掌握eBPF - tailcall — seamaner's blog

本文从C语言中函数调用的jump优化入手，理解 eBPF 的重要概念 - tailcall.

什么是Tail-call

“tail-call”（尾调用）是指一个函数在执行结束时，最后一步调用的是另一个函数，且该调用的返回值直接作为当前函数的返回值。简单来说，就是当前函数的最后一步是调用另一个函数，并将该函数的返回值返回给当前函数的调用者，而不需要在当前函数中进行其他的操作。

尾调用优化是一种编译器优化技术，它可以在某些情况下避免在函数调用时创建新的栈帧，从而减少函数调用的开销和内存消耗。这种优化通常在函数嵌套调用较深或需要进行递归计算时会带来明显的性能提升。
看一个C语言计算斐波那契数列的例子。

#include <stdint.h>

static uint64_t fib_tail(uint64_t n, uint64_t a, uint64_t b)
{
    if (n == 0)
        return a;
    if (n == 1)
        return b;

    return fib_tail(n - 1, b, a + b);
}

uint64_t fib(uint64_t n)
{
    return fib_tail(n, 1, 1);
}
int main()
{
    int n = fib(10000000);
    printf("fib(%d) is %d\n", 10000000, n);
}

使用gcc -c fib.c -o fib.o编译，查看汇编代码（使用objdump -d fib.o），可以看到fib函数调用fib_tail(call 0 <fib_tail>):

0000000000000000 <fib_tail>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 83 ec 20             sub    $0x20,%rsp
   c:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  10:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  14:   48 89 55 e8             mov    %rdx,-0x18(%rbp)
  18:   48 83 7d f8 00          cmpq   $0x0,-0x8(%rbp)
  1d:   75 06                   jne    25 <fib_tail+0x25>
  1f:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  23:   eb 2f                   jmp    54 <fib_tail+0x54>
  25:   48 83 7d f8 01          cmpq   $0x1,-0x8(%rbp)
  2a:   75 06                   jne    32 <fib_tail+0x32>
  2c:   48 8b 45 e8             mov    -0x18(%rbp),%rax
  30:   eb 22                   jmp    54 <fib_tail+0x54>
  32:   48 8b 55 f0             mov    -0x10(%rbp),%rdx
  36:   48 8b 45 e8             mov    -0x18(%rbp),%rax
  3a:   48 01 c2                add    %rax,%rdx
  3d:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  41:   48 8d 48 ff             lea    -0x1(%rax),%rcx
  45:   48 8b 45 e8             mov    -0x18(%rbp),%rax
  49:   48 89 c6                mov    %rax,%rsi
  4c:   48 89 cf                mov    %rcx,%rdi
  4f:   e8 ac ff ff ff          call   0 <fib_tail>
  54:   c9                      leave
  55:   c3                      ret

gcc fib.c -o fib编译，为了便于查看执行结果，用gdb运行：

gdb ./fib
(gdb) run
Starting program: /home/ubuntu/ebpf/eBPF-tutorial/tail-call/fib
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555198 in fib_tail ()
(gdb) bt -1
#92469 0x000055555555519d in fib_tail ()

可以看到程序执行到92469次，栈溢出sement fault退出。

函数的最后一行是函数调用，由于调用返回后会接着返回，因此当前函数栈frame不需要保存，call可以优化成jmp。
call相当于，push ret-address + jump ，优化后省掉了压栈操作。这样Tail-call达到了栈不增长的效果。 gcc -O3优化后，可以看到call被优化掉, 汇编代码中没有了 call 0 <fib_tail>这样的指令：

fib.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <fib>:
   0:   f3 0f 1e fa             endbr64
   4:   48 83 ff 01             cmp    $0x1,%rdi
   8:   76 2e                   jbe    38 <fib+0x38>
   a:   b8 01 00 00 00          mov    $0x1,%eax
   f:   ba 01 00 00 00          mov    $0x1,%edx
  14:   eb 0d                   jmp    23 <fib+0x23>
  16:   66 2e 0f 1f 84 00 00    cs nopw 0x0(%rax,%rax,1)
  1d:   00 00 00
  20:   4c 89 c0                mov    %r8,%rax
  23:   48 83 ef 01             sub    $0x1,%rdi
  27:   4c 8d 04 02             lea    (%rdx,%rax,1),%r8
  2b:   48 89 c2                mov    %rax,%rdx
  2e:   48 83 ff 01             cmp    $0x1,%rdi
  32:   75 ec                   jne    20 <fib+0x20>
  34:   4c 89 c0                mov    %r8,%rax
  37:   c3                      ret
  38:   41 b8 01 00 00 00       mov    $0x1,%r8d
  3e:   eb f4                   jmp    34 <fib+0x34>

优化后的fib可以正常运行，不会stack溢出崩溃掉，最终计算出结果：

gcc fib.c -O3 -o fib.o3
 ./fib.o3
fib(10000000) is 8644293272739028509

什么是eBPF Tail-call

       long bpf_tail_call(void *ctx, struct bpf_map *prog_array_map, u32
       index)

              Description
                     This special helper is used to trigger a "tail
                     call", or in other words, to jump into another eBPF
                     program. The same stack frame is used (but values
                     on stack and in registers for the caller are not
                     accessible to the callee). This mechanism allows
                     for program chaining, either for raising the
                     maximum number of available eBPF instructions, or
                     to execute given programs in conditional blocks.
                     For security reasons, there is an upper limit to
                     the number of successive tail calls that can be
                     performed.

                     Upon call of this helper, the program attempts to
                     jump into a program referenced at index index in
                     prog_array_map, a special map of type
                     BPF_MAP_TYPE_PROG_ARRAY, and passes ctx, a pointer
                     to the context.

                     If the call succeeds, the kernel immediately runs
                     the first instruction of the new program. This is
                     not a function call, and it never returns to the
                     previous program. If the call fails, then the
                     helper has no effect, and the caller continues to
                     run its subsequent instructions. A call can fail if
                     the destination program for the jump does not exist
                     (i.e. index is superior to the number of entries in
                     prog_array_map), or if the maximum number of tail
                     calls has been reached for this chain of programs.
                     This limit is defined in the kernel by the macro
                     MAX_TAIL_CALL_CNT (not accessible to user space),
                     which is currently set to 33.

              Return 0 on success, or a negative error in case of
                     failure.

被调用的函数执行完，函数返回到哪里？
相当于从调用bpf_tail_call helper函数的地方直接返回了，上上一层函数的地方接着往下执行。

A:
A1: codeA1
A2: codeA2
A3: call B
A4: codeA4

B:
B1: codeB1
B2: tail_call C
B3: codeB3

C:
C1: codeC1
C2: codeC2   -- ret

B2: tail_call c成功后，B3不会再执行，执行流返回到A4继续执行。
执行的代码是：
A1 -> A2 -> A3(call B) -> B1 -> B2(tailcall C) -> C1 -> C2(ret) ->A4
注意因为tailcall的缘故，B3没有执行。

文档信息

本文作者：seamaner
本文链接：https://seamaner.github.io/2024/11/04/ebpf-tailcall/
版权声明：自由转载-非商用-非衍生-保持署名（创意共享3.0许可证）

seamaner's blog

掌握eBPF - tailcall

什么是Tail-call

什么是eBPF Tail-call

文档信息

Search

Table of Contents