Skip to content

Commit ee17c5d

Browse files
author
Herton R. Krzesinski
committed
Merge: bpf, xdp: update to 6.0
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1742 bpf, xdp: update to 6.0 Bugzilla: https://bugzilla.redhat.com/2137876 Signed-off-by: Artem Savkov <asavkov@redhat.com> Approved-by: Jiri Benc <jbenc@redhat.com> Approved-by: Prarit Bhargava <prarit@redhat.com> Approved-by: Jerome Marchand <jmarchan@redhat.com> Approved-by: Yauheni Kaliuta <ykaliuta@redhat.com> Approved-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2 parents a91d469 + 4722c43 commit ee17c5d

File tree

246 files changed

+13471
-8263
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

246 files changed

+13471
-8263
lines changed

Documentation/bpf/bpf_design_QA.rst

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -214,6 +214,12 @@ A: NO. Tracepoints are tied to internal implementation details hence they are
214214
subject to change and can break with newer kernels. BPF programs need to change
215215
accordingly when this happens.
216216

217+
Q: Are places where kprobes can attach part of the stable ABI?
218+
--------------------------------------------------------------
219+
A: NO. The places to which kprobes can attach are internal implementation
220+
details, which means that they are subject to change and can break with
221+
newer kernels. BPF programs need to change accordingly when this happens.
222+
217223
Q: How much stack space a BPF program uses?
218224
-------------------------------------------
219225
A: Currently all program types are limited to 512 bytes of stack
@@ -273,3 +279,22 @@ cc (congestion-control) implementations. If any of these kernel
273279
functions has changed, both the in-tree and out-of-tree kernel tcp cc
274280
implementations have to be changed. The same goes for the bpf
275281
programs and they have to be adjusted accordingly.
282+
283+
Q: Attaching to arbitrary kernel functions is an ABI?
284+
-----------------------------------------------------
285+
Q: BPF programs can be attached to many kernel functions. Do these
286+
kernel functions become part of the ABI?
287+
288+
A: NO.
289+
290+
The kernel function prototypes will change, and BPF programs attaching to
291+
them will need to change. The BPF compile-once-run-everywhere (CO-RE)
292+
should be used in order to make it easier to adapt your BPF programs to
293+
different versions of the kernel.
294+
295+
Q: Marking a function with BTF_ID makes that function an ABI?
296+
-------------------------------------------------------------
297+
A: NO.
298+
299+
The BTF_ID macro does not cause a function to become part of the ABI
300+
any more than does the EXPORT_SYMBOL_GPL macro.

Documentation/bpf/btf.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -369,7 +369,8 @@ No additional type data follow ``btf_type``.
369369
* ``name_off``: offset to a valid C identifier
370370
* ``info.kind_flag``: 0
371371
* ``info.kind``: BTF_KIND_FUNC
372-
* ``info.vlen``: 0
372+
* ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
373+
or BTF_FUNC_EXTERN)
373374
* ``type``: a BTF_KIND_FUNC_PROTO type
374375

375376
No additional type data follow ``btf_type``.
@@ -380,6 +381,9 @@ type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
380381
:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
381382
(ABI).
382383

384+
Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are
385+
supported in the kernel.
386+
383387
2.2.13 BTF_KIND_FUNC_PROTO
384388
~~~~~~~~~~~~~~~~~~~~~~~~~~
385389

Documentation/bpf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ that goes into great technical depth about the BPF Architecture.
1919
faq
2020
syscall_api
2121
helpers
22+
kfuncs
2223
programs
2324
maps
2425
bpf_prog_run

Documentation/bpf/instruction-set.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ BPF_XOR | BPF_K | BPF_ALU64 means::
127127
Byte swap instructions
128128
----------------------
129129

130-
The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit
130+
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
131131
code field of ``BPF_END``.
132132

133133
The byte swap instructions operate on the destination register
@@ -351,7 +351,7 @@ These instructions have seven implicit operands:
351351
* Register R0 is an implicit output which contains the data fetched from
352352
the packet.
353353
* Registers R1-R5 are scratch registers that are clobbered after a call to
354-
``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions.
354+
``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions.
355355

356356
These instructions have an implicit program exit condition as well. When an
357357
eBPF program is trying to access the data beyond the packet boundary, the

Documentation/bpf/kfuncs.rst

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
=============================
2+
BPF Kernel Functions (kfuncs)
3+
=============================
4+
5+
1. Introduction
6+
===============
7+
8+
BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux
9+
kernel which are exposed for use by BPF programs. Unlike normal BPF helpers,
10+
kfuncs do not have a stable interface and can change from one kernel release to
11+
another. Hence, BPF programs need to be updated in response to changes in the
12+
kernel.
13+
14+
2. Defining a kfunc
15+
===================
16+
17+
There are two ways to expose a kernel function to BPF programs, either make an
18+
existing function in the kernel visible, or add a new wrapper for BPF. In both
19+
cases, care must be taken that BPF program can only call such function in a
20+
valid context. To enforce this, visibility of a kfunc can be per program type.
21+
22+
If you are not creating a BPF wrapper for existing kernel function, skip ahead
23+
to :ref:`BPF_kfunc_nodef`.
24+
25+
2.1 Creating a wrapper kfunc
26+
----------------------------
27+
28+
When defining a wrapper kfunc, the wrapper function should have extern linkage.
29+
This prevents the compiler from optimizing away dead code, as this wrapper kfunc
30+
is not invoked anywhere in the kernel itself. It is not necessary to provide a
31+
prototype in a header for the wrapper kfunc.
32+
33+
An example is given below::
34+
35+
/* Disables missing prototype warnings */
36+
__diag_push();
37+
__diag_ignore_all("-Wmissing-prototypes",
38+
"Global kfuncs as their definitions will be in BTF");
39+
40+
struct task_struct *bpf_find_get_task_by_vpid(pid_t nr)
41+
{
42+
return find_get_task_by_vpid(nr);
43+
}
44+
45+
__diag_pop();
46+
47+
A wrapper kfunc is often needed when we need to annotate parameters of the
48+
kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
49+
registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.
50+
51+
2.2 Annotating kfunc parameters
52+
-------------------------------
53+
54+
Similar to BPF helpers, there is sometime need for additional context required
55+
by the verifier to make the usage of kernel functions safer and more useful.
56+
Hence, we can annotate a parameter by suffixing the name of the argument of the
57+
kfunc with a __tag, where tag may be one of the supported annotations.
58+
59+
2.2.1 __sz Annotation
60+
---------------------
61+
62+
This annotation is used to indicate a memory and size pair in the argument list.
63+
An example is given below::
64+
65+
void bpf_memzero(void *mem, int mem__sz)
66+
{
67+
...
68+
}
69+
70+
Here, the verifier will treat first argument as a PTR_TO_MEM, and second
71+
argument as its size. By default, without __sz annotation, the size of the type
72+
of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
73+
pointer.
74+
75+
.. _BPF_kfunc_nodef:
76+
77+
2.3 Using an existing kernel function
78+
-------------------------------------
79+
80+
When an existing function in the kernel is fit for consumption by BPF programs,
81+
it can be directly registered with the BPF subsystem. However, care must still
82+
be taken to review the context in which it will be invoked by the BPF program
83+
and whether it is safe to do so.
84+
85+
2.4 Annotating kfuncs
86+
---------------------
87+
88+
In addition to kfuncs' arguments, verifier may need more information about the
89+
type of kfunc(s) being registered with the BPF subsystem. To do so, we define
90+
flags on a set of kfuncs as follows::
91+
92+
BTF_SET8_START(bpf_task_set)
93+
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
94+
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
95+
BTF_SET8_END(bpf_task_set)
96+
97+
This set encodes the BTF ID of each kfunc listed above, and encodes the flags
98+
along with it. Ofcourse, it is also allowed to specify no flags.
99+
100+
2.4.1 KF_ACQUIRE flag
101+
---------------------
102+
103+
The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
104+
refcounted object. The verifier will then ensure that the pointer to the object
105+
is eventually released using a release kfunc, or transferred to a map using a
106+
referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
107+
loading of the BPF program until no lingering references remain in all possible
108+
explored states of the program.
109+
110+
2.4.2 KF_RET_NULL flag
111+
----------------------
112+
113+
The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
114+
may be NULL. Hence, it forces the user to do a NULL check on the pointer
115+
returned from the kfunc before making use of it (dereferencing or passing to
116+
another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
117+
both are orthogonal to each other.
118+
119+
2.4.3 KF_RELEASE flag
120+
---------------------
121+
122+
The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
123+
passed in to it. There can be only one referenced pointer that can be passed in.
124+
All copies of the pointer being released are invalidated as a result of invoking
125+
kfunc with this flag.
126+
127+
2.4.4 KF_KPTR_GET flag
128+
----------------------
129+
130+
The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument
131+
as a pointer to kptr, safely increments the refcount of the object it points to,
132+
and returns a reference to the user. The rest of the arguments may be normal
133+
arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with
134+
KF_ACQUIRE and KF_RET_NULL flags.
135+
136+
2.4.5 KF_TRUSTED_ARGS flag
137+
--------------------------
138+
139+
The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
140+
indicates that the all pointer arguments will always be refcounted, and have
141+
their offset set to 0. It can be used to enforce that a pointer to a refcounted
142+
object acquired from a kfunc or BPF helper is passed as an argument to this
143+
kfunc without any modifications (e.g. pointer arithmetic) such that it is
144+
trusted and points to the original object. This flag is often used for kfuncs
145+
that operate (change some property, perform some operation) on an object that
146+
was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to
147+
ensure the integrity of the operation being performed on the expected object.
148+
149+
2.5 Registering the kfuncs
150+
--------------------------
151+
152+
Once the kfunc is prepared for use, the final step to making it visible is
153+
registering it with the BPF subsystem. Registration is done per BPF program
154+
type. An example is shown below::
155+
156+
BTF_SET8_START(bpf_task_set)
157+
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
158+
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
159+
BTF_SET8_END(bpf_task_set)
160+
161+
static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
162+
.owner = THIS_MODULE,
163+
.set = &bpf_task_set,
164+
};
165+
166+
static int init_subsystem(void)
167+
{
168+
return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
169+
}
170+
late_initcall(init_subsystem);

Documentation/bpf/libbpf/libbpf_naming_convention.rst

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ described here. It's recommended to follow these conventions whenever a
99
new function or type is added to keep libbpf API clean and consistent.
1010

1111
All types and functions provided by libbpf API should have one of the
12-
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
13-
``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``.
12+
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``btf_dump_``,
13+
``ring_buffer_``, ``perf_buffer_``.
1414

1515
System call wrappers
1616
--------------------
@@ -59,15 +59,6 @@ Auxiliary functions and types that don't fit well in any of categories
5959
described above should have ``libbpf_`` prefix, e.g.
6060
``libbpf_get_error`` or ``libbpf_prog_type_by_name``.
6161

62-
AF_XDP functions
63-
-------------------
64-
65-
AF_XDP functions should have an ``xsk_`` prefix, e.g.
66-
``xsk_umem__get_data`` or ``xsk_umem__create``. The interface consists
67-
of both low-level ring access functions and high-level configuration
68-
functions. These can be mixed and matched. Note that these functions
69-
are not reentrant for performance reasons.
70-
7162
ABI
7263
---
7364

0 commit comments

Comments
 (0)