You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BPF-maps are basically generic *key-value* stores (see =key_size= and
41
-
=value_size=), with a given =type=, and maximum allowed entries
41
+
BPFmaps are generic *key/value* stores (hence the =key_size= and
42
+
=value_size= parameters), with a given =type=, and maximum allowed entries
42
43
=max_entries=. Here we focus on the simple =BPF_MAP_TYPE_ARRAY=, which means
43
-
=max_entries= gets allocated when map is created.
44
+
=max_entries= array elements get allocated when the map is first created.
44
45
45
-
The BPF-map is both accessible from BPF-prog (kernel) side and userspace.
46
-
How this is done and how they differ is part of this lesson.
46
+
The BPFmap is accessible from both the BPF program (kernel) side and from
47
+
userspace. How this is done and how they differ is part of this lesson.
47
48
48
-
** Lesson#2: libbpf map ELF relocation
49
+
** libbpf map ELF relocation
49
50
50
-
The libbpf library (fortunately) handles ELF-object decoding and map
51
-
references relocation, when the map is referenced from the BPF code.
51
+
It is worth pointing out that everything goes through the bpf syscall. This
52
+
means that the user space program /must/ create the maps and programs with
53
+
separate invocations of the bpf syscall. So how does a BPF program reference
54
+
a BPF map?
52
55
53
-
It is worth pointing out that everything goes through the bpf-syscall. This
54
-
means that libbpf /must/ create the maps and programs with separate
55
-
invocations of the bpf-syscall. Then how can a BPF-prog reference a BPF-map?
56
-
This happen via first loading all the BPF-maps, and get back their
57
-
corresponding file-descriptor (FD). Then the ELF-relocation table is used
58
-
for identifying when the BPF-prog reference a given map, and then rewrite
59
-
those BPF-byte-code instructions to use the map FD, before loading BPF-prog
60
-
into the kernel.
56
+
This happens by first loading all the BPF maps, and storing their
57
+
corresponding file descriptors (FDs). Then the ELF relocation table is used
58
+
to identify each reference the BPF program makes to a given map; each such
59
+
reference is then rewritten, so the BPF byte code instructions use the right
60
+
map FD for each map.
61
61
62
-
** Lesson#3: bpf_object to bpf_map
62
+
All this needs to be done before the BPF program itself can be loaded into
63
+
the kernel. Fortunately, the libbpf library handles the ELF object decoding
64
+
and map reference relocation, transparently to the user space program
65
+
performing the loads.
66
+
67
+
** bpf_object to bpf_map
63
68
64
69
As you learned in [[file:../basic02-prog-by-name/][basic02]] the libbpf API have "objects" and functions
65
-
working on/with these objects. The struct =bpf_object= represents ELF object
66
-
itself (which is returned from our =load_bpf_and_xdp_attach()= function).
70
+
working on/with these objects. The struct =bpf_object= represents the ELF
71
+
object itself (which is returned from our =load_bpf_and_xdp_attach()=
72
+
function).
67
73
68
-
In our function find_map_fd() (in [[file:xdp_load_and_stats.c]]) the function
69
-
=bpf_object__find_map_by_name()= is used for finding the =bpf_map= object
74
+
Similarly to what we did for BPF functions, our load has a function called
75
+
=find_map_fd()= (in [[file:xdp_load_and_stats.c]]), which uses the library
76
+
function =bpf_object__find_map_by_name()= for finding the =bpf_map= object
70
77
with a given name. (Note, the length of the map name is provided by ELF and
71
-
is longer than what the name kernel stores, after loading it). Next step is
72
-
obtaining the map file-descriptor (FD) via =bpf_map__fd()=. There is also a
73
-
libbpf function that wrap these two steps, which is called
78
+
is longer than what the name kernel stores, after loading it). After finding
79
+
the =bpf_object=, we obtain the map filedescriptor via =bpf_map__fd()=.
80
+
There is also a libbpf function that wraps these two steps, which is called
74
81
=bpf_object__find_map_fd_by_name()=.
75
82
76
-
** Lesson#4: read map-value from userspace
77
-
78
-
The contents of the map is read from userspace via the function
79
-
=bpf_map_lookup_elem()=, which is a simple syscall-wrapper, that operate on
80
-
the map file-descriptor (FD), lookup the =key= and store the value into the
81
-
memory area supplied by the value pointer. It is userspace own
82
-
responsibility to known what map it is reading and know the value size, and
83
-
thus have allocated memory large enough to store the value. In our example
84
-
we demonstrate how userspace can query the map-FD and get back some info in
85
-
struct =bpf_map_info= via syscall-wrapper =bpf_obj_get_info_by_fd()=.
83
+
** Reading map values from userspace
86
84
87
-
The program =xdp_load_and_stats= will periodically read the xdp_stats_map
88
-
value and produce some stats.
85
+
The contents of a map is read from userspace via the function
86
+
=bpf_map_lookup_elem()=, which is a simple syscall-wrapper, that operates on
87
+
the map file descriptor (FD). The syscall looks up the =key= and stores the
88
+
value into the memory area supplied by the value pointer. It is up to the
89
+
calling userspace program to ensure that the memory allocated to hold the
90
+
returned value is large enough to store the type of data contained in the
91
+
map. In our example we demonstrate how userspace can query the map FD and
92
+
get back some info in struct =bpf_map_info= via the syscall wrapper
93
+
=bpf_obj_get_info_by_fd()=.
89
94
95
+
For example, the program =xdp_load_and_stats= will periodically read the
96
+
xdp_stats_map value and produce some stats.
90
97
91
98
* Assignments
92
99
93
100
The assignments are have "hint" marks in the code via =Assignment#num=
94
101
comments.
95
102
96
-
** Assignment#1: Add bytes counter
103
+
** Assignment1: Add bytes counter
97
104
98
105
The current assignment code only counts packets. It is your *assignment* to
99
106
extend this to also count bytes.
100
107
101
-
Notice how BPF-map =xdp_stats_map= used:
108
+
Notice how the BPFmap =xdp_stats_map= used:
102
109
- =.value_size = sizeof(struct datarec)=
103
110
104
-
The BPF-map have no knowledge about the data-structure used for the value
111
+
The BPFmap has no knowledge about the data-structure used for the value
105
112
record, it only knows the size. (The [[https://github.com/torvalds/linux/blob/master/Documentation/bpf/btf.rst][BPF Type Format]] ([[https://www.kernel.org/doc/html/latest/bpf/btf.html][BTF]]) is an advanced
106
-
topic, that allow for associating data-struct knowledge via debug info, but
107
-
we ignore that for now). Thus, it is up-to the two-sides (userspace and
108
-
BPF-prog kernel side) to stay in-sync on the content and structure of
109
-
=value=. The hint here on the data-structure used comes from =sizeof(struct
110
-
datarec)=, which indicate that =struct datarec= is used.
113
+
topic, that allows for associating datastruct knowledge via debug info, but
114
+
we ignore that for now). Thus, it is upto the twosides (userspace and
115
+
BPF-prog kernel side) to ensure they stay insync on the content and
116
+
structure of =value=. The hint here on the datastructure used comes from
117
+
=sizeof(struct datarec)=, which indicate that =struct datarec= is used.
111
118
112
119
This =struct datarec= is defined in the include [[file:common_kern_user.h]] as:
113
120
@@ -119,18 +126,17 @@ struct datarec {
119
126
};
120
127
#+end_src
121
128
122
-
*** Assignment#1.1: Update BPF-prog
129
+
*** Assignment1.1: Update the BPF program
123
130
124
-
Next step is update BPF-prog kernel side program: [[file:xdp_prog_kern.c]].
131
+
Next step is to update the kernel side BPF program: [[file:xdp_prog_kern.c]].
125
132
126
133
To figure out the length of the packet, you need to learn about the context
127
-
variable =*ctx= with type [[https://elixir.bootlin.com/linux/v5.0/ident/xdp_md][struct xdp_md]] that the BPF-prog gets a pointer
128
-
to, when invoked by the kernel. This =struct xdp_md= is a little odd, as all
129
-
members have type =__u32=, which is not actually their real data-types, as
130
-
access to this data-structure is remapped by the kernel at BPF-load time
131
-
(the BPF-byte-code instructions are rewritten by [[https://elixir.bootlin.com/linux/latest/ident/xdp_convert_ctx_access][xdp_convert_ctx_access()]]
132
-
and [[https://elixir.bootlin.com/linux/latest/ident/xdp_is_valid_access][xdp_is_valid_access()]] assign types for the verifier). Access gets
133
-
remapped to struct =xdp_buff= and also struct =xdp_rxq_info=.
134
+
variable =*ctx= with type [[https://elixir.bootlin.com/linux/v5.0/ident/xdp_md][struct xdp_md]] that the BPF program gets a pointer
135
+
to when invoked by the kernel. This =struct xdp_md= is a little odd, as all
136
+
members have type =__u32=. However, this is not actually their real data
137
+
types, as access to this data-structure is remapped by the kernel when the
138
+
program is loaded into the kernel. Access gets remapped to struct =xdp_buff=
139
+
and also struct =xdp_rxq_info=.
134
140
135
141
#+begin_src C
136
142
struct xdp_md {
@@ -144,52 +150,55 @@ struct xdp_md {
144
150
};
145
151
#+end_src
146
152
147
-
First order of business in [[file:xdp_prog_kern.c]], is type-cast the data_end
148
-
and data into void pointers:
153
+
While we know this, the compiler doesn't. So we need to type-cast the fields
154
+
into void pointers before we can use them:
149
155
150
156
#+begin_src C
151
157
void *data_end = (void *)(long)ctx->data_end;
152
158
void *data = (void *)(long)ctx->data;
153
159
#+end_src
154
160
155
-
Next step is calculating the number of bytes, by simply subtracting =data=
156
-
from =data_end=, and update the datarec member.
161
+
The next step is calculating the number of bytes in each packet, by simply
162
+
subtracting =data= from =data_end=, and update the datarec member.
0 commit comments