Skip to content

Commit 4a1373c

Browse files
committed
basic03: Text nits
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
1 parent 170257b commit 4a1373c

1 file changed

Lines changed: 103 additions & 94 deletions

File tree

basic03-map-counter/README.org

Lines changed: 103 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,32 @@
11
# -*- fill-column: 76; -*-
2-
#+TITLE: Tutorial: Basic03
2+
#+TITLE: Tutorial: Basic03 - counting with BPF maps
33
#+OPTIONS: ^:nil
44

5-
In this lesson you will learn about BPF-maps, and in the assignment get
6-
hands-on experience with extending the "value" size/content, and reading the
7-
contents from userspace.
5+
In this lesson you will learn about BPF maps, the persistent storage
6+
mechanism available to BPF programs. The assignments will give you hands-on
7+
experience with extending the "value" size/content, and reading the contents
8+
from userspace.
89

9-
We will only cover two simple maps types:
10+
In this lesson we will only cover two simple maps types:
1011
- =BPF_MAP_TYPE_ARRAY= and
1112
- =BPF_MAP_TYPE_PERCPU_ARRAY=.
1213

13-
* Overview of exercise :TOC:
14-
- [[#lesson][Lesson]]
15-
- [[#lesson1-defining-a-map][Lesson#1: defining a map]]
16-
- [[#lesson2-libbpf-map-elf-relocation][Lesson#2: libbpf map ELF relocation]]
17-
- [[#lesson3-bpf_object-to-bpf_map][Lesson#3: bpf_object to bpf_map]]
18-
- [[#lesson4-read-map-value-from-userspace][Lesson#4: read map-value from userspace]]
14+
* Table of Contents :TOC:
15+
- [[#things-you-will-learn-in-this-lesson][Things you will learn in this lesson]]
16+
- [[#defining-a-map][Defining a map]]
17+
- [[#libbpf-map-elf-relocation][libbpf map ELF relocation]]
18+
- [[#bpf_object-to-bpf_map][bpf_object to bpf_map]]
19+
- [[#reading-map-values-from-userspace][Reading map values from userspace]]
1920
- [[#assignments][Assignments]]
20-
- [[#assignment1-add-bytes-counter][Assignment#1: Add bytes counter]]
21-
- [[#assignment2-handle-other-xdp-actions-stats][Assignment#2: Handle other XDP actions stats]]
22-
- [[#assignment3-per-cpu-stats][Assignment#3: Per CPU stats]]
21+
- [[#assignment-1-add-bytes-counter][Assignment 1: Add bytes counter]]
22+
- [[#assignment-2-handle-other-xdp-actions-stats][Assignment 2: Handle other XDP actions stats]]
23+
- [[#assignment-3-per-cpu-stats][Assignment 3: Per CPU stats]]
2324

24-
* Lesson
25+
* Things you will learn in this lesson
2526

26-
** Lesson#1: defining a map
27+
** Defining a map
2728

28-
Creating a BPF-map is done by defining a global struct =bpf_map_def= (in
29+
Creating a BPF map is done by defining a global struct =bpf_map_def= (in
2930
[[file:xdp_prog_kern.c]]), with a special =SEC("maps")= as below:
3031

3132
#+begin_src C
@@ -37,77 +38,83 @@ struct bpf_map_def SEC("maps") xdp_stats_map = {
3738
};
3839
#+end_src
3940

40-
BPF-maps are basically generic *key-value* stores (see =key_size= and
41-
=value_size=), with a given =type=, and maximum allowed entries
41+
BPF maps are generic *key/value* stores (hence the =key_size= and
42+
=value_size= parameters), with a given =type=, and maximum allowed entries
4243
=max_entries=. Here we focus on the simple =BPF_MAP_TYPE_ARRAY=, which means
43-
=max_entries= gets allocated when map is created.
44+
=max_entries= array elements get allocated when the map is first created.
4445

45-
The BPF-map is both accessible from BPF-prog (kernel) side and userspace.
46-
How this is done and how they differ is part of this lesson.
46+
The BPF map is accessible from both the BPF program (kernel) side and from
47+
userspace. How this is done and how they differ is part of this lesson.
4748

48-
** Lesson#2: libbpf map ELF relocation
49+
** libbpf map ELF relocation
4950

50-
The libbpf library (fortunately) handles ELF-object decoding and map
51-
references relocation, when the map is referenced from the BPF code.
51+
It is worth pointing out that everything goes through the bpf syscall. This
52+
means that the user space program /must/ create the maps and programs with
53+
separate invocations of the bpf syscall. So how does a BPF program reference
54+
a BPF map?
5255

53-
It is worth pointing out that everything goes through the bpf-syscall. This
54-
means that libbpf /must/ create the maps and programs with separate
55-
invocations of the bpf-syscall. Then how can a BPF-prog reference a BPF-map?
56-
This happen via first loading all the BPF-maps, and get back their
57-
corresponding file-descriptor (FD). Then the ELF-relocation table is used
58-
for identifying when the BPF-prog reference a given map, and then rewrite
59-
those BPF-byte-code instructions to use the map FD, before loading BPF-prog
60-
into the kernel.
56+
This happens by first loading all the BPF maps, and storing their
57+
corresponding file descriptors (FDs). Then the ELF relocation table is used
58+
to identify each reference the BPF program makes to a given map; each such
59+
reference is then rewritten, so the BPF byte code instructions use the right
60+
map FD for each map.
6161

62-
** Lesson#3: bpf_object to bpf_map
62+
All this needs to be done before the BPF program itself can be loaded into
63+
the kernel. Fortunately, the libbpf library handles the ELF object decoding
64+
and map reference relocation, transparently to the user space program
65+
performing the loads.
66+
67+
** bpf_object to bpf_map
6368

6469
As you learned in [[file:../basic02-prog-by-name/][basic02]] the libbpf API have "objects" and functions
65-
working on/with these objects. The struct =bpf_object= represents ELF object
66-
itself (which is returned from our =load_bpf_and_xdp_attach()= function).
70+
working on/with these objects. The struct =bpf_object= represents the ELF
71+
object itself (which is returned from our =load_bpf_and_xdp_attach()=
72+
function).
6773

68-
In our function find_map_fd() (in [[file:xdp_load_and_stats.c]]) the function
69-
=bpf_object__find_map_by_name()= is used for finding the =bpf_map= object
74+
Similarly to what we did for BPF functions, our load has a function called
75+
=find_map_fd()= (in [[file:xdp_load_and_stats.c]]), which uses the library
76+
function =bpf_object__find_map_by_name()= for finding the =bpf_map= object
7077
with a given name. (Note, the length of the map name is provided by ELF and
71-
is longer than what the name kernel stores, after loading it). Next step is
72-
obtaining the map file-descriptor (FD) via =bpf_map__fd()=. There is also a
73-
libbpf function that wrap these two steps, which is called
78+
is longer than what the name kernel stores, after loading it). After finding
79+
the =bpf_object=, we obtain the map file descriptor via =bpf_map__fd()=.
80+
There is also a libbpf function that wraps these two steps, which is called
7481
=bpf_object__find_map_fd_by_name()=.
7582

76-
** Lesson#4: read map-value from userspace
77-
78-
The contents of the map is read from userspace via the function
79-
=bpf_map_lookup_elem()=, which is a simple syscall-wrapper, that operate on
80-
the map file-descriptor (FD), lookup the =key= and store the value into the
81-
memory area supplied by the value pointer. It is userspace own
82-
responsibility to known what map it is reading and know the value size, and
83-
thus have allocated memory large enough to store the value. In our example
84-
we demonstrate how userspace can query the map-FD and get back some info in
85-
struct =bpf_map_info= via syscall-wrapper =bpf_obj_get_info_by_fd()=.
83+
** Reading map values from userspace
8684

87-
The program =xdp_load_and_stats= will periodically read the xdp_stats_map
88-
value and produce some stats.
85+
The contents of a map is read from userspace via the function
86+
=bpf_map_lookup_elem()=, which is a simple syscall-wrapper, that operates on
87+
the map file descriptor (FD). The syscall looks up the =key= and stores the
88+
value into the memory area supplied by the value pointer. It is up to the
89+
calling userspace program to ensure that the memory allocated to hold the
90+
returned value is large enough to store the type of data contained in the
91+
map. In our example we demonstrate how userspace can query the map FD and
92+
get back some info in struct =bpf_map_info= via the syscall wrapper
93+
=bpf_obj_get_info_by_fd()=.
8994

95+
For example, the program =xdp_load_and_stats= will periodically read the
96+
xdp_stats_map value and produce some stats.
9097

9198
* Assignments
9299

93100
The assignments are have "hint" marks in the code via =Assignment#num=
94101
comments.
95102

96-
** Assignment#1: Add bytes counter
103+
** Assignment 1: Add bytes counter
97104

98105
The current assignment code only counts packets. It is your *assignment* to
99106
extend this to also count bytes.
100107

101-
Notice how BPF-map =xdp_stats_map= used:
108+
Notice how the BPF map =xdp_stats_map= used:
102109
- =.value_size = sizeof(struct datarec)=
103110

104-
The BPF-map have no knowledge about the data-structure used for the value
111+
The BPF map has no knowledge about the data-structure used for the value
105112
record, it only knows the size. (The [[https://github.com/torvalds/linux/blob/master/Documentation/bpf/btf.rst][BPF Type Format]] ([[https://www.kernel.org/doc/html/latest/bpf/btf.html][BTF]]) is an advanced
106-
topic, that allow for associating data-struct knowledge via debug info, but
107-
we ignore that for now). Thus, it is up-to the two-sides (userspace and
108-
BPF-prog kernel side) to stay in-sync on the content and structure of
109-
=value=. The hint here on the data-structure used comes from =sizeof(struct
110-
datarec)=, which indicate that =struct datarec= is used.
113+
topic, that allows for associating data struct knowledge via debug info, but
114+
we ignore that for now). Thus, it is up to the two sides (userspace and
115+
BPF-prog kernel side) to ensure they stay in sync on the content and
116+
structure of =value=. The hint here on the data structure used comes from
117+
=sizeof(struct datarec)=, which indicate that =struct datarec= is used.
111118

112119
This =struct datarec= is defined in the include [[file:common_kern_user.h]] as:
113120

@@ -119,18 +126,17 @@ struct datarec {
119126
};
120127
#+end_src
121128

122-
*** Assignment#1.1: Update BPF-prog
129+
*** Assignment 1.1: Update the BPF program
123130

124-
Next step is update BPF-prog kernel side program: [[file:xdp_prog_kern.c]].
131+
Next step is to update the kernel side BPF program: [[file:xdp_prog_kern.c]].
125132

126133
To figure out the length of the packet, you need to learn about the context
127-
variable =*ctx= with type [[https://elixir.bootlin.com/linux/v5.0/ident/xdp_md][struct xdp_md]] that the BPF-prog gets a pointer
128-
to, when invoked by the kernel. This =struct xdp_md= is a little odd, as all
129-
members have type =__u32=, which is not actually their real data-types, as
130-
access to this data-structure is remapped by the kernel at BPF-load time
131-
(the BPF-byte-code instructions are rewritten by [[https://elixir.bootlin.com/linux/latest/ident/xdp_convert_ctx_access][xdp_convert_ctx_access()]]
132-
and [[https://elixir.bootlin.com/linux/latest/ident/xdp_is_valid_access][xdp_is_valid_access()]] assign types for the verifier). Access gets
133-
remapped to struct =xdp_buff= and also struct =xdp_rxq_info=.
134+
variable =*ctx= with type [[https://elixir.bootlin.com/linux/v5.0/ident/xdp_md][struct xdp_md]] that the BPF program gets a pointer
135+
to when invoked by the kernel. This =struct xdp_md= is a little odd, as all
136+
members have type =__u32=. However, this is not actually their real data
137+
types, as access to this data-structure is remapped by the kernel when the
138+
program is loaded into the kernel. Access gets remapped to struct =xdp_buff=
139+
and also struct =xdp_rxq_info=.
134140

135141
#+begin_src C
136142
struct xdp_md {
@@ -144,52 +150,55 @@ struct xdp_md {
144150
};
145151
#+end_src
146152

147-
First order of business in [[file:xdp_prog_kern.c]], is type-cast the data_end
148-
and data into void pointers:
153+
While we know this, the compiler doesn't. So we need to type-cast the fields
154+
into void pointers before we can use them:
149155

150156
#+begin_src C
151157
void *data_end = (void *)(long)ctx->data_end;
152158
void *data = (void *)(long)ctx->data;
153159
#+end_src
154160

155-
Next step is calculating the number of bytes, by simply subtracting =data=
156-
from =data_end=, and update the datarec member.
161+
The next step is calculating the number of bytes in each packet, by simply
162+
subtracting =data= from =data_end=, and update the datarec member.
157163

158164
#+begin_src C
159165
__u64 bytes = data_end - data; /* Calculate packet length */
160166
lock_xadd(&rec->rx_bytes, bytes);
161167
#+end_src
162168

163-
*** Assignment#1.2: Update userspace prog
169+
*** Assignment 1.2: Update the userspace program
164170

165-
Now it is time to update the userspace program reading stats in
166-
[[file:xdp_load_and_stats.c]].
171+
Now it is time to update the userspace program that reads stats (in
172+
[[file:xdp_load_and_stats.c]]).
167173

168-
Update functions:
174+
Update the functions:
169175
- =map_collect()= to also collect rx_bytes.
170176
- =stats_print()= to also print rx_bytes (adjust fmt string)
171177

172-
** Assignment#2: Handle other XDP actions stats
178+
** Assignment 2: Handle other XDP actions stats
173179

174-
Notice how the BPF-map =xdp_stats_map= we defined (in [[#lesson1-defining-a-map][Lesson#1: defining a
175-
map]]) is an array with more elements =max_entries=XDP_ACTION_MAX=. The idea
176-
is to keep stats per [[https://elixir.bootlin.com/linux/latest/ident/xdp_action][(enum) xdp_action]], but our program does not take
177-
advantage of this.
180+
Notice how the BPF map =xdp_stats_map= we defined above is actually an
181+
array, with =max_entries=XDP_ACTION_MAX=. The idea with this is to keep
182+
stats per [[https://elixir.bootlin.com/linux/latest/ident/xdp_action][(enum) xdp_action]], but our program does not yet take advantage of
183+
this.
178184

179-
The *assignment* is primarily to extend userspace stats tool (in
180-
[[file:xdp_load_and_stats.c]] to collect and print these extra stats.
185+
The *assignment* is to extend userspace stats tool (in
186+
[[file:xdp_load_and_stats.c]]) to collect and print these extra stats.
181187

182-
** Assignment#3: Per CPU stats
188+
** Assignment 3: Per CPU stats
183189

184-
Avoid the atomic stats counter, by using another per-CPU array type, and
185-
move the burden of summing to userspace.
190+
Thus far, we have used atomic operations to increment our stats counters;
191+
however, this is expensive as it inserts memory barriers to make sure
192+
different CPUs don't garble each other's data. We can avoid this by using
193+
another array type that stores its data in per-CPU storage. The drawback of
194+
this is that we move the burden of summing to userspace.
186195

187-
First step is to change map =type= (in [[file:xdp_prog_kern.c]]) to use
188-
=BPF_MAP_TYPE_PERCPU_ARRAY=. If you only make this change, the userspace
189-
program will detect this and complain, as we query the map FD for some info
190-
(via =bpf_obj_get_info_by_fd()=) and e.g. check the map type. Remember it is
191-
userspace responsibility to make sure the data record for the value is large
192-
enough.
196+
To achieve this, the first step is to change map =type= (in
197+
[[file:xdp_prog_kern.c]]) to use =BPF_MAP_TYPE_PERCPU_ARRAY=. If you only make
198+
this change, the userspace program will detect this and complain, as we
199+
query the map FD for some info (via =bpf_obj_get_info_by_fd()=) and e.g.
200+
check the map type. Remember it is userspace's responsibility to make sure
201+
the data record for the value is large enough.
193202

194203
Next step is writing a function that gets the values per CPU and sum these.
195204
In the [[file:xdp_load_and_stats.c]]. You can copy paste this, and call it from

0 commit comments

Comments
 (0)