@@ -347,270 +347,6 @@ However, only the value of register ``r1`` is important to successfully finish
347347verification. The goal of the liveness tracking algorithm is to spot this fact
348348and figure out that both states are actually equivalent.
349349
350- Data structures
351- ~~~~~~~~~~~~~~~
352-
353- Liveness is tracked using the following data structures::
354-
355- enum bpf_reg_liveness {
356- REG_LIVE_NONE = 0,
357- REG_LIVE_READ32 = 0x1,
358- REG_LIVE_READ64 = 0x2,
359- REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
360- REG_LIVE_WRITTEN = 0x4,
361- REG_LIVE_DONE = 0x8,
362- };
363-
364- struct bpf_reg_state {
365- ...
366- struct bpf_reg_state *parent;
367- ...
368- enum bpf_reg_liveness live;
369- ...
370- };
371-
372- struct bpf_stack_state {
373- struct bpf_reg_state spilled_ptr;
374- ...
375- };
376-
377- struct bpf_func_state {
378- struct bpf_reg_state regs[MAX_BPF_REG];
379- ...
380- struct bpf_stack_state *stack;
381- }
382-
383- struct bpf_verifier_state {
384- struct bpf_func_state *frame[MAX_CALL_FRAMES];
385- struct bpf_verifier_state *parent;
386- ...
387- }
388-
389- * ``REG_LIVE_NONE `` is an initial value assigned to ``->live `` fields upon new
390- verifier state creation;
391-
392- * ``REG_LIVE_WRITTEN `` means that the value of the register (or stack slot) is
393- defined by some instruction verified between this verifier state's parent and
394- verifier state itself;
395-
396- * ``REG_LIVE_READ{32,64} `` means that the value of the register (or stack slot)
397- is read by a some child state of this verifier state;
398-
399- * ``REG_LIVE_DONE `` is a marker used by ``clean_verifier_state() `` to avoid
400- processing same verifier state multiple times and for some sanity checks;
401-
402- * ``->live `` field values are formed by combining ``enum bpf_reg_liveness ``
403- values using bitwise or.
404-
405- Register parentage chains
406- ~~~~~~~~~~~~~~~~~~~~~~~~~
407-
408- In order to propagate information between parent and child states, a *register
409- parentage chain * is established. Each register or stack slot is linked to a
410- corresponding register or stack slot in its parent state via a ``->parent ``
411- pointer. This link is established upon state creation in ``is_state_visited() ``
412- and might be modified by ``set_callee_state() `` called from
413- ``__check_func_call() ``.
414-
415- The rules for correspondence between registers / stack slots are as follows:
416-
417- * For the current stack frame, registers and stack slots of the new state are
418- linked to the registers and stack slots of the parent state with the same
419- indices.
420-
421- * For the outer stack frames, only callee saved registers (r6-r9) and stack
422- slots are linked to the registers and stack slots of the parent state with the
423- same indices.
424-
425- * When function call is processed a new ``struct bpf_func_state `` instance is
426- allocated, it encapsulates a new set of registers and stack slots. For this
427- new frame, parent links for r6-r9 and stack slots are set to nil, parent links
428- for r1-r5 are set to match caller r1-r5 parent links.
429-
430- This could be illustrated by the following diagram (arrows stand for
431- ``->parent `` pointers)::
432-
433- ... ; Frame #0, some instructions
434- --- checkpoint #0 ---
435- 1 : r6 = 42 ; Frame #0
436- --- checkpoint #1 ---
437- 2 : call foo() ; Frame #0
438- ... ; Frame #1, instructions from foo()
439- --- checkpoint #2 ---
440- ... ; Frame #1, instructions from foo()
441- --- checkpoint #3 ---
442- exit ; Frame #1, return from foo()
443- 3 : r1 = r6 ; Frame #0 <- current state
444-
445- +-------------------------------+-------------------------------+
446- | Frame #0 | Frame #1 |
447- Checkpoint +-------------------------------+-------------------------------+
448- #0 | r0 | r1-r5 | r6-r9 | fp-8 ... |
449- +-------------------------------+
450- ^ ^ ^ ^
451- | | | |
452- Checkpoint +-------------------------------+
453- #1 | r0 | r1-r5 | r6-r9 | fp-8 ... |
454- +-------------------------------+
455- ^ ^ ^
456- |_______|_______|_______________
457- | | |
458- nil nil | | | nil nil
459- | | | | | | |
460- Checkpoint +-------------------------------+-------------------------------+
461- #2 | r0 | r1-r5 | r6-r9 | fp-8 ... | r0 | r1-r5 | r6-r9 | fp-8 ... |
462- +-------------------------------+-------------------------------+
463- ^ ^ ^ ^ ^
464- nil nil | | | | |
465- | | | | | | |
466- Checkpoint +-------------------------------+-------------------------------+
467- #3 | r0 | r1-r5 | r6-r9 | fp-8 ... | r0 | r1-r5 | r6-r9 | fp-8 ... |
468- +-------------------------------+-------------------------------+
469- ^ ^
470- nil nil | |
471- | | | |
472- Current +-------------------------------+
473- state | r0 | r1-r5 | r6-r9 | fp-8 ... |
474- +-------------------------------+
475- \
476- r6 read mark is propagated via these links
477- all the way up to checkpoint #1.
478- The checkpoint #1 contains a write mark for r6
479- because of instruction (1), thus read propagation
480- does not reach checkpoint #0 (see section below).
481-
482- Liveness marks tracking
483- ~~~~~~~~~~~~~~~~~~~~~~~
484-
485- For each processed instruction, the verifier tracks read and written registers
486- and stack slots. The main idea of the algorithm is that read marks propagate
487- back along the state parentage chain until they hit a write mark, which 'screens
488- off' earlier states from the read. The information about reads is propagated by
489- function ``mark_reg_read() `` which could be summarized as follows::
490-
491- mark_reg_read(struct bpf_reg_state *state, ...):
492- parent = state->parent
493- while parent:
494- if state->live & REG_LIVE_WRITTEN:
495- break
496- if parent->live & REG_LIVE_READ64:
497- break
498- parent->live |= REG_LIVE_READ64
499- state = parent
500- parent = state->parent
501-
502- Notes:
503-
504- * The read marks are applied to the **parent ** state while write marks are
505- applied to the **current ** state. The write mark on a register or stack slot
506- means that it is updated by some instruction in the straight-line code leading
507- from the parent state to the current state.
508-
509- * Details about REG_LIVE_READ32 are omitted.
510-
511- * Function ``propagate_liveness() `` (see section :ref: `read_marks_for_cache_hits `)
512- might override the first parent link. Please refer to the comments in the
513- ``propagate_liveness() `` and ``mark_reg_read() `` source code for further
514- details.
515-
516- Because stack writes could have different sizes ``REG_LIVE_WRITTEN `` marks are
517- applied conservatively: stack slots are marked as written only if write size
518- corresponds to the size of the register, e.g. see function ``save_register_state() ``.
519-
520- Consider the following example::
521-
522- 0: (*u64)(r10 - 8) = 0 ; define 8 bytes of fp-8
523- --- checkpoint #0 ---
524- 1: (*u32)(r10 - 8) = 1 ; redefine lower 4 bytes
525- 2: r1 = (*u32)(r10 - 8) ; read lower 4 bytes defined at (1)
526- 3: r2 = (*u32)(r10 - 4) ; read upper 4 bytes defined at (0)
527-
528- As stated above, the write at (1) does not count as ``REG_LIVE_WRITTEN ``. Should
529- it be otherwise, the algorithm above wouldn't be able to propagate the read mark
530- from (3) to checkpoint #0.
531-
532- Once the ``BPF_EXIT `` instruction is reached ``update_branch_counts() `` is
533- called to update the ``->branches `` counter for each verifier state in a chain
534- of parent verifier states. When the ``->branches `` counter reaches zero the
535- verifier state becomes a valid entry in a set of cached verifier states.
536-
537- Each entry of the verifier states cache is post-processed by a function
538- ``clean_live_states() ``. This function marks all registers and stack slots
539- without ``REG_LIVE_READ{32,64} `` marks as ``NOT_INIT `` or ``STACK_INVALID ``.
540- Registers/stack slots marked in this way are ignored in function ``stacksafe() ``
541- called from ``states_equal() `` when a state cache entry is considered for
542- equivalence with a current state.
543-
544- Now it is possible to explain how the example from the beginning of the section
545- works::
546-
547- 0: call bpf_get_prandom_u32()
548- 1: r1 = 0
549- 2: if r0 == 0 goto +1
550- 3: r0 = 1
551- --- checkpoint[0] ---
552- 4: r0 = r1
553- 5: exit
554-
555- * At instruction #2 branching point is reached and state ``{ r0 == 0, r1 == 0, pc == 4 } ``
556- is pushed to states processing queue (pc stands for program counter).
557-
558- * At instruction #4:
559-
560- * ``checkpoint[0] `` states cache entry is created: ``{ r0 == 1, r1 == 0, pc == 4 } ``;
561- * ``checkpoint[0].r0 `` is marked as written;
562- * ``checkpoint[0].r1 `` is marked as read;
563-
564- * At instruction #5 exit is reached and ``checkpoint[0] `` can now be processed
565- by ``clean_live_states() ``. After this processing ``checkpoint[0].r1 `` has a
566- read mark and all other registers and stack slots are marked as ``NOT_INIT ``
567- or ``STACK_INVALID ``
568-
569- * The state ``{ r0 == 0, r1 == 0, pc == 4 } `` is popped from the states queue
570- and is compared against a cached state ``{ r1 == 0, pc == 4 } ``, the states
571- are considered equivalent.
572-
573- .. _read_marks_for_cache_hits :
574-
575- Read marks propagation for cache hits
576- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
577-
578- Another point is the handling of read marks when a previously verified state is
579- found in the states cache. Upon cache hit verifier must behave in the same way
580- as if the current state was verified to the program exit. This means that all
581- read marks, present on registers and stack slots of the cached state, must be
582- propagated over the parentage chain of the current state. Example below shows
583- why this is important. Function ``propagate_liveness() `` handles this case.
584-
585- Consider the following state parentage chain (S is a starting state, A-E are
586- derived states, -> arrows show which state is derived from which)::
587-
588- r1 read
589- <------------- A[r1] == 0
590- C[r1] == 0
591- S ---> A ---> B ---> exit E[r1] == 1
592- |
593- ` ---> C ---> D
594- |
595- ` ---> E ^
596- |___ suppose all these
597- ^ states are at insn #Y
598- |
599- suppose all these
600- states are at insn #X
601-
602- * Chain of states ``S -> A -> B -> exit `` is verified first.
603-
604- * While ``B -> exit `` is verified, register ``r1 `` is read and this read mark is
605- propagated up to state ``A ``.
606-
607- * When chain of states ``C -> D `` is verified the state ``D `` turns out to be
608- equivalent to state ``B ``.
609-
610- * The read mark for ``r1 `` has to be propagated to state ``C ``, otherwise state
611- ``C `` might get mistakenly marked as equivalent to state ``E `` even though
612- values for register ``r1 `` differ between ``C `` and ``E ``.
613-
614350Understanding eBPF verifier messages
615351====================================
616352
0 commit comments