The source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2)
Перейти к файлу
Daniel Borkmann 7be14c1c90 bpf: Fix __reg_bound_offset 64->32 var_off subreg propagation
Xu reports that after commit 3f50f132d8 ("bpf: Verifier, do explicit ALU32
bounds tracking"), the following BPF program is rejected by the verifier:

   0: (61) r2 = *(u32 *)(r1 +0)          ; R2_w=pkt(off=0,r=0,imm=0)
   1: (61) r3 = *(u32 *)(r1 +4)          ; R3_w=pkt_end(off=0,imm=0)
   2: (bf) r1 = r2
   3: (07) r1 += 1
   4: (2d) if r1 > r3 goto pc+8
   5: (71) r1 = *(u8 *)(r2 +0)           ; R1_w=scalar(umax=255,var_off=(0x0; 0xff))
   6: (18) r0 = 0x7fffffffffffff10
   8: (0f) r1 += r0                      ; R1_w=scalar(umin=0x7fffffffffffff10,umax=0x800000000000000f)
   9: (18) r0 = 0x8000000000000000
  11: (07) r0 += 1
  12: (ad) if r0 < r1 goto pc-2
  13: (b7) r0 = 0
  14: (95) exit

And the verifier log says:

  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=pkt(off=0,r=0,imm=0)
  1: (61) r3 = *(u32 *)(r1 +4)          ; R1=ctx(off=0,imm=0) R3_w=pkt_end(off=0,imm=0)
  2: (bf) r1 = r2                       ; R1_w=pkt(off=0,r=0,imm=0) R2_w=pkt(off=0,r=0,imm=0)
  3: (07) r1 += 1                       ; R1_w=pkt(off=1,r=0,imm=0)
  4: (2d) if r1 > r3 goto pc+8          ; R1_w=pkt(off=1,r=1,imm=0) R3_w=pkt_end(off=0,imm=0)
  5: (71) r1 = *(u8 *)(r2 +0)           ; R1_w=scalar(umax=255,var_off=(0x0; 0xff)) R2_w=pkt(off=0,r=1,imm=0)
  6: (18) r0 = 0x7fffffffffffff10       ; R0_w=9223372036854775568
  8: (0f) r1 += r0                      ; R0_w=9223372036854775568 R1_w=scalar(umin=9223372036854775568,umax=9223372036854775823,s32_min=-240,s32_max=15)
  9: (18) r0 = 0x8000000000000000       ; R0_w=-9223372036854775808
  11: (07) r0 += 1                      ; R0_w=-9223372036854775807
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775807 R1_w=scalar(umin=9223372036854775568,umax=9223372036854775809)
  13: (b7) r0 = 0                       ; R0_w=0
  14: (95) exit

  from 12 to 11: R0_w=-9223372036854775807 R1_w=scalar(umin=9223372036854775810,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) R2_w=pkt(off=0,r=1,imm=0) R3_w=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775806
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775806 R1_w=scalar(umin=9223372036854775810,umax=9223372036854775810,var_off=(0x8000000000000000; 0xffffffff))
  13: safe

  [...]

  from 12 to 11: R0_w=-9223372036854775795 R1=scalar(umin=9223372036854775822,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775794
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775794 R1=scalar(umin=9223372036854775822,umax=9223372036854775822,var_off=(0x8000000000000000; 0xffffffff))
  13: safe

  from 12 to 11: R0_w=-9223372036854775794 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775793
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
  13: safe

  from 12 to 11: R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775792
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775792 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff))
  13: safe

  [...]

The 64bit umin=9223372036854775810 bound continuously bumps by +1 while
umax=9223372036854775823 stays as-is until the verifier complexity limit
is reached and the program gets finally rejected. During this simulation,
the umin also eventually surpasses umax. Looking at the first 'from 12
to 11' output line from the loop, R1 has the following state:

  R1_w=scalar(umin=0x8000000000000002 (9223372036854775810),
              umax=0x800000000000000f (9223372036854775823),
          var_off=(0x8000000000000000;
                           0xffffffff))

The var_off has technically not an inconsistent state but it's very
imprecise and far off surpassing 64bit umax bounds whereas the expected
output with refined known bits in var_off should have been like:

  R1_w=scalar(umin=0x8000000000000002 (9223372036854775810),
              umax=0x800000000000000f (9223372036854775823),
          var_off=(0x8000000000000000;
                                  0xf))

In the above log, var_off stays as var_off=(0x8000000000000000; 0xffffffff)
and does not converge into a narrower mask where more bits become known,
eventually transforming R1 into a constant upon umin=9223372036854775823,
umax=9223372036854775823 case where the verifier would have terminated and
let the program pass.

The __reg_combine_64_into_32() marks the subregister unknown and propagates
64bit {s,u}min/{s,u}max bounds to their 32bit equivalents iff they are within
the 32bit universe. The question came up whether __reg_combine_64_into_32()
should special case the situation that when 64bit {s,u}min bounds have
the same value as 64bit {s,u}max bounds to then assign the latter as
well to the 32bit reg->{s,u}32_{min,max}_value. As can be seen from the
above example however, that is just /one/ special case and not a /generic/
solution given above example would still not be addressed this way and
remain at an imprecise var_off=(0x8000000000000000; 0xffffffff).

The improvement is needed in __reg_bound_offset() to refine var32_off with
the updated var64_off instead of the prior reg->var_off. The reg_bounds_sync()
code first refines information about the register's min/max bounds via
__update_reg_bounds() from the current var_off, then in __reg_deduce_bounds()
from sign bit and with the potentially learned bits from bounds it'll
update the var_off tnum in __reg_bound_offset(). For example, intersecting
with the old var_off might have improved bounds slightly, e.g. if umax
was 0x7f...f and var_off was (0; 0xf...fc), then new var_off will then
result in (0; 0x7f...fc). The intersected var64_off holds then the
universe which is a superset of var32_off. The point for the latter is
not to broaden, but to further refine known bits based on the intersection
of var_off with 32 bit bounds, so that we later construct the final var_off
from upper and lower 32 bits. The final __update_reg_bounds() can then
potentially still slightly refine bounds if more bits became known from the
new var_off.

After the improvement, we can see R1 converging successively:

  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=pkt(off=0,r=0,imm=0)
  1: (61) r3 = *(u32 *)(r1 +4)          ; R1=ctx(off=0,imm=0) R3_w=pkt_end(off=0,imm=0)
  2: (bf) r1 = r2                       ; R1_w=pkt(off=0,r=0,imm=0) R2_w=pkt(off=0,r=0,imm=0)
  3: (07) r1 += 1                       ; R1_w=pkt(off=1,r=0,imm=0)
  4: (2d) if r1 > r3 goto pc+8          ; R1_w=pkt(off=1,r=1,imm=0) R3_w=pkt_end(off=0,imm=0)
  5: (71) r1 = *(u8 *)(r2 +0)           ; R1_w=scalar(umax=255,var_off=(0x0; 0xff)) R2_w=pkt(off=0,r=1,imm=0)
  6: (18) r0 = 0x7fffffffffffff10       ; R0_w=9223372036854775568
  8: (0f) r1 += r0                      ; R0_w=9223372036854775568 R1_w=scalar(umin=9223372036854775568,umax=9223372036854775823,s32_min=-240,s32_max=15)
  9: (18) r0 = 0x8000000000000000       ; R0_w=-9223372036854775808
  11: (07) r0 += 1                      ; R0_w=-9223372036854775807
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775807 R1_w=scalar(umin=9223372036854775568,umax=9223372036854775809)
  13: (b7) r0 = 0                       ; R0_w=0
  14: (95) exit

  from 12 to 11: R0_w=-9223372036854775807 R1_w=scalar(umin=9223372036854775810,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2_w=pkt(off=0,r=1,imm=0) R3_w=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775806
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775806 R1_w=-9223372036854775806
  13: safe

  from 12 to 11: R0_w=-9223372036854775806 R1_w=scalar(umin=9223372036854775811,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2_w=pkt(off=0,r=1,imm=0) R3_w=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775805
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775805 R1_w=-9223372036854775805
  13: safe

  [...]

  from 12 to 11: R0_w=-9223372036854775798 R1=scalar(umin=9223372036854775819,umax=9223372036854775823,var_off=(0x8000000000000008; 0x7),s32_min=8,s32_max=15,u32_min=8,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775797
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775797 R1=-9223372036854775797
  13: safe

  from 12 to 11: R0_w=-9223372036854775797 R1=scalar(umin=9223372036854775820,umax=9223372036854775823,var_off=(0x800000000000000c; 0x3),s32_min=12,s32_max=15,u32_min=12,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775796
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775796 R1=-9223372036854775796
  13: safe

  from 12 to 11: R0_w=-9223372036854775796 R1=scalar(umin=9223372036854775821,umax=9223372036854775823,var_off=(0x800000000000000c; 0x3),s32_min=12,s32_max=15,u32_min=12,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775795
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775795 R1=-9223372036854775795
  13: safe

  from 12 to 11: R0_w=-9223372036854775795 R1=scalar(umin=9223372036854775822,umax=9223372036854775823,var_off=(0x800000000000000e; 0x1),s32_min=14,s32_max=15,u32_min=14,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775794
  12: (ad) if r0 < r1 goto pc-2         ; R0_w=-9223372036854775794 R1=-9223372036854775794
  13: safe

  from 12 to 11: R0_w=-9223372036854775794 R1=-9223372036854775793 R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  11: (07) r0 += 1                      ; R0_w=-9223372036854775793
  12: (ad) if r0 < r1 goto pc-2
  last_idx 12 first_idx 12
  parent didn't have regs=1 stack=0 marks: R0_rw=P-9223372036854775801 R1_r=scalar(umin=9223372036854775815,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  last_idx 11 first_idx 11
  regs=1 stack=0 before 11: (07) r0 += 1
  parent didn't have regs=1 stack=0 marks: R0_rw=P-9223372036854775805 R1_rw=scalar(umin=9223372036854775812,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2_w=pkt(off=0,r=1,imm=0) R3_w=pkt_end(off=0,imm=0) R10=fp0
  last_idx 12 first_idx 0
  regs=1 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=1 stack=0 before 11: (07) r0 += 1
  regs=1 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=1 stack=0 before 11: (07) r0 += 1
  regs=1 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=1 stack=0 before 11: (07) r0 += 1
  regs=1 stack=0 before 9: (18) r0 = 0x8000000000000000
  last_idx 12 first_idx 12
  parent didn't have regs=2 stack=0 marks: R0_rw=P-9223372036854775801 R1_r=Pscalar(umin=9223372036854775815,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2=pkt(off=0,r=1,imm=0) R3=pkt_end(off=0,imm=0) R10=fp0
  last_idx 11 first_idx 11
  regs=2 stack=0 before 11: (07) r0 += 1
  parent didn't have regs=2 stack=0 marks: R0_rw=P-9223372036854775805 R1_rw=Pscalar(umin=9223372036854775812,umax=9223372036854775823,var_off=(0x8000000000000000; 0xf),s32_min=0,s32_max=15,u32_max=15) R2_w=pkt(off=0,r=1,imm=0) R3_w=pkt_end(off=0,imm=0) R10=fp0
  last_idx 12 first_idx 0
  regs=2 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=2 stack=0 before 11: (07) r0 += 1
  regs=2 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=2 stack=0 before 11: (07) r0 += 1
  regs=2 stack=0 before 12: (ad) if r0 < r1 goto pc-2
  regs=2 stack=0 before 11: (07) r0 += 1
  regs=2 stack=0 before 9: (18) r0 = 0x8000000000000000
  regs=2 stack=0 before 8: (0f) r1 += r0
  regs=3 stack=0 before 6: (18) r0 = 0x7fffffffffffff10
  regs=2 stack=0 before 5: (71) r1 = *(u8 *)(r2 +0)
  13: safe

  from 4 to 13: safe
  verification time 322 usec
  stack depth 0
  processed 56 insns (limit 1000000) max_states_per_insn 1 total_states 3 peak_states 3 mark_read 1

This also fixes up a test case along with this improvement where we match
on the verifier log. The updated log now has a refined var_off, too.

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Xu Kuohai <xukuohai@huaweicloud.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20230314203424.4015351-2-xukuohai@huaweicloud.com
Link: https://lore.kernel.org/bpf/20230322213056.2470-1-daniel@iogearbox.net
2023-03-22 16:49:25 -07:00
Documentation bpf, docs: Libbpf overview documentation 2023-03-18 10:17:39 -07:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
arch bpf-next-for-netdev 2023-03-06 20:36:39 -08:00
block Driver core changes for 6.3-rc1 2023-02-24 12:58:55 -08:00
certs Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
crypto Networking changes for 6.3. 2023-02-21 18:24:12 -08:00
drivers net: microchip: sparx5: Add TC template support 2023-03-08 13:19:43 +00:00
fs ARM: SoC drivers for 6.3 2023-02-27 10:04:49 -08:00
include bpf: return long from bpf_map_ops funcs 2023-03-22 15:11:30 -07:00
init Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
io_uring net: reclaim skb->scm_io_uring bit 2023-03-08 13:21:47 +00:00
ipc Merge branch 'work.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:20:07 -08:00
kernel bpf: Fix __reg_bound_offset 64->32 var_off subreg propagation 2023-03-22 16:49:25 -07:00
lib Kernel concurrency sanitizer (KCSAN) updates for v6.3 2023-02-25 13:02:20 -08:00
mm memblock: small optimizations 2023-02-27 09:34:53 -08:00
net bpf: return long from bpf_map_ops funcs 2023-03-22 15:11:30 -07:00
rust Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
samples bpf: use canonical ftrace path 2023-03-13 21:51:30 -07:00
scripts Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
security powerpc updates for 6.3 2023-02-25 11:00:06 -08:00
sound ARM: SoC drivers for 6.3 2023-02-27 10:04:49 -08:00
tools bpf: Fix __reg_bound_offset 64->32 var_off subreg propagation 2023-03-22 16:49:25 -07:00
usr usr/gen_init_cpio.c: remove unnecessary -1 values from int file 2022-10-03 14:21:44 -07:00
virt KVM/riscv changes for 6.3 2023-02-15 12:33:28 -05:00
.clang-format media: subdev: Add for_each_active_route() macro 2023-01-22 09:35:57 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for *.dtso files 2023-02-26 15:28:23 +09:00
.gitignore .gitignore: ignore *.cover and *.mbx 2023-02-05 18:51:22 +09:00
.mailmap 12 hotfixes, mostly against mm/. Five of these fixes are cc:stable. 2023-02-13 14:09:20 -08:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS ARM: SoC fixes for 6.3, part 1 2023-02-27 10:09:40 -08:00
Makefile Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.