Граф коммитов

20 Коммитов

Автор SHA1 Сообщение Дата
Daniel Borkmann 286aad3c40 net: bpf: be friendly to kmemcheck
Reported by Mikulas Patocka, kmemcheck currently barks out a
false positive since we don't have special kmemcheck annotation
for bitfields used in bpf_prog structure.

We currently have jited:1, len:31 and thus when accessing len
while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
we're reading uninitialized memory.

As we don't need the whole bit universe for pages member, we
can just split it to u16 and use a bool flag for jited instead
of a bitfield.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09 16:58:56 -07:00
Daniel Borkmann 60a3b2253c net: bpf: make eBPF interpreter images read-only
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:02:48 -07:00
Alexei Starovoitov 7ae457c1e5 net: filter: split 'struct sk_filter' into socket and bpf parts
clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix

split 'struct sk_filter' into
struct sk_filter {
	atomic_t        refcnt;
	struct rcu_head rcu;
	struct bpf_prog *prog;
};
and
struct bpf_prog {
        u32                     jited:1,
                                len:31;
        struct sock_fprog_kern  *orig_prog;
        unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                            const struct bpf_insn *filter);
        union {
                struct sock_filter      insns[0];
                struct bpf_insn         insnsi[0];
                struct work_struct      work;
        };
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases

split SK_RUN_FILTER macro into:
    SK_RUN_FILTER to be used with 'struct sk_filter *' and
    BPF_PROG_RUN to be used with 'struct bpf_prog *'

__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function

also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:

sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter

API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet

API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:03:58 -07:00
Markos Chandras d8214ef14a MIPS: bpf: Fix stack space allocation for BPF memwords on MIPS64
When allocating stack space for BPF memwords we need to use the
appropriate 32 or 64-bit instruction to avoid losing the top 32 bits
of the stack pointer.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7135/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:23 +01:00
Markos Chandras b6a14a9845 MIPS: BPF: Use 32 or 64-bit load instruction to load an address to register
When loading a pointer to register we need to use the appropriate
32 or 64bit instruction to preserve the pointers' top 32bits.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7180/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:22 +01:00
Markos Chandras b4fe0ec86d MIPS: bpf: Fix PKT_TYPE case for big-endian cores
The skb->pkt_type field is defined as follows:

u8 pkt_type:3,
   fclone:2,
   ipvs_property:1,
   peeked:1,
   nf_trace:1

resulting to the following layout in big-endian systems

[pkt_type][fclone][ipvs_propery][peeked][nf_trace]
^                                                ^
|                                                |
LSB                                             MSB

As a result, the existing code did not work because it was trying to
match pkt_type == 7 whereas in reality it is 7<<5 on big-endian
systems.

This has been fixed in the interpreter in
0dcceabb0c
"net: filter: fix SKF_AD_PKTTYPE extension on big-endian"

The fix is to look for 7<<5 on big-endian systems for the pkt_type
field, and shift by 5 so the packet type will be at the lower 3 bits
of the A register.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7132/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:22 +01:00
Markos Chandras 95782bf434 MIPS: BPF: Prevent kernel fall over for >=32bit shifts
Remove BUG_ON() if the shift immediate is >=32 to avoid kernel crashes
due to malicious user input. If the shift immediate is >= 32,
we simply load the destination register with 0 since only
32-bit instructions are used by JIT so this will do the
correct thing even on MIPS64.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7179/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:22 +01:00
Markos Chandras e5bb48b055 MIPS: bpf: Drop update_on_xread and always initialize the X register
Previously, update_on_xread() only set the reset flag if SEEN_X hasn't
been set already. However, SEEN_X is used to indicate that X is used
as destination or source register so there are some cases where X
is only used as source register and we really need to make sure that it
has been initialized in time. As a result of which, drop this function and
always set X to zero if it's used in any of the opcodes.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7133/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:22 +01:00
Markos Chandras 10c4d614d2 MIPS: bpf: Fix is_range() semantics
is_range() was meant to check whether the number is within
the s16 range or not. However the return values and consumers expected
the exact opposite. We fix that by inverting the logic in the function
to return 'true' for < s16 and 'false' for > s16.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Reported-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7131/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:21 +01:00
Markos Chandras 78b95b662c MIPS: bpf: Use pr_debug instead of pr_warn for unhandled opcodes
We should prevent spamming the logs during normal execution of bpf-jit.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Suggested-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7129/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:21 +01:00
Markos Chandras 91a41d7f97 MIPS: bpf: Fix return values for VLAN_TAG_PRESENT case
If VLAN_TAG_PRESENT is not zero, then return 1 as expected by
classic BPF. Otherwise return 0.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7128/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:21 +01:00
Markos Chandras 6e86c59d4d MIPS: bpf: Use correct mask for VLAN_TAG case
Using VLAN_VID_MASK is not correct to get the vlan tag. Use
~VLAN_PRESENT_MASK instead and make sure it's u16 so the top 16-bits
will be removed. This will ensure that the emit_andi() code will not
treat this as a big 32-bit unsigned value.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7127/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:21 +01:00
Markos Chandras 1ab24a4e3d MIPS: bpf: Fix branch conditional for BPF_J{GT/GE} cases
The sltiu and sltu instructions will set the scratch register
to 1 if A <= X|K so fix the emitted branch conditional to check
for scratch != zero rather than scratch >= zero which would complicate
the resuling branch logic given that MIPS does not have a BGT or BGET
instructions to compare general purpose registers directly.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7126/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:21 +01:00
Markos Chandras 9eebfe478d MIPS: bpf: Add SEEN_SKB to flags when looking for the PKT_TYPE
The SKF_AD_PKTTYPE uses the skb pointer so make sure it's in the
flags so it will be initialized in time.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7125/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:20 +01:00
Markos Chandras 9ee1606e8a MIPS: bpf: Use 'andi' instead of 'and' for the VLAN cases
The VLAN_VID_MASK and VLAN_TAG_PRESENT are immediates, so using
'and' which expects 3 registers will produce wrong results. Fix
this by using the 'andi' instruction.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7124/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:20 +01:00
Markos Chandras 55393ee535 MIPS: bpf: Return error code if the offset is a negative number
Previously, the negative offset was not checked leading to failures
due to trying to load data beyond the skb struct boundaries. Until we
have proper asm helpers in place, it's best if we return ENOSUPP if K
is negative when trying to JIT the filter or 0 during runtime if we
do an indirect load where the value of X is unknown during build time.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7123/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:20 +01:00
Markos Chandras 35a8e16abe MIPS: bpf: Use the LO register to get division's quotient
Reading from the HI register to get the division result is wrong.
The quotient is placed in the LO register.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7122/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:20 +01:00
Daniel Borkmann a83d081ed1 MIPS: BPF JIT: Fix build error.
mips: allmodconfig fails in 3.16-rc1 with lots of undefined symbols.

  arch/mips/net/bpf_jit.c: In function 'is_load_to_a':
  arch/mips/net/bpf_jit.c:559:7: error: 'BPF_S_LD_W_LEN' undeclared (first use in this function)
  arch/mips/net/bpf_jit.c:559:7: note: each undeclared identifier is reported only once for each function it appears in
  arch/mips/net/bpf_jit.c:560:7: error: 'BPF_S_LD_W_ABS' undeclared (first use in this function)
  [...]

The reason behind this is that 3480593131 ("net: filter: get rid of
BPF_S_* enum") was routed via net-next tree, that takes all BPF-related
changes, at a time where MIPS BPF JIT was not part of net-next, while
c6610de353 ("MIPS: net: Add BPF JIT") was routed via mips arch tree
and went into mainline within the same merge window. Thus, fix it up by
converting BPF_S_* in a similar fashion as in 3480593131 for MIPS.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>
Cc: Linux MIPS Mailing List <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/7099/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-26 10:48:18 +01:00
Ralf Baechle b4f16c938e MIPS: BFP: Simplify code slightly.
This keeps the if condition slightly simpler - it's going to become ore
complication.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-06-04 22:50:40 +02:00
Markos Chandras c6610de353 MIPS: net: Add BPF JIT
This adds initial support for BPF-JIT on MIPS

Tested on mips32 LE/BE and mips64 BE/n64 using
dhcp, ping and various tcpdump filters.

Benchmarking:

Assuming the remote MIPS target uses 192.168.154.181
as its IP address, and the local host uses 192.168.154.136,
the following results can be obtained using the following
tcpdump filter (catches no frames) and a simple
'time ping -f -c 1000000' command.

[root@(none) ~]# tcpdump -p -n -s 0 -i eth0 net 10.0.0.0/24 -d
(000) ldh      [12]
(001) jeq      #0x800           jt 2	jf 8
(002) ld       [26]
(003) and      #0xffffff00
(004) jeq      #0xa000000       jt 16	jf 5
(005) ld       [30]
(006) and      #0xffffff00
(007) jeq      #0xa000000       jt 16	jf 17
(008) jeq      #0x806           jt 10	jf 9
(009) jeq      #0x8035          jt 10	jf 17
(010) ld       [28]
(011) and      #0xffffff00
(012) jeq      #0xa000000       jt 16	jf 13
(013) ld       [38]
(014) and      #0xffffff00
(015) jeq      #0xa000000       jt 16	jf 17
(016) ret      #65535

- BPF-JIT Disabled

real    1m38.005s
user    0m1.510s
sys     0m6.710s

- BPF-JIT Enabled

real    1m35.215s
user    0m1.200s
sys     0m4.140s

[ralf@linux-mips.org: Resolved conflict.]

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2014-05-30 16:10:20 +02:00