Граф коммитов

723525 Коммитов

Автор SHA1 Сообщение Дата
David Miller 5f5a641116 bpf: sparc64: Add JIT support for multi-function programs.
Modelled strongly upon the arm64 implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-23 01:00:52 +01:00
Yonghong Song c475ffad58 tools/bpf: adjust rlimit RLIMIT_MEMLOCK for test_dev_cgroup
The default rlimit RLIMIT_MEMLOCK is 64KB. In certain cases,
e.g. in a test machine mimicking our production system, this test may
fail due to unable to charge the required memory for prog load:

  $ ./test_dev_cgroup
  libbpf: load bpf program failed: Operation not permitted
  libbpf: failed to load program 'cgroup/dev'
  libbpf: failed to load object './dev_cgroup.o'
  Failed to load DEV_CGROUP program
  ...

Changing the default rlimit RLIMIT_MEMLOCK to unlimited
makes the test pass.

This patch also fixed a problem where when bpf_prog_load fails,
cleanup_cgroup_environment() should not be called since
setup_cgroup_environment() has not been invoked. Otherwise,
the following confusing message will appear:
  ...
  (/home/yhs/local/linux/tools/testing/selftests/bpf/cgroup_helpers.c:95:
   errno: No such file or directory) Opening Cgroup Procs: /mnt/cgroup.procs
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 19:15:54 -08:00
Alexei Starovoitov 7466177635 Merge branch 'bpftool-improvements-kallsymfix'
Daniel Borkmann says:

====================
This work adds correlation of maps and calls into the bpftool
xlated dump in order to help debugging and introspection of
loaded BPF progs. First patch makes kallsyms work on subprogs
with bpf calls, and second implements the actual correlation.
Details and example output can be found in the 2nd patch.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 18:09:42 -08:00
Daniel Borkmann 7105e828c0 bpf: allow for correlation of maps and helpers in dump
Currently a dump of an xlated prog (post verifier stage) doesn't
correlate used helpers as well as maps. The prog info lists
involved map ids, however there's no correlation of where in the
program they are used as of today. Likewise, bpftool does not
correlate helper calls with the target functions.

The latter can be done w/o any kernel changes through kallsyms,
and also has the advantage that this works with inlined helpers
and BPF calls.

Example, via interpreter:

  # tc filter show dev foo ingress
  filter protocol all pref 49152 bpf chain 0
  filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
                      direct-action not_in_hw id 1 tag c74773051b364165   <-- prog id:1

  * Output before patch (calls/maps remain unclear):

  # bpftool prog dump xlated id 1             <-- dump prog id:1
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = 0xffff95c47a8d4800
   6: (85) call unknown#73040
   7: (15) if r0 == 0x0 goto pc+18
   8: (bf) r2 = r10
   9: (07) r2 += -4
  10: (bf) r1 = r0
  11: (85) call unknown#73040
  12: (15) if r0 == 0x0 goto pc+23
  [...]

  * Output after patch:

  # bpftool prog dump xlated id 1
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]                     <-- map id:2
   6: (85) call bpf_map_lookup_elem#73424     <-- helper call
   7: (15) if r0 == 0x0 goto pc+18
   8: (bf) r2 = r10
   9: (07) r2 += -4
  10: (bf) r1 = r0
  11: (85) call bpf_map_lookup_elem#73424
  12: (15) if r0 == 0x0 goto pc+23
  [...]

  # bpftool map show id 2                     <-- show/dump/etc map id:2
  2: hash_of_maps  flags 0x0
        key 4B  value 4B  max_entries 3  memlock 4096B

Example, JITed, same prog:

  # tc filter show dev foo ingress
  filter protocol all pref 49152 bpf chain 0
  filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
                  direct-action not_in_hw id 3 tag c74773051b364165 jited

  # bpftool prog show id 3
  3: sched_cls  tag c74773051b364165
        loaded_at Dec 19/13:48  uid 0
        xlated 384B  jited 257B  memlock 4096B  map_ids 2

  # bpftool prog dump xlated id 3
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]                      <-- map id:2
   6: (85) call __htab_map_lookup_elem#77408   <-+ inlined rewrite
   7: (15) if r0 == 0x0 goto pc+2                |
   8: (07) r0 += 56                              |
   9: (79) r0 = *(u64 *)(r0 +0)                <-+
  10: (15) if r0 == 0x0 goto pc+24
  11: (bf) r2 = r10
  12: (07) r2 += -4
  [...]

Example, same prog, but kallsyms disabled (in that case we are
also not allowed to pass any relative offsets, etc, so prog
becomes pointer sanitized on dump):

  # sysctl kernel.kptr_restrict=2
  kernel.kptr_restrict = 2

  # bpftool prog dump xlated id 3
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]
   6: (85) call bpf_unspec#0
   7: (15) if r0 == 0x0 goto pc+2
  [...]

Example, BPF calls via interpreter:

  # bpftool prog dump xlated id 1
   0: (85) call pc+2#__bpf_prog_run_args32
   1: (b7) r0 = 1
   2: (95) exit
   3: (b7) r0 = 2
   4: (95) exit

Example, BPF calls via JIT:

  # sysctl net.core.bpf_jit_enable=1
  net.core.bpf_jit_enable = 1
  # sysctl net.core.bpf_jit_kallsyms=1
  net.core.bpf_jit_kallsyms = 1

  # bpftool prog dump xlated id 1
   0: (85) call pc+2#bpf_prog_3b185187f1855c4c_F
   1: (b7) r0 = 1
   2: (95) exit
   3: (b7) r0 = 2
   4: (95) exit

And finally, an example for tail calls that is now working
as well wrt correlation:

  # bpftool prog dump xlated id 2
  [...]
  10: (b7) r2 = 8
  11: (85) call bpf_trace_printk#-41312
  12: (bf) r1 = r6
  13: (18) r2 = map[id:1]
  15: (b7) r3 = 0
  16: (85) call bpf_tail_call#12
  17: (b7) r1 = 42
  18: (6b) *(u16 *)(r6 +46) = r1
  19: (b7) r0 = 0
  20: (95) exit

  # bpftool map show id 1
  1: prog_array  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 18:09:40 -08:00
Daniel Borkmann 4f74d80971 bpf: fix kallsyms handling for subprogs
Right now kallsyms handling is not working with JITed subprogs.
The reason is that when in 1c2a088a66 ("bpf: x64: add JIT support
for multi-function programs") in jit_subprogs() they are passed
to bpf_prog_kallsyms_add(), then their prog type is 0, which BPF
core will think it's a cBPF program as only cBPF programs have a
0 type. Thus, they need to inherit the type from the main prog.

Once that is fixed, they are indeed added to the BPF kallsyms
infra, but their tag is 0. Therefore, since intention is to add
them as bpf_prog_F_<tag>, we need to pass them to bpf_prog_calc_tag()
first. And once this is resolved, there is a use-after-free on
prog cleanup: we remove the kallsyms entry from the main prog,
later walk all subprogs and call bpf_jit_free() on them. However,
the kallsyms linkage was never released on them. Thus, do that
for all subprogs right in __bpf_prog_put() when refcount hits 0.

Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 18:09:40 -08:00
David Miller 7d9890ef50 libbpf: Fix build errors.
These elf object pieces are of type Elf64_Xword and therefore could be
"long long" on some builds.

Cast to "long long" and use printf format %lld to deal with this since
we are building with -Werror=format.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-20 01:33:25 +01:00
Yonghong Song 06ef0ccb5a bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog
The tools/testing/selftests/bpf test program
test_dev_cgroup fails with the following error
when compiled with llvm 6.0. (I did not try
with earlier versions.)

  libbpf: load bpf program failed: Permission denied
  libbpf: -- BEGIN DUMP LOG ---
  libbpf:
  0: (61) r2 = *(u32 *)(r1 +4)
  1: (b7) r0 = 0
  2: (55) if r2 != 0x1 goto pc+8
   R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv1 R10=fp0
  3: (69) r2 = *(u16 *)(r1 +0)
  invalid bpf_context access off=0 size=2
  ...

The culprit is the following statement in dev_cgroup.c:
  short type = ctx->access_type & 0xFFFF;
This code is typical as the ctx->access_type is assigned
as below in kernel/bpf/cgroup.c:
  struct bpf_cgroup_dev_ctx ctx = {
        .access_type = (access << 16) | dev_type,
        .major = major,
        .minor = minor,
  };

The compiler converts it to u16 access while
the verifier cgroup_dev_is_valid_access rejects
any non u32 access.

This patch permits the field access_type to be accessible
with type u16 and u8 as well.

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:43:29 +01:00
Xiongwei Song c060bc6115 bpf: make function xdp_do_generic_redirect_map() static
The function xdp_do_generic_redirect_map() is only used in this file, so
make it static.

Clean up sparse warning:
net/core/filter.c:2687:5: warning: no previous prototype
for 'xdp_do_generic_redirect_map' [-Wmissing-prototypes]

Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:37:16 +01:00
Jakub Kicinski 4ca998fe46 selftests/bpf: add netdevsim to config
BPF offload tests (test_offload.py) will require netdevsim
to be built, add it to config.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:35:12 +01:00
Alexei Starovoitov 5ee7f784cd bpf: arm64: fix uninitialized variable
fix the following issue:
arch/arm64/net/bpf_jit_comp.c: In function 'bpf_int_jit_compile':
arch/arm64/net/bpf_jit_comp.c:982:18: error: 'image_size' may be used
uninitialized in this function [-Werror=maybe-uninitialized]

Fixes: db496944fd ("bpf: arm64: add JIT support for multi-function programs")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:29:25 +01:00
Colin Ian King fa2d41adb9 bpf: make function skip_callee static and return NULL rather than 0
Function skip_callee is local to the source and does not need to
be in global scope, so make it static. Also return NULL rather than 0.
Cleans up two sparse warnings:

symbol 'skip_callee' was not declared. Should it be static?
Using plain integer as NULL pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:26:04 +01:00
Colin Ian King e90004d56b bpf: fix spelling mistake: "funcation"-> "function"
Trivial fix to spelling mistake in error message text.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:22:59 +01:00
Bjorn Helgaas 962b582785 cxgb4: Simplify PCIe Completion Timeout setting
Simplify PCIe Completion Timeout setting by using the
pcie_capability_clear_and_set_word() interface.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 15:12:57 -05:00
David S. Miller 0dc6905a6a Merge branch 'erspan-a-couple-fixes'
William Tu says:

====================
net: erspan: a couple fixes

Haishuang Yan reports a couple of issues (wrong return value,
pskb_may_pull) on erspan V1.  Since erspan V2 is in net-next,
this series fix the similar issues on v2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 15:11:26 -05:00
William Tu d91e8db5b6 net: erspan: reload pointer after pskb_may_pull
pskb_may_pull() can change skb->data, so we need to re-load pkt_md
and ershdr at the right place.

Fixes: 94d7d8f292 ("ip6_gre: add erspan v2 support")
Fixes: f551c91de2 ("net: erspan: introduce erspan v2 for ip_gre")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 15:11:25 -05:00
William Tu ae3e13373b net: erspan: fix wrong return value
If pskb_may_pull return failed, return PACKET_REJECT
instead of -ENOMEM.

Fixes: 94d7d8f292 ("ip6_gre: add erspan v2 support")
Fixes: f551c91de2 ("net: erspan: introduce erspan v2 for ip_gre")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 15:11:25 -05:00
David S. Miller 814a178413 Merge branch 'sfp-phylink-fixes'
Russell King says:

====================
More SFP/phylink fixes

This series fixes a few more bits with sfp/phylink, particularly
confusion with the right way to test for the RTNL mutex being
held, a change in 2016 to the mdiobus_scan() behaviour that wasn't
noticed, and a fix for reading module EEPROMs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 14:57:49 -05:00
Russell King 8b874514c1 phylink: fix locking asserts
Use ASSERT_RTNL() rather than WARN_ON(!lockdep_rtnl_is_held()) which
stops working when lockdep fires, and we end up with lots of warnings.

Fixes: 9525ae8395 ("phylink: add phylink infrastructure")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 14:57:48 -05:00
Russell King 2794ffc441 sfp: fix EEPROM reading in the case of non-SFF8472 SFPs
The EEPROM reading was trying to read from the second EEPROM address
if we requested the last byte from the SFF8079 EEPROM, which caused a
failure when the second EEPROM is not present.  Discovered with a
S-RJ01 SFP module.  Fix this.

Fixes: 7397005545 ("sfp: add SFP module support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 14:57:48 -05:00
Russell King 20b56ed9f8 sfp: fix non-detection of PHY
The detection of a PHY changed in commit e98a3aabf8 ("mdio_bus: don't
return NULL from mdiobus_scan()") which now causes sfp to print an
error message.  Update for this change.

Fixes: 7397005545 ("sfp: add SFP module support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 14:57:48 -05:00
Samuel Mendoza-Jonas 75e8e15635 net/ncsi: Don't take any action on HNCDSC AEN
The current HNCDSC handler takes the status flag from the AEN packet and
will update or change the current channel based on this flag and the
current channel status.

However the flag from the HNCDSC packet merely represents the host link
state. While the state of the host interface is potentially interesting
information it should not affect the state of the NCSI link. Indeed the
NCSI specification makes no mention of any recommended action related to
the host network controller driver state.

Update the HNCDSC handler to record the host network driver status but
take no other action.

Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 14:50:11 -05:00
David S. Miller 27e9f4b3e5 Merge branch 'phy-meson-gxl-clean-up-and-improvements'
Jerome Brunet says:

====================
net: phy: meson-gxl: clean-up and improvements

This patchset adds defines for the control registers and helpers to access
the banked registers. The goal being to make it easier to understand what
the driver actually does.
Then CONFIG_A6 settings is removed since this statement was without effect
Finally interrupt support is added, speeding things up a little

This series has been tested on the libretech-cc and khadas VIM

Changes since v2 [0]:
Drop LPA corruption fix which has been merged through net. Apart from this,
series remains the same.

[0]: https://lkml.kernel.org/r/20171207142715.32578-1-jbrunet@baylibre.com
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:57 -05:00
Jerome Brunet afb4fa47fe net: phy: meson-gxl: join the authors
Following previous changes, join the other authors of this driver and
take the blame with them

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:56 -05:00
Jerome Brunet cf127ff20a net: phy: meson-gxl: add interrupt support
Enable interrupt support in meson-gxl PHY driver

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:56 -05:00
Jerome Brunet 30e43f1334 net: phy: meson-gxl: leave CONFIG_A6 untouched
The PHY performs just as well when left in its default configuration and
it makes senses because this poke gets reset just after init.

According to the documentation, all registers in the Analog/DSP bank are
reset when there is a mode switch from 10BT to 100BT. The bank is also
reset on power down and soft reset, so we will never see the value which
may have been set by the bootloader.

In the end, we have used the default configuration so far and there is no
reason to change now. Remove CONFIG_A6 poke to make this clear.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:56 -05:00
Jerome Brunet c1e535510f net: phy: meson-gxl: use genphy_config_init
Use the generic init function to populate some of the phydev
structure fields

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:56 -05:00
Jerome Brunet fdaa84c371 net: phy: meson-gxl: add read and write helpers for banked registers
Add read and write helpers to manipulate banked registers on this PHY
This helps clarify the settings applied to these registers and what the
driver actually does

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:55 -05:00
Jerome Brunet 00fd73eb29 net: phy: meson-gxl: define control registers
Define registers and bits in meson-gxl PHY driver to make a bit
more human friendly. No functional change.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:55 -05:00
Jerome Brunet 9042b46eda net: phy: meson-gxl: check phy_write return value
Always check phy_write return values. Better to be safe than sorry

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:24:55 -05:00
David S. Miller e765508597 Merge branch 'sfc-Medford2'
Edward Cree says:

====================
sfc: Initial X2000-series (Medford2) support

Basic PCI-level changes to support X2000-series NICs.
Also fix unexpected-PTP-event log messages, since the timestamp format has
 been changed in these NICs and that causes us to fail to probe PTP (but we
 still get the PPS events).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:50 -05:00
Bert Kenward 0bc959a95e sfc: populate the timer reload field
The timer mode register now has a separate field for the reload value.
Since we always use this timer with the reload (for interrupt moderation)
we set this to the same as the initial value.

Previous hardware ignores this field, so we can safely set these bits
on all hardware that uses this register.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:50 -05:00
Bert Kenward d8d8ccf277 sfc: update EF10 register definitions
The RX_L4_CLASS field has shrunk from 3 bits to 2 bits. The upper
bit was never used in previous hardware, so we can use the new
definition throughout.

The TSO OUTER_IPID field was previously spelt differently from the
external definitions.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:50 -05:00
Edward Cree acaef3c156 sfc: improve PTP error reporting
Log a message if PTP probing fails; if we then, unexpectedly, get PTP
 events, only log a message for the first one on each device.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:49 -05:00
Edward Cree aae5a31663 sfc: add Medford2 (SFC9250) PCI Device IDs
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:49 -05:00
Edward Cree 7182744301 sfc: support VI strides other than 8k
Medford2 can also have 16k or 64k VI stride.  This is reported by MCDI in
 GET_CAPABILITIES, which fortunately is called before the driver does
 anything sensitive to the VI stride (such as accessing or even allocating
 VIs past the zeroth).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:49 -05:00
Edward Cree 03714bbb22 sfc: make mem_bar a function rather than a constant
Support using BAR 0 on SFC9250, even though the driver doesn't bind to such
 devices yet.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 13:07:49 -05:00
David S. Miller 59436c9ee1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2017-12-18

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Allow arbitrary function calls from one BPF function to another BPF function.
   As of today when writing BPF programs, __always_inline had to be used in
   the BPF C programs for all functions, unnecessarily causing LLVM to inflate
   code size. Handle this more naturally with support for BPF to BPF calls
   such that this __always_inline restriction can be overcome. As a result,
   it allows for better optimized code and finally enables to introduce core
   BPF libraries in the future that can be reused out of different projects.
   x86 and arm64 JIT support was added as well, from Alexei.

2) Add infrastructure for tagging functions as error injectable and allow for
   BPF to return arbitrary error values when BPF is attached via kprobes on
   those. This way of injecting errors generically eases testing and debugging
   without having to recompile or restart the kernel. Tags for opting-in for
   this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef.

3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper
   call for XDP programs. First part of this work adds handling of BPF
   capabilities included in the firmware, and the later patches add support
   to the nfp verifier part and JIT as well as some small optimizations,
   from Jakub.

4) The bpftool now also gets support for basic cgroup BPF operations such
   as attaching, detaching and listing current BPF programs. As a requirement
   for the attach part, bpftool can now also load object files through
   'bpftool prog load'. This reuses libbpf which we have in the kernel tree
   as well. bpftool-cgroup man page is added along with it, from Roman.

5) Back then commit e87c6bc385 ("bpf: permit multiple bpf attachments for
   a single perf event") added support for attaching multiple BPF programs
   to a single perf event. Given they are configured through perf's ioctl()
   interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF
   command in this work in order to return an array of one or multiple BPF
   prog ids that are currently attached, from Yonghong.

6) Various minor fixes and cleanups to the bpftool's Makefile as well
   as a new 'uninstall' and 'doc-uninstall' target for removing bpftool
   itself or prior installed documentation related to it, from Quentin.

7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is
   required for the test_dev_cgroup test case to run, from Naresh.

8) Fix reporting of XDP prog_flags for nfp driver, from Jakub.

9) Fix libbpf's exit code from the Makefile when libelf was not found in
   the system, also from Jakub.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 10:51:06 -05:00
Josef Bacik 46df3d209d trace: reenable preemption if we modify the ip
Things got moved around between the original bpf_override_return patches
and the final version, and now the ftrace kprobe dispatcher assumes if
you modified the ip that you also enabled preemption.  Make a comment of
this and enable preemption, this fixes the lockdep splat that happened
when using this feature.

Fixes: 9802d86585 ("bpf: add a bpf_override_function helper")
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:47:32 +01:00
Jakub Kicinski 4a29c0db69 nfp: set flags in the correct member of netdev_bpf
netdev_bpf.flags is the input member for installing the program.
netdev_bpf.prog_flags is the output member for querying.  Set
the correct one on query.

Fixes: 92f0292b35 ("net: xdp: report flags program was installed with on query")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:41:59 +01:00
Jakub Kicinski 21567eded9 libbpf: fix Makefile exit code if libelf not found
/bin/sh's exit does not recognize -1 as a number, leading to
the following error message:

/bin/sh: 1: exit: Illegal number: -1

Use 1 as the exit code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:40:29 +01:00
Daniel Borkmann ef9fde06a2 Merge branch 'bpf-to-bpf-function-calls'
Alexei Starovoitov says:

====================
First of all huge thank you to Daniel, John, Jakub, Edward and others who
reviewed multiple iterations of this patch set over the last many months
and to Dave and others who gave critical feedback during netconf/netdev.

The patch is solid enough and we thought through numerous corner cases,
but it's not the end. More followups with code reorg and features to follow.

TLDR: Allow arbitrary function calls from bpf function to another bpf function.

Since the beginning of bpf all bpf programs were represented as a single function
and program authors were forced to use always_inline for all functions
in their C code. That was causing llvm to unnecessary inflate the code size
and forcing developers to move code to header files with little code reuse.

With a bit of additional complexity teach verifier to recognize
arbitrary function calls from one bpf function to another as long as
all of functions are presented to the verifier as a single bpf program.
Extended program layout:
..
r1 = ..    // arg1
r2 = ..    // arg2
call pc+1  // function call pc-relative
exit
.. = r1    // access arg1
.. = r2    // access arg2
..
call pc+20 // second level of function call
...

It allows for better optimized code and finally allows to introduce
the core bpf libraries that can be reused in different projects,
since programs are no longer limited by single elf file.
With function calls bpf can be compiled into multiple .o files.

This patch is the first step. It detects programs that contain
multiple functions and checks that calls between them are valid.
It splits the sequence of bpf instructions (one program) into a set
of bpf functions that call each other. Calls to only known
functions are allowed. Since all functions are presented to
the verifier at once conceptually it is 'static linking'.

Future plans:
- introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions
  to be loaded into the kernel that can be later linked to other
  programs with concrete program types. Aka 'dynamic linking'.

- introduce function pointer type and indirect calls to allow
  bpf functions call other dynamically loaded bpf functions while
  the caller bpf function is already executing. Aka 'runtime linking'.
  This will be more generic and more flexible alternative
  to bpf_tail_calls.

FAQ:
Q: Interpreter and JIT changes mean that new instruction is introduced ?
A: No. The call instruction technically stays the same. Now it can call
   both kernel helpers and other bpf functions.
   Calling convention stays the same as well.
   From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL
   similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn.

Q: What had to change on LLVM side?
A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release:
   https://reviews.llvm.org/rL318614
   with few bugfixes as well.
   Make sure to build the latest llvm to have bpf_call support.

More details in the patches.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:37 +01:00
Daniel Borkmann 28ab173e96 selftests/bpf: additional bpf_call tests
Add some additional checks for few more corner cases.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov db496944fd bpf: arm64: add JIT support for multi-function programs
similar to x64 add support for bpf-to-bpf calls.
When program has calls to in-kernel helpers the target call offset
is known at JIT time and arm64 architecture needs 2 passes.
With bpf-to-bpf calls the dynamically allocated function start
is unknown until all functions of the program are JITed.
Therefore (just like x64) arm64 JIT needs one extra pass over
the program to emit correct call offsets.

Implementation detail:
Avoid being too clever in 64-bit immediate moves and
always use 4 instructions (instead of 3-4 depending on the address)
to make sure only one extra pass is needed.
If some future optimization would make it worth while to optimize
'call 64-bit imm' further, the JIT would need to do 4 passes
over the program instead of 3 as in this patch.
For typical bpf program address the mov needs 3 or 4 insns,
so unconditional 4 insns to save extra pass is a worthy trade off
at this state of JIT.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov 1c2a088a66 bpf: x64: add JIT support for multi-function programs
Typical JIT does several passes over bpf instructions to
compute total size and relative offsets of jumps and calls.
With multitple bpf functions calling each other all relative calls
will have invalid offsets intially therefore we need to additional
last pass over the program to emit calls with correct offsets.
For example in case of three bpf functions:
main:
  call foo
  call bpf_map_lookup
  exit
foo:
  call bar
  exit
bar:
  exit

We will call bpf_int_jit_compile() indepedently for main(), foo() and bar()
x64 JIT typically does 4-5 passes to converge.
After these initial passes the image for these 3 functions
will be good except call targets, since start addresses of
foo() and bar() are unknown when we were JITing main()
(note that call bpf_map_lookup will be resolved properly
during initial passes).
Once start addresses of 3 functions are known we patch
call_insn->imm to point to right functions and call
bpf_int_jit_compile() again which needs only one pass.
Additional safety checks are done to make sure this
last pass doesn't produce image that is larger or smaller
than previous pass.

When constant blinding is on it's applied to all functions
at the first pass, since doing it once again at the last
pass can change size of the JITed code.

Tested on x64 and arm64 hw with JIT on/off, blinding on/off.
x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter.
All other JITs that support normal BPF_CALL will behave the same way
since bpf-to-bpf call is equivalent to bpf-to-kernel call from
JITs point of view.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov 60b58afc96 bpf: fix net.core.bpf_jit_enable race
global bpf_jit_enable variable is tested multiple times in JITs,
blinding and verifier core. The malicious root can try to toggle
it while loading the programs. This race condition was accounted
for and there should be no issues, but it's safer to avoid
this race condition.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov 1ea47e01ad bpf: add support for bpf_call to interpreter
though bpf_call is still the same call instruction and
calling convention 'bpf to bpf' and 'bpf to helper' is the same
the interpreter has to oparate on 'struct bpf_insn *'.
To distinguish these two cases add a kernel internal opcode and
mark call insns with it.
This opcode is seen by interpreter only. JITs will never see it.
Also add tiny bit of debug code to aid interpreter debugging.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov b0b04fc49e selftests/bpf: add xdp noinline test
add large semi-artificial XDP test with 18 functions to stress test
bpf call verification logic

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov 3bc35c63cb selftests/bpf: add bpf_call test
strip always_inline from test_l4lb.c and compile it with -fno-inline
to let verifier go through 11 function with various function arguments
and return values

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov 48cca7e44f libbpf: add support for bpf_call
- recognize relocation emitted by llvm
- since all regular function will be kept in .text section and llvm
  takes care of pc-relative offsets in bpf_call instruction
  simply copy all of .text to relevant program section while adjusting
  bpf_call instructions in program section to point to newly copied
  body of instructions from .text
- do so for all programs in the elf file
- set all programs types to the one passed to bpf_prog_load()

Note for elf files with multiple programs that use different
functions in .text section we need to do 'linker' style logic.
This work is still TBD

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00
Alexei Starovoitov d98588cef0 selftests/bpf: add tests for stack_zero tracking
adjust two tests, since verifier got smarter
and add new one to test stack_zero logic

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00