Граф коммитов

577019 Коммитов

Автор SHA1 Сообщение Дата
Jes Sorensen 4de2481919 rtl8xxxu: Fix incorrect test for auto LLT failure
The logic for testing auto load failure in rtl8xxxu_auto_llt_table()
was inverted.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:54 +02:00
Jes Sorensen a47b9d477c rtl8xxxu: Init the LLT after we start the firmware
To match the flow of the vendor driver, move the LLT init to after the
firmware is started.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:54 +02:00
Jes Sorensen 07bb46be53 rtl8xxxu: Init page boundaries before starting the firmware
This reorganizes the device initialization to init page boundaries
before starting the firmware. This matches the flow in the 8192eu
vendor driver.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:54 +02:00
Jes Sorensen 74b99bed87 rtl8xxxu: Add rtl8xxxu_auto_llt_table()
Newer chips can auto load the LLT table, it is no longer necessary to
build it manually in the driver.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:53 +02:00
Jes Sorensen c05a9dbfb2 rtl8xxxu: Implment rtl8192eu_power_on()
This implements the rtl8192eu power on sequence, and splits it off
from the rtl8192cu/rtl8723au power on sequence.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:53 +02:00
Jes Sorensen b001e086c4 rtl8xxxu: Add rtl8192eu_nic.bin to the MODULE_FIRMWARE list
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:53 +02:00
Jes Sorensen 4a82ffe3f8 rtl8xxxu: Use 1024 byte block loads for 8192eu firmware
The rtl8192eu can handle 1024 byte block writes, unlike it's
predecessors (8192cu/8188cu).

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:53 +02:00
Jes Sorensen 0e5d435a61 rtl8xxxu: Identify chip vendors correctly
This identifies the chip vendors correctly and also picks the correct
firmware for rtl8192eu.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:52 +02:00
Jes Sorensen 3307d84024 rtl8xxxu: Add initial code to parse rtl8192eu efuse
This is the start of 8192eu support. For now just detect the device
and parse the efuse.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:28:52 +02:00
Eliad Peller 8cf77e176f wlcore/wl18xx: add radar_debug_mode handling
Add debugfs key (under CFG80211_CERTIFICATION_ONUS
configuration) to set/clear radar_debug_mode.
In this mode, the driver simply ignores radar
events (but prints them).

The fw is notified about this mode through
a special generic_cfg_feature command.

This mode is relevant only for ap mode. look for
it when initializing ap vif.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:00:23 +02:00
Eliad Peller 87cba16960 wlcore: don't WARN_ON in case of existing ROC
When working with AP + P2P, it's possible to get into
a state when the AP is in ROC (due to assiciating station)
while trying to ROC on the P2P interface.

Replace the WARN_ON with wl1271_error to avoid warnings
in this case.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 15:00:20 +02:00
Amitkumar Karwar 54f008497b mwifiex: Empty Tx queue during suspend
In cfg80211 suspend handler, stop the netif queue and
wait until all the Tx queues become empty. Start the
queues in resume handler.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 14:58:39 +02:00
Hui Wang 107b871333 brcmfmac: Remove waitqueue_active check
We met a problem of pm_suspend  when repeated closing/opening the lid
on a Lenovo laptop (1/20 reproduce rate), below is the log:

[ 199.735876] PM: Entering mem sleep
[ 199.750516] e1000e: EEE TX LPI TIMER: 00000011
[ 199.856638] Trying to free nonexistent resource <000000000000d000-000000000000d0ff>
[ 201.753566] brcmfmac: brcmf_pcie_suspend: Timeout on response for entering D3 substate
[ 201.753581] pci_legacy_suspend(): brcmf_pcie_suspend+0x0/0x1f0 [brcmfmac] returns -5
[ 201.753585] dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -5
[ 201.753589] PM: Device 0000:04:00.0 failed to suspend async: error -5

Through debugging, we found when problem happens, it is not the device
fails to enter D3, but the signal D3_ACK comes too early to pass the
waitqueue_active() check.

Just like this:
brcmf_pcie_send_mb_data(devinfo, BRCMF_H2D_HOST_D3_INFORM);
// signal is triggered here
wait_event_timeout(devinfo->mbdata_resp_wait, devinfo->mbdata_completed,
		   BRCMF_PCIE_MBDATA_TIMEOUT);

So far I think it is safe to remove waitqueue_active check since there
is only one place to trigger this signal (sending
BRCMF_H2D_HOST_D3_INFORM). And it is not a problem calling wake_up
event earlier than calling wait_event.

Cc: Brett Rudley <brudley@broadcom.com>
Cc: Hante Meuleman <meuleman@broadcom.com>
Cc: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Cc: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Cc: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 14:57:56 +02:00
Dan Carpenter 3691ac4a9c libertas: fix an error code in probe
We accidentally return success instead of a negative error code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-10 14:56:46 +02:00
Kalle Valo 6ab882a8aa * update GSCAN capabilities (Ayala)
* fix AES-CMAC in AP mode (Johannes)
 * adapt prints to new firmware API
 * rx path improvements (Sara and Gregory)
 * fixes for the thermal / cooling device code (Chaya Rachel)
 * fixes for GO uAPSD handling
 * more code for the 9000 device family (Sara)
 * infrastructure work for firmware notification (Chaya Rachel)
 * improve association reliablity (Sara)
 * runtime PM fixes
 * fixes for ROC (HS2.0)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW4HQ5AAoJEC0Llv5uNjIBfHIP/j68hRW4reTJdYo0Q7MG1WPa
 47ir4d/+Yz7nnXj4RFY9s8Sv6irC8gsY1bFFTZCNXDTB/BeJjvszBMGVwHuUPq+w
 wg8zz8uhiAb8IlmPKnEPhuiwRrdcgqy443m06k89VNjz/kKv4/iRDQ+bL3P4O9En
 LrKz9md91k5svLOuLmUXMdP+6eXKip4LBVeCgIIt2lkMNW1jYDi1Af2YEXp7Ifad
 u5ow9n5aAe16QKTnMUtnShlCrdflYsBHzcxza9jKXO8hqxrHXNg9TmBNL0ZguPlQ
 AdqzFmEDeUzv7GDTE4Po/DrJT13sto5gn2v36XMXLDgwS26hSqoLX1amyZ7baQyh
 J986recmF5JGx/y67QE6KgSfpdDPI2E4a28d0Y4ZsFHL32SgaD0qCcOiiszjjrCD
 zFYDYTPZdU9Cz6nwP27wwze0FcCCB2pOvtW5oGZb0eW1hWNb0BQlSdYIjfup7P/y
 wzMIg6a5k7mGEbXpUIPpVriasTBLwNe3rmU413sm9vZkR5T5r4f3FYWHtg7dc3Rt
 jP2Nb78hMC0KH/+KPBuTPdgf8hrz8JBQIJaisnfEpYjqPE1auTAsNyVNlgssHtbC
 XA2l78tRe9lZNjrp6oJIon+XYGuPDzVX6geT1P9qYB7PnSI8C/rL96GVjjohqeko
 1nFwhlAk0WvWjr/0NHJx
 =j8++
 -----END PGP SIGNATURE-----

Merge tag 'iwlwifi-next-for-kalle-2016-03-09_2' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

* update GSCAN capabilities (Ayala)
* fix AES-CMAC in AP mode (Johannes)
* adapt prints to new firmware API
* rx path improvements (Sara and Gregory)
* fixes for the thermal / cooling device code (Chaya Rachel)
* fixes for GO uAPSD handling
* more code for the 9000 device family (Sara)
* infrastructure work for firmware notification (Chaya Rachel)
* improve association reliablity (Sara)
* runtime PM fixes
* fixes for ROC (HS2.0)
2016-03-10 14:53:35 +02:00
Ramesh Shanmugasundaram e481ab23c5 can: rcar_can: Add r8a7795 support
Added r8a7795 SoC support.

Signed-off-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10 10:30:21 +01:00
Marek Vasut 6cc6426605 can: ifi: Add obscure bit swap for EFF frame IDs
In case of CAN2.0 EFF frame, the controller handles frame IDs in a
rather bizzare way. The ID is split into an extended part, IDX[28:11]
and standard part, ID[10:0]. In the TX path, the core first sends the
top 11 bits of the IDX, followed by ID and finally the rest of IDX.
In the RX path, the core stores the ID the LSbit part of IDX field,
followed by the LSbit parts of real IDX. The MSbit parts of IDX are
stored in ID field of the register.

This patch implements the necessary bit shuffling to mitigate this
obscure behavior. In case two of these controllers are connected
together, the RX and TX bit swapping nullifies itself and the issue
does not manifest. The issue only manifests when talking to another
different CAN controller.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10 10:19:09 +01:00
Marek Vasut 223654355c can: ifi: Fix RX and TX ID mask
The RX and TX ID mask for CAN2.0 is 11 bits wide. This patch fixes
the incorrect mask, which caused the CAN IDs to miss the MSBit both
on receive and transmit.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10 10:19:09 +01:00
Marek Vasut f1deaee0c3 can: ifi: Fix TX DLC configuration
The TX DLC, the transmission length information, was not written
into the transmit configuration register. When using the CAN core
with different CAN controller, the receiving CAN controller will
receive only the ID part of the CAN frame, but no data at all.

This patch adds the TX DLC into the register to fix this issue.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10 10:19:09 +01:00
Marek Vasut 99312c377f can: ifi: Fix clock generator configuration
The clock generation does not match reality when using the CAN IP
core outside of the FPGA design. This patch fixes the computation
of values which are programmed into the clock generator registers.

First, there are some off-by-one errors which manifest themselves
only when communicating with different controller, so those are
fixed.

Second, the bits in the clock generator registers have different
meaning depending on whether the core is in ISO CANFD mode or any
of the other modes (BOSCH CANFD or CAN2.0). Detect the ISO CANFD
mode and fix handling of this special case of clock configuration.

Finally, the CAN clock speed is in CANCLOCK register, not SYSCLOCK
register, so fix this as well.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10 10:19:09 +01:00
Alexei Starovoitov cdc4e47da8 bpf: avoid copying junk bytes in bpf_get_current_comm()
Lots of places in the kernel use memcpy(buf, comm, TASK_COMM_LEN); but
the result is typically passed to print("%s", buf) and extra bytes
after zero don't cause any harm.
In bpf the result of bpf_get_current_comm() is used as the part of
map key and was causing spurious hash map mismatches.
Use strlcpy() to guarantee zero-terminated string.
bpf verifier checks that output buffer is zero-initialized,
so even for short task names the output buffer don't have junk bytes.
Note it's not a security concern, since kprobe+bpf is root only.

Fixes: ffeedafbf0 ("bpf: introduce current->pid, tgid, uid, gid, comm accessors")
Reported-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 23:27:30 -05:00
Alexei Starovoitov b8cdc05173 bpf: bpf_stackmap_copy depends on CONFIG_PERF_EVENTS
0-day bot reported build error:
kernel/built-in.o: In function `map_lookup_elem':
>> kernel/bpf/.tmp_syscall.o:(.text+0x329b3c): undefined reference to `bpf_stackmap_copy'
when CONFIG_BPF_SYSCALL is set and CONFIG_PERF_EVENTS is not.
Add weak definition to resolve it.
This code path in map_lookup_elem() is never taken
when CONFIG_PERF_EVENTS is not set.

Fixes: 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 23:26:51 -05:00
David S. Miller 3b8377dca1 Merge branch 'variable-length-ll-headers'
Willem de Bruijn says:

====================
net: validate variable length ll headers

Allow device-specific validation of link layer headers. Existing
checks drop all packets shorter than hard_header_len. For variable
length protocols, such packets can be valid.

patch 1 adds header_ops.validate and dev_validate_header
patch 2 implements the protocol specific callback for AX25
patch 3 replaces ll_header_truncated with dev_validate_header
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 22:13:01 -05:00
Willem de Bruijn 9ed988cd59 packet: validate variable length ll headers
Replace link layer header validation check ll_header_truncate with
more generic dev_validate_header.

Validation based on hard_header_len incorrectly drops valid packets
in variable length protocols, such as AX25. dev_validate_header
calls header_ops.validate for such protocols to ensure correctness
below hard_header_len.

See also http://comments.gmane.org/gmane.linux.network/401064

Fixes 9c7077622d ("packet: make packet_snd fail on len smaller than l2 header")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 22:13:01 -05:00
Willem de Bruijn ea47781c26 ax25: add link layer header validation function
As variable length protocol, AX25 fails link layer header validation
tests based on a minimum length. header_ops.validate allows protocols
to validate headers that are shorter than hard_header_len. Implement
this callback for AX25.

See also http://comments.gmane.org/gmane.linux.network/401064

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 22:13:01 -05:00
Willem de Bruijn 2793a23aac net: validate variable length ll headers
Netdevice parameter hard_header_len is variously interpreted both as
an upper and lower bound on link layer header length. The field is
used as upper bound when reserving room at allocation, as lower bound
when validating user input in PF_PACKET.

Clarify the definition to be maximum header length. For validation
of untrusted headers, add an optional validate member to header_ops.

Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance
for deliberate testing of corrupt input. In this case, pad trailing
bytes, as some device drivers expect completely initialized headers.

See also http://comments.gmane.org/gmane.linux.network/401064

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 22:13:01 -05:00
Jean Delvare 87aca73737 NFC: microread: Drop platform data header file
Originally I only wanted to drop the unneeded inclusion of
<linux/i2c.h>, but then noticed that struct
microread_nfc_platform_data isn't actually used, and
MICROREAD_DRIVER_NAME is redefined in the only file where it is used,
so we can get rid of the header file and dead code altogether.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-03-09 23:25:45 +01:00
David S. Miller 9531ab65f4 Merge branch 'kcm'
Tom Herbert says:

====================
kcm: Kernel Connection Multiplexor (KCM)

Kernel Connection Multiplexor (KCM) is a facility that provides a
message based interface over TCP for generic application protocols.
The motivation for this is based on the observation that although
TCP is byte stream transport protocol with no concept of message
boundaries, a common use case is to implement a framed application
layer protocol running over TCP. To date, most TCP stacks offer
byte stream API for applications, which places the burden of message
delineation, message I/O operation atomicity, and load balancing
in the application. With KCM an application can efficiently send
and receive application protocol messages over TCP using a
datagram interface.

In order to delineate message in a TCP stream for receive in KCM, the
kernel implements a message parser. For this we chose to employ BPF
which is applied to the TCP stream. BPF code parses application layer
messages and returns a message length. Nearly all binary application
protocols are parsable in this manner, so KCM should be applicable
across a wide range of applications. Other than message length
determination in receive, KCM does not require any other application
specific awareness. KCM does not implement any other application
protocol semantics-- these are are provided in userspace or could be
implemented in a kernel module layered above KCM.

KCM implements an NxM multiplexor in the kernel as diagrammed below:

+------------+   +------------+   +------------+   +------------+
| KCM socket |   | KCM socket |   | KCM socket |   | KCM socket |
+------------+   +------------+   +------------+   +------------+
      |                 |               |                |
      +-----------+     |               |     +----------+
                  |     |               |     |
               +----------------------------------+
               |           Multiplexor            |
               +----------------------------------+
                 |   |           |           |  |
       +---------+   |           |           |  ------------+
       |             |           |           |              |
+----------+  +----------+  +----------+  +----------+ +----------+
|  Psock   |  |  Psock   |  |  Psock   |  |  Psock   | |  Psock   |
+----------+  +----------+  +----------+  +----------+ +----------+
      |              |           |            |             |
+----------+  +----------+  +----------+  +----------+ +----------+
| TCP sock |  | TCP sock |  | TCP sock |  | TCP sock | | TCP sock |
+----------+  +----------+  +----------+  +----------+ +----------+

The KCM sockets provide the datagram interface to applications,
Psocks are the state for each attached TCP connection (i.e. where
message delineation is performed on receive).

A description of the APIs and design can be found in the included
Documentation/networking/kcm.txt.

In this patch set:

  - Add MSG_BATCH flag. This is used in sendmsg msg_hdr flags to
    indicate that more messages will be sent on the socket. The stack
    may batch messages up if it is beneficial for transmission.
  - In sendmmsg, set MSG_BATCH in all sub messages except for the last
    one.
  - In order to allow sendmmsg to contain multiple messages with
    SOCK_SEQPAKET we allow each msg_hdr in the sendmmsg to set MSG_EOR.
  - Add KCM module
    - This supports SOCK_DGRAM and SOCK_SEQPACKET.
  - KCM documentation

v2:
  - Added splice and page operations.
  - Assemble receive messages in place on TCP socket (don't have a
    separate assembly queue.
  - Based on above, enforce maxmimum receive message to be the size
    of the recceive socket buffer.
  - Support message assembly timeout. Use the timeout value in
    sk_rcvtimeo on the TCP socket.
  - Tested some with a couple of other production applications,
    see ~5% improvement in application latency.

Testing:

Dave Watson has integrated KCM into Thrift and we intend to put these
changes into open source. Example of this is in:

https://github.com/djwatson/fbthrift/commit/
dd7e0f9cf4e80912fdb90f6cd394db24e61a14cc

Some initial KCM Thrift benchmark numbers (comment from Dave)

Thrift by default ties a single connection to a single thread.  KCM is
instead able to load balance multiple connections across multiple epoll
loops easily.

A test sending ~5k bytes of data to a kcm thrift server, dropping the
bytes on recv:

QPS     Latency / std dev Latency
  without KCM
    70336     209/123
  with KCM
    70353     191/124

A test sending a small request, then doing work in the epoll thread,
before serving more requests:

QPS     Latency / std dev Latency
without KCM
    14282     559/602
with KCM
    23192     344/234

At the high end, there's definitely some additional kernel overhead:

Cranking the pipelining way up, with lots of small requests

QPS     Latency / std dev Latency
without KCM
   1863429     127/119
with KCM
   1337713     192/241

---

So for a "realistic" workload, KCM performs pretty well (second case).
Under extreme conditions of highest tps we still have some work to do.
In its nature a multiplexor will spread work between CPUs which is
logically good for load balancing but coan conflict with the goal
promoting affinity. Batching messages on both send and receive are
the means to recoup performance.

Future support:

 - Integration with TLS (TLS-in-kernel is a separate initiative).
 - Page operations/splice support
 - Unconnected KCM sockets. Will be able to attach sockets to different
   destinations, AF_KCM addresses with be used in sendmsg and recvmsg
   to indicate destination
 - Explore more utility in performing BPF inline with a TCP data stream
   (setting SO_MARK, rxhash for messages being sent received on
   KCM sockets).
 - Performance work
   - Diagnose performance issues under high message load

FAQ (Questions posted on LWN)

Q: Why do this in the kernel?

A: Because the kernel is good at scheduling threads and steering packets
   to threads. KCM fits well into this model since it allows the unit
   of work for scheduling and steering to be the application layer
   messages themselves. KCM should be thought of as generic application
   protocol acceleration. It to the philosophy that the kernel provides
   generic and extensible interfaces.

Q: How can adding code in the path yield better performance?

A: It is true that for just sending receiving a single message there
   would be some performance loss since the code path is longer (for
   instance comparing netperf to KCM). But for real production
   applications performance takes on many dynamics. Parallelism, context
   switching, affinity, granularity of locking, and load balancing are
   all relevant. The theory of KCM is that by an application-centric
   interface, the kernel can provide better support for these
   performance characteristics.

Q: Why not use an existing message-oriented protocol such as RUDP,
   DCCP, SCTP, RDS, and others?

A: Because that would entail using a completely new transport protocol.
   Deploying a new protocol at scale is either a huge undertaking or
   fundamentally infeasible. This is true in either the Internet and in
   the data center due in a large part to protocol ossification.
   Besides, KCM we want KCM to work existing, well deployed application
   protocols that we couldn't change even if we wanted to (e.g. http/2).

   KCM simply defines a new interface method, it does not redefine any
   aspect of the transport protocol nor application protocol, nor set
   any new requirements on these. Neither does KCM attempt to implement
   any application protocol logic other than message deliniation in the
   stream. These are fundamental requirement of KCM.

Q: How does this affect TCP?

A: It doesn't, not in the slightest. The use of KCM can be one-sided,
   KCM has no effect on the wire.

Q: Why force TCP into doing something it's not designed for?

A: TCP is defined as transport protocol and there is no standard that
   says the API into TCP must be stream based sockets, or for that
   matter sockets at all (or even that TCP needs to be implemented in a
   kernel). KCM is not inconsistent with the design of TCP just because
   to makes an message based interface over TCP, if it were then every
   application protocol sending messages over TCP would also be! :-)

Q: What about the problem of a connections with very slow rate of
   incoming data? As a result your application can get storms of very
   short reads. And it actually happens a lot with connection from
   mobile devices and it is a problem for servers handling a lot of
   connections.

A: The storm of short reads will occur regardless of whether KCM is used
   or not. KCM does have one advantage in this scenario though, it will
   only wake up the application when a full message has been received,
   not for each packet that makes up part of a bigger messages. If a
   bunch of small messages are received, the application can receive
   messages in batches using recvmmsg.

Q: Why not just use DPDK, or at least provide KCM like functionality in
   DPDK?

A: DPDK, or more generally OS bypass presumably with a TCP stack in
   userland, presents a different model of load balancing than that of
   KCM (and the kernel). KCM implements load balancing of messages
   across the threads of an application, whereas DPDK load balances
   based on queues which are more static and coarse-grained since
   multiple connections are bound to queues. DPDK works best when
   processing of packets is silo'ed in a thread on the CPU processing
   a queue, and packet processing (for both the stack and application)
   is fairly uniform. KCM works well for applications where the amount
   of work to process messages varies an application work is commonly
   delegated to worker threads often on different CPUs.

   The message based interface over TCP is something that could be
   provide by a DPDK or OS bypass library.

Q: I'm not quite seeing this for HTTP. Maybe for HTTP/2, I guess, or web
   sockets?

A: Yes. KCM is most appropriate for message based protocols over TCP
   where is easy to deduce the message length (e.g. a length field)
   and the protocol implements its own message ordering semantics.
   Fortunately this encompasses many modern protocols.

Q: How is memory limited and controlled?

A: In v2 all data for messages is now kept in socket buffers, either
   those for TCP or KCM, so socket buffer limits are applicable.
   This includes receive messages assembly which is now done ont teh
   TCP socket buffer instead of a separate queue-- this has the
   consequence that the TCP socket buffer limit provides an
   enforceable maxmimum message size.

   Additionally, a timeout may be set for messages assembly. The
   value used for this is taken from sk_rcvtimeo of the TCP socket.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:16 -05:00
Tom Herbert 10016594f4 kcm: Add description in Documentation
Add kcm.txt to desribe KCM and interfaces.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:16 -05:00
Tom Herbert 29152a34f7 kcm: Add receive message timeout
This patch adds receive timeout for message assembly on the attached TCP
sockets. The timeout is set when a new messages is started and the whole
message has not been received by TCP (not in the receive queue). If the
completely message is subsequently received the timer is cancelled, if the
timer expires the RX side is aborted.

The timeout value is taken from the socket timeout (SO_RCVTIMEO) that is
set on a TCP socket (i.e. set by get sockopt before attaching a TCP socket
to KCM.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:15 -05:00
Tom Herbert 7ced95ef52 kcm: Add memory limit for receive message construction
Message assembly is performed on the TCP socket. This is logically
equivalent of an application that performs a peek on the socket to find
out how much memory is needed for a receive buffer. The receive socket
buffer also provides the maximum message size which is checked.

The receive algorithm is something like:

   1) Receive the first skbuf for a message (or skbufs if multiple are
      needed to determine message length).
   2) Check the message length against the number of bytes in the TCP
      receive queue (tcp_inq()).
	- If all the bytes of the message are in the queue (incluing the
	  skbuf received), then proceed with message assembly (it should
	  complete with the tcp_read_sock)
        - Else, mark the psock with the number of bytes needed to
	  complete the message.
   3) In TCP data ready function, if the psock indicates that we are
      waiting for the rest of the bytes of a messages, check the number
      of queued bytes against that.
        - If there are still not enough bytes for the message, just
	  return
        - Else, clear the waiting bytes and proceed to receive the
	  skbufs.  The message should now be received in one
	  tcp_read_sock

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:15 -05:00
Tom Herbert f29698fc6b kcm: Sendpage support
Implement kcm_sendpage. Set in sendpage to kcm_sendpage in both
dgram and seqpacket ops.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:15 -05:00
Tom Herbert 91687355b9 kcm: Splice support
Implement kcm_splice_read. This is supported only for seqpacket.
Add kcm_seqpacket_ops and set splice read to kcm_splice_read.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:15 -05:00
Tom Herbert cd6e111bf5 kcm: Add statistics and proc interfaces
This patch adds various counters for KCM. These include counters for
messages and bytes received or sent, as well as counters for number of
attached/unattached TCP sockets and other error or edge events.

The statistics are exposed via a proc interface. /proc/net/kcm provides
statistics per KCM socket and per psock (attached TCP sockets).
/proc/net/kcm_stats provides aggregate statistics.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:14 -05:00
Tom Herbert ab7ac4eb98 kcm: Kernel Connection Multiplexor module
This module implements the Kernel Connection Multiplexor.

Kernel Connection Multiplexor (KCM) is a facility that provides a
message based interface over TCP for generic application protocols.
With KCM an application can efficiently send and receive application
protocol messages over TCP using datagram sockets.

For more information see the included Documentation/networking/kcm.txt

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:14 -05:00
Tom Herbert 473bd239b8 tcp: Add tcp_inq to get available receive bytes on socket
Create a common kernel function to get the number of bytes available
on a TCP socket. This is based on code in INQ getsockopt and we now call
the function for that getsockopt.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:14 -05:00
Tom Herbert fa9835e52e net: Walk fragments in __skb_splice_bits
Add walking of fragments in __skb_splice_bits.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:14 -05:00
Tom Herbert f092276d85 net: Add MSG_BATCH flag
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to
indicate that more messages will follow (i.e. a batch of messages is
being sent). This is similar to MSG_MORE except that the following
messages are not merged into one packet, they are sent individually.
sendmmsg is updated so that each contained message except for the
last one is marked as MSG_BATCH.

MSG_BATCH is a performance optimization in cases where a socket
implementation can benefit by transmitting packets in a batch.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert 28a94d8fb3 net: Allow MSG_EOR in each msghdr of sendmmsg
This patch allows setting MSG_EOR in each individual msghdr passed
in sendmmsg. This allows a sendmmsg to send multiple messages when
using SOCK_SEQPACKET.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert f4a00aacdb net: Make sock_alloc exportable
Export it for cases where we want to create sockets by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert ff3c44e675 rcu: Add list_next_or_null_rcu
This is a convenience function that returns the next entry in an RCU
list or NULL if at the end of the list.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Ayala Beker 5ed47226e0 iwlwifi: mvm: update GSCAN capabilities
Gscan capabilities were updated with new capabilities supported
by the device. While at it, simplify the firmware support
conditional and move both conditions into the WARN() to make it
easier to undertand and use the unlikely() for both.

Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 21:05:17 +02:00
Johannes Berg 81279c49ce iwlwifi: mvm: don't try to offload AES-CMAC in AP/IBSS modes
The firmware/hardware only supports checking AES-CMAC on RX, not
using it on TX. For station mode this is fine, since it's the only
thing it will ever do. For AP mode, it never receives such frames,
but must be able to transmit them. This is currently broken since
we try to enable them for hardware crypto (for RX only) and then
treat them as TX_CMD_SEC_EXT, leading to FIFO underruns during TX
so the frames never go out to the air.

To fix this, simply use software on TX in AP (and IBSS) mode.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 21:05:17 +02:00
Emmanuel Grumbach 7d3ca7f4b1 iwlwifi: mvm: adapt the firmware assert log to new firmware
Newer firmware versions put different data in the memory
which is read by the driver upon firmware crash. Just
change the variable names in the code and the name of the
data in the log that we print withouth any functional
change.
On older firmware, there will be a mismatch between the
names that are printed and the content itself, but that's
harmless.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 21:05:16 +02:00
Gregory Greenman e0e168dc8c iwlwifi: pcie: avoid restocks inside rx loop if not emergency
When trying to reach high Rx throughput of more than 500Mbps on
a device with a relatively weak CPU (Atom x5-Z8500), CPU utilization
may become a bottleneck. Analysis showed that we are looping in
iwl_pcie_rx_handle for very long periods which led to starvation
of other threads (iwl_pcie_rx_handle runs with _bh disabled).
We were handling Rx and allocating new buffers and the new buffers
were ready quickly enough to be available before we had finished
handling all the buffers available in the hardware. As a
consequence, we called iwl_pcie_rxq_restock to refill the hardware
with the new buffers, and start again handling new buffers without
exiting the function. Since we read the hardware pointer again when
we goto restart, new buffers were handled immediately instead of
exiting the function.

This patch avoids refilling RBs inside rx handling loop, unless an
emergency situation is reached. It also doesn't read the hardware
pointer again unless we are in an emergency (unlikely) case.
This significantly reduce the maximal time we spend in
iwl_pcie_rx_handle with _bh disabled.

Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 21:05:16 +02:00
Chaya Rachel Ivgi b358993b3f iwlwifi: mvm: return the cooling state index instead of the budget
iwl_mvm_tcool_get_cur_state is the function that returns the
cooling state index to the sysfs handler. This function returns
mvm->cooling_dev.cur_state but that variable was set to the
budget and not the cooling state index. Fix that.
Add a missing blank line while at it.

Signed-off-by: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 21:05:13 +02:00
Emmanuel Grumbach 416cb2467b iwlwifi: mvm: remove RRM advertisement
mac80211 advertises this feature for all its drivers.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 20:59:20 +02:00
Emmanuel Grumbach 532beba378 iwlwifi: mvm: don't let NDPs mess the packet tracking
We need to track the next packet that we will reclaim in
order to know when the Tx queues are empty. This is useful
when we open or tear down an A-MPDU session which requires
to switch queue.
The next packet being reclaimed is identified by its WiFi
sequence number and this is relevant only when we use QoS.
QoS NDPs do have a TID but have a meaningless sequence
number. The spec mandates the receiver to ignore the
sequence number in this case, allowing the transmitter to
put any sequence number. Our implementation leaves it 0.
When we reclaim a QoS NDP, we can't update the next_relcaim
counter since the sequence number of the QoS NDP itself is
invalid.
We used to update the next_reclaim based on the sequence
number of the QoS NDP which reset it to 1 (0 + 1) and
because of this, we never knew when the queue got empty.
This had to sad consequence to stuck the A-MPDU state
machine in a transient state.
To fix this, don't update next_reclaim when we reclaim
a QoS NDP.

Alesya saw this bug when testing u-APSD. Because the
A-MPDU state machine was stuck in EMPTYING_DELBA, we
updated mac80211 that we still have frames for that
station when it got back to sleep. mac80211 then wrongly
set the TIM bit in the beacon and requested to release
non-existent frames from the A-MPDU queue. This led to
a situation where the client was trying to poll frames
but we had no frames to send.

Reported-by: Alesya Shapira <alesya.shapira@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 20:59:20 +02:00
Sara Sharon 17c867bfe8 iwlwifi: add support for getting HW address from CSR
From 9000 family on, we need to get HW address from host
CSR registers.
OEM can override it by fusing the override registers - read
those first, and if those are 0 - read the OTP registers instead.

In addition - bail out if no valid mac address is present. Make
it shared for all NICs.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 20:59:19 +02:00
Sara Sharon 7b5424361e iwlwifi: pcie: fine tune number of rxbs
We kick the allocator when we have 2 RBDs that don't have
attached RBs, and the allocator allocates 8 RBs meaning
that it needs another 6 RBDs to attach the RBs to.
The design is that allocator should always have enough RBDs
to fulfill requests, so we give in advance 6 RBDs to the
allocator so that when it is kicked, it gets additional 2 RBDs
and has enough RBDs.
These RBDs were taken from the Rx queue itself, meaning
that each Rx queue didn't have the maximal number of
RBDs, but MAX - 6.
Change initial number of RBDs in the system to include both
queue size and allocator reserves.
Note the multi-queue is always 511 instead of 512 to avoid a
full queue since we cannot detect this state easily enough in
the 9000 arch.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2016-03-09 20:59:19 +02:00