WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Jakub Kicinski	82192cb497	Merge branch 'ena-capabilities-field-and-cosmetic-changes' Arthur Kiyanovski says: ==================== ENA: capabilities field and cosmetic changes Add a new capabilities bitmask field to get indication of capabilities supported by the device. Use the capabilities field to query the device for ENI stats support. Other patches are cosmetic changes like fixing readme mistakes, removing unused variables etc... ==================== Link: https://lore.kernel.org/r/20220107202346.3522-1-akiyano@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:58 -08:00
Arthur Kiyanovski	9fe890cc5b	net: ena: Extract recurring driver reset code into a function Create an inline function for resetting the driver to reduce code duplication. Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:52 -08:00
Arthur Kiyanovski	d0e8831d6c	net: ena: Change the name of bad_csum variable Changed bad_csum to csum_bad to align with csum_unchecked & csum_good Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:51 -08:00
Arthur Kiyanovski	9b648bb1d8	net: ena: Add debug prints for invalid req_id resets Add qid and req_id to error prints when ENA_REGS_RESET_INV_TX_REQ_ID reset occurs. Switch from %hu to %u, since u16 should be printed with %u, as explained in [1]. [1] - https://www.kernel.org/doc/html/latest/core-api/printk-formats.html Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:51 -08:00
Arthur Kiyanovski	c215941aba	net: ena: Remove ena_calc_queue_size_ctx struct This struct was used to pass data from callee function to its caller. Its usage can be avoided. Removing it results in less code without any damage to code readability. Also it allows to consolidate ring size calculation into a single function (ena_calc_io_queue_size()). Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:51 -08:00
Arthur Kiyanovski	e344546980	net: ena: Move reset completion print to the reset function The print that indicates that device reset has finished is currently called from ena_restore_device(). Move it to ena_fw_reset_device() as it is the more natural location for it. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:51 -08:00
Arthur Kiyanovski	09f8676eae	net: ena: Remove redundant return code check The ena_com_indirect_table_fill_entry() function only returns -EINVAL or 0, no need to check for -EOPNOTSUPP. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:50 -08:00
Arthur Kiyanovski	273a2397fc	net: ena: Update LLQ header length in ena documentation LLQ entry length is 128 bytes. Therefore the maximum header in the entry is calculated by: tx_max_header_size = LLQ_ENTRY_SIZE - DESCRIPTORS_NUM_BEFORE_HEADER * 16 = 128 - 2 * 16 = 96 This patch updates the documentation so that it states the correct max header length. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:50 -08:00
Arthur Kiyanovski	394c48e08b	net: ena: Change ENI stats support check to use capabilities field Use the capabilities field to query the device for ENI stats support. This replaces the previous method that tried to get the ENI stats during ena_probe() and used the success or failure as an indication for support by the device. Remove eni_stats_supported field from struct ena_adapter. This field was used for the previous method of queriying for ENI stats support. Change the severity level of the print in case of ena_com_get_eni_stats() failure from info to error. With the previous method of querying form ENI stats support, failure to get ENI stats was normal for devices that don't support it. With the use of the capabilities field such a failure is unexpected, as it is called only if the device reported that it supports ENI stats. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:50 -08:00
Arthur Kiyanovski	a2d5d6a70f	net: ena: Add capabilities field with support for ENI stats capability This bitmask field indicates what capabilities are supported by the device. The capabilities field differs from the 'supported_features' field which indicates what sub-commands for the set/get feature commands are supported. The sub-commands are specified in the 'feature_id' field of the 'ena_admin_set_feat_cmd' struct in the following way: struct ena_admin_set_feat_cmd cmd; cmd.aq_common_descriptor.opcode = ENA_ADMIN_SET_FEATURE; cmd.feat_common.feature_ The 'capabilities' field, on the other hand, specifies different capabilities of the device. For example, whether the device supports querying of ENI stats. Also add an enumerator which contains all the capabilities. The first added capability macro is for ENI stats feature. Capabilities are queried along with the other device attributes (in ena_com_get_dev_attr_feat()) during device initialization and are stored in the ena_com_dev struct. They can be later queried using the ena_com_get_cap() helper function. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:50 -08:00
Arthur Kiyanovski	7dcf922152	net: ena: Change return value of ena_calc_io_queue_size() to void ena_calc_io_queue_size() always returns 0, therefore make it a void function and update the calling function to stop checking the return value. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:25:50 -08:00
Eric Dumazet	bf44077c1b	af_packet: fix tracking issues in packet_do_bind() It appears that my changes in packet_do_bind() were slightly wrong. syzbot found that calling bind() twice would trigger a false positive. Remove proto_curr/dev_curr variables and rewrite things to be less confusing (like not having to use netdev_tracker_alloc(), and instead use the standard dev_hold_track()) Fixes: `f1d9268e06` ("net: add net device refcount tracker to struct packet_type") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220107183953.3886647-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:11:55 -08:00
Sunil Goutham	6dc9a23e29	octeontx2-af: Fix interrupt name strings Fixed interrupt name string logic which currently results in wrong memory location being accessed while dumping /proc/interrupts. Fixes: `4826090719` ("octeontx2-af: Enable CPT HW interrupts") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1641538505-28367-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:07:06 -08:00
Jakub Kicinski	d8caa2ed47	Merge branch 'mptcp-refactoring-for-one-selftest-and-csum-validation' Mat Martineau says: ==================== mptcp: Refactoring for one selftest and csum validation Patch 1 changes the MPTCP join self tests to depend more on events rather than delays, so the script runs faster and has more consistent results. Patches 2 and 3 get rid of some duplicate code in MPTCP's checksum validation by modifying and leveraging an existing helper function. ==================== Link: https://lore.kernel.org/r/20220107192524.445137-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:00:46 -08:00
Geliang Tang	8401e87f5a	mptcp: reuse __mptcp_make_csum in validate_data_csum This patch reused __mptcp_make_csum() in validate_data_csum() instead of open-coding. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:00:44 -08:00
Geliang Tang	c312ee2191	mptcp: change the parameter of __mptcp_make_csum This patch changed the type of the last parameter of __mptcp_make_csum() from __sum16 to __wsum. And export this function in protocol.h. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:00:44 -08:00
Paolo Abeni	327b9a94e2	selftests: mptcp: more stable join tests-cases MPTCP join self-tests are a bit fragile as they reply on delays instead of events to catch-up with the expected sockets states. Replace the delay with state checking where possible and reduce the number of sleeps in the most complex scenarios. This will both reduce the tests run-time and will improve stability. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 19:00:43 -08:00
Vladimir Oltean	5cad43a52e	net: dsa: felix: add port fast age support Add support for flushing the MAC table on a given port in the ocelot switch library, and use this functionality in the felix DSA driver. This operation is needed when a port leaves a bridge to become standalone, and when the learning is disabled, and when the STP state changes to a state where no FDB entry should be present. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220107144229.244584-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 18:58:25 -08:00
Vladimir Oltean	a14e6b69f3	net: mscc: ocelot: fix incorrect balancing with down LAG ports Assuming the test setup described here: https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/ (swp1 and swp2 are in bond0, and bond0 is in a bridge with swp0) it can be seen that when swp1 goes down (on either board A or B), then traffic that should go through that port isn't forwarded anywhere. A dump of the PGID table shows the following: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1 PGID_DST[2] = ports 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 1, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 Whereas a "good" PGID configuration for that setup should have looked like this: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1, 2 PGID_DST[2] = ports 1, 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 In other words, in the "bad" configuration, the attempt is to remove the inactive swp1 from the destination ports via PGID_DST. But when a MAC table entry is learned, it is learned towards PGID_DST 1, because that is the logical port id of the LAG itself (it is equal to the lowest numbered member port). So when swp1 becomes inactive, if we set PGID_DST[1] to contain just swp1 and not swp2, the packet will not have any chance to reach the destination via swp2. The "correct" way to remove swp1 as a destination is via PGID_AGGR (remove swp1 from the aggregation port groups for all aggregation codes). This means that PGID_DST[1] and PGID_DST[2] must still contain both swp1 and swp2. This makes the MAC table still treat packets destined towards the single-port LAG as "multicast", and the inactive ports are removed via the aggregation code tables. The change presented here is a design one: the ocelot_get_bond_mask() function used to take an "only_active_ports" argument. We don't need that. The only call site that specifies only_active_ports=true, ocelot_set_aggr_pgids(), must retrieve the entire bonding mask, because it must program that into PGID_DST. Additionally, it must also clear the inactive ports from the bond mask here, which it can't do if bond_mask just contains the active ports: ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i); ac &= ~bond_mask; <---- here /* Don't do division by zero if there was no active * port. Just make all aggregation codes zero. */ if (num_active_ports) ac \|= BIT(aggr_idx[i % num_active_ports]); ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i); So it becomes the responsibility of ocelot_set_aggr_pgids() to take ocelot_port->lag_tx_active into consideration when populating the aggr_idx array. Fixes: `23ca3b727e` ("net: mscc: ocelot: rebalance LAGs on link up/down events") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220107164332.402133-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 18:54:59 -08:00
Jakub Kicinski	a5e7d9bbc3	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2022-01-07 This series contains updates to i40e and iavf drivers. Karen limits per VF MAC filters so that one VF does not consume all filters for i40e. Jedrzej reduces busy wait time for admin queue calls for i40e. Mateusz updates firmware versions to reflect new supported NVM images and renames an error to remove non-inclusive language for i40e. Yang Li fixes a set but not used warning for i40e. Jason Wang removes an unneeded variable for iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: remove an unneeded variable i40e: remove variables set but not used i40e: Remove non-inclusive language i40e: Update FW API version i40e: Minimize amount of busy-waiting during AQ send i40e: Add ensurance of MacVlan resources for every trusted VF ==================== Link: https://lore.kernel.org/r/20220107175704.438387-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 18:51:47 -08:00
Gal Pressman	ffef737fd0	net/tls: Fix skb memory leak when running kTLS traffic The cited Fixes commit introduced a memory leak when running kTLS traffic (with/without hardware offloads). I'm running nginx on the server side and wrk on the client side and get the following: unreferenced object 0xffff8881935e9b80 (size 224): comm "softirq", pid 0, jiffies 4294903611 (age 43.204s) hex dump (first 32 bytes): 80 9b d0 36 81 88 ff ff 00 00 00 00 00 00 00 00 ...6............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000efe2a999>] build_skb+0x1f/0x170 [<00000000ef521785>] mlx5e_skb_from_cqe_mpwrq_linear+0x2bc/0x610 [mlx5_core] [<00000000945d0ffe>] mlx5e_handle_rx_cqe_mpwrq+0x264/0x9e0 [mlx5_core] [<00000000cb675b06>] mlx5e_poll_rx_cq+0x3ad/0x17a0 [mlx5_core] [<0000000018aac6a9>] mlx5e_napi_poll+0x28c/0x1b60 [mlx5_core] [<000000001f3369d1>] __napi_poll+0x9f/0x560 [<00000000cfa11f72>] net_rx_action+0x357/0xa60 [<000000008653b8d7>] __do_softirq+0x282/0x94e [<00000000644923c6>] __irq_exit_rcu+0x11f/0x170 [<00000000d4085f8f>] irq_exit_rcu+0xa/0x20 [<00000000d412fef4>] common_interrupt+0x7d/0xa0 [<00000000bfb0cebc>] asm_common_interrupt+0x1e/0x40 [<00000000d80d0890>] default_idle+0x53/0x70 [<00000000f2b9780e>] default_idle_call+0x8c/0xd0 [<00000000c7659e15>] do_idle+0x394/0x450 I'm not familiar with these areas of the code, but I've added this sk_defer_free_flush() to tls_sw_recvmsg() based on a hunch and it resolved the issue. Fixes: `f35f821935` ("tcp: defer skb freeing after socket lock is released") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220102081253.9123-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 18:42:18 -08:00
Linus Torvalds	d1587f7bfe	Merge branch 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This contains the cgroup.procs permission check fixes so that they use the credentials at the time of open rather than write, which also fixes the cgroup namespace lifetime bug" * 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: selftests: cgroup: Test open-time cgroup namespace usage for migration checks selftests: cgroup: Test open-time credential usage for migration checks selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 cgroup: Use open-time cgroup namespace for process migration perm checks cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv cgroup: Use open-time credentials for process migraton perm checks	2022-01-07 15:58:06 -08:00
Linus Torvalds	35632d92ef	block-5.16-2022-01-07 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmHYRcAQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgplYyEACwEaL/JVb1fYf5fv7nQiVQPQxDhIFwMB66 1bBV5y3SoQmq0ZO0mEkb3SIProbEVadnTOjcF8+8SB6K0qLefhhBfTRB9jnIfQqL qTGbIMcWv0q2wnBGnRIv1ADzu6KkRT+pXLW9RgLs8ZbQDxH2w9tMSvRYeBKV+xHp J6PPNp5OjuA+Obd8fi7ILNTDh5EIkWQBpZNoByyAUmz9iH8InsALhM3aae2jPbO+ uWdPTxK7QKUvGIS+mfnTF/F+pURyQDwrPLirxq8yNfiAe++ZaryLb6jQcF/m3yKP s1FlGZrYsbRhroRJxa1uRZzVkenHbWTSea/iqFQdf+rJs4RASOr2/lHimorWKt0H HfV+xAcVmZR775KINh+zwImuAvB5ymOyUePUMeAvDyaAeiPsQGQ+VuylNFBvcHNg JvEmSaQfKw5H0qAUyIN70v3bqeOaPF7b50Kyg35a9RbELAvMeVIASokIjmJJ2ndb uVdPo+9FozwDFQ4iMbdJ4fjl9XHKxSKxJKaPRfzz+Tfbx1QSKXkcepT8b8WJ0S7O yweuLRwrlrUW1kFMbAS9sspL94z0o1rF61FjRJdd24D5YnGTqUK9Ciok2mpYqokk jBQX05+nyiyKwrPYJfblDxnQGLzv5VCkhkGmpARla0J1si9OrK9Oxp1lq9Q1XeWY EdO4WmGMDA== =Liaw -----END PGP SIGNATURE----- Merge tag 'block-5.16-2022-01-07' of git://git.kernel.dk/linux-block Pull block fix from Jens Axboe: "Just the md bitmap regression this time" * tag 'block-5.16-2022-01-07' of git://git.kernel.dk/linux-block: md/raid1: fix missing bitmap update w/o WriteMostly devices	2022-01-07 13:28:20 -08:00
Linus Torvalds	494603e06b	Fix 10nm EDAC driver to release and unmap resources on systems without HBM -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQQW3WBGcnu5yJnSXn0kTJLX0iGMLAUCYdSQwhQcdG9ueS5sdWNr QGludGVsLmNvbQAKCRAkTJLX0iGMLAtJAQCMRv7I8aEC/MuqAG+wlJ1ffhrh27re Q5WwcaltQDPA3AEA9pO2/SKnZHKTs4RZ9cKTJiKrsFNAfhf6bCKww0tYLg0= =51UZ -----END PGP SIGNATURE----- Merge tag 'edac_urgent_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Tony Luck: "Fix 10nm EDAC driver to release and unmap resources on systems without HBM" * tag 'edac_urgent_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/i10nm: Release mdev/mbase when failing to detect HBM	2022-01-07 13:22:58 -08:00
Wolfram Sang	a19f75de73	Revert "i2c: core: support bus regulator controlling in adapter" This largely reverts commit `5a7b95fb99`. It breaks suspend with AMD GPUs, and we couldn't incrementally fix it. So, let's remove the code and go back to the drawing board. We keep the header extension to not break drivers already populating the regulator. We expect to re-add the code handling it soon. Fixes: `5a7b95fb99` ("i2c: core: support bus regulator controlling in adapter") Reported-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com> Link: https://lore.kernel.org/r/1295184560.182511.1639075777725@mail.yahoo.com Reported-by: Konstantin Kharlamov <hi-angel@yandex.ru> Link: https://lore.kernel.org/r/7143a7147978f4104171072d9f5225d2ce355ec1.camel@yandex.ru BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1850 Tested-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com> Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Signed-off-by: Wolfram Sang <wsa@kernel.org> Cc: <stable@vger.kernel.org> # 5.14+	2022-01-07 21:27:15 +01:00
Arnaldo Carvalho de Melo	dc9f2dd5de	Revert "libtraceevent: Increase libtraceevent logging when verbose" This reverts commit `08efcb4a63`. This breaks the build as it will prefer using libbpf-devel header files, even when not using LIBBPF_DYNAMIC=1, breaking the build. This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0, as described by Jiri Slaby: ======================================================================= It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0: > util/debug.c: In function ‘perf_debug_option’: > util/debug.c:243:17: error: implicit declaration of function ‘tep_set_loglevel’ [-Werror=implicit-function-declaration] > 243 \| tep_set_loglevel(TEP_LOG_INFO); > \| ^~~~~~~~~~~~~~~~ > util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this function); did you mean ‘TEP_PRINT_INFO’? > 243 \| tep_set_loglevel(TEP_LOG_INFO); > \| ^~~~~~~~~~~~ > \| TEP_PRINT_INFO > util/debug.c:243:34: note: each undeclared identifier is reported only once for each function it appears in > util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this function) > 245 \| tep_set_loglevel(TEP_LOG_DEBUG); > \| ^~~~~~~~~~~~~ > util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this function) > 247 \| tep_set_loglevel(TEP_LOG_ALL); > \| ^~~~~~~~~~~ It is because the gcc's command line looks like: gcc ... -I/home/abuild/rpmbuild/BUILD/tools/lib/ ... -DLIBTRACEEVENT_VERSION=65790 ... ======================================================================= The proper way to fix this is more involved and so not suitable for this late in the 5.16-rc stage. Reported-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-01-07 16:08:50 -03:00
Jiri Olsa	f06a82f9d3	perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to When running 'perf trace' with an BPF object like: # perf trace -e openat,tools/perf/examples/bpf/hello.c the event parsing eventually calls llvm__get_kbuild_opts() that runs a script and that ends up with SIGCHLD delivered to the 'perf trace' handler, which assumes the workload process is done and quits 'perf trace'. Move the SIGCHLD handler setup directly to trace__run(), where the event is parsed and the object is already compiled. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Christy Lee <christyc.y.lee@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220106222030.227499-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-01-07 15:45:19 -03:00
Linus Torvalds	24556728c3	Two small fixes for x86: * lockdep WARN due to missing lock nesting annotation * NULL pointer dereference when accessing debugfs -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHYcyIUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNleQf/VyM140Z76sxY9IOp6RxhxP4UsMmd mRyFb1tf+nZcN2iw0E4XTu457xi3wkk7lERTGU4YM+FyB5/3ZUiZoUv/91IZMQ5p cDhdB7VgXX+kWt9sgEV8T6slNpE7EiqcFRQNwCumH46AlzE8VTR7S7U5MhRfbFpb fQSJd833fFPgZ7bJLoQoSYn/z7S5lZkzt636MHvgGAh6FD7195QgBQTof7baEJtV PtNZ60qLTzm1IxbfE78lJBhcyRHNHHHfwR5apdsMXGrkBgmREdzPd9qA/K7zHSB5 nNDX/EmKh/5mbaF8YjhfARdt5hUZKIH1O8jlcRkWn7WCxNMyyYY+Kyvu1w== =4acD -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Two small fixes for x86: - lockdep WARN due to missing lock nesting annotation - NULL pointer dereference when accessing debugfs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Check for rmaps allocation KVM: SEV: Mark nested locking of kvm->lock	2022-01-07 09:28:37 -08:00
Linus Torvalds	7a6043cc2e	drm fixes for 5.16 final amdgpu: - suspend/resume fix - fix runtime PM regression -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHXsYcACgkQDHTzWXnE hr5xCBAAhXyvx51ekZvUSCmsWblgkPwMS/+jmTI0yw02TDHfREbr1FZrac6TvOWY RpxFbDf8iXbBWRWBIFwH8oH1brqlXwcrJqhtLBK31xT8SjBRTgo2B4oHYSAluBE4 yF//sMZcCi9U3AoFjBFSACJ/+rkvecbqx/9tvMwtwDYD8x2afLKDG7g+QWF1jmXM XRfYPKRwxU69j1FLhEIbjjo8MeqhR8+nMp8Bx1uPbT3LVTq2jzaHbc3KNd3yAFqj VzBLuhP0mQ9IZvPbEjx3ixLH9rFn+gNOZAe7BMYEkMKPO0WzW7CVw9YN+RELdC0x VtQTZM4Cp/nzYZcRksqGSW7tvTczKHfv1KbegwvEse97P/+hUAAv60t0ro9sXwrT 9iYFwb36XV1NeFW+e9PkrOzBQa7FdHAT2wdmzP/0YdNWXueCvlp12ndbasSQ7pna tMuWctYAtDseyss/o+Oww9P6k3ElQ5nWxtraYbVAyJoFCCB+zqhTu2v+ZQpEgn8p fGXSrT/b6/Z+hj583G1SHlWsc5vYqBosgSinnUVqOUFAJXfeGapAjZa1uIFx7bve 7h4kgMSw+bePZDETB7pbMogQyQExv6MrVqm4E5EoeyCg4n/iXNrc/Fshb5ijOuhr nXOPh3Es4aV27ZbP/qvfWWy7jLGgBnLF0EQu36HGbDNNO7+mdNo= =Y3vO -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "There is only the amdgpu runtime pm regression fix in here: amdgpu: - suspend/resume fix - fix runtime PM regression" * tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: disable runpm if we are the primary adapter fbdev: fbmem: add a helper to determine if an aperture is used by a fw fb drm/amd/pm: keep the BACO feature enabled for suspend	2022-01-07 09:17:53 -08:00
Jason Wang	5322c68e58	iavf: remove an unneeded variable The variable `ret_code' used for returning is never changed in function `iavf_shutdown_adminq'. So that it can be removed and just return its initial value 0 at the end of `iavf_shutdown_adminq' function. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:04:21 -08:00
Yang Li	a127adf2fc	i40e: remove variables set but not used The code that uses variables pe_cntx_size and pe_filt_size has been removed, so they should be removed as well. Eliminate the following clang warnings: drivers/net/ethernet/intel/i40e/i40e_common.c:4139:20: warning: variable 'pe_filt_size' set but not used. drivers/net/ethernet/intel/i40e/i40e_common.c:4139:6: warning: variable 'pe_cntx_size' set but not used. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:04:21 -08:00
Mateusz Palczewski	17b33d4319	i40e: Remove non-inclusive language Remove non-inclusive language from the driver. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:04:21 -08:00
Mateusz Palczewski	9c83ca8a63	i40e: Update FW API version Update FW API versions to the newest supported NVM images. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:04:21 -08:00
Jedrzej Jagielski	ef39584ddb	i40e: Minimize amount of busy-waiting during AQ send The i40e_asq_send_command will now use a non blocking usleep_range if possible (non-atomic context), instead of busy-waiting udelay. The usleep_range function uses hrtimers to provide better performance and removes the negative impact of busy-waiting in time-critical environments. 1. Rename i40e_asq_send_command to i40e_asq_send_command_atomic and add 5th parameter to inform if called from an atomic context. Call inside usleep_range (if non-atomic) or udelay (if atomic). 2. Change i40e_asq_send_command to invoke i40e_asq_send_command_atomic(..., false). 3. Change two functions: - i40e_aq_set_vsi_uc_promisc_on_vlan - i40e_aq_set_vsi_mc_promisc_on_vlan to explicitly use i40e_asq_send_command_atomic(..., true) instead of i40e_asq_send_command, as they use spinlocks and do some work in an atomic context. All other calls to i40e_asq_send_command remain unchanged. Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:04:21 -08:00
Nikunj A Dadhania	fffb532378	KVM: x86: Check for rmaps allocation With TDP MMU being the default now, access to mmu_rmaps_stat debugfs file causes following oops: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 3185 Comm: cat Not tainted 5.16.0-rc4+ #204 RIP: 0010:pte_list_count+0x6/0x40 Call Trace: <TASK> ? kvm_mmu_rmaps_stat_show+0x15e/0x320 seq_read_iter+0x126/0x4b0 ? aa_file_perm+0x124/0x490 seq_read+0xf5/0x140 full_proxy_read+0x5c/0x80 vfs_read+0x9f/0x1a0 ksys_read+0x67/0xe0 __x64_sys_read+0x19/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fca6fc13912 Return early when rmaps are not present. Reported-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220105040337.4234-1-nikunj@amd.com> Cc: stable@vger.kernel.org Fixes: `3bcd0662d6` ("KVM: X86: Introduce mmu_rmaps_stat per-vm debugfs file") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-07 12:04:01 -05:00
Karen Sornek	cfb1d572c9	i40e: Add ensurance of MacVlan resources for every trusted VF Trusted VF can use up every resource available, leaving nothing to other trusted VFs. Introduce define, which calculates MacVlan resources available based on maximum available MacVlan resources, bare minimum for each VF and number of currently allocated VFs. Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-07 09:03:44 -08:00
Wanpeng Li	597cb7968c	KVM: SEV: Mark nested locking of kvm->lock Both source and dest vms' kvm->locks are held in sev_lock_two_vms. Mark one with a different subtype to avoid false positives from lockdep. Fixes: `c9d61dcb0b` (KVM: SEV: accept signals in sev_lock_two_vms) Reported-by: Yiru Xu <xyru1999@gmail.com> Tested-by: Jinrong Liang <cloudliang@tencent.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1641364863-26331-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-07 12:01:55 -05:00
Dave Hansen	2056e2989b	x86/sgx: Fix NULL pointer dereference on non-SGX systems == Problem == Nathan Chancellor reported an oops when aceessing the 'sgx_total_bytes' sysfs file: https://lore.kernel.org/all/YbzhBrimHGGpddDM@archlinux-ax161/ The sysfs output code accesses the sgx_numa_nodes[] array unconditionally. However, this array is allocated during SGX initialization, which only occurs on systems where SGX is supported. If the sysfs file is accessed on systems without SGX support, sgx_numa_nodes[] is NULL and an oops occurs. == Solution == To fix this, hide the entire nodeX/x86/ attribute group on systems without SGX support using the ->is_visible attribute group callback. Unfortunately, SGX is initialized via a device_initcall() which occurs _after_ the ->is_visible() callback. Instead of moving SGX initialization earlier, call sysfs_update_group() during SGX initialization to update the group visiblility. This update requires moving the SGX sysfs code earlier in sgx/main.c. There are no code changes other than the addition of arch_update_sysfs_visibility() and a minor whitespace fixup to arch_node_attr_is_visible() which checkpatch caught. CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-sgx@vger.kernel.org Cc: x86@kernel.org Fixes: `50468e4313` ("x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20220104171527.5E8416A8@davehans-spike.ostc.intel.com	2022-01-07 08:47:23 -08:00
Kevin Bracey	c25af830ab	sch_cake: revise Diffserv docs Documentation incorrectly stated that CS1 is equivalent to LE for diffserv8. But when LE was added to the table, CS1 was pushed into tin 1, leaving only LE in tin 0. Also "TOS1" no longer exists, as that is the same codepoint as LE. Make other tweaks properly distinguishing codepoints from classes and putting current Diffserve codepoints ahead of legacy ones. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://lore.kernel.org/r/20220106215637.3132391-1-kevin@bracey.fi Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-07 08:41:29 -08:00
Dan Carpenter	dc35616e6c	netrom: fix api breakage in nr_setsockopt() This needs to copy an unsigned int from user space instead of a long to avoid breaking user space with an API change. I have updated all the integer overflow checks from ULONG to UINT as well. This is a slight API change but I do not expect it to affect anything in real life. Fixes: `3087a6f36e` ("netrom: fix copying in user data in nr_setsockopt") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 14:11:05 +00:00
Dan Carpenter	9371937092	ax25: uninitialized variable in ax25_setsockopt() The "opt" variable is unsigned long but we only copy 4 bytes from the user so the lower 4 bytes are uninitialized. I have changed the integer overflow checks from ULONG to UINT as well. This is a slight API change but I don't expect it to break anything. Fixes: `a7b75c5a8c` ("net: pass a sockptr_t into ->setsockopt") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 14:10:26 +00:00
David S. Miller	b69c5b5886	Merge branch 'octeontx2-ptp-bugs' Subbaraya Sundeep says: ==================== octeontx2: Fix PTP bugs This patchset addresses two problems found when using ptp. Patch 1 - Increases the refcount of ptp device before use which was missing and it lead to refcount increment after use bug when module is loaded and unloaded couple of times. Patch 2 - PTP resources allocated by VF are not being freed during VF teardown. This patch fixes that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 14:04:19 +00:00
Rakesh Babu Saladi	eabd0f88b0	octeontx2-nicvf: Free VF PTP resources. When a VF is removed respective PTP resources are not being freed currently. This patch fixes it. Fixes: `43510ef4dd` ("octeontx2-nicvf: Add PTP hardware clock support to NIX VF") Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 14:04:19 +00:00
Subbaraya Sundeep	93440f4888	octeontx2-af: Increment ptp refcount before use Before using the ptp pci device by AF driver increment the reference count of it. Fixes: `a8b90c9d26` ("octeontx2-af: Add PTP device id for CN10K and 95O silcons") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 14:04:19 +00:00
David S. Miller	fff63521cd	Merge branch 'mptcp-fixes' Mat Martineau says: ==================== mptcp: Fixes for buffer reclaim and option writing Here are three fixes dealing with a syzkaller crash MPTCP triggers in the memory manager in 5.16-rc8, and some option writing problems. Patches 1 and 2 fix some corner cases in MPTCP option writing. Patch 3 addresses a crash that syzkaller found a way to trigger in the mm subsystem by passing an invalid value to __sk_mem_reduce_allocated(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:29:45 +00:00
Mat Martineau	269bda9e7d	mptcp: Check reclaim amount before reducing allocation syzbot found a page counter underflow that was triggered by MPTCP's reclaim code: page_counter underflow: -4294964789 nr_pages=4294967295 WARNING: CPU: 2 PID: 3785 at mm/page_counter.c:56 page_counter_cancel+0xcf/0xe0 mm/page_counter.c:56 Modules linked in: CPU: 2 PID: 3785 Comm: kworker/2:6 Not tainted 5.16.0-rc1-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:page_counter_cancel+0xcf/0xe0 mm/page_counter.c:56 Code: c7 04 24 00 00 00 00 45 31 f6 eb 97 e8 2a 2b b5 ff 4c 89 ea 48 89 ee 48 c7 c7 00 9e b8 89 c6 05 a0 c1 ba 0b 01 e8 95 e4 4b 07 <0f> 0b eb a8 4c 89 e7 e8 25 5a fb ff eb c7 0f 1f 00 41 56 41 55 49 RSP: 0018:ffffc90002d4f918 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffff88806a494120 RCX: 0000000000000000 RDX: ffff8880688c41c0 RSI: ffffffff815e8f28 RDI: fffff520005a9f15 RBP: ffffffff000009cb R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815e2cfe R11: 0000000000000000 R12: ffff88806a494120 R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2de21000 CR3: 000000005ad59000 CR4: 0000000000150ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> page_counter_uncharge+0x2e/0x60 mm/page_counter.c:160 drain_stock+0xc1/0x180 mm/memcontrol.c:2219 refill_stock+0x139/0x2f0 mm/memcontrol.c:2271 __sk_mem_reduce_allocated+0x24d/0x550 net/core/sock.c:2945 __mptcp_rmem_reclaim net/mptcp/protocol.c:167 [inline] __mptcp_mem_reclaim_partial+0x124/0x410 net/mptcp/protocol.c:975 mptcp_mem_reclaim_partial net/mptcp/protocol.c:982 [inline] mptcp_alloc_tx_skb net/mptcp/protocol.c:1212 [inline] mptcp_sendmsg_frag+0x18c6/0x2190 net/mptcp/protocol.c:1279 __mptcp_push_pending+0x232/0x720 net/mptcp/protocol.c:1545 mptcp_release_cb+0xfe/0x200 net/mptcp/protocol.c:2975 release_sock+0xb4/0x1b0 net/core/sock.c:3306 mptcp_worker+0x51e/0xc10 net/mptcp/protocol.c:2443 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> __mptcp_mem_reclaim_partial() could call __mptcp_rmem_reclaim() with a negative value, which passed that negative value to __sk_mem_reduce_allocated() and triggered the splat above. Check for a reclaim amount that is positive and large enough for __mptcp_rmem_reclaim() to actually adjust rmem_fwd_alloc (much like the sk_mem_reclaim_partial() code the function is based on). v2: Use '>' instead of '>=', since SK_MEM_QUANTUM - 1 would get right-shifted into nothing by __mptcp_rmem_reclaim. Fixes: `6511882cdd` ("mptcp: allocate fwd memory separately on the rx and tx path") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/252 Reported-and-tested-by: syzbot+bc9e2d2dbcb347dd215a@syzkaller.appspotmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:29:45 +00:00
Geliang Tang	110b6d1fe9	mptcp: fix a DSS option writing error 'ptr += 1;' was omitted in the original code. If the DSS is the last option -- which is what we have most of the time -- that's not an issue. But it is if we need to send something else after like a RM_ADDR or an MP_PRIO. Fixes: `1bff1e43a3` ("mptcp: optimize out option generation") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:29:45 +00:00
Matthieu Baerts	04fac2cae9	mptcp: fix opt size when sending DSS + MP_FAIL When these two options had to be sent -- which is not common -- the DSS size was not being taken into account in the remaining size. Additionally in this situation, the reported size was only the one of the MP_FAIL which can cause issue if at the end, we need to write more in the TCP options than previously said. Here we use a dedicated variable for MP_FAIL size to keep the WARN_ON_ONCE() just after. Fixes: `c25aeb4e09` ("mptcp: MP_FAIL suboption sending") Acked-and-tested-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:29:45 +00:00
David S. Miller	ca1a6705b2	Merge branch 'mptcp-next' Mat Martineau says: ==================== mptcp: New features and cleanup These patches have been tested in the MPTCP tree for a longer than usual time (thanks to holiday schedules), and are ready for the net-next branch. Changes include feature updates, small fixes, refactoring, and some selftest changes. Patch 1 fixes an OUTQ ioctl issue with TCP fallback sockets. Patches 2, 3, and 6 add support of the MPTCP fastclose option (quick shutdown of the full MPTCP connection, similar to TCP RST in regular TCP), and a related self test. Patch 4 cleans up some accept and poll code that is no longer needed after the fastclose changes. Patch 5 add userspace disconnect using AF_UNSPEC, which is used when testing fastclose and makes the MPTCP socket's handling of AF_UNSPEC in connect() more TCP-like. Patches 7-11 refactor subflow creation to make better use of multiple local endpoints and to better handle individual connection failures when creating multiple subflows. Includes self test updates. Patch 12 cleans up the way subflows are added to the MPTCP connection list, eliminating the need for calls throughout the MPTCP code that had to check the intermediate "join list" for entries to shift over to the main "connection list". Patch 13 refactors the MPTCP release_cb flags to use separate storage for values only accessed with the socket lock held (no atomic ops needed), and for values that need atomic operations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:27:07 +00:00
Paolo Abeni	e9d09baca6	mptcp: avoid atomic bit manipulation when possible Currently the msk->flags bitmask carries both state for the mptcp_release_cb() - mostly touched under the mptcp data lock - and others state info touched even outside such lock scope. As a consequence, msk->flags is always manipulated with atomic operations. This change splits such bitmask in two separate fields, so that we use plain bit operations when touching the cb-related info. The MPTCP_PUSH_PENDING bit needs additional care, as it is the only CB related field currently accessed either under the mptcp data lock or the mptcp socket lock. Let's add another mask just for such bit's sake. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-07 11:27:07 +00:00

... 2 3 4 5 6 ...

1065521 Коммитов Все ветки Поиск

1065521 Коммитов

Все ветки