Граф коммитов

1153395 Коммитов

Автор SHA1 Сообщение Дата
Evan Quan 7a18e089ef drm/amd/pm: update SMU13.0.0 reported maximum shader clock
Update the reported maximum shader clock to the value which can
be guarded to be achieved on all cards. This is to align with
Window setting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-12-15 12:18:08 -05:00
Evan Quan 32a7819ff8 drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings
Correct the pstate standard/peak profiling mode clock settings
for SMU13.0.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-12-15 12:17:55 -05:00
Evan Quan 62b9f835a6 drm/amd/pm: enable GPO dynamic control support for SMU13.0.7
To better support UMD pstate profilings, the GPO feature needs
to be switched on/off accordingly.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-12-15 12:17:44 -05:00
Evan Quan 1794f6a953 drm/amd/pm: enable GPO dynamic control support for SMU13.0.0
To better support UMD pstate profilings, the GPO feature needs
to be switched on/off accordingly.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-12-15 12:16:43 -05:00
Jason Gunthorpe 5fc24e6022 RDMA/rxe: Fix compile warnings on 32-bit
Move the conditional code into a function, with two varients so it is
harder to make these kinds of mistakes.

 drivers/infiniband/sw/rxe/rxe_resp.c: In function 'atomic_write_reply':
 drivers/infiniband/sw/rxe/rxe_resp.c:794:13: error: unused variable 'payload' [-Werror=unused-variable]
   794 |         int payload = payload_size(pkt);
       |             ^~~~~~~
 drivers/infiniband/sw/rxe/rxe_resp.c:793:24: error: unused variable 'mr' [-Werror=unused-variable]
   793 |         struct rxe_mr *mr = qp->resp.mr;
       |                        ^~
 drivers/infiniband/sw/rxe/rxe_resp.c:791:19: error: unused variable 'dst' [-Werror=unused-variable]
   791 |         u64 src, *dst;
       |                   ^~~
 drivers/infiniband/sw/rxe/rxe_resp.c:791:13: error: unused variable 'src' [-Werror=unused-variable]
   791 |         u64 src, *dst;

Fixes: 034e285f8b ("RDMA/rxe: Make responder support atomic write on RC service")
Link: https://lore.kernel.org/linux-rdma/Y5s+EVE7eLWQqOwv@nvidia.com/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-15 11:32:06 -04:00
Pavel Begunkov a8cf95f936 io_uring: fix overflow handling regression
Because the single task locking series got reordered ahead of the
timeout and completion lock changes, two hunks inadvertently ended up
using __io_fill_cqe_req() rather than io_fill_cqe_req(). This meant
that we dropped overflow handling in those two spots. Reinstate the
correct CQE filling helper.

Fixes: f66f73421f ("io_uring: skip spinlocking for ->task_complete")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-15 08:20:10 -07:00
Vladimir Oltean a7d82367da net: dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
In the blamed commit, it was not noticed that one implementation of
chip->info->ops->phylink_get_caps(), called by mv88e6xxx_get_caps(),
may access hardware registers, and in doing so, it takes the
mv88e6xxx_reg_lock(). Namely, this is mv88e6352_phylink_get_caps().

This is a problem because mv88e6xxx_get_caps(), apart from being
a top-level function (method invoked by dsa_switch_ops), is now also
directly called from mv88e6xxx_setup_port(), which runs under the
mv88e6xxx_reg_lock() taken by mv88e6xxx_setup(). Therefore, when running
on mv88e6352, the reg_lock would be acquired a second time and the
system would deadlock on driver probe.

The things that mv88e6xxx_setup() can compete with in terms of register
access with are the IRQ handlers and MDIO bus operations registered by
mv88e6xxx_probe(). So there is a real need to acquire the register lock.

The register lock can, in principle, be dropped and re-acquired pretty
much at will within the driver, as long as no operations that involve
waiting for indirect access to complete (essentially, callers of
mv88e6xxx_smi_direct_wait() and mv88e6xxx_wait_mask()) are interrupted
with the lock released. However, I would guess that in mv88e6xxx_setup(),
the critical section is kept open for such a long time just in order to
optimize away multiple lock/unlock operations on the registers.

We could, in principle, drop the reg_lock right before the
mv88e6xxx_setup_port() -> mv88e6xxx_get_caps() call, and
re-acquire it immediately afterwards. But this would look ugly, because
mv88e6xxx_setup_port() would release a lock which it didn't acquire, but
the caller did.

A cleaner solution to this issue comes from the observation that struct
mv88e6xxxx_ops methods generally assume they are called with the
reg_lock already acquired. Whereas mv88e6352_phylink_get_caps() is more
the exception rather than the norm, in that it acquires the lock itself.

Let's enforce the same locking pattern/convention for
chip->info->ops->phylink_get_caps() as well, and make
mv88e6xxx_get_caps(), the top-level function, acquire the register lock
explicitly, for this one implementation that will access registers for
port 4 to work properly.

This means that mv88e6xxx_setup_port() will no longer call the top-level
function, but the low-level mv88e6xxx_ops method which expects the
correct calling context (register lock held).

Compared to chip->info->ops->phylink_get_caps(), mv88e6xxx_get_caps()
also fixes up the supported_interfaces bitmap for internal ports, since
that can be done generically and does not require per-switch knowledge.
That's code which will no longer execute, however mv88e6xxx_setup_port()
doesn't need that. It just needs to look at the mac_capabilities bitmap.

Fixes: cc1049ccee ("net: dsa: mv88e6xxx: fix speed setting for CPU/DSA ports")
Reported-by: Maksim Kiselev <bigunclemax@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Maksim Kiselev <bigunclemax@gmail.com>
Link: https://lore.kernel.org/r/20221214110120.3368472-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-15 15:42:23 +01:00
Biju Das c72a7e4259 ravb: Fix "failed to switch device to config mode" message during unbind
This patch fixes the error "ravb 11c20000.ethernet eth0: failed to switch
device to config mode" during unbind.

We are doing register access after pm_runtime_put_sync().

We usually do cleanup in reverse order of init. Currently in
remove(), the "pm_runtime_put_sync" is not in reverse order.

Probe
	reset_control_deassert(rstc);
	pm_runtime_enable(&pdev->dev);
	pm_runtime_get_sync(&pdev->dev);

remove
	pm_runtime_put_sync(&pdev->dev);
	unregister_netdev(ndev);
	..
	ravb_mdio_release(priv);
	pm_runtime_disable(&pdev->dev);

Consider the call to unregister_netdev()
unregister_netdev->unregister_netdevice_queue->rollback_registered_many
that calls the below functions which access the registers after
pm_runtime_put_sync()
 1) ravb_get_stats
 2) ravb_close

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Cc: stable@vger.kernel.org
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221214105118.2495313-1-biju.das.jz@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-15 15:36:05 +01:00
Krzysztof Kozlowski a12a383e59
ASoC: lochnagar: Fix unused lochnagar_of_match warning
lochnagar_of_match is used unconditionally, so COMPILE_TEST builds
without OF warn:

  sound/soc/codecs/lochnagar-sc.c:247:34: error: ‘lochnagar_of_match’ defined but not used [-Werror=unused-const-variable=]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221215134337.77944-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-15 13:53:00 +00:00
Gaosheng Cui 2cb815cfc7 net: stmmac: fix errno when create_singlethread_workqueue() fails
We should set the return value to -ENOMEM explicitly when
create_singlethread_workqueue() fails in stmmac_dvr_probe(),
otherwise we'll lose the error value.

Fixes: a137f3f27f ("net: stmmac: fix possible memory leak in stmmac_dvr_probe()")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221214080117.3514615-1-cuigaosheng1@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-15 14:00:34 +01:00
Ming Lei d36a9ea5e7 block: fix use-after-free of q->q_usage_counter
For blk-mq, queue release handler is usually called after
blk_mq_freeze_queue_wait() returns. However, the
q_usage_counter->release() handler may not be run yet at that time, so
this can cause a use-after-free.

Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu().
Since ->release() is called with rcu read lock held, it is agreed that
the race should be covered in caller per discussion from the two links.

Reported-by: Zhang Wensheng <zhangwensheng@huaweicloud.com>
Reported-by: Zhong Jinghua <zhongjinghua@huawei.com>
Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u
Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/
Cc: Hillf Danton <hdanton@sina.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Dennis Zhou <dennis@kernel.org>
Fixes: 2b0d3d3e4f ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-15 05:23:11 -07:00
Yuwei Guan 1eb206208b block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED
The 'bfqd->num_groups_with_pending_reqs' is used when
CONFIG_BFQ_GROUP_IOSCHED is enabled, so let the variables and processes
take effect when CONFIG_BFQ_GROUP_IOSCHED is enabled.

Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20221110112622.389332-1-Yuwei.Guan@zeekrlife.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-15 05:11:59 -07:00
Johan Hovold cb3543cff9
regulator: core: fix deadlock on regulator enable
When updating the operating mode as part of regulator enable, the caller
has already locked the regulator tree and drms_uA_update() must not try
to do the same in order not to trigger a deadlock.

The lock inversion is reported by lockdep as:

  ======================================================
  WARNING: possible circular locking dependency detected
  6.1.0-next-20221215 #142 Not tainted
  ------------------------------------------------------
  udevd/154 is trying to acquire lock:
  ffffc11f123d7e50 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x280

  but task is already holding lock:
  ffff80000e4c36e8 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_enable+0x34/0x80

  which lock already depends on the new lock.

  ...

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(regulator_ww_class_acquire);
                                 lock(regulator_list_mutex);
                                 lock(regulator_ww_class_acquire);
    lock(regulator_list_mutex);

   *** DEADLOCK ***

just before probe of a Qualcomm UFS controller (occasionally) deadlocks
when enabling one of its regulators.

Fixes: 9243a195be ("regulator: core: Change voltage setting path")
Fixes: f8702f9e4a ("regulator: core: Use ww_mutex for regulators locking")
Cc: stable@vger.kernel.org      # 5.0
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20221215104646.19818-1-johan+linaro@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-15 11:56:26 +00:00
Li Zetao 7e43039a49 r6040: Fix kmemleak in probe and remove
There is a memory leaks reported by kmemleak:

  unreferenced object 0xffff888116111000 (size 2048):
    comm "modprobe", pid 817, jiffies 4294759745 (age 76.502s)
    hex dump (first 32 bytes):
      00 c4 0a 04 81 88 ff ff 08 10 11 16 81 88 ff ff  ................
      08 10 11 16 81 88 ff ff 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<ffffffff815bcd82>] kmalloc_trace+0x22/0x60
      [<ffffffff827e20ee>] phy_device_create+0x4e/0x90
      [<ffffffff827e6072>] get_phy_device+0xd2/0x220
      [<ffffffff827e7844>] mdiobus_scan+0xa4/0x2e0
      [<ffffffff827e8be2>] __mdiobus_register+0x482/0x8b0
      [<ffffffffa01f5d24>] r6040_init_one+0x714/0xd2c [r6040]
      ...

The problem occurs in probe process as follows:
  r6040_init_one:
    mdiobus_register
      mdiobus_scan    <- alloc and register phy_device,
                         the reference count of phy_device is 3
    r6040_mii_probe
      phy_connect     <- connect to the first phy_device,
                         so the reference count of the first
                         phy_device is 4, others are 3
    register_netdev   <- fault inject succeeded, goto error handling path

    // error handling path
    err_out_mdio_unregister:
      mdiobus_unregister(lp->mii_bus);
    err_out_mdio:
      mdiobus_free(lp->mii_bus);    <- the reference count of the first
                                       phy_device is 1, it is not released
                                       and other phy_devices are released
  // similarly, the remove process also has the same problem

The root cause is traced to the phy_device is not disconnected when
removes one r6040 device in r6040_remove_one() or on error handling path
after r6040_mii probed successfully. In r6040_mii_probe(), a net ethernet
device is connected to the first PHY device of mii_bus, in order to
notify the connected driver when the link status changes, which is the
default behavior of the PHY infrastructure to handle everything.
Therefore the phy_device should be disconnected when removes one r6040
device or on error handling path.

Fix it by adding phy_disconnect() when removes one r6040 device or on
error handling path after r6040_mii probed successfully.

Fixes: 3831861b4a ("r6040: implement phylib")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221213125614.927754-1-lizetao1@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-15 12:48:34 +01:00
Andreas Gruenbacher 6b46a06100 gfs2: Remove support for glock holder auto-demotion (2)
As a follow-up to the previous commit, move the recovery related code in
__gfs2_glock_dq() to gfs2_glock_dq() where it better fits.  No
functional change.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-12-15 12:41:22 +01:00
Andreas Gruenbacher ba3e77a4a2 gfs2: Remove support for glock holder auto-demotion
Remove the support for glock holder auto-demotion (commit dc732906c2
and folow-ups) as we are not planning to use this feature, and the
additional code therefore only adds unnecessary complexity.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-12-15 12:41:22 +01:00
Kirill Tkhai 3ff8bff704 unix: Fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
There is a race resulting in alive SOCK_SEQPACKET socket
may change its state from TCP_ESTABLISHED to TCP_CLOSE:

unix_release_sock(peer)                  unix_dgram_sendmsg(sk)
  sock_orphan(peer)
    sock_set_flag(peer, SOCK_DEAD)
                                           sock_alloc_send_pskb()
                                             if !(sk->sk_shutdown & SEND_SHUTDOWN)
                                               OK
                                           if sock_flag(peer, SOCK_DEAD)
                                             sk->sk_state = TCP_CLOSE
  sk->sk_shutdown = SHUTDOWN_MASK

After that socket sk remains almost normal: it is able to connect, listen, accept
and recvmsg, while it can't sendmsg.

Since this is the only possibility for alive SOCK_SEQPACKET to change
the state in such way, we should better fix this strange and potentially
danger corner case.

Note, that we will return EPIPE here like this is normally done in sock_alloc_send_pskb().
Originally used ECONNREFUSED looks strange, since it's strange to return
a specific retval in dependence of race in kernel, when user can't affect on this.

Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock
to fix race with unix_dgram_connect():

unix_dgram_connect(other)            unix_dgram_sendmsg(sk)
                                       unix_peer(sk) = NULL
                                       unix_state_unlock(sk)
  unix_state_double_lock(sk, other)
  sk->sk_state  = TCP_ESTABLISHED
  unix_peer(sk) = other
  unix_state_double_unlock(sk, other)
                                       sk->sk_state  = TCP_CLOSED

This patch fixes both of these races.

Fixes: 83301b5367 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
Signed-off-by: Kirill Tkhai <tkhai@ya.ru>
Link: https://lore.kernel.org/r/135fda25-22d5-837a-782b-ceee50e19844@ya.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-15 11:35:18 +01:00
Toke Høiland-Jørgensen f506439ec3 selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program
This adds a simple test for inserting an XDP program into a cpumap that is
"owned" by an XDP program that was loaded as PROG_TYPE_EXT (as libxdp
does). Prior to the kernel fix this would fail because the map type
ownership would be set to PROG_TYPE_EXT instead of being resolved to
PROG_TYPE_XDP.

v5:
- Fix a few nits from Andrii, add his ACK
v4:
- Use skeletons for selftest
v3:
- Update comment to better explain the cause
- Add Yonghong's ACK

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221214230254.790066-2-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-14 21:30:40 -08:00
Toke Høiland-Jørgensen 1c123c567f bpf: Resolve fext program type when checking map compatibility
The bpf_prog_map_compatible() check makes sure that BPF program types are
not mixed inside BPF map types that can contain programs (tail call maps,
cpumaps and devmaps). It does this by setting the fields of the map->owner
struct to the values of the first program being checked against, and
rejecting any subsequent programs if the values don't match.

One of the values being set in the map owner struct is the program type,
and since the code did not resolve the prog type for fext programs, the map
owner type would be set to PROG_TYPE_EXT and subsequent loading of programs
of the target type into the map would fail.

This bug is seen in particular for XDP programs that are loaded as
PROG_TYPE_EXT using libxdp; these cannot insert programs into devmaps and
cpumaps because the check fails as described above.

Fix the bug by resolving the fext program type to its target program type
as elsewhere in the verifier.

v3:
- Add Yonghong's ACK

Fixes: f45d5b6ce2 ("bpf: generalise tail call map compatibility check")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221214230254.790066-1-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-14 21:30:27 -08:00
Minsuk Kang 9f28157778 nfc: pn533: Clear nfc_target before being used
Fix a slab-out-of-bounds read that occurs in nla_put() called from
nfc_genl_send_target() when target->sensb_res_len, which is duplicated
from an nfc_target in pn533, is too large as the nfc_target is not
properly initialized and retains garbage values. Clear nfc_targets with
memset() before they are used.

Found by a modified version of syzkaller.

BUG: KASAN: slab-out-of-bounds in nla_put
Call Trace:
 memcpy
 nla_put
 nfc_genl_dump_targets
 genl_lock_dumpit
 netlink_dump
 __netlink_dump_start
 genl_family_rcv_msg_dumpit
 genl_rcv_msg
 netlink_rcv_skb
 genl_rcv
 netlink_unicast
 netlink_sendmsg
 sock_sendmsg
 ____sys_sendmsg
 ___sys_sendmsg
 __sys_sendmsg
 do_syscall_64

Fixes: 673088fb42 ("NFC: pn533: Send ATR_REQ directly for active device detection")
Fixes: 361f3cb7f9 ("NFC: DEP link hook implementation for pn533")
Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221214015139.119673-1-linuxlovemin@yonsei.ac.kr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-14 20:51:29 -08:00
Vladimir Oltean 628050ec95 net: enetc: avoid buffer leaks on xdp_do_redirect() failure
Before enetc_clean_rx_ring_xdp() calls xdp_do_redirect(), each software
BD in the RX ring between index orig_i and i can have one of 2 refcount
values on its page.

We are the owner of the current buffer that is being processed, so the
refcount will be at least 1.

If the current owner of the buffer at the diametrically opposed index
in the RX ring (i.o.w, the other half of this page) has not yet called
kfree(), this page's refcount could even be 2.

enetc_page_reusable() in enetc_flip_rx_buff() tests for the page
refcount against 1, and [ if it's 2 ] does not attempt to reuse it.

But if enetc_flip_rx_buff() is put after the xdp_do_redirect() call,
the page refcount can have one of 3 values. It can also be 0, if there
is no owner of the other page half, and xdp_do_redirect() for this
buffer ran so far that it triggered a flush of the devmap/cpumap bulk
queue, and the consumers of those bulk queues also freed the buffer,
all by the time xdp_do_redirect() returns the execution back to enetc.

This is the reason why enetc_flip_rx_buff() is called before
xdp_do_redirect(), but there is a big flaw with that reasoning:
enetc_flip_rx_buff() will set rx_swbd->page = NULL on both sides of the
enetc_page_reusable() branch, and if xdp_do_redirect() returns an error,
we call enetc_xdp_free(), which does not deal gracefully with that.

In fact, what happens is quite special. The page refcounts start as 1.
enetc_flip_rx_buff() figures they're reusable, transfers these
rx_swbd->page pointers to a different rx_swbd in enetc_reuse_page(), and
bumps the refcount to 2. When xdp_do_redirect() later returns an error,
we call the no-op enetc_xdp_free(), but we still haven't lost the
reference to that page. A copy of it is still at rx_ring->next_to_alloc,
but that has refcount 2 (and there are no concurrent owners of it in
flight, to drop the refcount). What really kills the system is when
we'll flip the rx_swbd->page the second time around. With an updated
refcount of 2, the page will not be reusable and we'll really leak it.
Then enetc_new_page() will have to allocate more pages, which will then
eventually leak again on further errors from xdp_do_redirect().

The problem, summarized, is that we zeroize rx_swbd->page before we're
completely done with it, and this makes it impossible for the error path
to do something with it.

Since the packet is potentially multi-buffer and therefore the
rx_swbd->page is potentially an array, manual passing of the old
pointers between enetc_flip_rx_buff() and enetc_xdp_free() is a bit
difficult.

For the sake of going with a simple solution, we accept the possibility
of racing with xdp_do_redirect(), and we move the flip procedure to
execute only on the redirect success path. By racing, I mean that the
page may be deemed as not reusable by enetc (having a refcount of 0),
but there will be no leak in that case, either.

Once we accept that, we have something better to do with buffers on
XDP_REDIRECT failure. Since we haven't performed half-page flipping yet,
we won't, either (and this way, we can avoid enetc_xdp_free()
completely, which gives the entire page to the slab allocator).
Instead, we'll call enetc_xdp_drop(), which will recycle this half of
the buffer back to the RX ring.

Fixes: 9d2b68cc10 ("net: enetc: add support for XDP_REDIRECT")
Suggested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20221213001908.2347046-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-14 20:24:23 -08:00
John Stultz 76d62f24db pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion
Wei Wang reported seeing priority inversion caused latencies
caused by contention on pmsg_lock, and suggested it be switched
to a rt_mutex.

I was initially hesitant this would help, as the tasks in that
trace all seemed to be SCHED_NORMAL, so the benefit would be
limited to only nice boosting.

However, another similar issue was raised where the priority
inversion was seen did involve a blocked RT task so it is clear
this would be helpful in that case.

Cc: Wei Wang <wvw@google.com>
Cc: Midas Chien<midaschieh@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kernel-team@android.com
Fixes: 9d5438f462 ("pstore: Add pmsg - user-space accessible pstore object")
Reported-by: Wei Wang <wvw@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221214231834.3711880-1-jstultz@google.com
2022-12-14 19:37:42 -08:00
Nathan Chancellor d6a9fb87e9 security: Restrict CONFIG_ZERO_CALL_USED_REGS to gcc or clang > 15.0.6
A bad bug in clang's implementation of -fzero-call-used-regs can result
in NULL pointer dereferences (see the links above the check for more
information). Restrict CONFIG_CC_HAS_ZERO_CALL_USED_REGS to either a
supported GCC version or a clang newer than 15.0.6, which will catch
both a theoretical 15.0.7 and the upcoming 16.0.0, which will both have
the bug fixed.

Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221214232602.4118147-1-nathan@kernel.org
2022-12-14 16:05:36 -08:00
Kristina Martsenko f68022ae0a lkdtm: cfi: Make PAC test work with GCC 7 and 8
The CFI test uses the branch-protection=none compiler attribute to
disable PAC return address protection on a function. While newer GCC
versions support this attribute, older versions (GCC 7 and 8) instead
supported the sign-return-address=none attribute, leading to a build
failure when the test is built with older compilers. Fix it by checking
which attribute is supported and using the correct one.

Fixes: 2e53b877dc ("lkdtm: Add CFI_BACKWARD to test ROP mitigations")
Reported-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/CAEUSe78kDPxQmQqCWW-_9LCgJDFhAeMoVBFnX9QLx18Z4uT4VQ@mail.gmail.com/
2022-12-14 16:05:09 -08:00
Masami Hiramatsu (Google) d4505aa6af tracing/probes: Reject symbol/symstr type for uprobe
Since uprobe's argument must contain the user-space data, that
should not be converted to kernel symbols. Reject if user
specifies these types on uprobe events. e.g.

 /sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symbol' >> uprobe_events
 sh: write error: Invalid argument
 /sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symstr' >> uprobe_events
 sh: write error: Invalid argument
 /sys/kernel/debug/tracing # cat error_log
 [ 1783.134883] trace_uprobe: error: Unknown type is specified
   Command: p /bin/sh:10 %ax:symbol
                             ^
 [ 1792.201120] trace_uprobe: error: Unknown type is specified
   Command: p /bin/sh:10 %ax:symstr
                             ^
Link: https://lore.kernel.org/all/166679931679.1528100.15540755370726009882.stgit@devnote3/

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-12-15 09:00:20 +09:00
Masami Hiramatsu (Google) b26a124cbf tracing/probes: Add symstr type for dynamic events
Add 'symstr' type for storing the kernel symbol as a string data
instead of the symbol address. This allows us to filter the
events by wildcard symbol name.

e.g.
  # echo 'e:wqfunc workqueue.workqueue_execute_start symname=$function:symstr' >> dynamic_events
  # cat events/eprobes/wqfunc/format
  name: wqfunc
  ID: 2110
  format:
  	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
  	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
  	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
  	field:int common_pid;	offset:4;	size:4;	signed:1;

  	field:__data_loc char[] symname;	offset:8;	size:4;	signed:1;

  print fmt: " symname=\"%s\"", __get_str(symname)

Note that there is already 'symbol' type which just change the
print format (so it still stores the symbol address in the tracing
ring buffer.) On the other hand, 'symstr' type stores the actual
"symbol+offset/size" data as a string.

Link: https://lore.kernel.org/all/166679930847.1528100.4124308529180235965.stgit@devnote3/

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-12-15 08:48:57 +09:00
wuqiang 3b7ddab8a1 kprobes: kretprobe events missing on 2-core KVM guest
Default value of maxactive is set as num_possible_cpus() for nonpreemptable
systems. For a 2-core system, only 2 kretprobe instances would be allocated
in default, then these 2 instances for execve kretprobe are very likely to
be used up with a pipelined command.

Here's the testcase: a shell script was added to crontab, and the content
of the script is:

  #!/bin/sh
  do_something_magic `tr -dc a-z < /dev/urandom | head -c 10`

cron will trigger a series of program executions (4 times every hour). Then
events loss would be noticed normally after 3-4 hours of testings.

The issue is caused by a burst of series of execve requests. The best number
of kretprobe instances could be different case by case, and should be user's
duty to determine, but num_possible_cpus() as the default value is inadequate
especially for systems with small number of cpus.

This patch enables the logic for preemption as default, thus increases the
minimum of maxactive to 10 for nonpreemptable systems.

Link: https://lore.kernel.org/all/20221110081502.492289-1-wuqiang.matt@bytedance.com/

Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
Reviewed-by: Solar Designer <solar@openwall.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-12-15 08:48:40 +09:00
Linus Torvalds 041fae9c10 f2fs-for-6.2-rc1
In this round, we've added two features: 1) F2FS_IOC_START_ATOMIC_REPLACE and
 2) per-block age-based extent cache. 1) is a variant of the previous atomic
 write feature which guarantees a per-file atomicity. It would be more efficient
 than AtomicFile implementation in Android framework. 2) implements another type
 of extent cache in memory which keeps the per-block age in a file, so that block
 allocator could split the hot and cold data blocks more accurately.
 
 Enhancement:
  - introduce F2FS_IOC_START_ATOMIC_REPLACE
  - refactor extent_cache to add a new per-block-age-based extent cache support
  - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
  - add proc entry to show discard_plist info
  - optimize iteration over sparse directories
  - add barrier mount option
 
 Bug fix
  - avoid victim selection from previous victim section
  - fix to enable compress for newly created file if extension matches
  - set zstd compress level correctly
  - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
  - correct i_size change for atomic writes
  - allow to read node block after shutdown
  - allow to set compression for inlined file
  - fix gc mode when gc_urgent_high_remaining is 1
  - should put a page when checking the summary info
 
 Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and doc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmOaTNUACgkQQBSofoJI
 UNIQnw//V7Q8DUHw5YNj04jutwXH2DNMLAmn/NJh5S6dIzy/LiywlSzVg53/0/FP
 4K577urUkIhgilRO+yncUMSnSQk7BluQvGSx4ja2AV+dpDomjxM3GwIacGzSvr7D
 VfVf8Vig10UEFrrtEEKtv1VFlYHAmo8lLpubzrZHV8aZFLHHYO2fakQhPu8BYsaz
 eGCJwxjvTZcQUPkaeG9tWto3ChI3F6PzreiQ5TztHhLWSEgw/o0qijpsc+2SthaV
 my7uGjeBY8EGPeSYbeCxRtdx8g8Qu11K3ISuDj8zBybmjG3IWOGt1CVcrY6tZbal
 aL70CMtHkMqMn03VqbpCTqBtdWNMrrw5sYSL3qXIUdXlX/2yJBh9fLAeNxKNs5Nu
 6veSb2WgYMHqIsClkAAcP0xJ8g6kodGoG60wVr4ek0Vdt4osaQqwq+bnffpwwxtQ
 F+7aRuinv+rdrHJ4CuFXAmHPKh2lBe2lTTWZEKg2RptTxZ5DhD2Qn6x1khPD2GFA
 mG2Aeiq6PVxxEeIO+w/VBCuAgpGTFV2N/ZIF8VfjFNdWiN5OGLWQNHC2KGj2G2uV
 +fA+B91txQWtjY9h72YJb2+aGIixcnLY24ni4mDgDItqtpCB4PW56W8cbnbv9Pl+
 aXAWdADqJdDyllHoVB/JQ24gr2fATJGRIDeYDnw+vPP4f5ZT5vg=
 =f00t
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added two features: F2FS_IOC_START_ATOMIC_REPLACE
  and a per-block age-based extent cache.

  F2FS_IOC_START_ATOMIC_REPLACE is a variant of the previous atomic
  write feature which guarantees a per-file atomicity. It would be more
  efficient than AtomicFile implementation in Android framework.

  The per-block age-based extent cache implements another type of extent
  cache in memory which keeps the per-block age in a file, so that block
  allocator could split the hot and cold data blocks more accurately.

  Enhancements:
   - introduce F2FS_IOC_START_ATOMIC_REPLACE
   - refactor extent_cache to add a new per-block-age-based extent cache support
   - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
   - add proc entry to show discard_plist info
   - optimize iteration over sparse directories
   - add barrier mount option

  Bug fixes:
   - avoid victim selection from previous victim section
   - fix to enable compress for newly created file if extension matches
   - set zstd compress level correctly
   - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
   - correct i_size change for atomic writes
   - allow to read node block after shutdown
   - allow to set compression for inlined file
   - fix gc mode when gc_urgent_high_remaining is 1
   - should put a page when checking the summary info

  Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and
  doc"

* tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
  f2fs: reset wait_ms to default if any of the victims have been selected
  f2fs: fix some format WARNING in debug.c and sysfs.c
  f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
  f2fs: fix iostat parameter for discard
  f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
  f2fs: add block_age-based extent cache
  f2fs: allocate the extent_cache by default
  f2fs: refactor extent_cache to support for read and more
  f2fs: remove unnecessary __init_extent_tree
  f2fs: move internal functions into extent_cache.c
  f2fs: specify extent cache for read explicitly
  f2fs: introduce f2fs_is_readonly() for readability
  f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
  f2fs: do some cleanup for f2fs module init
  MAINTAINERS: Add f2fs bug tracker link
  f2fs: remove the unused flush argument to change_curseg
  f2fs: open code allocate_segment_by_default
  f2fs: remove struct segment_allocation default_salloc_ops
  f2fs: introduce discard_urgent_util sysfs node
  f2fs: define MIN_DISCARD_GRANULARITY macro
  ...
2022-12-14 15:27:57 -08:00
Linus Torvalds eb67d239f3 RISC-V Patches for the 6.2 Merge Window, Part 1
* Support for the T-Head PMU via the perf subsystem.
 * ftrace support for rv32.
 * Support for non-volatile memory devices.
 * Various fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmOZ6WsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWGcD/wLGiHq3ekQhl5D+CaA1WlJ5XzQFfY2
 bv1ZCZGdjuiv66jiMlmEsbpfUCk3bSAIjCO3MHQNDmTuPJztCHVJXOHbZFWItzzO
 soW4nXHKW1sGHa7hDLGQUPkltA48OdPoyqEDvlnpyEWFT+2xHwdFEURWE85FXGeq
 ZzFSKUQqX/V52n9TS4M4QtmNnQatR3TgIs8ttzD4JqwWFBbp4/iBfIGt6n3W24XH
 9lKWikO4YOYUPl0KVIakM4d8NmX7g+7vhCKWavLke1fF/IQOlyWwA0eM8ryj33OG
 L1nFkqfF3mCw9i72WHftlc0rAgVqcYS8ntnQkPNpt2zPp3xFjDwEy+XiZrRE+sAp
 m5Ma2Tkw7G3ueBtXwP1yo+EKa7PrVFbCRD/rEpLJAC6+9ktvc7cYs39E08O+wrwT
 qkYThDolovqMOqfOq6afEGy5lfIa5U00vxK+3MXiE3eLEjHSJhwTXadUbwyMjJWE
 zOwA6p5NfDFzklESSNTtIBY85Zlh/g2q6GWCy7yBQnlaSdbpDxcnAlSZipq66Iqm
 9ytdZiHid4BIRQxr5qyXTB184BvFnWNRs9NGhCj38uLEnuxwSChzwoh/WPDxLNte
 U9ouvwJO5U2qAZsMGJhY8W2s/9WvWpSqRhSMA/nnNV1Hh+URFz8rFXAln6kNn//v
 j+cYGCyjLnO1hg==
 =4Ak2
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the T-Head PMU via the perf subsystem

 - ftrace support for rv32

 - Support for non-volatile memory devices

 - Various fixes and cleanups

* tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  Documentation: RISC-V: patch-acceptance: s/implementor/implementer
  Documentation: RISC-V: Mention the UEFI Standards
  Documentation: RISC-V: Allow patches for non-standard behavior
  Documentation: RISC-V: Fix a typo in patch-acceptance
  riscv: Fixup compile error with !MMU
  riscv: Fix P4D_SHIFT definition for 3-level page table mode
  riscv: Apply a static assert to riscv_isa_ext_id
  RISC-V: Add some comments about the shadow and overflow stacks
  RISC-V: Align the shadow stack
  RISC-V: Ensure Zicbom has a valid block size
  RISC-V: Introduce riscv_isa_extension_check
  RISC-V: Improve use of isa2hwcap[]
  riscv: Don't duplicate _ALTERNATIVE_CFG* macros
  riscv: alternatives: Drop the underscores from the assembly macro names
  riscv: alternatives: Don't name unused macro parameters
  riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2
  riscv: mm: call best_map_size many times during linear-mapping
  riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
  riscv: Fix crash during early errata patching
  riscv: boot: add zstd support
  ...
2022-12-14 15:23:49 -08:00
Linus Torvalds 94a855111e - Add the call depth tracking mitigation for Retbleed which has
been long in the making. It is a lighterweight software-only fix for
 Skylake-based cores where enabling IBRS is a big hammer and causes a
 significant performance impact.
 
 What it basically does is, it aligns all kernel functions to 16 bytes
 boundary and adds a 16-byte padding before the function, objtool
 collects all functions' locations and when the mitigation gets applied,
 it patches a call accounting thunk which is used to track the call depth
 of the stack at any time.
 
 When that call depth reaches a magical, microarchitecture-specific value
 for the Return Stack Buffer, the code stuffs that RSB and avoids its
 underflow which could otherwise lead to the Intel variant of Retbleed.
 
 This software-only solution brings a lot of the lost performance back,
 as benchmarks suggest:
 
   https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
 
 That page above also contains a lot more detailed explanation of the
 whole mechanism
 
 - Implement a new control flow integrity scheme called FineIBT which is
 based on the software kCFI implementation and uses hardware IBT support
 where present to annotate and track indirect branches using a hash to
 validate them
 
 - Other misc fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
 VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
 VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
 gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
 iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
 Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
 ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
 ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
 73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
 i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
 Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
 zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
 =cRy1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Add the call depth tracking mitigation for Retbleed which has been
   long in the making. It is a lighterweight software-only fix for
   Skylake-based cores where enabling IBRS is a big hammer and causes a
   significant performance impact.

   What it basically does is, it aligns all kernel functions to 16 bytes
   boundary and adds a 16-byte padding before the function, objtool
   collects all functions' locations and when the mitigation gets
   applied, it patches a call accounting thunk which is used to track
   the call depth of the stack at any time.

   When that call depth reaches a magical, microarchitecture-specific
   value for the Return Stack Buffer, the code stuffs that RSB and
   avoids its underflow which could otherwise lead to the Intel variant
   of Retbleed.

   This software-only solution brings a lot of the lost performance
   back, as benchmarks suggest:

       https://lore.kernel.org/all/20220915111039.092790446@infradead.org/

   That page above also contains a lot more detailed explanation of the
   whole mechanism

 - Implement a new control flow integrity scheme called FineIBT which is
   based on the software kCFI implementation and uses hardware IBT
   support where present to annotate and track indirect branches using a
   hash to validate them

 - Other misc fixes and cleanups

* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
  x86/paravirt: Use common macro for creating simple asm paravirt functions
  x86/paravirt: Remove clobber bitmask from .parainstructions
  x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
  x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
  x86/Kconfig: Enable kernel IBT by default
  x86,pm: Force out-of-line memcpy()
  objtool: Fix weak hole vs prefix symbol
  objtool: Optimize elf_dirty_reloc_sym()
  x86/cfi: Add boot time hash randomization
  x86/cfi: Boot time selection of CFI scheme
  x86/ibt: Implement FineIBT
  objtool: Add --cfi to generate the .cfi_sites section
  x86: Add prefix symbols for function padding
  objtool: Add option to generate prefix symbols
  objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  objtool: Slice up elf_create_section_symbol()
  kallsyms: Revert "Take callthunks into account"
  x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
  x86/retpoline: Fix crash printing warning
  x86/paravirt: Fix a !PARAVIRT build warning
  ...
2022-12-14 15:03:00 -08:00
Kees Cook 00dd027f72 docs: Fix path paste-o for /sys/kernel/warn_count
Running "make htmldocs" shows that "/sys/kernel/oops_count" was
duplicated. This should have been "warn_count":

  Warning: /sys/kernel/oops_count is defined 2 times:
  ./Documentation/ABI/testing/sysfs-kernel-warn_count:0
  ./Documentation/ABI/testing/sysfs-kernel-oops_count:0

Fix the typo.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-doc/202212110529.A3Qav8aR-lkp@intel.com
Fixes: 8b05aa2633 ("panic: Expose "warn_count" to sysfs")
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-12-14 14:37:59 -08:00
Kees Cook 1a17e5b513 LoadPin: Ignore the "contents" argument of the LSM hooks
LoadPin only enforces the read-only origin of kernel file reads. Whether
or not it was a partial read isn't important. Remove the overly
conservative checks so that things like partial firmware reads will
succeed (i.e. reading a firmware header).

Fixes: 2039bda1fa ("LSM: Add "contents" flag to kernel_read_file hook")
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge@hallyn.com>
Tested-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://lore.kernel.org/r/20221209195453.never.494-kees@kernel.org
2022-12-14 14:34:18 -08:00
Christian König 4772222066 drm/amdgpu: revert "generally allow over-commit during BO allocation"
This reverts commit f9d00a4a8d.

This causes problem for KFD because when we overcommit we accidentially
bind the BO to GTT for moving it into VRAM. We also need to make sure
that this is done only as fallback after trying to evict first.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-14 16:48:00 -05:00
Luben Tuikov 3273f11675 drm/amdgpu: Remove unnecessary domain argument
Remove the "domain" argument to amdgpu_bo_create_kernel_at() since this
function takes an "offset" argument which is the offset off of VRAM, and as
such allocation always takes place in VRAM. Thus, the "domain" argument is
unnecessary.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-14 16:48:00 -05:00
Luben Tuikov 7554886daa drm/amdgpu: Fix size validation for non-exclusive domains (v4)
Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the
requested memory exists, else we get a kernel oops when dereferencing "man".

v2: Make the patch standalone, i.e. not dependent on local patches.
v3: Preserve old behaviour and just check that the manager pointer is not
    NULL.
v4: Complain if GTT domain requested and it is uninitialized--most likely a
    bug.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-14 16:48:00 -05:00
Luben Tuikov 28afcb0ad5 drm/amdgpu: Check if fru_addr is not NULL (v2)
Always check if fru_addr is not NULL. This commit also fixes a "smatch"
warning.

v2: Add a Fixes tag.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Dan Carpenter <error27@gmail.com>
Cc: kernel test robot <lkp@intel.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Fixes: afbe5d1e4b ("drm/amdgpu: Bug-fix: Reading I2C FRU data on newer ASICs")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-14 16:48:00 -05:00
Linus Torvalds 93761c93e9 + Features
- switch to zstd compression for profile raw data
 
 + Cleanups
   - Simplify obtain the newest label on a cred
   - remove useless static inline functions
   - compute permission conversion on policy unpack
   - refactor code to share common permissins
   - refactor unpack to group policy backwards compatiblity code
   - add __init annotation to aa_{setup/teardown}_dfa_engine()
 
 + Bug Fixes
   - fix a memleak in
     - multi_transaction_new()
     - free_ruleset()
     - unpack_profile()
     - alloc_ns()
   - fix lockdep warning when removing a namespace
   - fix regression in stacking due to label flags
   - fix loading of child before parent
   - fix kernel-doc comments that differ from fns
   - fix spelling errors in comments
   - store return value of unpack_perms_table() to signed variable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmOZwywACgkQBS82cBjV
 w9jBjRAAmj4gyK0L3eGY4IV2BpvnkHwHY4lOObJulTwILOOj0Pz8CJqRCa/HDCGj
 aOlnwqksPsAjadzzfi58D6TnT+3fOuskbcMgTyvX5jraTXPrUl90+hXorbXKuLrw
 iaX6QxW8soNW/s3oJhrC2HxbIhGA9VpVnmQpVZpJMmz5bU2xmzL62FCN8x88kytr
 9CygaudPrvwYJf5pPd62p7ltj2S6lFwZ6dVCyiDQGTc+Gyng4G8p4MCfI1CwMMyo
 mAUeeRnoeeBwH3tSy/Wsr72jPKjsMASpcMHo3ns/dVSw/ug2FYYToZbfxT/uAa6O
 WVHfS1Kv/5afG9xxyfocWecd+Yp3lsXq9F+q36uOT9NeJmlej9aJr5sWMcvV3sru
 QVNN7tFZbHqCnLhpl6RDH/NiguweNYQXrl2lukXZe/FKu/KDasFIOzL+IAt2TqZE
 3mWrha7Q7j/gdBw8+fHHGtXCx0NSQlz1oFLo/y/mI7ztwUPJsBYbH5+108iP0ys/
 7Kd+jkYRucJB4upGH4meQbN6f/rrs3+m/b/j0Q8RCFHAs2f+mYZeN/JOHCo0T4YH
 KO1W60846fPs+7yZTVxWYFpR/kIuXksyxMWpEEZFFtF4MNoaeM1uypBWqm/JmKYr
 8oDtEyiOd/qmZnWRcuO3/bmdoJUZY1zTXWA0dlScYc8vR4KC+EE=
 =6GKy
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-pr-2022-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor updates from John Johansen:
 "Features:
   - switch to zstd compression for profile raw data

  Cleanups:
   - simplify obtaining the newest label on a cred
   - remove useless static inline functions
   - compute permission conversion on policy unpack
   - refactor code to share common permissins
   - refactor unpack to group policy backwards compatiblity code
   - add __init annotation to aa_{setup/teardown}_dfa_engine()

  Bug Fixes:
   - fix a memleak in
       - multi_transaction_new()
       - free_ruleset()
       - unpack_profile()
       - alloc_ns()
   - fix lockdep warning when removing a namespace
   - fix regression in stacking due to label flags
   - fix loading of child before parent
   - fix kernel-doc comments that differ from fns
   - fix spelling errors in comments
   - store return value of unpack_perms_table() to signed variable"

* tag 'apparmor-pr-2022-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (64 commits)
  apparmor: Fix uninitialized symbol 'array_size' in policy_unpack_test.c
  apparmor: Add __init annotation to aa_{setup/teardown}_dfa_engine()
  apparmor: Fix memleak in alloc_ns()
  apparmor: Fix memleak issue in unpack_profile()
  apparmor: fix a memleak in free_ruleset()
  apparmor: Fix spelling of function name in comment block
  apparmor: Use pointer to struct aa_label for lbs_cred
  AppArmor: Fix kernel-doc
  LSM: Fix kernel-doc
  AppArmor: Fix kernel-doc
  apparmor: Fix loading of child before parent
  apparmor: refactor code that alloc null profiles
  apparmor: fix obsoleted comments for aa_getprocattr() and audit_resource()
  apparmor: remove useless static inline functions
  apparmor: Fix unpack_profile() warn: passing zero to 'ERR_PTR'
  apparmor: fix uninitialize table variable in error in unpack_trans_table
  apparmor: store return value of unpack_perms_table() to signed variable
  apparmor: Fix kunit test for out of bounds array
  apparmor: Fix decompression of rawdata for read back to userspace
  apparmor: Fix undefined references to zstd_ symbols
  ...
2022-12-14 13:42:09 -08:00
Linus Torvalds 64e7003c6b This update includes the following changes:
API:
 
 - Optimise away self-test overhead when they are disabled.
 - Support symmetric encryption via keyring keys in af_alg.
 - Flip hwrng default_quality, the default is now maximum entropy.
 
 Algorithms:
 
 - Add library version of aesgcm.
 - CFI fixes for assembly code.
 - Add arm/arm64 accelerated versions of sm3/sm4.
 
 Drivers:
 
 - Remove assumption on arm64 that kmalloc is DMA-aligned.
 - Fix selftest failures in rockchip.
 - Add support for RK3328/RK3399 in rockchip.
 - Add deflate support in qat.
 - Merge ux500 into stm32.
 - Add support for TEE for PCI ID 0x14CA in ccp.
 - Add mt7986 support in mtk.
 - Add MaxLinear platform support in inside-secure.
 - Add NPCM8XX support in npcm.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmOZhNQACgkQxycdCkmx
 i6edOQ/+IHYe2Z+fLsMGs0qgTVaEV33O0crTRl/PMkfBJai57grz6x/G9QrkwGHS
 084u4RmwhVrE7Z/pxvey48m0lHMw3H/ElLTRl5LV1zE2OtGgr4VV63wtqthu1QS1
 KblVnjb52DhFhvF1O1IrK9lxyX0lByOiARFVdyZR6+Rb66Xfq8rqk5t8U8mmTUFz
 ds9S2Un4HajgtjNEyI78DOX8o4wVST8tltQs0eVii6T9AeXgSgX37ytD7Xtg/zrz
 /p61KFgKBQkRT7EEGD6xgNrND0vNAp2w98ZTTRXTZI8+Y0aTUcTYya7cXOLBt9bQ
 rA7z9sNKvmwJijTMV6O9eqRGcYfzc2G4qfMhlQqj/P2pjLnEZXdvFNHTTbclR76h
 2UFlZXPDQVQukvnNNnB6bmIvv6DsM+jmGH0pK5BnBJXnD5SOZh1RqjJxw0Kj6QCM
 VxpKDvfStux2Guh6mz1lJna/S44qKy/sVYkWUawcmE4RF2+GfNayM1GUpEUofndE
 vz1yZdgLPETSh5QzKrjFkUAnqo/AsAdc5Qxroz9DRz1BCC0GCuIxjUG8ScTWgcth
 R/reQDczBckCNpPxrWPHHYoVXnAMwEFySfcjZyuCoMO6t6qVUvcjRShCyKwO/JPl
 9YREdRmq0swwIB9cFIrEoWrzc3wjjBtsltDFlkKsa9c92LXoW+g=
 =OpWt
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Optimise away self-test overhead when they are disabled
   - Support symmetric encryption via keyring keys in af_alg
   - Flip hwrng default_quality, the default is now maximum entropy

  Algorithms:
   - Add library version of aesgcm
   - CFI fixes for assembly code
   - Add arm/arm64 accelerated versions of sm3/sm4

  Drivers:
   - Remove assumption on arm64 that kmalloc is DMA-aligned
   - Fix selftest failures in rockchip
   - Add support for RK3328/RK3399 in rockchip
   - Add deflate support in qat
   - Merge ux500 into stm32
   - Add support for TEE for PCI ID 0x14CA in ccp
   - Add mt7986 support in mtk
   - Add MaxLinear platform support in inside-secure
   - Add NPCM8XX support in npcm"

* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
  crypto: ux500/cryp - delete driver
  crypto: stm32/cryp - enable for use with Ux500
  crypto: stm32 - enable drivers to be used on Ux500
  dt-bindings: crypto: Let STM32 define Ux500 CRYP
  hwrng: geode - Fix PCI device refcount leak
  hwrng: amd - Fix PCI device refcount leak
  crypto: qce - Set DMA alignment explicitly
  crypto: octeontx2 - Set DMA alignment explicitly
  crypto: octeontx - Set DMA alignment explicitly
  crypto: keembay - Set DMA alignment explicitly
  crypto: safexcel - Set DMA alignment explicitly
  crypto: hisilicon/hpre - Set DMA alignment explicitly
  crypto: chelsio - Set DMA alignment explicitly
  crypto: ccree - Set DMA alignment explicitly
  crypto: ccp - Set DMA alignment explicitly
  crypto: cavium - Set DMA alignment explicitly
  crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
  crypto: arm64/ghash-ce - use frame_push/pop macros consistently
  crypto: arm64/crct10dif - use frame_push/pop macros consistently
  crypto: arm64/aes-modes - use frame_push/pop macros consistently
  ...
2022-12-14 12:31:09 -08:00
Linus Torvalds 48ea09cdda hardening updates for v6.2-rc1
- Convert flexible array members, fix -Wstringop-overflow warnings,
   and fix KCFI function type mismatches that went ignored by
   maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
 
 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
   add more __alloc_size attributes, and introduce full testing
   of all allocator functions. Finally remove the ksize() side-effect
   so that each allocation-aware checker can finally behave without
   exceptions.
 
 - Introduce oops_limit (default 10,000) and warn_limit (default off)
   to provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook).
 
 - Introduce overflows_type() and castable_to_type() helpers for
   cleaner overflow checking.
 
 - Improve code generation for strscpy() and update str*() kern-doc.
 
 - Convert strscpy and sigphash tests to KUnit, and expand memcpy
   tests.
 
 - Always use a non-NULL argument for prepare_kernel_cred().
 
 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
 
 - Adjust orphan linker section checking to respect CONFIG_WERROR
   (Xin Li).
 
 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
 
 - Fix um vs FORTIFY warnings for always-NULL arguments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
 AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
 F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
 I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
 yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
 d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
 rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
 oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
 0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
 ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
 jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
 AHXcibguPRQBPAdiPQ==
 =yaaN
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening updates from Kees Cook:

 - Convert flexible array members, fix -Wstringop-overflow warnings, and
   fix KCFI function type mismatches that went ignored by maintainers
   (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)

 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
   more __alloc_size attributes, and introduce full testing of all
   allocator functions. Finally remove the ksize() side-effect so that
   each allocation-aware checker can finally behave without exceptions

 - Introduce oops_limit (default 10,000) and warn_limit (default off) to
   provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook)

 - Introduce overflows_type() and castable_to_type() helpers for cleaner
   overflow checking

 - Improve code generation for strscpy() and update str*() kern-doc

 - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests

 - Always use a non-NULL argument for prepare_kernel_cred()

 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)

 - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
   Li)

 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)

 - Fix um vs FORTIFY warnings for always-NULL arguments

* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
  ksmbd: replace one-element arrays with flexible-array members
  hpet: Replace one-element array with flexible-array member
  um: virt-pci: Avoid GCC non-NULL warning
  signal: Initialize the info in ksignal
  lib: fortify_kunit: build without structleak plugin
  panic: Expose "warn_count" to sysfs
  panic: Introduce warn_limit
  panic: Consolidate open-coded panic_on_warn checks
  exit: Allow oops_limit to be disabled
  exit: Expose "oops_count" to sysfs
  exit: Put an upper limit on how often we can oops
  panic: Separate sysctl logic from CONFIG_SMP
  mm/pgtable: Fix multiple -Wstringop-overflow warnings
  mm: Make ksize() a reporting-only function
  kunit/fortify: Validate __alloc_size attribute results
  drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
  drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
  driver core: Add __alloc_size hint to devm allocators
  overflow: Introduce overflows_type() and castable_to_type()
  coredump: Proactively round up to kmalloc bucket size
  ...
2022-12-14 12:20:00 -08:00
Linus Torvalds ad76bf1ff1 memblock: extend test coverage
* add tests that trigger reallocation of memblock structures from
   memblock itself via memblock_double_array()
 * add tests for memblock_alloc_exact_nid_raw() that verify that requested
   node and memory range constraints are respected.
 -----BEGIN PGP SIGNATURE-----
 
 iQFMBAABCAA2FiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmOYL14YHG1pa2UucmFw
 b3BvcnRAZ21haWwuY29tAAoJEDkDhibLDv2RZdcH/2AE447oXzVO2lzOgkqQH1EX
 xJdaa7hu00h2Euzv2lgcOHroHGXDP8wYjUV2cEyNZMP0WOMiO8i6rwIKmrzWufcm
 R+ZoKPQV/Nc+7rIycpW455yLxcgsVIpUILK2BQEkDCGYugSHKb7IYdcA9KDJwtmR
 xIG9j8nsuwWJtmtAuQqNOBmsc5FzKNYFa/RtDiJoMFmQNK3UqB8G8VCASdP0DYvH
 7MXPcyRmlwpmOsKoNKi2/wQBsiag8/PLgcZv5vYg+E6no1tMG6u7pgDS12Sn6ZvA
 I8gThJ8HNAo0d1O2SnbkicMx2CqrPFSub3QXaEFjCZF5mdBcirxHc/VBKj50TXU=
 =iXEA
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:
 "Extend test coverage:

   - add tests that trigger reallocation of memblock structures from
     memblock itself via memblock_double_array()

   - add tests for memblock_alloc_exact_nid_raw() that verify that
     requested node and memory range constraints are respected"

* tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock tests: remove completed TODO item
  memblock tests: add generic NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: add bottom-up NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: add top-down NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: introduce range tests for memblock_alloc_exact_nid_raw
  memblock test: Update TODO list
  memblock test: Add test to memblock_reserve() 129th region
  memblock test: Add test to memblock_add() 129th region
2022-12-14 12:17:57 -08:00
Jiri Olsa 4121d4481b bpf: Synchronize dispatcher update with bpf_dispatcher_xdp_func
Hao Sun reported crash in dispatcher image [1].

Currently we don't have any sync between bpf_dispatcher_update and
bpf_dispatcher_xdp_func, so following race is possible:

 cpu 0:                               cpu 1:

 bpf_prog_run_xdp
   ...
   bpf_dispatcher_xdp_func
     in image at offset 0x0

                                      bpf_dispatcher_update
                                        update image at offset 0x800
                                      bpf_dispatcher_update
                                        update image at offset 0x0

     in image at offset 0x0 -> crash

Fixing this by synchronizing dispatcher image update (which is done
in bpf_dispatcher_update function) with bpf_dispatcher_xdp_func that
reads and execute the dispatcher image.

Calling synchronize_rcu after updating and installing new image ensures
that readers leave old image before it's changed in the next dispatcher
update. The update itself is locked with dispatcher's mutex.

The bpf_prog_run_xdp is called under local_bh_disable and synchronize_rcu
will wait for it to leave [2].

[1] https://lore.kernel.org/bpf/Y5SFho7ZYXr9ifRn@krava/T/#m00c29ece654bc9f332a17df493bbca33e702896c
[2] https://lore.kernel.org/bpf/0B62D35A-E695-4B7A-A0D4-774767544C1A@gmail.com/T/#mff43e2c003ae99f4a38f353c7969be4c7162e877

Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20221214123542.1389719-1-jolsa@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-14 12:02:14 -08:00
Tejun Heo 813e693023 blk-iolatency: Fix memory leak on add_disk() failures
When a gendisk is successfully initialized but add_disk() fails such as when
a loop device has invalid number of minor device numbers specified,
blkcg_init_disk() is called during init and then blkcg_exit_disk() during
error handling. Unfortunately, iolatency gets initialized in the former but
doesn't get cleaned up in the latter.

This is because, in non-error cases, the cleanup is performed by
del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
policies, iolatency being one of them, can only be activated once the disk
is fully registered and visible. That assumption is true for wbt and iocost,
but not so for iolatency as it gets initialized before add_disk() is called.

It is desirable to lazy-init rq_qos policies because they are optional
features and add to hot path overhead once initialized - each IO has to walk
all the registered rq_qos policies. So, we want to switch iolatency to lazy
init too. However, that's a bigger change. As a fix for the immediate
problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
This is safe because duplicate calls to rq_qos_exit() become noop's.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: darklight2357@icloud.com
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: d706751215 ("block: introduce blk-iolatency io controller")
Cc: stable@vger.kernel.org # v4.19+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/Y5TQ5gm3O4HXrXR3@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-14 12:42:55 -07:00
Linus Torvalds 6f1f5caed5 orangefs: four fixes from Zhang Xiaoxu and two from Colin Ian King
Zhang: fixed problems with memory leaks on exit in sysfs and debufs.
 fs/orangefs/orangefs-debugfs.c
 fs/orangefs/orangefs-sysfs.c
 fs/orangefs/orangefs-debugfs.c
 fs/orangefs/orangefs-mod.c
 
 Colin: removed an unused variable and an unneeded assignment.
 fs/orangefs/file.c
 fs/orangefs/inode.c
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIGSFVdO6eop9nER2z0QOqevODb4FAmOQ+4sACgkQz0QOqevO
 Db4hLg//Vj5t1OBMxc/qbHGs6DwyIThQATss2jE2q0dkWwXxwsmxa23dwXzQGe6b
 hF+gLzh5AgJ4BnaoFYAgF/pJ7Su1iz3FLzh1J4dQ1zfPA/cr0vAhoxmpMH0X4+H8
 nycDNn/gZuZVn/2x1SVnDrut6R9mIQM6ESYdO/99TK8z2IIgqiaCZRi0timPp6Gs
 z3XYibdSbFib4xjbNLcdwLsLphzR4XupMdaXGai7MBriEc3lLN5dIn1hElbrmHD8
 k3xRVyWlyBknp1/xb8icnn05vim35PIbuY6K9RaJfhx94qdfu51c0rST1Ay4zY0Z
 FKTI6CocraXJWZ2557yXrLCxUa+/t7VYvxmF7oYgrqCD4xlzybEO79iGo6mvJ1cd
 47PERjJFDa5AvVYFC9Ggm5V4qJhX60fUxXubmwhQVtMNFNjo/P9dpxbmkHmFmTG3
 Op+N45MsVJNxQDtE1qwcoTU5nBEX/6ziQGaYkgXa8YArSn89xs4i89wQlACsJElM
 xXZCoKxQgT1wyq2DvoieVAShFSXXJ9OQjOYI3lGN57n+jHSweR0Jr5IfUGq3dfax
 nyyjcDr2p291w2weFbsQtdUTdGKKhEiky7aSIvFMtCXUHdsIIty+YSx3vjAk3A/p
 a0HFgZBuTPYwdbwW4EA+12Prf2xnsoHV/51dbklCAhKEGAPSMg0=
 =2ddP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.2-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs updates from Mike Marshall:

 - fix problems with memory leaks on exit in sysfs and debufs (Zhang)

 - remove an unused variable and an unneeded assignment (Colin)

* tag 'for-linus-6.2-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init()
  orangefs: Fix kmemleak in orangefs_sysfs_init()
  orangefs: Fix kmemleak in orangefs_prepare_debugfs_help_string()
  orangefs: Fix sysfs not cleanup when dev init failed
  orangefs: remove redundant assignment to variable buffer_index
  orangefs: remove variable i
2022-12-14 11:16:33 -08:00
Tetsuo Handa 3c3bfb8586 fbdev: fbcon: release buffer when fbcon_do_set_font() failed
syzbot is reporting memory leak at fbcon_do_set_font() [1], for
commit a5a923038d ("fbdev: fbcon: Properly revert changes when
vc_resize() failed") missed that the buffer might be newly allocated
by fbcon_set_font().

Link: https://syzkaller.appspot.com/bug?extid=25bdb7b1703639abd498 [1]
Reported-by: syzbot <syzbot+25bdb7b1703639abd498@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+25bdb7b1703639abd498@syzkaller.appspotmail.com>
Fixes: a5a923038d ("fbdev: fbcon: Properly revert changes when vc_resize() failed")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
ye xingchen b20a558d37 fbdev: sh_mobile_lcdcfb: use sysfs_emit() to instead of scnprintf()
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
ye xingchen 522d5226ee fbdev: uvesafb: use sysfs_emit() to instead of scnprintf()
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
Christophe JAILLET f2ff0c430f fbdev: uvesafb: Simplify uvesafb_remove()
When the remove() function is called, we know that the probe() function has
successfully been executed. So 'info' is known to be not NULL.

Simplify the code accordingly.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
Christophe JAILLET a943710407 fbdev: uvesafb: Fixes an error handling path in uvesafb_probe()
If an error occurs after a successful uvesafb_init_mtrr() call, it must be
undone by a corresponding arch_phys_wc_del() call, as already done in the
remove function.

This has been added in the remove function in commit 63e28a7a5f
("uvesafb: Clean up MTRR code")

Fixes: 8bdb3a2d7d ("uvesafb: the driver core")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
Randy Dunlap 35b4f4d4a7 fbdev: uvesafb: don't build on UML
The uvesafb fbdev driver uses memory management information that is not
available on ARCH=um, so don't allow this driver to be built on UML.

Prevents these build errors:

../drivers/video/fbdev/uvesafb.c: In function ‘uvesafb_vbe_init’:
../drivers/video/fbdev/uvesafb.c:807:21: error: ‘__supported_pte_mask’ undeclared (first use in this function)
  807 |                 if (__supported_pte_mask & _PAGE_NX) {
../drivers/video/fbdev/uvesafb.c:807:44: error: ‘_PAGE_NX’ undeclared (first use in this function)
  807 |                 if (__supported_pte_mask & _PAGE_NX) {

Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: linux-um@lists.infradead.org
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: Michal Januszewski <spock@gentoo.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00
Randy Dunlap 71c53e1922 fbdev: geode: don't build on UML
The geode fbdev driver uses struct cpuinfo fields that are not present
on ARCH=um, so don't allow this driver to be built on UML.

Prevents these build errors:

In file included from ../arch/x86/include/asm/olpc.h:7:0,
                 from ../drivers/mfd/cs5535-mfd.c:17:
../arch/x86/include/asm/geode.h: In function ‘is_geode_gx’:
../arch/x86/include/asm/geode.h:16:24: error: ‘struct cpuinfo_um’ has no member named ‘x86_vendor’
  return ((boot_cpu_data.x86_vendor == X86_VENDOR_NSC) &&
../arch/x86/include/asm/geode.h:16:39: error: ‘X86_VENDOR_NSC’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’?
  return ((boot_cpu_data.x86_vendor == X86_VENDOR_NSC) &&
../arch/x86/include/asm/geode.h:17:17: error: ‘struct cpuinfo_um’ has no member named ‘x86’
   (boot_cpu_data.x86 == 5) &&
../arch/x86/include/asm/geode.h:18:17: error: ‘struct cpuinfo_um’ has no member named ‘x86_model’
   (boot_cpu_data.x86_model == 5));
../arch/x86/include/asm/geode.h: In function ‘is_geode_lx’:
../arch/x86/include/asm/geode.h:23:24: error: ‘struct cpuinfo_um’ has no member named ‘x86_vendor’
  return ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
../arch/x86/include/asm/geode.h:23:39: error: ‘X86_VENDOR_AMD’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’?
  return ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
../arch/x86/include/asm/geode.h:24:17: error: ‘struct cpuinfo_um’ has no member named ‘x86’
   (boot_cpu_data.x86 == 5) &&
../arch/x86/include/asm/geode.h:25:17: error: ‘struct cpuinfo_um’ has no member named ‘x86_model’
   (boot_cpu_data.x86_model == 10));

Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: linux-um@lists.infradead.org
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: Andres Salomon <dilinger@queued.net>
Cc: linux-geode@lists.infradead.org
Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-14 20:01:51 +01:00