When checking for libc rseq support in the library constructor, don't
only depend on the symbols presence, check that the registration was
completed.
This targets a scenario where the libc has rseq support but it is not
wired for the current architecture in 'bits/rseq.h', we want to fallback
to our internal registration mechanism.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20220614154830.1367382-4-mjeanson@efficios.com
This header is also used in librseq where it can be included in C++
code, add a space between literals and string macros.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20220614154830.1367382-3-mjeanson@efficios.com
Make the RISC-V rseq selftests compatible with glibc-2.35 by using the
rseq_get_abi() helper.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20220614154830.1367382-2-mjeanson@efficios.com
Wakelist can help avoid cache bouncing and offload the overhead of waker
cpu. So far, using wakelist within the same llc only happens on
WF_ON_CPU, and this limitation could be removed to further improve
wakeup performance.
The commit 518cd62341 ("sched: Only queue remote wakeups when
crossing cache boundaries") disabled queuing tasks on wakelist when
the cpus share llc. This is because, at that time, the scheduler must
send IPIs to do ttwu_queue_wakelist. Nowadays, ttwu_queue_wakelist also
supports TIF_POLLING, so this is not a problem now when the wakee cpu is
in idle polling.
Benefits:
Queuing the task on idle cpu can help improving performance on waker cpu
and utilization on wakee cpu, and further improve locality because
the wakee cpu can handle its own rq. This patch helps improving rt on
our real java workloads where wakeup happens frequently.
Consider the normal condition (CPU0 and CPU1 share same llc)
Before this patch:
CPU0 CPU1
select_task_rq() idle
rq_lock(CPU1->rq)
enqueue_task(CPU1->rq)
notify CPU1 (by sending IPI or CPU1 polling)
resched()
After this patch:
CPU0 CPU1
select_task_rq() idle
add to wakelist of CPU1
notify CPU1 (by sending IPI or CPU1 polling)
rq_lock(CPU1->rq)
enqueue_task(CPU1->rq)
resched()
We see CPU0 can finish its work earlier. It only needs to put task to
wakelist and return.
While CPU1 is idle, so let itself handle its own runqueue data.
This patch brings no difference about IPI.
This patch only takes effect when the wakee cpu is:
1) idle polling
2) idle not polling
For 1), there will be no IPI with or without this patch.
For 2), there will always be an IPI before or after this patch.
Before this patch: waker cpu will enqueue task and check preempt. Since
"idle" will be sure to be preempted, waker cpu must send a resched IPI.
After this patch: waker cpu will put the task to the wakelist of wakee
cpu, and send an IPI.
Benchmark:
We've tested schbench, unixbench, and hachbench on both x86 and arm64.
On x86 (Intel Xeon Platinum 8269CY):
schbench -m 2 -t 8
Latency percentiles (usec) before after
50.0000th: 8 6
75.0000th: 10 7
90.0000th: 11 8
95.0000th: 12 8
*99.0000th: 13 10
99.5000th: 15 11
99.9000th: 18 14
Unixbench with full threads (104)
before after
Dhrystone 2 using register variables 3011862938 3009935994 -0.06%
Double-Precision Whetstone 617119.3 617298.5 0.03%
Execl Throughput 27667.3 27627.3 -0.14%
File Copy 1024 bufsize 2000 maxblocks 785871.4 784906.2 -0.12%
File Copy 256 bufsize 500 maxblocks 210113.6 212635.4 1.20%
File Copy 4096 bufsize 8000 maxblocks 2328862.2 2320529.1 -0.36%
Pipe Throughput 145535622.8 145323033.2 -0.15%
Pipe-based Context Switching 3221686.4 3583975.4 11.25%
Process Creation 101347.1 103345.4 1.97%
Shell Scripts (1 concurrent) 120193.5 123977.8 3.15%
Shell Scripts (8 concurrent) 17233.4 17138.4 -0.55%
System Call Overhead 5300604.8 5312213.6 0.22%
hackbench -g 1 -l 100000
before after
Time 3.246 2.251
On arm64 (Ampere Altra):
schbench -m 2 -t 8
Latency percentiles (usec) before after
50.0000th: 14 10
75.0000th: 19 14
90.0000th: 22 16
95.0000th: 23 16
*99.0000th: 24 17
99.5000th: 24 17
99.9000th: 28 25
Unixbench with full threads (80)
before after
Dhrystone 2 using register variables 3536194249 3537019613 0.02%
Double-Precision Whetstone 629383.6 629431.6 0.01%
Execl Throughput 65920.5 65846.2 -0.11%
File Copy 1024 bufsize 2000 maxblocks 1063722.8 1064026.8 0.03%
File Copy 256 bufsize 500 maxblocks 322684.5 318724.5 -1.23%
File Copy 4096 bufsize 8000 maxblocks 2348285.3 2328804.8 -0.83%
Pipe Throughput 133542875.3 131619389.8 -1.44%
Pipe-based Context Switching 3215356.1 3576945.1 11.25%
Process Creation 108520.5 120184.6 10.75%
Shell Scripts (1 concurrent) 122636.3 121888 -0.61%
Shell Scripts (8 concurrent) 17462.1 17381.4 -0.46%
System Call Overhead 4429998.9 4435006.7 0.11%
hackbench -g 1 -l 100000
before after
Time 4.217 2.916
Our patch has improvement on schbench, hackbench
and Pipe-based Context Switching of unixbench
when there exists idle cpus,
and no obvious regression on other tests of unixbench.
This can help improve rt in scenes where wakeup happens frequently.
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20220608233412.327341-3-dtcccc@linux.alibaba.com
The commit 2ebb177175 ("sched/core: Offload wakee task activation if it
the wakee is descheduling") checked rq->nr_running <= 1 to avoid task
stacking when WF_ON_CPU.
Per the ordering of writes to p->on_rq and p->on_cpu, observing p->on_cpu
(WF_ON_CPU) in ttwu_queue_cond() implies !p->on_rq, IOW p has gone through
the deactivate_task() in __schedule(), thus p has been accounted out of
rq->nr_running. As such, the task being the only runnable task on the rq
implies reading rq->nr_running == 0 at that point.
The benchmark result is in [1].
[1] https://lore.kernel.org/all/e34de686-4e85-bde1-9f3c-9bbc86b38627@linux.alibaba.com/
Suggested-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20220608233412.327341-2-dtcccc@linux.alibaba.com
While doing newidle load balancing, it is possible for new tasks to
arrive, such as with pending wakeups. newidle_balance() already accounts
for this by exiting the sched_domain load_balance() iteration if it
detects these cases. This is very important for minimizing wakeup
latency.
However, if we are already in load_balance(), we may stay there for a
while before returning back to newidle_balance(). This is most
exacerbated if we enter a 'goto redo' loop in the LBF_ALL_PINNED case. A
very straightforward workaround to this is to adjust should_we_balance()
to bail out if we're doing a CPU_NEWLY_IDLE balance and new tasks are
detected.
This was tested with the following reproduction:
- two threads that take turns sleeping and waking each other up are
affined to two cores
- a large number of threads with 100% utilization are pinned to all
other cores
Without this patch, wakeup latency was ~120us for the pair of threads,
almost entirely spent in load_balance(). With this patch, wakeup latency
is ~6us.
Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220609025515.2086253-1-joshdon@google.com
sysctl_sched_dl_period_max and sysctl_sched_dl_period_min are unsigned
integer, but proc_dointvec() wouldn't return error even if we set a
negative number.
Use proc_douintvec_minmax() instead of proc_dointvec(). Add extra1 for
sysctl_sched_dl_period_max and extra2 for sysctl_sched_dl_period_min.
It's just an optimization for match data and proc_handler in struct
ctl_table. The 'if (period < min || period > max)' in __checkparam_dl()
will work fine even if there hasn't this patch.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Link: https://lore.kernel.org/r/20220607101807.249965-1-yajun.deng@linux.dev
We notice the rq leaf_cfs_rq_list has two problems when do bugfix
backports and some test profiling.
1. cfs_rqs under throttled subtree could be added to the list, and
make their fully decayed ancestors on the list, even though not needed.
2. #1 also make the leaf_cfs_rq_list management complex and error prone,
this is the list of related bugfix so far:
commit 31bc6aeaab ("sched/fair: Optimize update_blocked_averages()")
commit fe61468b2c ("sched/fair: Fix enqueue_task_fair warning")
commit b34cb07dde ("sched/fair: Fix enqueue_task_fair() warning some more")
commit 39f23ce07b ("sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list")
commit 0258bdfaff ("sched/fair: Fix unfairness caused by missing load decay")
commit a7b359fc6a ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
commit fdaba61ef8 ("sched/fair: Ensure that the CFS parent is added after unthrottling")
commit 2630cde267 ("sched/fair: Add ancestors of unthrottled undecayed cfs_rq")
commit 31bc6aeaab ("sched/fair: Optimize update_blocked_averages()")
delete every cfs_rq under throttled subtree from rq->leaf_cfs_rq_list,
and delete the throttled_hierarchy() test in update_blocked_averages(),
which optimized update_blocked_averages().
But those later bugfix add cfs_rqs under throttled subtree back to
rq->leaf_cfs_rq_list again, with their fully decayed ancestors, for
the integrity of rq->leaf_cfs_rq_list.
This patch takes another method, skip all cfs_rqs under throttled
hierarchy when list_add_leaf_cfs_rq(), to completely make cfs_rqs
under throttled subtree off the leaf_cfs_rq_list.
So we don't need to consider throttled related things in
enqueue_entity(), unthrottle_cfs_rq() and enqueue_task_fair(),
which simplify the code a lot. Also optimize update_blocked_averages()
since cfs_rqs under throttled hierarchy and their ancestors
won't be on the leaf_cfs_rq_list.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220601021848.76943-1-zhouchengming@bytedance.com
In the case of systems containing multiple LLCs per socket, like
AMD Zen systems, users want to spread bandwidth hungry applications
across multiple LLCs. Stream is one such representative workload where
the best performance is obtained by limiting one stream thread per LLC.
To ensure this, users are known to pin the tasks to a specify a subset
of the CPUs consisting of one CPU per LLC while running such bandwidth
hungry tasks.
Suppose we kickstart a multi-threaded task like stream with 8 threads
using taskset or numactl to run on a subset of CPUs on a 2 socket Zen3
server where each socket contains 128 CPUs
(0-63,128-191 in one socket, 64-127,192-255 in another socket)
Eg: numactl -C 0,16,32,48,64,80,96,112 ./stream8
Here each CPU in the list is from a different LLC and 4 of those LLCs
are on one socket, while the other 4 are on another socket.
Ideally we would prefer that each stream thread runs on a different
CPU from the allowed list of CPUs. However, the current heuristics in
find_idlest_group() do not allow this during the initial placement.
Suppose the first socket (0-63,128-191) is our local group from which
we are kickstarting the stream tasks. The first four stream threads
will be placed in this socket. When it comes to placing the 5th
thread, all the allowed CPUs are from the local group (0,16,32,48)
would have been taken.
However, the current scheduler code simply checks if the number of
tasks in the local group is fewer than the allowed numa-imbalance
threshold. This threshold was previously 25% of the NUMA domain span
(in this case threshold = 32) but after the v6 of Mel's patchset
"Adjust NUMA imbalance for multiple LLCs", got merged in sched-tip,
Commit: e496132ebe ("sched/fair: Adjust the allowed NUMA imbalance
when SD_NUMA spans multiple LLCs") it is now equal to number of LLCs
in the NUMA domain, for processors with multiple LLCs.
(in this case threshold = 8).
For this example, the number of tasks will always be within threshold
and thus all the 8 stream threads will be woken up on the first socket
thereby resulting in sub-optimal performance.
The following sched_wakeup_new tracepoint output shows the initial
placement of tasks in the current tip/sched/core on the Zen3 machine:
stream-5313 [016] d..2. 627.005036: sched_wakeup_new: comm=stream pid=5315 prio=120 target_cpu=032
stream-5313 [016] d..2. 627.005086: sched_wakeup_new: comm=stream pid=5316 prio=120 target_cpu=048
stream-5313 [016] d..2. 627.005141: sched_wakeup_new: comm=stream pid=5317 prio=120 target_cpu=000
stream-5313 [016] d..2. 627.005183: sched_wakeup_new: comm=stream pid=5318 prio=120 target_cpu=016
stream-5313 [016] d..2. 627.005218: sched_wakeup_new: comm=stream pid=5319 prio=120 target_cpu=016
stream-5313 [016] d..2. 627.005256: sched_wakeup_new: comm=stream pid=5320 prio=120 target_cpu=016
stream-5313 [016] d..2. 627.005295: sched_wakeup_new: comm=stream pid=5321 prio=120 target_cpu=016
Once the first four threads are distributed among the allowed CPUs of
socket one, the rest of the treads start piling on these same CPUs
when clearly there are CPUs on the second socket that can be used.
Following the initial pile up on a small number of CPUs, though the
load-balancer eventually kicks in, it takes a while to get to {4}{4}
and even {4}{4} isn't stable as we observe a bunch of ping ponging
between {4}{4} to {5}{3} and back before a stable state is reached
much later (1 Stream thread per allowed CPU) and no more migration is
required.
We can detect this piling and avoid it by checking if the number of
allowed CPUs in the local group are fewer than the number of tasks
running in the local group and use this information to spread the
5th task out into the next socket (after all, the goal in this
slowpath is to find the idlest group and the idlest CPU during the
initial placement!).
The following sched_wakeup_new tracepoint output shows the initial
placement of tasks after adding this fix on the Zen3 machine:
stream-4485 [016] d..2. 230.784046: sched_wakeup_new: comm=stream pid=4487 prio=120 target_cpu=032
stream-4485 [016] d..2. 230.784123: sched_wakeup_new: comm=stream pid=4488 prio=120 target_cpu=048
stream-4485 [016] d..2. 230.784167: sched_wakeup_new: comm=stream pid=4489 prio=120 target_cpu=000
stream-4485 [016] d..2. 230.784222: sched_wakeup_new: comm=stream pid=4490 prio=120 target_cpu=112
stream-4485 [016] d..2. 230.784271: sched_wakeup_new: comm=stream pid=4491 prio=120 target_cpu=096
stream-4485 [016] d..2. 230.784322: sched_wakeup_new: comm=stream pid=4492 prio=120 target_cpu=080
stream-4485 [016] d..2. 230.784368: sched_wakeup_new: comm=stream pid=4493 prio=120 target_cpu=064
We see that threads are using all of the allowed CPUs and there is
no pileup.
No output is generated for tracepoint sched_migrate_task with this
patch due to a perfect initial placement which removes the need
for balancing later on - both across NUMA boundaries and within
NUMA boundaries for stream.
Following are the results from running 8 Stream threads with and
without pinning on a dual socket Zen3 Machine (2 x 64C/128T):
During the testing of this patch, the tip sched/core was at
commit: 089c02ae27 "ftrace: Use preemption model accessors for trace
header printout"
Pinning is done using: numactl -C 0,16,32,48,64,80,96,112 ./stream8
5.18.0-rc1 5.18.0-rc1 5.18.0-rc1
tip sched/core tip sched/core tip sched/core
(no pinning) + pinning + this-patch
+ pinning
Copy: 109364.74 (0.00 pct) 94220.50 (-13.84 pct) 158301.28 (44.74 pct)
Scale: 109670.26 (0.00 pct) 90210.59 (-17.74 pct) 149525.64 (36.34 pct)
Add: 129029.01 (0.00 pct) 101906.00 (-21.02 pct) 186658.17 (44.66 pct)
Triad: 127260.05 (0.00 pct) 106051.36 (-16.66 pct) 184327.30 (44.84 pct)
Pinning currently hurts the performance compared to unbound case on
tip/sched/core. With the addition of this patch, we are able to
outperform tip/sched/core by a good margin with pinning.
Following are the results from running 16 Stream threads with and
without pinning on a dual socket IceLake Machine (2 x 32C/64T):
NUMA Topology of Intel Skylake machine:
Node 1: 0,2,4,6 ... 126 (Even numbers)
Node 2: 1,3,5,7 ... 127 (Odd numbers)
Pinning is done using: numactl -C 0-15 ./stream16
5.18.0-rc1 5.18.0-rc1 5.18.0-rc1
tip sched/core tip sched/core tip sched/core
(no pinning) +pinning + this-patch
+ pinning
Copy: 85815.31 (0.00 pct) 149819.21 (74.58 pct) 156807.48 (82.72 pct)
Scale: 64795.60 (0.00 pct) 97595.07 (50.61 pct) 99871.96 (54.13 pct)
Add: 71340.68 (0.00 pct) 111549.10 (56.36 pct) 114598.33 (60.63 pct)
Triad: 68890.97 (0.00 pct) 111635.16 (62.04 pct) 114589.24 (66.33 pct)
In case of Icelake machine, with single LLC per socket, pinning across
the two sockets reduces cache contention, thus showing great
improvement in pinned case which is further benefited by this patch.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lkml.kernel.org/r/20220407111222.22649-1-kprateek.nayak@amd.com
For a single LLC per node, a NUMA imbalance is allowed up until 25%
of CPUs sharing a node could be active. One intent of the cut-off is
to avoid an imbalance of memory channels but there is no topological
information based on active memory channels. Furthermore, there can
be differences between nodes depending on the number of populated
DIMMs.
A cut-off of 25% was arbitrary but generally worked. It does have a severe
corner cases though when an parallel workload is using 25% of all available
CPUs over-saturates memory channels. This can happen due to the initial
forking of tasks that get pulled more to one node after early wakeups
(e.g. a barrier synchronisation) that is not quickly corrected by the
load balancer. The LB may fail to act quickly as the parallel tasks are
considered to be poor migrate candidates due to locality or cache hotness.
On a range of modern Intel CPUs, 12.5% appears to be a better cut-off
assuming all memory channels are populated and is used as the new cut-off
point. A minimum of 1 is specified to allow a communicating pair to
remain local even for CPUs with low numbers of cores. For modern AMDs,
there are multiple LLCs and are not affected.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20220520103519.1863-5-mgorman@techsingularity.net
The imbalance limitations are applied inconsistently at fork time
and at runtime. At fork, a new task can remain local until there are
too many running tasks even if the degree of imbalance is larger than
NUMA_IMBALANCE_MIN which is different to runtime. Secondly, the imbalance
figure used during load balancing is different to the one used at NUMA
placement. Load balancing uses the number of tasks that must move to
restore imbalance where as NUMA balancing uses the total imbalance.
In combination, it is possible for a parallel workload that uses a small
number of CPUs without applying scheduler policies to have very variable
run-to-run performance.
[lkp@intel.com: Fix build breakage for arc-allyesconfig]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20220520103519.1863-4-mgorman@techsingularity.net
If a destination node has spare capacity but there is an imbalance then
two tasks are selected for swapping. If the tasks have no numa group
or are within the same NUMA group, it's simply shuffling tasks around
without having any impact on the compute imbalance. Instead, it's just
punishing one task to help another.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20220520103519.1863-3-mgorman@techsingularity.net
On clone, numa_migrate_retry is inherited from the parent which means
that the first NUMA placement of a task is non-deterministic. This
affects when load balancing recognises numa tasks and whether to
migrate "regular", "remote" or "all" tasks between NUMA scheduler
domains.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20220520103519.1863-2-mgorman@techsingularity.net
Highlights:
- Fix hp-wmi regression on HP Omen laptops introduced in 5.18
- Several hardware-id additions
- A couple of other tiny fixes
The following is an automated git shortlog grouped by driver:
barco-p50-gpio:
- Add check for platform_driver_register
gigabyte-wmi:
- Add support for B450M DS3H-CF
- Add Z690M AORUS ELITE AX DDR4 support
hp-wmi:
- Use zero insize parameter only when supported
- Resolve WMI query failures on some devices
platform/mellanox:
- Add static in struct declaration.
- Spelling s/platfom/platform/
platform/x86/intel:
- hid: Add Surface Go to VGBS allow list
- pmc: Support Intel Raptorlake P
- Fix pmt_crashlog array reference
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmKl3+0UHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9zU3wf/YJUH3Obp0kAUssrOxT9QShHH6iFL
Gh6CN38DYLEbV7FNBH/iA+ZW7DSsymQm1M3jj1QYB0Sd2X/hSsUEh2q6F0XNjTKH
X8vVrSRAL+DStpgyKerBJHKpzO3ESUlipmIvR/JuGVHQAylR8ielHvAx+XhA1PKz
8/ezppIa9bxZ1+LwQNPE40jJv+wlNctGrx3TEwqPep3STWWFEjkeYpR+3XIJ4Ads
5P7FXxpM2cRFKBcI9jVIBDMIZulEfkXxBDcLCHOYYitxcpnrFFghn14ys5Exh9zL
pdaVHnlP6pEFpVhCWa/ICpSyrLhNYqTUJ92dyFeakbGBj9QlwnfeceMLAw==
=SaRl
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Highlights:
- Fix hp-wmi regression on HP Omen laptops introduced in 5.18
- Several hardware-id additions
- A couple of other tiny fixes"
* tag 'platform-drivers-x86-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/intel: hid: Add Surface Go to VGBS allow list
platform/x86: hp-wmi: Use zero insize parameter only when supported
platform/x86: hp-wmi: Resolve WMI query failures on some devices
platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF
platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support
platform/x86: barco-p50-gpio: Add check for platform_driver_register
platform/x86/intel: pmc: Support Intel Raptorlake P
platform/x86/intel: Fix pmt_crashlog array reference
platform/mellanox: Add static in struct declaration.
platform/mellanox: Spelling s/platfom/platform/
Tetsuo's patch to trigger build warnings if system-wide wq's are flushed
along with a TP type update and trivial comment update.
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCYqUyqQ4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGQPtAQCQZuNFoWhCtdpjW/MWuGdY1pGGPMVl+60xwvew
Ad8gegD/eoAsXP1XAzJ9Z1BPqr/IxncfOgGGDGHbR1Ll39qLlwE=
=i3Sx
-----END PGP SIGNATURE-----
Merge tag 'wq-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Tetsuo's patch to trigger build warnings if system-wide wq's are
flushed along with a TP type update and trivial comment update"
* tag 'wq-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Switch to new kerneldoc syntax for named variable macro argument
workqueue: Fix type of cpu in trace event
workqueue: Wrap flush_workqueue() using a macro
- Make the *.mod build rule portable for POSIX awk
- Fix regression of 'make nsdeps'
- Make scripts/check-local-export working for older bash versions
- Fix scripts/gdb to extract the .config data from vmlinux
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmKk7n0VHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGFv0P/RNPUu+hhNkQB41nIyDkn977KtM4
QqEdXUaGVEXEMwcBYn7bo4LR/pxbRX73WJ2o3ONwSTnpmbwDkSwmJX0JADuUhFKh
je7E/+Sv9wJsyZ93Tow/nTt13xCcox9QSxx2wZrS06GQz6EmU34N+sXx82tffH8L
ONQq36p31JKCwTuaK9oiAz+FAH6ap9sStL6WBrS+HSpinO0s5P9xTtTrdZHTYmn2
KgFofMZvGAMEjGZsi2H3nobjH52/yZoUyfDanFS4DAS2ZMMQ5Vw8MpCxuKz8G2oN
/VCo3HgK9D+0uaAwPag3z+hugHoMBU5cjNhsvjiO6veB/VEHxm5KoEATo4xmU2OZ
W7GGp3FCsxrZkSkXtKO6FAJatN/DlgHYoJ647xo/yNOxULgzVCZgAs/cPUqUW9l2
bFNeDJ4YIiFDCwi65/KOAyLOpQT8j/wMEWmrrk57HBkrSaBrHhkNV31d3Y5HUnr2
XhjqGk4dX+YX8WAEN0LrjLzUQNntqxk2yvITi0m4OaIvblu0XU2B0F70WRqhXxaf
cmQFysAhmcaFSrlWxiNOnLPENNQjSy5jeKU/f6ZpGcYu7MU09ay3P+eoqFoPGq24
dLBqdLd52QDJQAo91y05cUJrj6IRYNsEK7HYlvokmvfmGHjfclxdM4+rT6dBwHpf
ih8aP2eTxRMIwqgY
=ka85
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Make the *.mod build rule portable for POSIX awk
- Fix regression of 'make nsdeps'
- Make scripts/check-local-export working for older bash versions
- Fix scripts/gdb to extract the .config data from vmlinux
* tag 'kbuild-fixes-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
scripts/gdb: change kernel config dumping method
scripts/check-local-export: avoid 'wait $!' for process substitution
scripts/nsdeps: adjust to the format change of *.mod files
kbuild: avoid regex RS for POSIX awk
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmKkwyEACgkQiiy9cAdy
T1EqyAv/d8aDey0rQzGy918wzJd91gZrNFOJUpzVhIs3O5MakBgeoYn+S6rySl1+
xs6lXTQdSEyiL0edqTIq8iqA+iuhLPCBW2BWa/Zw089yHM/Ho3tjc5gBl5w38OcF
7NpFUInkg+yoBYWY9cCwjL83YaPxhcLKGY7S6WWptUxzf5Sg6eUqXCkMS7eUV6hb
YniMa5uWZSJtqY4F6qw/NOw90QekodEmfL4lLU/GXOnDxlJ8v5Ztf3aGHITWNwsd
ovhutUSai/tZz9fYHp6yOZYDcl4i0brOa3dIyU2tr52TdtzS73he8rE+Th4bu+uM
XTXvrDCTwsnOTiRFyyBJcaVDF+6LpqPEcURqLEbVOf0xXHyoEQ4zEwVFQJIBYOP4
Oy8XeXQePRxCnToI2cFZaw85IkLikoZ+4PggFkbsaFdJfkboR7b+XVhkfGVr5jnn
A6Unwrn3f6LS6MLbsDAjUpfzftwRyhbcvYCukeYKWz836xAx5tyOBIZA3m6keXah
LyhX4qSb
=mHWJ
-----END PGP SIGNATURE-----
Merge tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
"Three reconnect fixes, all for stable as well.
One of these three reconnect fixes does address a problem with
multichannel reconnect, but this does not include the additional
fix (still being tested) for dynamically detecting multichannel
adapter changes which will improve those reconnect scenarios even
more"
* tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: populate empty hostnames for extra channels
cifs: return errors during session setup during reconnects
cifs: fix reconnect on smb3 mount types
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmKktaEACgkQSfxwEqXe
A65Q+RAAkyLfX6kxQlTZbglTTiapbGwpk7+VhM0pJfc3guXhIe8kNZ5cX/mfkGeZ
f/ebUmWSJhqaNz1OIZeBZQ98ESwh8imfWaxWDkqsFh4c+hGsSp2xwIszMn3Hg+7L
Sm/0Q71eZaSnRBGxWRVBbz3tTppUBS4nJxvFj8iM3jWWUffZa0m/w1lMvqc8kNJu
kM1ONqb+CEuHOJyltUaH2qEXD6fQE3IOpPdC6PdsFqFX8qLN/pwZO6tpY8tYPt3j
AUubp8F3eR4Y7WcmMi1b7BmiRgg47jdsS18aqRSH8CuYvIBbXHNM+tuK54zh/888
d+s9Bj6ALR4z1/a8HXqtudCazYU+1VozWxVIELcpDWTX4wKgqVZ0HLEz7sAEfCgV
wSIcIY8TRJlOTL43KenbJqbyIOsvAidqWEYz5ogF9WYUlaD82s2j8pUMj4wQoD5w
VJqF2CaoewU0BOGK7rZFnElN5rPlfEJtNBZQOEo16BzA2tXilPgRFoOmKs9TMRQo
dGotfp62rUuS14b9x7zc9be0QmtnmwxQKO9U6SRVd5X2HXU/E5PsqTsbt0PKlSbA
qemw2CehPB+uFs0cqbbeI5VBpVnPQJkaflTfOVr04h7623KJ5Pnblv7/8iSedOOh
L4ACxPO5I5MrM8rJAR4g85SYVg3A/HaTufA5XtR5QB/qmEaErXg=
=1Lf0
-----END PGP SIGNATURE-----
Merge tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:
- A fix for a 5.19 regression for a case in which early device tree
initializes the RNG, which flips a static branch.
On most plaforms, jump labels aren't initialized until much later, so
this caused splats. On a few mailing list threads, we cooked up easy
fixes for arm64, arm32, and risc-v. But then things looked slightly
more involved for xtensa, powerpc, arc, and mips. And at that point,
when we're patching 7 architectures in a place before the console is
even available, it seems like the cost/risk just wasn't worth it.
So random.c works around it now by checking the already exported
`static_key_initialized` boolean, as though somebody already ran into
this issue in the past. I'm not super jazzed about that; it'd be
prettier to not have to complicate downstream code. But I suppose
it's practical.
- A few small code nits and adding a missing __init annotation.
- A change to the default config values to use the cpu and bootloader's
seeds for initializing the RNG earlier.
This brings them into line with what all the distros do (Fedora/RHEL,
Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void... at
least), and moreover will now give us test coverage in various test
beds that might have caught the above device tree bug earlier.
- A change to WireGuard CI's configuration to increase test coverage
around the RNG.
- A documentation comment fix to unrelated maintainerless CRC code that
I was asked to take, I guess because it has to do with polynomials
(which the RNG thankfully no longer uses).
* tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
wireguard: selftests: use maximum cpu features and allow rng seeding
random: remove rng_has_arch_random()
random: credit cpu and bootloader seeds by default
random: do not use jump labels before they are initialized
random: account for arch randomness in bits
random: mark bootloader randomness code as __init
random: avoid checking crng_ready() twice in random_init()
crc-itu-t: fix typo in CRC ITU-T polynomial comment
The Surface Go reports Chassis Type 9 (Laptop,) so the device needs to be
added to dmi_vgbs_allow_list to enable tablet mode when an attached Type
Cover is folded back.
BugLink: https://github.com/linux-surface/linux-surface/issues/837
Signed-off-by: Duke Lee <krnhotwings@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220607213654.5567-1-krnhotwings@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
commit be9d73e649 ("platform/x86: hp-wmi: Fix 0x05 error code reported by
several WMI calls") and commit 12b19f14a2 ("platform/x86: hp-wmi: Fix
hp_wmi_read_int() reporting error (0x05)") cause ACPI BIOS Error (bug):
Attempt to CreateField of length zero (20211217/dsopcode-133) because of
the ACPI method HWMC, which unconditionally creates a Field of
size (insize*8) bits:
CreateField (Arg1, 0x80, (Local5 * 0x08), DAIN)
In cases where args->insize = 0, the Field size is 0, resulting in
an error.
Fix this by using zero insize only if 0x5 error code is returned
Tested on Omen 15 AMD (2020) board ID: 8786.
Fixes: be9d73e649 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls")
Signed-off-by: Bedant Patnaik <bedant.patnaik@gmail.com>
Tested-by: Jorge Lopez <jorge.lopez2@hp.com>
Link: https://lore.kernel.org/r/41be46743d21c78741232a47bbb5f1cdbcc3d21e.camel@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
WMI queries fail on some devices where the ACPI method HWMC
unconditionally attempts to create Fields beyond the buffer
if the buffer is too small, this breaks essential features
such as power profiles:
CreateByteField (Arg1, 0x10, D008)
CreateByteField (Arg1, 0x11, D009)
CreateByteField (Arg1, 0x12, D010)
CreateDWordField (Arg1, 0x10, D032)
CreateField (Arg1, 0x80, 0x0400, D128)
In cases where args->data had zero length, ACPI BIOS Error
(bug): AE_AML_BUFFER_LIMIT, Field [D008] at bit
offset/length 128/8 exceeds size of target Buffer (128 bits)
(20211217/dsopcode-198) was obtained.
ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D009] at bit
offset/length 136/8 exceeds size of target Buffer (136bits)
(20211217/dsopcode-198)
The original code created a buffer size of 128 bytes regardless if
the WMI call required a smaller buffer or not. This particular
behavior occurs in older BIOS and reproduced in OMEN laptops. Newer
BIOS handles buffer sizes properly and meets the latest specification
requirements. This is the reason why testing with a dynamically
allocated buffer did not uncover any failures with the test systems at
hand.
This patch was tested on several OMEN, Elite, and Zbooks. It was
confirmed the patch resolves HPWMI_FAN GET/SET calls in an OMEN
Laptop 15-ek0xxx. No problems were reported when testing on several Elite
and Zbooks notebooks.
Fixes: 4b4967cbd2 ("platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated")
Signed-off-by: Jorge Lopez <jorge.lopez2@hp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220608212923.8585-2-jorge.lopez2@hp.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The syntax without dots is available since commit 43756e347f
("scripts/kernel-doc: Add support for named variable macro arguments").
The same HTML output is produced with and without this patch.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
- make irq_chip structs immutable in several Diolan and intel drivers to
get rid of the new warning we emit when fiddling with irq chips
- don't print error messages on probe deferral in gpio-dwapb
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmKk+XAACgkQEacuoBRx
13Krng//XvlVTqp0nRr7zl/9NaA9iwjTENEkIp0eL9PNA6LjEDz8v1WCW35wIrkW
h/Ve+udXHTbtNEU5dSSW7zrFPU3Ma7oPBHlddE4AzwMgzj+NYXjULhx6Fdvg2aRI
B/pDyArqW3ndJ+PLI01bFS92Sb62ZGBgMngfBV6z9Unc+L/JvK6Hlm/Vn1EQ+oOF
pzzGKgtxW1jE895SyB8TlVoCxtmxGohmOD8lpDlZetq/59q6epEgSKj6nR6k4fTB
E6dk/K0qn4/8N1AU/yZkitxea6Sk59AuS5l5bs+th2ZNaesdyfqMbJSvqqNMNbr1
NAEZ37MDhCdQPS0W0/77SX00E4lblObXnlv4JLM7klpGX78AiIOEY11ILfp5y4wp
n9m+pBiW/6DQNggn+uF/bly4I3wQPqEnL43NOpSmiTYtJVY3KBjiLS08sF48//f1
fJuLjJ4dRJJGWBCeR6xesfhXwkspYAkPEsP/xhqMNdehCMBlK7W6Vw8PH7AAzKGK
l3qubKVlS2U0wcu8ISAeHyt2WDhnY0Wo0G/27+jt6Nx+h87ChGNVhN2E3PFGM05i
W4iYI6tADjT8TGqpslyjEwAREVT3J90oEXx/t3hDsgkt7h5DPPgijMKxM7NXhM5Z
bBDjYgij3Q586L65O4Nwstgv+tolsM+i4nYuqwfGeRr9r0XeiQ8=
=oL8o
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"A set of fixes. Most address the new warning we emit at build time
when irq chips are not immutable with some additional tweaks to
gpio-crystalcove from Andy and a small tweak to gpio-dwapd.
- make irq_chip structs immutable in several Diolan and intel drivers
to get rid of the new warning we emit when fiddling with irq chips
- don't print error messages on probe deferral in gpio-dwapb"
* tag 'gpio-fixes-for-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: dwapb: Don't print error on -EPROBE_DEFER
gpio: dln2: make irq_chip immutable
gpio: sch: make irq_chip immutable
gpio: merrifield: make irq_chip immutable
gpio: wcove: make irq_chip immutable
gpio: crystalcove: Join function declarations and long lines
gpio: crystalcove: Use specific type and API for IRQ number
gpio: crystalcove: make irq_chip immutable
13 driver and 1 core patch. Nine of the driver patches are minor
fixes and reworks to lpfc and the rest are trivial and minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYqSIMyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishf5hAP4rj6mu
qqCQmh7V+zMst2ijZ+BBJBHX5eURcOwMWxd48QD/dLp/hX7djKS8i8hju2AznOyx
F8g4PqPvnoJJ0adlpb0=
=GTnk
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Driver fixes and and one core patch.
Nine of the driver patches are minor fixes and reworks to lpfc and the
rest are trivial and minor fixes elsewhere"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pmcraid: Fix missing resource cleanup in error case
scsi: ipr: Fix missing/incorrect resource cleanup in error case
scsi: mpt3sas: Fix out-of-bounds compiler warning
scsi: lpfc: Update lpfc version to 14.2.0.4
scsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion
scsi: lpfc: Add more logging of cmd and cqe information for aborted NVMe cmds
scsi: lpfc: Fix port stuck in bypassed state after LIP in PT2PT topology
scsi: lpfc: Resolve NULL ptr dereference after an ELS LOGO is aborted
scsi: lpfc: Address NULL pointer dereference after starget_to_rport()
scsi: lpfc: Resolve some cleanup issues following SLI path refactoring
scsi: lpfc: Resolve some cleanup issues following abort path refactoring
scsi: lpfc: Correct BDE type for XMIT_SEQ64_WQE in lpfc_ct_reject_event()
scsi: vmw_pvscsi: Expand vcpuHint to 16 bits
scsi: sd: Fix interpretation of VPD B9h length
Fixes all over the place, most notably fixes for latent
bugs in drivers that got exposed by suppressing
interrupts before DRIVER_OK, which in turn has been
done by 8b4ec69d7e ("virtio: harden vring IRQ").
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKkIwUPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpo/8H/3JwrGv0k5vj1wbHXymmBCRJrQ90Wg5/lwGj
8419DWHidPqf8X/ZAEHoWMr3FM7Jsj7KuTl3sQZODxQbQ4wERvEljTRosxImJ3iI
k2+T7HGf4pAIp8zzSp+1xL1feyctHT3atWKlpjXGqsDNCONexBIfG7/M63MkArOg
SzrbcTsZzabdI2eKDSipiywnrGlQH6mq7YXynaKWBdL7RtJC+PHAYLksJufPz4Lw
vsHpNEfWkoss+hsYKhH73lU4B++eQnHKbjiMktPkNLusxWW035unP97kqJKvxFgu
NDF9PDa/4e47nJqqcibvo6VEZ5q6hLjaDCqRPjzEdKl44V8VeNE=
=Q+zn
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place, most notably fixes for latent bugs in
drivers that got exposed by suppressing interrupts before DRIVER_OK,
which in turn has been done by 8b4ec69d7e ("virtio: harden vring
IRQ")"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
um: virt-pci: set device ready in probe()
vdpa: make get_vq_group and set_group_asid optional
virtio: Fix all occurences of the "the the" typo
vduse: Fix NULL pointer dereference on sysfs access
vringh: Fix loop descriptors check in the indirect cases
vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
vdpa/mlx5: fix error code for deleting vlan
virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
vdpa/mlx5: Fix syntax errors in comments
virtio-rng: make device ready before making request
Commit 6c77676645 ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()")
introduced a problem on some 32-bit architectures (at least arm, xtensa,
csky,sparc and mips), that have a 'size_t' that is 'unsigned int'.
The reason is that we now do
min(nr * PAGE_SIZE - offset, maxsize);
where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is
'unsigned long'. As a result, the normal C type rules means that the
first argument to 'min()' ends up being 'unsigned long'.
In contrast, 'maxsize' is of type 'size_t'.
Now, 'size_t' and 'unsigned long' are always the same physical type in
the kernel, so you'd think this doesn't matter, and from an actual
arithmetic standpoint it doesn't.
But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if
it could also be 'unsigned long'. In that situation, both are unsigned
32-bit types, but they are not the *same* type.
And as a result 'min()' will complain about the distinct types (ignore
the "pointer types" part of the error message: that's an artifact of the
way we have made 'min()' check types for being the same):
lib/iov_iter.c: In function 'iter_xarray_get_pages':
include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
| ^~
lib/iov_iter.c:1464:16: note: in expansion of macro 'min'
1464 | return min(nr * PAGE_SIZE - offset, maxsize);
| ^~~
This was not visible on 64-bit architectures (where we always define
'size_t' to be 'unsigned long').
Force these cases to use 'min_t(size_t, x, y)' to make the type explicit
and avoid the issue.
[ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned
long' arithmetically. We've certainly historically seen environments
with 16-bit address spaces and 32-bit 'unsigned long'.
Similarly, even in 64-bit modern environments, 'size_t' could be its
own type distinct from 'unsigned long', even if it were arithmetically
identical.
So the above type commentary is only really descriptive of the kernel
environment, not some kind of universal truth for the kinds of wild
and crazy situations that are allowed by the C standard ]
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/
Cc: Jeff Layton <jlayton@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
By forcing the maximum CPU that QEMU has available, we expose additional
capabilities, such as the RNDR instruction, which increases test
coverage. This then allows the CI to skip the fake seeding step in some
cases. Also enable STRICT_KERNEL_RWX to catch issues related to early
jump labels when the RNG is initialized at boot.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
MAGIC_START("IKCFG_ST") and MAGIC_END("IKCFG_ED") are moved out
from the kernel_config_data variable.
Thus, we parse kernel_config_data directly instead of considering
offset of MAGIC_START and MAGIC_END.
Fixes: 13610aa908 ("kernel/configs: use .incbin directive to embed config_data.gz")
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Call virtio_device_ready() to make this driver work after commit
b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the
virtqueues in the probe function. (The virtio core sets the device
ready when probe returns.)
Fixes: 8b4ec69d7e ("virtio: harden vring IRQ")
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
- There is now a backup maintainer for NFSD
Notable fixes:
- Prevent array overruns in svc_rdma_build_writes()
- Prevent buffer overruns when encoding NFSv3 READDIR results
- Fix a potential UAF in nfsd_file_put()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmKjhbsACgkQM2qzM29m
f5cgig/6A9gC2c9v4lR2fH6ufiCWvJBfuVaBbToubwktJHaDLvqH56JcvS3s/gKL
PKGmbQTI/6lgmVgJqQSxJUnfe6wzHx8G1MdjlZEIwi3pUeiV+LpiprZz9TOhgYYV
YDnXRGhb4wKOm75w8rb6X8k106XdKwdBaRQwb88FDawffWoEY0XNYrlmNmmWi8To
ELlOlIwRCBbKJoJ6yEEWQRrBuVBXapbsn29tipZXbdo58g+vL0yDQq9s97b0mHhi
C2apAN2+k18FiBJsA7b7pW1l/P6k9FNEeetvgWyN8OSMpPNmt0vz1HvKaIstPgg1
BX6rgWe5eQBFEk2KNvSGHrV3R+wAp7jeuVpHUMjxXvzmfj4exJV/H8lu+qZJNDGN
ybCJatomR4APFxk+s1kptlzNo7zfyPz15L80HmWIngYJ/lrBOoKPIIi3bwPQcBwW
q2Rc+SlvpqbJvEcomgF/lqQN6inmx44J+KpOSA/S8qSIdSkz0iaZsDahFxgZNe82
h+X/i1maRtnSIvWdGMR7O6kEFT5jky35WlTv/VutTOsUwA4mUU9vZUnufBBHJH07
nOdLMi/QS/O5GOnlyegrODtN75wi+IeKt+WMNmnN+JB8Tsg0kZwjOsc/dbQfyJqP
PrQJ5AUP0TMm90B1873z8yKmhWeXtB71vgAI/d53aadBG4ZEstU=
=6ugk
-----END PGP SIGNATURE-----
Merge tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Notable changes:
- There is now a backup maintainer for NFSD
Notable fixes:
- Prevent array overruns in svc_rdma_build_writes()
- Prevent buffer overruns when encoding NFSv3 READDIR results
- Fix a potential UAF in nfsd_file_put()"
* tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
SUNRPC: Clean up xdr_get_next_encode_buffer()
SUNRPC: Clean up xdr_commit_encode()
SUNRPC: Optimize xdr_reserve_space()
SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
SUNRPC: Trap RDMA segment overflows
NFSD: Fix potential use-after-free in nfsd_file_put()
MAINTAINERS: reciprocal co-maintainership for file locking and nfsd
Currently, the secondary channels of a multichannel session
also get hostname populated based on the info in primary channel.
However, this will end up with a wrong resolution of hostname to
IP address during reconnect.
This change fixes this by not populating hostname info for all
secondary channels.
Fixes: 5112d80c16 ("cifs: populate server_hostname for extra channels")
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
properly setup. Remove now unused bioset_init_from_src.
- Fix DM zoned hang from locking imbalance due to needless check in
clone_endio().
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmKjmloACgkQxSPxCi2d
A1r8VggAyz5qvVintXmJ5SEnzz/nPfeXcu7UpyDWbabYaRQ+RSdD541TYayhlp0f
Pnk2R+aOYG2JvKviIg2Tbf+WqCkhAgStG26QyTWaakwiLva+xE2+lbxJus/8ZcWT
pZ8UvwpCFbg7nTpGNHYT2/fs2++6e2Y+bPE87rG534agHf6c48UHdoLiHm3bgNuh
iY1ro4UsseVDum48BCPj6mwOaAYfjpp6BlJgzlRol0PRkqKCgRthN0pksSn3fW3y
H9LS4gUvJIBVq5oA0T9zRWDeEdKdqXUGnATjWaet5kSjZCCuJmu23vyC707Nnx76
KN7jEmkR1eGuXQlw4URiQ2HuG941dw==
=O04+
-----END PGP SIGNATURE-----
Merge tag 'for-5.19/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM core's bioset initialization so that blk integrity pool is
properly setup. Remove now unused bioset_init_from_src.
- Fix DM zoned hang from locking imbalance due to needless check in
clone_endio().
* tag 'for-5.19/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix zoned locking imbalance due to needless check in clone_endio
block: remove bioset_init_from_src
dm: fix bio_set allocation
Pull fscache cleanups from David Howells:
- fix checker complaint in afs
- two netfs cleanups:
- netfs_inode calling convention cleanup plus the requisite
documentation changes
- replace the ->cleanup op with a ->free_request op.
This is possible as the I/O request is now always available at
the cleanup point as the stuff to be cleaned up is no longer
passed into the API functions, but rather obtained by ->init_request.
* 'fscache-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Rename the netfs_io_request cleanup op and give it an op pointer
netfs: Further cleanups after struct netfs_inode wrapper introduced
afs: Fix some checker issues
(and more similar to logics for other flavours)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYqOi+gAKCRBZ7Krx/gZQ
66sQAQC0axU6eieSzk0WkfqfuwNs4AmSGdUN2BQS5oEUdKIyAgD/US9asCVX1joL
HU+nRLNLGfA2f5IeVxjyByDmmYDNCQE=
=w2Z9
-----END PGP SIGNATURE-----
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter fix from Al Viro:
"ITER_XARRAY get_pages fix; now the return value is a lot saner (and
more similar to logics for other flavours)"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: Fix iter_xarray_get_pages{,_alloc}()
As platform_driver_register() could fail, it should be better
to deal with the return value in order to maintain the code
consisitency.
Fixes: 86af1d02d4 ("platform/x86: Support for EC-connected GPIOs for identify LED/button on Barco P50 board")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Peter Korsgaard <peter.korsgaard@barco.com>
Link: https://lore.kernel.org/r/20220526090345.1444172-1-jiasheng@iscas.ac.cn
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add Raptorlake P to the list of the platforms that intel_pmc_core driver
supports for pmc_core device. Raptorlake P PCH is based on Alderlake P
PCH.
Signed-off-by: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20220602012617.20100-1-george.d.sworo@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The probe function pmt_crashlog_probe() may incorrectly reference
the 'priv->entry array' as it uses 'i' to reference the array instead
of 'priv->num_entries' as it should. This is similar to the problem
that was addressed in pmt_telemetry_probe via commit 2cdfa0c20d
("platform/x86/intel: Fix 'rmmod pmt_telemetry' panic").
Cc: "David E. Box" <david.e.box@linux.intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mark Gross <markgross@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David Arcari <darcari@redhat.com>
Reviewed-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20220526203140.339120-1-darcari@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fix problem of missing static in struct declaration.
Fixes: 662f24826f ("platform/mellanox: Add support for new SN2201 system")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20220602145103.11859-1-michaelsh@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The maths at the end of iter_xarray_get_pages() to calculate the actual
size doesn't work under some circumstances, such as when it's been asked to
extract a partial single page. Various terms of the equation cancel out
and you end up with actual == offset. The same issue exists in
iter_xarray_get_pages_alloc().
Fix these to just use min() to select the lesser amount from between the
amount of page content transcribed into the buffer, minus the offset, and
the size limit specified.
This doesn't appear to have caused a problem yet upstream because network
filesystems aren't getting the pages from an xarray iterator, but rather
passing it directly to the socket, which just iterates over it. Cachefiles
*does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for
whole pages to be written or read.
Fixes: 7ff5062079 ("iov_iter: Add ITER_XARRAY")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: Gao Xiang <xiang@kernel.org>
cc: linux-afs@lists.infradead.org
cc: v9fs-developer@lists.sourceforge.net
cc: devel@lists.orangefs.org
cc: linux-erofs@lists.ozlabs.org
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The netfs_io_request cleanup op is now always in a position to be given a
pointer to a netfs_io_request struct, so this can be passed in instead of
the mapping and private data arguments (both of which are included in the
struct).
So rename the ->cleanup op to ->free_request (to match ->init_request) and
pass in the I/O pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Change the signature of netfs helper functions to take a struct netfs_inode
pointer rather than a struct inode pointer where appropriate, thereby
relieving the need for the network filesystem to convert its internal inode
format down to the VFS inode only for netfslib to bounce it back up. For
type safety, it's better not to do that (and it's less typing too).
Give netfs_write_begin() an extra argument to pass in a pointer to the
netfs_inode struct rather than deriving it internally from the file
pointer. Note that the ->write_begin() and ->write_end() ops are intended
to be replaced in the future by netfslib code that manages this without the
need to call in twice for each page.
netfs_readpage() and similar are intended to be pointed at directly by the
address_space_operations table, so must stick to the signature dictated by
the function pointers there.
Changes
=======
- Updated the kerneldoc comments and documentation [DH].
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
Remove an unused global variable and make another static as reported by
make C=1.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
- Don't release a folio while it's still locked
- Fix a use-after-free after dropping the mmap_lock
- Fix a memory leak when splitting a page
- Fix a kernel-doc warning for struct folio
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmKjmV8ACgkQDpNsjXcp
gj7f0Af+OeYLW8nMkqSe92OETzOVPlYCFBPlgE98kmwaD9nFOZlG65w7KggYyUbu
hCU5xgfyxo2rQWalO8CLf/a9w+v02UO9IbjtV3kdePpVxRPx+euYsScyoVOn9O6p
FI6BwEOONUc45rcOGMqDG8BCh75vdeeemu1Z8AYEGs1sIyl2AQYQvpZyZRu2JnBy
AFkjKpwNLHjrC3T1AjOHaJ6CmA2eDJX9z6yuk5yKwMVr7Mkq93PUwyJQb44CK3iD
jGgOrPEL1+JUUFMtSfE0Wzy8wUvMyFq7RDZ39zVooQSz2AcyvcQTbO076dPEKKSZ
tXwvO8J6TDv17s4/ekaoA/+ernwoyQ==
=yYpU
-----END PGP SIGNATURE-----
Merge tag 'folio-5.19a' of git://git.infradead.org/users/willy/pagecache
Pull folio fixes from Matthew Wilcox:
"Four folio-related fixes:
- Don't release a folio while it's still locked
- Fix a use-after-free after dropping the mmap_lock
- Fix a memory leak when splitting a page
- Fix a kernel-doc warning for struct folio"
* tag 'folio-5.19a' of git://git.infradead.org/users/willy/pagecache:
mm: Add kernel-doc for folio->mlock_count
mm/huge_memory: Fix xarray node memory leak
filemap: Cache the value of vm_flags
filemap: Don't release a locked folio
After the commit ca522482e3 ("dm: pass NULL bdev to bio_alloc_clone"),
clone_endio() only calls dm_zone_endio() when DM targets remap the
clone bio's bdev to something other than the md->disk->part0 default.
However, if a DM target (e.g. dm-crypt) stacked ontop of a dm-zoned
does not remap the clone bio using bio_set_dev() then dm_zone_endio()
is not called at completion of the bios and zone locks are not
properly unlocked. This triggers a hang, in dm_zone_map_bio(), when
blktests block/004 is run for dm-crypt on zoned block devices. To
avoid the hang, simply remove the clone_endio() check that verifies
the target remapped the clone bio to a device other than the default.
Fixes: ca522482e3 ("dm: pass NULL bdev to bio_alloc_clone")
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>