WSL2-Linux-Kernel/include/linux/cpumask.h

1068 строки
29 KiB
C
Исходник Обычный вид История

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 17:07:57 +03:00
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __LINUX_CPUMASK_H
#define __LINUX_CPUMASK_H
/*
* Cpumasks provide a bitmap suitable for representing the
* set of CPU's in a system, one bit position per CPU number. In general,
* only nr_cpu_ids (<= NR_CPUS) bits are valid.
*/
#include <linux/kernel.h>
#include <linux/threads.h>
#include <linux/bitmap.h>
#include <linux/atomic.h>
#include <linux/bug.h>
/* Don't assign or return these: may not be this big! */
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
/**
* cpumask_bits - get the bits in a cpumask
* @maskp: the struct cpumask *
*
* You should only assume nr_cpu_ids bits of this mask are valid. This is
* a macro so it's const-correct.
*/
#define cpumask_bits(maskp) ((maskp)->bits)
mempolicy: add bitmap_onto() and bitmap_fold() operations The following adds two more bitmap operators, bitmap_onto() and bitmap_fold(), with the usual cpumask and nodemask wrappers. The bitmap_onto() operator computes one bitmap relative to another. If the n-th bit in the origin mask is set, then the m-th bit of the destination mask will be set, where m is the position of the n-th set bit in the relative mask. The bitmap_fold() operator folds a bitmap into a second that has bit m set iff the input bitmap has some bit n set, where m == n mod sz, for the specified sz value. There are two substantive changes between this patch and its predecessor bitmap_relative: 1) Renamed bitmap_relative() to be bitmap_onto(). 2) Added bitmap_fold(). The essential motivation for bitmap_onto() is to provide a mechanism for converting a cpuset-relative CPU or Node mask to an absolute mask. Cpuset relative masks are written as if the current task were in a cpuset whose CPUs or Nodes were just the consecutive ones numbered 0..N-1, for some N. The bitmap_onto() operator is provided in anticipation of adding support for the first such cpuset relative mask, by the mbind() and set_mempolicy() system calls, using a planned flag of MPOL_F_RELATIVE_NODES. These bitmap operators (and their nodemask wrappers, in particular) will be used in code that converts the user specified cpuset relative memory policy to a specific system node numbered policy, given the current mems_allowed of the tasks cpuset. Such cpuset relative mempolicies will address two deficiencies of the existing interface between cpusets and mempolicies: 1) A task cannot at present reliably establish a cpuset relative mempolicy because there is an essential race condition, in that the tasks cpuset may be changed in between the time the task can query its cpuset placement, and the time the task can issue the applicable mbind or set_memplicy system call. 2) A task cannot at present establish what cpuset relative mempolicy it would like to have, if it is in a smaller cpuset than it might have mempolicy preferences for, because the existing interface only allows specifying mempolicies for nodes currently allowed by the cpuset. Cpuset relative mempolicies are useful for tasks that don't distinguish particularly between one CPU or Node and another, but only between how many of each are allowed, and the proper placement of threads and memory pages on the various CPUs and Nodes available. The motivation for the added bitmap_fold() can be seen in the following example. Let's say an application has specified some mempolicies that presume 16 memory nodes, including say a mempolicy that specified MPOL_F_RELATIVE_NODES (cpuset relative) nodes 12-15. Then lets say that application is crammed into a cpuset that only has 8 memory nodes, 0-7. If one just uses bitmap_onto(), this mempolicy, mapped to that cpuset, would ignore the requested relative nodes above 7, leaving it empty of nodes. That's not good; better to fold the higher nodes down, so that some nodes are included in the resulting mapped mempolicy. In this case, the mempolicy nodes 12-15 are taken modulo 8 (the weight of the mems_allowed of the confining cpuset), resulting in a mempolicy specifying nodes 4-7. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: <kosaki.motohiro@jp.fujitsu.com> Cc: <ray-lk@madrabbit.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28 13:12:29 +04:00
cpumask, nodemask: implement cpumask/nodemask_pr_args() printf family of functions can now format bitmaps using '%*pb[l]' and all cpumask and nodemask formatting will be converted to use it. To ease printing these masks with '%*pb[l]' which require two params - the number of bits and the actual bitmap, this patch implement cpumask_pr_args() and nodemask_pr_args() which can be used to provide arguments for '%*pb[l]' Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mike Travis <travis@sgi.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-14 01:36:57 +03:00
/**
* cpumask_pr_args - printf args to output a cpumask
* @maskp: cpumask to be printed
*
* Can be used to provide arguments for '%*pb[l]' when printing a cpumask.
*/
#define cpumask_pr_args(maskp) nr_cpu_ids, cpumask_bits(maskp)
x86: Add performance variants of cpumask operators * Increase performance for systems with large count NR_CPUS by limiting the range of the cpumask operators that loop over the bits in a cpumask_t variable. This removes a large amount of wasted cpu cycles. * Add performance variants of the cpumask operators: int cpus_weight_nr(mask) Same using nr_cpu_ids instead of NR_CPUS int first_cpu_nr(mask) Number lowest set bit, or nr_cpu_ids int next_cpu_nr(cpu, mask) Next cpu past 'cpu', or nr_cpu_ids for_each_cpu_mask_nr(cpu, mask) for-loop cpu over mask using nr_cpu_ids * Modify following to use performance variants: #define num_online_cpus() cpus_weight_nr(cpu_online_map) #define num_possible_cpus() cpus_weight_nr(cpu_possible_map) #define num_present_cpus() cpus_weight_nr(cpu_present_map) #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) #define for_each_online_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) #define for_each_present_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) * Comment added to include/linux/cpumask.h: Note: The alternate operations with the suffix "_nr" are used to limit the range of the loop to nr_cpu_ids instead of NR_CPUS when NR_CPUS > 64 for performance reasons. If NR_CPUS is <= 64 then most assembler bitmask operators execute faster with a constant range, so the operator will continue to use NR_CPUS. Another consideration is that nr_cpu_ids is initialized to NR_CPUS and isn't lowered until the possible cpus are discovered (including any disabled cpus). So early uses will span the entire range of NR_CPUS. (The net effect is that for systems with 64 or less CPU's there are no functional changes.) For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@sgi.com> Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-12 23:21:13 +04:00
#if NR_CPUS == 1
#define nr_cpu_ids 1U
#else
extern unsigned int nr_cpu_ids;
x86: Add performance variants of cpumask operators * Increase performance for systems with large count NR_CPUS by limiting the range of the cpumask operators that loop over the bits in a cpumask_t variable. This removes a large amount of wasted cpu cycles. * Add performance variants of the cpumask operators: int cpus_weight_nr(mask) Same using nr_cpu_ids instead of NR_CPUS int first_cpu_nr(mask) Number lowest set bit, or nr_cpu_ids int next_cpu_nr(cpu, mask) Next cpu past 'cpu', or nr_cpu_ids for_each_cpu_mask_nr(cpu, mask) for-loop cpu over mask using nr_cpu_ids * Modify following to use performance variants: #define num_online_cpus() cpus_weight_nr(cpu_online_map) #define num_possible_cpus() cpus_weight_nr(cpu_possible_map) #define num_present_cpus() cpus_weight_nr(cpu_present_map) #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) #define for_each_online_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) #define for_each_present_cpu(cpu) for_each_cpu_mask_nr((cpu), ...) * Comment added to include/linux/cpumask.h: Note: The alternate operations with the suffix "_nr" are used to limit the range of the loop to nr_cpu_ids instead of NR_CPUS when NR_CPUS > 64 for performance reasons. If NR_CPUS is <= 64 then most assembler bitmask operators execute faster with a constant range, so the operator will continue to use NR_CPUS. Another consideration is that nr_cpu_ids is initialized to NR_CPUS and isn't lowered until the possible cpus are discovered (including any disabled cpus). So early uses will span the entire range of NR_CPUS. (The net effect is that for systems with 64 or less CPU's there are no functional changes.) For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@sgi.com> Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-12 23:21:13 +04:00
#endif
#ifdef CONFIG_CPUMASK_OFFSTACK
/* Assuming NR_CPUS is huge, a runtime limit is more efficient. Also,
* not all bits may be allocated. */
#define nr_cpumask_bits nr_cpu_ids
#else
cpumask: make "nr_cpumask_bits" unsigned Bit searching functions accept "unsigned long" indices but "nr_cpumask_bits" is "int" which is signed, so inevitable sign extensions occur on x86_64. Those MOVSX are #1 MOVSX bloat by number of uses across whole kernel. Change "nr_cpumask_bits" to unsigned, this number can't be negative after all. It allows to do implicit zero-extension on x86_64 without MOVSX. Change signed comparisons into unsigned comparisons where necessary. Other uses looks fine because it is either argument passed to a function or comparison is already unsigned. Net win on allyesconfig type of kernel: ~2.8 KB (!) add/remove: 0/0 grow/shrink: 8/725 up/down: 93/-2926 (-2833) function old new delta xen_exit_mmap 691 735 +44 qstat_read 426 440 +14 __cpufreq_cooling_register 1678 1687 +9 trace_rb_cpu_prepare 447 455 +8 vermagic 54 60 +6 nfp_driver_version 54 60 +6 rcu_torture_stats_print 1147 1151 +4 find_next_push_cpu 267 269 +2 xen_irq_resume 961 960 -1 ... init_vp_index 946 906 -40 od_set_powersave_bias 328 281 -47 power_cpu_exit 193 139 -54 arch_show_interrupts 3538 3484 -54 select_idle_sibling 1558 1471 -87 Total: Before=158358910, After=158356077, chg -0.00% Same arguments apply to "nr_cpu_ids" but I haven't yet found enough courage to delve into this issue (and proper fix may require new type "cpu_t" which is whole separate story). Link: http://lkml.kernel.org/r/20170309205322.GA1728@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-09 01:56:15 +03:00
#define nr_cpumask_bits ((unsigned int)NR_CPUS)
#endif
/*
* The following particular system cpumasks and operations manage
* possible, present, active and online cpus.
*
* cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
* cpu_present_mask - has bit 'cpu' set iff cpu is populated
* cpu_online_mask - has bit 'cpu' set iff cpu available to scheduler
* cpu_active_mask - has bit 'cpu' set iff cpu available to migration
*
* If !CONFIG_HOTPLUG_CPU, present == possible, and active == online.
*
* The cpu_possible_mask is fixed at boot time, as the set of CPU id's
* that it is possible might ever be plugged in at anytime during the
* life of that system boot. The cpu_present_mask is dynamic(*),
* representing which CPUs are currently plugged in. And
* cpu_online_mask is the dynamic subset of cpu_present_mask,
* indicating those CPUs available for scheduling.
*
* If HOTPLUG is enabled, then cpu_possible_mask is forced to have
* all NR_CPUS bits set, otherwise it is just the set of CPUs that
* ACPI reports present at boot.
*
* If HOTPLUG is enabled, then cpu_present_mask varies dynamically,
* depending on what ACPI reports as currently plugged in, otherwise
* cpu_present_mask is just a copy of cpu_possible_mask.
*
* (*) Well, cpu_present_mask is dynamic in the hotplug case. If not
* hotplug, it's a copy of cpu_possible_mask, hence fixed at boot.
*
* Subtleties:
* 1) UP arch's (NR_CPUS == 1, CONFIG_SMP not defined) hardcode
* assumption that their single CPU is online. The UP
* cpu_{online,possible,present}_masks are placebos. Changing them
* will have no useful affect on the following num_*_cpus()
* and cpu_*() macros in the UP case. This ugliness is a UP
* optimization - don't waste any instructions or memory references
* asking if you're online or how many CPUs there are if there is
* only one CPU.
*/
extern struct cpumask __cpu_possible_mask;
extern struct cpumask __cpu_online_mask;
extern struct cpumask __cpu_present_mask;
extern struct cpumask __cpu_active_mask;
extern struct cpumask __cpu_dying_mask;
#define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
#define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask)
#define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask)
#define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask)
#define cpu_dying_mask ((const struct cpumask *)&__cpu_dying_mask)
extern atomic_t __num_online_cpus;
extern cpumask_t cpus_booted_once_mask;
static inline void cpu_max_bits_warn(unsigned int cpu, unsigned int bits)
{
#ifdef CONFIG_DEBUG_PER_CPU_MAPS
WARN_ON_ONCE(cpu >= bits);
#endif /* CONFIG_DEBUG_PER_CPU_MAPS */
}
/* verify cpu argument to cpumask_* operators */
static inline unsigned int cpumask_check(unsigned int cpu)
{
cpu_max_bits_warn(cpu, nr_cpumask_bits);
return cpu;
}
#if NR_CPUS == 1
/* Uniprocessor. Assume all masks are "1". */
static inline unsigned int cpumask_first(const struct cpumask *srcp)
{
return 0;
}
sched/isolcpus: Fix "isolcpus=" boot parameter handling when !CONFIG_CPUMASK_OFFSTACK cpulist_parse() uses nr_cpumask_bits as a limit to parse the passed buffer from kernel commandline. What nr_cpumask_bits represents varies depending upon the CONFIG_CPUMASK_OFFSTACK option: - If CONFIG_CPUMASK_OFFSTACK=n, then nr_cpumask_bits is the same as NR_CPUS, which might not represent the # of CPUs that really exist (default 64). So, there's a chance of a gap between nr_cpu_ids and NR_CPUS, which ultimately lead towards invalid cpulist_parse() operation. For example, if isolcpus=9 is passed on an 8 cpu system (CONFIG_CPUMASK_OFFSTACK=n) it doesn't show the error that it's supposed to. This patch fixes this bug by finding the last CPU of the passed isolcpus= list and checking it against nr_cpu_ids. It also fixes the error message where the nr_cpu_ids should be nr_cpu_ids-1, since CPU numbering starts from 0. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adobriyan@gmail.com Cc: akpm@linux-foundation.org Cc: longman@redhat.com Cc: mka@chromium.org Cc: tj@kernel.org Link: http://lkml.kernel.org/r/20171023130154.9050-1-rakib.mullick@gmail.com [ Enhanced the changelog and the kernel message. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> include/linux/cpumask.h | 16 ++++++++++++++++ kernel/sched/topology.c | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-)
2017-10-23 16:01:54 +03:00
static inline unsigned int cpumask_last(const struct cpumask *srcp)
{
return 0;
}
/* Valid inputs for n are -1 and 0. */
static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
{
return n+1;
}
static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
{
return n+1;
}
static inline unsigned int cpumask_next_and(int n,
const struct cpumask *srcp,
const struct cpumask *andp)
{
return n+1;
}
static inline unsigned int cpumask_next_wrap(int n, const struct cpumask *mask,
int start, bool wrap)
{
/* cpu0 unless stop condition, wrap and at cpu0, then nr_cpumask_bits */
return (wrap && n == 0);
}
/* cpu must be a valid cpu, ie 0, so there's no other choice. */
static inline unsigned int cpumask_any_but(const struct cpumask *mask,
unsigned int cpu)
{
return 1;
}
static inline unsigned int cpumask_local_spread(unsigned int i, int node)
{
return 0;
}
static inline int cpumask_any_and_distribute(const struct cpumask *src1p,
const struct cpumask *src2p) {
return cpumask_next_and(-1, src1p, src2p);
}
static inline int cpumask_any_distribute(const struct cpumask *srcp)
{
return cpumask_first(srcp);
}
#define for_each_cpu(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
rcu: Accelerate grace period if last non-dynticked CPU Currently, rcu_needs_cpu() simply checks whether the current CPU has an outstanding RCU callback, which means that the last CPU to go into dyntick-idle mode might wait a few ticks for the relevant grace periods to complete. However, if all the other CPUs are in dyntick-idle mode, and if this CPU is in a quiescent state (which it is for RCU-bh and RCU-sched any time that we are considering going into dyntick-idle mode), then the grace period is instantly complete. This patch therefore repeatedly invokes the RCU grace-period machinery in order to force any needed grace periods to complete quickly. It does so a limited number of times in order to prevent starvation by an RCU callback function that might pass itself to call_rcu(). However, if any CPU other than the current one is not in dyntick-idle mode, fall back to simply checking (with fix to bug noted by Lai Jiangshan). Also, take advantage of last grace-period forcing, the opportunity to do so noted by Steve Rostedt. And apply simplified #ifdef condition suggested by Frederic Weisbecker. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-15-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-23 04:04:59 +03:00
#define for_each_cpu_not(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
#define for_each_cpu_wrap(cpu, mask, start) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask, (void)(start))
#define for_each_cpu_and(cpu, mask1, mask2) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask1, (void)mask2)
#else
/**
* cpumask_first - get the first cpu in a cpumask
* @srcp: the cpumask pointer
*
* Returns >= nr_cpu_ids if no cpus set.
*/
static inline unsigned int cpumask_first(const struct cpumask *srcp)
{
return find_first_bit(cpumask_bits(srcp), nr_cpumask_bits);
}
sched/isolcpus: Fix "isolcpus=" boot parameter handling when !CONFIG_CPUMASK_OFFSTACK cpulist_parse() uses nr_cpumask_bits as a limit to parse the passed buffer from kernel commandline. What nr_cpumask_bits represents varies depending upon the CONFIG_CPUMASK_OFFSTACK option: - If CONFIG_CPUMASK_OFFSTACK=n, then nr_cpumask_bits is the same as NR_CPUS, which might not represent the # of CPUs that really exist (default 64). So, there's a chance of a gap between nr_cpu_ids and NR_CPUS, which ultimately lead towards invalid cpulist_parse() operation. For example, if isolcpus=9 is passed on an 8 cpu system (CONFIG_CPUMASK_OFFSTACK=n) it doesn't show the error that it's supposed to. This patch fixes this bug by finding the last CPU of the passed isolcpus= list and checking it against nr_cpu_ids. It also fixes the error message where the nr_cpu_ids should be nr_cpu_ids-1, since CPU numbering starts from 0. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adobriyan@gmail.com Cc: akpm@linux-foundation.org Cc: longman@redhat.com Cc: mka@chromium.org Cc: tj@kernel.org Link: http://lkml.kernel.org/r/20171023130154.9050-1-rakib.mullick@gmail.com [ Enhanced the changelog and the kernel message. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> include/linux/cpumask.h | 16 ++++++++++++++++ kernel/sched/topology.c | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-)
2017-10-23 16:01:54 +03:00
/**
* cpumask_last - get the last CPU in a cpumask
* @srcp: - the cpumask pointer
*
* Returns >= nr_cpumask_bits if no CPUs set.
*/
static inline unsigned int cpumask_last(const struct cpumask *srcp)
{
return find_last_bit(cpumask_bits(srcp), nr_cpumask_bits);
}
unsigned int __pure cpumask_next(int n, const struct cpumask *srcp);
/**
* cpumask_next_zero - get the next unset cpu in a cpumask
* @n: the cpu prior to the place to search (ie. return will be > @n)
* @srcp: the cpumask pointer
*
* Returns >= nr_cpu_ids if no further cpus unset.
*/
static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
{
/* -1 is a legal arg here. */
if (n != -1)
cpumask_check(n);
return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1);
}
int __pure cpumask_next_and(int n, const struct cpumask *, const struct cpumask *);
int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
unsigned int cpumask_local_spread(unsigned int i, int node);
int cpumask_any_and_distribute(const struct cpumask *src1p,
const struct cpumask *src2p);
int cpumask_any_distribute(const struct cpumask *srcp);
/**
* for_each_cpu - iterate over every cpu in a mask
* @cpu: the (optionally unsigned) integer iterator
* @mask: the cpumask pointer
*
* After the loop, cpu is >= nr_cpu_ids.
*/
#define for_each_cpu(cpu, mask) \
for ((cpu) = -1; \
(cpu) = cpumask_next((cpu), (mask)), \
(cpu) < nr_cpu_ids;)
rcu: Accelerate grace period if last non-dynticked CPU Currently, rcu_needs_cpu() simply checks whether the current CPU has an outstanding RCU callback, which means that the last CPU to go into dyntick-idle mode might wait a few ticks for the relevant grace periods to complete. However, if all the other CPUs are in dyntick-idle mode, and if this CPU is in a quiescent state (which it is for RCU-bh and RCU-sched any time that we are considering going into dyntick-idle mode), then the grace period is instantly complete. This patch therefore repeatedly invokes the RCU grace-period machinery in order to force any needed grace periods to complete quickly. It does so a limited number of times in order to prevent starvation by an RCU callback function that might pass itself to call_rcu(). However, if any CPU other than the current one is not in dyntick-idle mode, fall back to simply checking (with fix to bug noted by Lai Jiangshan). Also, take advantage of last grace-period forcing, the opportunity to do so noted by Steve Rostedt. And apply simplified #ifdef condition suggested by Frederic Weisbecker. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-15-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-23 04:04:59 +03:00
/**
* for_each_cpu_not - iterate over every cpu in a complemented mask
* @cpu: the (optionally unsigned) integer iterator
* @mask: the cpumask pointer
*
* After the loop, cpu is >= nr_cpu_ids.
*/
#define for_each_cpu_not(cpu, mask) \
for ((cpu) = -1; \
(cpu) = cpumask_next_zero((cpu), (mask)), \
(cpu) < nr_cpu_ids;)
extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap);
/**
* for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
* @cpu: the (optionally unsigned) integer iterator
* @mask: the cpumask pointer
* @start: the start location
*
* The implementation does not assume any bit in @mask is set (including @start).
*
* After the loop, cpu is >= nr_cpu_ids.
*/
#define for_each_cpu_wrap(cpu, mask, start) \
for ((cpu) = cpumask_next_wrap((start)-1, (mask), (start), false); \
(cpu) < nr_cpumask_bits; \
(cpu) = cpumask_next_wrap((cpu), (mask), (start), true))
/**
* for_each_cpu_and - iterate over every cpu in both masks
* @cpu: the (optionally unsigned) integer iterator
* @mask1: the first cpumask pointer
* @mask2: the second cpumask pointer
*
* This saves a temporary CPU mask in many places. It is equivalent to:
* struct cpumask tmp;
* cpumask_and(&tmp, &mask1, &mask2);
* for_each_cpu(cpu, &tmp)
* ...
*
* After the loop, cpu is >= nr_cpu_ids.
*/
#define for_each_cpu_and(cpu, mask1, mask2) \
for ((cpu) = -1; \
(cpu) = cpumask_next_and((cpu), (mask1), (mask2)), \
(cpu) < nr_cpu_ids;)
#endif /* SMP */
#define CPU_BITS_NONE \
{ \
[0 ... BITS_TO_LONGS(NR_CPUS)-1] = 0UL \
}
#define CPU_BITS_CPU0 \
{ \
[0] = 1UL \
}
/**
* cpumask_set_cpu - set a cpu in a cpumask
* @cpu: cpu number (< nr_cpu_ids)
* @dstp: the cpumask pointer
*/
static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
{
set_bit(cpumask_check(cpu), cpumask_bits(dstp));
}
static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
{
__set_bit(cpumask_check(cpu), cpumask_bits(dstp));
}
/**
* cpumask_clear_cpu - clear a cpu in a cpumask
* @cpu: cpu number (< nr_cpu_ids)
* @dstp: the cpumask pointer
*/
static inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp)
{
clear_bit(cpumask_check(cpu), cpumask_bits(dstp));
}
static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp)
{
__clear_bit(cpumask_check(cpu), cpumask_bits(dstp));
}
/**
* cpumask_test_cpu - test for a cpu in a cpumask
* @cpu: cpu number (< nr_cpu_ids)
* @cpumask: the cpumask pointer
*
* Returns 1 if @cpu is set in @cpumask, else returns 0
*/
static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask)
{
return test_bit(cpumask_check(cpu), cpumask_bits((cpumask)));
}
/**
* cpumask_test_and_set_cpu - atomically test and set a cpu in a cpumask
* @cpu: cpu number (< nr_cpu_ids)
* @cpumask: the cpumask pointer
*
* Returns 1 if @cpu is set in old bitmap of @cpumask, else returns 0
*
* test_and_set_bit wrapper for cpumasks.
*/
static inline int cpumask_test_and_set_cpu(int cpu, struct cpumask *cpumask)
{
return test_and_set_bit(cpumask_check(cpu), cpumask_bits(cpumask));
}
/**
* cpumask_test_and_clear_cpu - atomically test and clear a cpu in a cpumask
* @cpu: cpu number (< nr_cpu_ids)
* @cpumask: the cpumask pointer
*
* Returns 1 if @cpu is set in old bitmap of @cpumask, else returns 0
*
* test_and_clear_bit wrapper for cpumasks.
*/
static inline int cpumask_test_and_clear_cpu(int cpu, struct cpumask *cpumask)
{
return test_and_clear_bit(cpumask_check(cpu), cpumask_bits(cpumask));
}
/**
* cpumask_setall - set all cpus (< nr_cpu_ids) in a cpumask
* @dstp: the cpumask pointer
*/
static inline void cpumask_setall(struct cpumask *dstp)
{
bitmap_fill(cpumask_bits(dstp), nr_cpumask_bits);
}
/**
* cpumask_clear - clear all cpus (< nr_cpu_ids) in a cpumask
* @dstp: the cpumask pointer
*/
static inline void cpumask_clear(struct cpumask *dstp)
{
bitmap_zero(cpumask_bits(dstp), nr_cpumask_bits);
}
/**
* cpumask_and - *dstp = *src1p & *src2p
* @dstp: the cpumask result
* @src1p: the first input
* @src2p: the second input
*
* If *@dstp is empty, returns 0, else returns 1
*/
static inline int cpumask_and(struct cpumask *dstp,
const struct cpumask *src1p,
const struct cpumask *src2p)
{
return bitmap_and(cpumask_bits(dstp), cpumask_bits(src1p),
cpumask_bits(src2p), nr_cpumask_bits);
}
/**
* cpumask_or - *dstp = *src1p | *src2p
* @dstp: the cpumask result
* @src1p: the first input
* @src2p: the second input
*/
static inline void cpumask_or(struct cpumask *dstp, const struct cpumask *src1p,
const struct cpumask *src2p)
{
bitmap_or(cpumask_bits(dstp), cpumask_bits(src1p),
cpumask_bits(src2p), nr_cpumask_bits);
}
/**
* cpumask_xor - *dstp = *src1p ^ *src2p
* @dstp: the cpumask result
* @src1p: the first input
* @src2p: the second input
*/
static inline void cpumask_xor(struct cpumask *dstp,
const struct cpumask *src1p,
const struct cpumask *src2p)
{
bitmap_xor(cpumask_bits(dstp), cpumask_bits(src1p),
cpumask_bits(src2p), nr_cpumask_bits);
}
/**
* cpumask_andnot - *dstp = *src1p & ~*src2p
* @dstp: the cpumask result
* @src1p: the first input
* @src2p: the second input
*
* If *@dstp is empty, returns 0, else returns 1
*/
static inline int cpumask_andnot(struct cpumask *dstp,
const struct cpumask *src1p,
const struct cpumask *src2p)
{
return bitmap_andnot(cpumask_bits(dstp), cpumask_bits(src1p),
cpumask_bits(src2p), nr_cpumask_bits);
}
/**
* cpumask_complement - *dstp = ~*srcp
* @dstp: the cpumask result
* @srcp: the input to invert
*/
static inline void cpumask_complement(struct cpumask *dstp,
const struct cpumask *srcp)
{
bitmap_complement(cpumask_bits(dstp), cpumask_bits(srcp),
nr_cpumask_bits);
}
/**
* cpumask_equal - *src1p == *src2p
* @src1p: the first input
* @src2p: the second input
*/
static inline bool cpumask_equal(const struct cpumask *src1p,
const struct cpumask *src2p)
{
return bitmap_equal(cpumask_bits(src1p), cpumask_bits(src2p),
nr_cpumask_bits);
}
/**
* cpumask_or_equal - *src1p | *src2p == *src3p
* @src1p: the first input
* @src2p: the second input
* @src3p: the third input
*/
static inline bool cpumask_or_equal(const struct cpumask *src1p,
const struct cpumask *src2p,
const struct cpumask *src3p)
{
return bitmap_or_equal(cpumask_bits(src1p), cpumask_bits(src2p),
cpumask_bits(src3p), nr_cpumask_bits);
}
/**
* cpumask_intersects - (*src1p & *src2p) != 0
* @src1p: the first input
* @src2p: the second input
*/
static inline bool cpumask_intersects(const struct cpumask *src1p,
const struct cpumask *src2p)
{
return bitmap_intersects(cpumask_bits(src1p), cpumask_bits(src2p),
nr_cpumask_bits);
}
/**
* cpumask_subset - (*src1p & ~*src2p) == 0
* @src1p: the first input
* @src2p: the second input
*
* Returns 1 if *@src1p is a subset of *@src2p, else returns 0
*/
static inline int cpumask_subset(const struct cpumask *src1p,
const struct cpumask *src2p)
{
return bitmap_subset(cpumask_bits(src1p), cpumask_bits(src2p),
nr_cpumask_bits);
}
/**
* cpumask_empty - *srcp == 0
* @srcp: the cpumask to that all cpus < nr_cpu_ids are clear.
*/
static inline bool cpumask_empty(const struct cpumask *srcp)
{
return bitmap_empty(cpumask_bits(srcp), nr_cpumask_bits);
}
/**
* cpumask_full - *srcp == 0xFFFFFFFF...
* @srcp: the cpumask to that all cpus < nr_cpu_ids are set.
*/
static inline bool cpumask_full(const struct cpumask *srcp)
{
return bitmap_full(cpumask_bits(srcp), nr_cpumask_bits);
}
/**
* cpumask_weight - Count of bits in *srcp
* @srcp: the cpumask to count bits (< nr_cpu_ids) in.
*/
static inline unsigned int cpumask_weight(const struct cpumask *srcp)
{
return bitmap_weight(cpumask_bits(srcp), nr_cpumask_bits);
}
/**
* cpumask_shift_right - *dstp = *srcp >> n
* @dstp: the cpumask result
* @srcp: the input to shift
* @n: the number of bits to shift by
*/
static inline void cpumask_shift_right(struct cpumask *dstp,
const struct cpumask *srcp, int n)
{
bitmap_shift_right(cpumask_bits(dstp), cpumask_bits(srcp), n,
nr_cpumask_bits);
}
/**
* cpumask_shift_left - *dstp = *srcp << n
* @dstp: the cpumask result
* @srcp: the input to shift
* @n: the number of bits to shift by
*/
static inline void cpumask_shift_left(struct cpumask *dstp,
const struct cpumask *srcp, int n)
{
bitmap_shift_left(cpumask_bits(dstp), cpumask_bits(srcp), n,
nr_cpumask_bits);
}
/**
* cpumask_copy - *dstp = *srcp
* @dstp: the result
* @srcp: the input cpumask
*/
static inline void cpumask_copy(struct cpumask *dstp,
const struct cpumask *srcp)
{
bitmap_copy(cpumask_bits(dstp), cpumask_bits(srcp), nr_cpumask_bits);
}
/**
* cpumask_any - pick a "random" cpu from *srcp
* @srcp: the input cpumask
*
* Returns >= nr_cpu_ids if no cpus set.
*/
#define cpumask_any(srcp) cpumask_first(srcp)
/**
* cpumask_first_and - return the first cpu from *srcp1 & *srcp2
* @src1p: the first input
* @src2p: the second input
*
* Returns >= nr_cpu_ids if no cpus set in both. See also cpumask_next_and().
*/
#define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p), (src2p))
/**
* cpumask_any_and - pick a "random" cpu from *mask1 & *mask2
* @mask1: the first input cpumask
* @mask2: the second input cpumask
*
* Returns >= nr_cpu_ids if no cpus set.
*/
#define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
/**
* cpumask_of - the cpumask containing just a given cpu
* @cpu: the cpu (<= nr_cpu_ids)
*/
#define cpumask_of(cpu) (get_cpu_mask(cpu))
/**
* cpumask_parse_user - extract a cpumask from a user string
* @buf: the buffer to extract from
* @len: the length of the buffer
* @dstp: the cpumask to set.
*
* Returns -errno, or 0 for success.
*/
static inline int cpumask_parse_user(const char __user *buf, int len,
struct cpumask *dstp)
{
cpumask: use nr_cpumask_bits for parsing functions Commit 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") converted both cpumask printing and parsing functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was okay for the printing functions as it just picked one of the two output formats that we were alternating between depending on a kernel config, doing the same for parsing wasn't okay. nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can be more efficient to use NR_CPUS when we can get away with it. Converting the printing functions to nr_cpu_ids makes sense because it affects how the masks get presented to userspace and doesn't break anything; however, using nr_cpu_ids for parsing functions can incorrectly leave the higher bits uninitialized while reading in these masks from userland. As all testing and comparison functions use nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks can erroneously yield false negative results. This made the taskstats interface incorrectly return -EINVAL even when the inputs were correct. Fix it by restoring the parse functions to use nr_cpumask_bits instead of nr_cpu_ids. Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org Fixes: 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de> Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-09 01:30:56 +03:00
return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits);
}
/**
* cpumask_parselist_user - extract a cpumask from a user string
* @buf: the buffer to extract from
* @len: the length of the buffer
* @dstp: the cpumask to set.
*
* Returns -errno, or 0 for success.
*/
static inline int cpumask_parselist_user(const char __user *buf, int len,
struct cpumask *dstp)
{
return bitmap_parselist_user(buf, len, cpumask_bits(dstp),
cpumask: use nr_cpumask_bits for parsing functions Commit 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") converted both cpumask printing and parsing functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was okay for the printing functions as it just picked one of the two output formats that we were alternating between depending on a kernel config, doing the same for parsing wasn't okay. nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can be more efficient to use NR_CPUS when we can get away with it. Converting the printing functions to nr_cpu_ids makes sense because it affects how the masks get presented to userspace and doesn't break anything; however, using nr_cpu_ids for parsing functions can incorrectly leave the higher bits uninitialized while reading in these masks from userland. As all testing and comparison functions use nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks can erroneously yield false negative results. This made the taskstats interface incorrectly return -EINVAL even when the inputs were correct. Fix it by restoring the parse functions to use nr_cpumask_bits instead of nr_cpu_ids. Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org Fixes: 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de> Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-09 01:30:56 +03:00
nr_cpumask_bits);
}
/**
* cpumask_parse - extract a cpumask from a string
* @buf: the buffer to extract from
* @dstp: the cpumask to set.
*
* Returns -errno, or 0 for success.
*/
static inline int cpumask_parse(const char *buf, struct cpumask *dstp)
{
return bitmap_parse(buf, UINT_MAX, cpumask_bits(dstp), nr_cpumask_bits);
}
/**
* cpulist_parse - extract a cpumask from a user string of ranges
* @buf: the buffer to extract from
* @dstp: the cpumask to set.
*
* Returns -errno, or 0 for success.
*/
static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
{
cpumask: use nr_cpumask_bits for parsing functions Commit 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") converted both cpumask printing and parsing functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was okay for the printing functions as it just picked one of the two output formats that we were alternating between depending on a kernel config, doing the same for parsing wasn't okay. nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can be more efficient to use NR_CPUS when we can get away with it. Converting the printing functions to nr_cpu_ids makes sense because it affects how the masks get presented to userspace and doesn't break anything; however, using nr_cpu_ids for parsing functions can incorrectly leave the higher bits uninitialized while reading in these masks from userland. As all testing and comparison functions use nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks can erroneously yield false negative results. This made the taskstats interface incorrectly return -EINVAL even when the inputs were correct. Fix it by restoring the parse functions to use nr_cpumask_bits instead of nr_cpu_ids. Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org Fixes: 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de> Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-09 01:30:56 +03:00
return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpumask_bits);
}
/**
* cpumask_size - size to allocate for a 'struct cpumask' in bytes
*/
static inline unsigned int cpumask_size(void)
{
return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long);
}
/*
* cpumask_var_t: struct cpumask for stack usage.
*
* Oh, the wicked games we play! In order to make kernel coding a
* little more difficult, we typedef cpumask_var_t to an array or a
* pointer: doing &mask on an array is a noop, so it still works.
*
* ie.
* cpumask_var_t tmpmask;
* if (!alloc_cpumask_var(&tmpmask, GFP_KERNEL))
* return -ENOMEM;
*
* ... use 'tmpmask' like a normal struct cpumask * ...
*
* free_cpumask_var(tmpmask);
*
*
* However, one notable exception is there. alloc_cpumask_var() allocates
* only nr_cpumask_bits bits (in the other hand, real cpumask_t always has
* NR_CPUS bits). Therefore you don't have to dereference cpumask_var_t.
*
* cpumask_var_t tmpmask;
* if (!alloc_cpumask_var(&tmpmask, GFP_KERNEL))
* return -ENOMEM;
*
* var = *tmpmask;
*
* This code makes NR_CPUS length memcopy and brings to a memory corruption.
* cpumask_copy() provide safe copy functionality.
*
* Note that there is another evil here: If you define a cpumask_var_t
* as a percpu variable then the way to obtain the address of the cpumask
* structure differently influences what this_cpu_* operation needs to be
* used. Please use this_cpu_cpumask_var_t in those cases. The direct use
* of this_cpu_ptr() or this_cpu_read() will lead to failures when the
* other type of cpumask_var_t implementation is configured.
*
* Please also note that __cpumask_var_read_mostly can be used to declare
* a cpumask_var_t variable itself (not its content) as read mostly.
*/
#ifdef CONFIG_CPUMASK_OFFSTACK
typedef struct cpumask *cpumask_var_t;
#define this_cpu_cpumask_var_ptr(x) this_cpu_read(x)
#define __cpumask_var_read_mostly __read_mostly
bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags);
bool zalloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
bool zalloc_cpumask_var(cpumask_var_t *mask, gfp_t flags);
void alloc_bootmem_cpumask_var(cpumask_var_t *mask);
void free_cpumask_var(cpumask_var_t mask);
void free_bootmem_cpumask_var(cpumask_var_t mask);
static inline bool cpumask_available(cpumask_var_t mask)
{
return mask != NULL;
}
#else
typedef struct cpumask cpumask_var_t[1];
#define this_cpu_cpumask_var_ptr(x) this_cpu_ptr(x)
#define __cpumask_var_read_mostly
static inline bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
{
return true;
}
static inline bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags,
int node)
{
return true;
}
static inline bool zalloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
{
cpumask_clear(*mask);
return true;
}
static inline bool zalloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags,
int node)
{
cpumask_clear(*mask);
return true;
}
static inline void alloc_bootmem_cpumask_var(cpumask_var_t *mask)
{
}
static inline void free_cpumask_var(cpumask_var_t mask)
{
}
static inline void free_bootmem_cpumask_var(cpumask_var_t mask)
{
}
static inline bool cpumask_available(cpumask_var_t mask)
{
return true;
}
#endif /* CONFIG_CPUMASK_OFFSTACK */
/* It's common to want to use cpu_all_mask in struct member initializers,
* so it has to refer to an address rather than a pointer. */
extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
#define cpu_all_mask to_cpumask(cpu_all_bits)
/* First bits of cpu_bit_bitmap are in fact unset. */
#define cpu_none_mask to_cpumask(cpu_bit_bitmap[0])
#define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
#define for_each_online_cpu(cpu) for_each_cpu((cpu), cpu_online_mask)
#define for_each_present_cpu(cpu) for_each_cpu((cpu), cpu_present_mask)
/* Wrappers for arch boot code to manipulate normally-constant masks */
void init_cpu_present(const struct cpumask *src);
void init_cpu_possible(const struct cpumask *src);
void init_cpu_online(const struct cpumask *src);
static inline void reset_cpu_possible_mask(void)
{
bitmap_zero(cpumask_bits(&__cpu_possible_mask), NR_CPUS);
}
static inline void
set_cpu_possible(unsigned int cpu, bool possible)
{
if (possible)
cpumask_set_cpu(cpu, &__cpu_possible_mask);
else
cpumask_clear_cpu(cpu, &__cpu_possible_mask);
}
static inline void
set_cpu_present(unsigned int cpu, bool present)
{
if (present)
cpumask_set_cpu(cpu, &__cpu_present_mask);
else
cpumask_clear_cpu(cpu, &__cpu_present_mask);
}
void set_cpu_online(unsigned int cpu, bool online);
static inline void
set_cpu_active(unsigned int cpu, bool active)
{
if (active)
cpumask_set_cpu(cpu, &__cpu_active_mask);
else
cpumask_clear_cpu(cpu, &__cpu_active_mask);
}
static inline void
set_cpu_dying(unsigned int cpu, bool dying)
{
if (dying)
cpumask_set_cpu(cpu, &__cpu_dying_mask);
else
cpumask_clear_cpu(cpu, &__cpu_dying_mask);
}
/**
* to_cpumask - convert an NR_CPUS bitmap to a struct cpumask *
* @bitmap: the bitmap
*
* There are a few places where cpumask_var_t isn't appropriate and
* static cpumasks must be used (eg. very early boot), yet we don't
* expose the definition of 'struct cpumask'.
*
* This does the conversion, and can be used as a constant initializer.
*/
#define to_cpumask(bitmap) \
((struct cpumask *)(1 ? (bitmap) \
: (void *)sizeof(__check_is_bitmap(bitmap))))
static inline int __check_is_bitmap(const unsigned long *bitmap)
{
return 1;
}
/*
* Special-case data structure for "single bit set only" constant CPU masks.
*
* We pre-generate all the 64 (or 32) possible bit positions, with enough
* padding to the left and the right, and return the constant pointer
* appropriately offset.
*/
extern const unsigned long
cpu_bit_bitmap[BITS_PER_LONG+1][BITS_TO_LONGS(NR_CPUS)];
static inline const struct cpumask *get_cpu_mask(unsigned int cpu)
{
const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
p -= cpu / BITS_PER_LONG;
return to_cpumask(p);
}
#if NR_CPUS > 1
/**
* num_online_cpus() - Read the number of online CPUs
*
* Despite the fact that __num_online_cpus is of type atomic_t, this
* interface gives only a momentary snapshot and is not protected against
* concurrent CPU hotplug operations unless invoked from a cpuhp_lock held
* region.
*/
static inline unsigned int num_online_cpus(void)
{
return atomic_read(&__num_online_cpus);
}
#define num_possible_cpus() cpumask_weight(cpu_possible_mask)
#define num_present_cpus() cpumask_weight(cpu_present_mask)
#define num_active_cpus() cpumask_weight(cpu_active_mask)
static inline bool cpu_online(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_online_mask);
}
static inline bool cpu_possible(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_possible_mask);
}
static inline bool cpu_present(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_present_mask);
}
static inline bool cpu_active(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_active_mask);
}
static inline bool cpu_dying(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_dying_mask);
}
#else
#define num_online_cpus() 1U
#define num_possible_cpus() 1U
#define num_present_cpus() 1U
#define num_active_cpus() 1U
static inline bool cpu_online(unsigned int cpu)
{
return cpu == 0;
}
static inline bool cpu_possible(unsigned int cpu)
{
return cpu == 0;
}
static inline bool cpu_present(unsigned int cpu)
{
return cpu == 0;
}
static inline bool cpu_active(unsigned int cpu)
{
return cpu == 0;
}
static inline bool cpu_dying(unsigned int cpu)
{
return false;
}
#endif /* NR_CPUS > 1 */
#define cpu_is_offline(cpu) unlikely(!cpu_online(cpu))
#if NR_CPUS <= BITS_PER_LONG
#define CPU_BITS_ALL \
{ \
[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS) \
}
#else /* NR_CPUS > BITS_PER_LONG */
#define CPU_BITS_ALL \
{ \
[0 ... BITS_TO_LONGS(NR_CPUS)-2] = ~0UL, \
[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS) \
}
#endif /* NR_CPUS > BITS_PER_LONG */
/**
* cpumap_print_to_pagebuf - copies the cpumask into the buffer either
* as comma-separated list of cpus or hex values of cpumask
* @list: indicates whether the cpumap must be list
* @mask: the cpumask to copy
* @buf: the buffer to copy into
*
* Returns the length of the (null-terminated) @buf string, zero if
* nothing is copied.
*/
static inline ssize_t
cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
{
return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
cpumask: always use nr_cpu_ids in formatting and parsing functions bitmap implements two variants of scnprintf functions to format a bitmap into a string and cpumask and nodemask wrap them to provide equivalent interfaces. The scnprintf family of functions require a string buffer as an output target which complicates code paths which just want to print out the mask through printk for informational or debug purposes as they have to worry about how large the buffer should be and whether it's too large to allocate on stack. Neither cpumask or nodemask provides a guildeline on how large the target buffer should be forcing users come up with their own solutions - some allocate an arbitrarily sized buffer which is small enough to allocate on stack but may be too short in corner cases, other come up with a custom upper limit calculation considering the output format, some allocate the buffer dynamically while one resorted to using lock to synchronize access to a static buffer. This is an artificial problem which is being solved repeatedly for no benefit. In a lot of cases, the output area already exists and can be targeted directly making the intermediate buffer unnecessary. This patchset teaches printf family of functions how to format bitmaps and replace the dedicated formatting functions with it. Pointer formatting is extended to cover bitmap formatting. It uses the field width for the number of bits instead of precision. The format used is '%*pb[l]', with the optional trailing 'l' specifying list format instead of hex masks. For more details, please see 0002. This patch (of 31): Currently, the formatting and parsing functions in cpumask.h use nr_cpumask_bits like other cpumask functions; however, nr_cpumask_bits is either NR_CPUS or nr_cpu_ids depending on CONFIG_CPUMASK_OFFSTACK. This leads to inconsistent behaviors. With CONFIG_NR_CPUS=512 and !CONFIG_CPUMASK_OFFSTACK # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000 # cat /proc/self/status | grep Cpus_allowed: Cpus_allowed: f With CONFIG_NR_CPUS=1024 and CONFIG_CPUMASK_OFFSTACK (fedora default) # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus 0 # cat /proc/self/status | grep Cpus_allowed: Cpus_allowed: f Note that /proc/self/status is always using nr_cpu_ids regardless of config. This is because seq cpumask formattings functions always use nr_cpu_ids. Given that the same output fields may switch between the two forms, converging on nr_cpu_ids always isn't too likely to surprise userland. This patch updates the formatting and parsing functions in cpumask.h to always use nr_cpu_ids. There's no point in dealing with CPUs which aren't even possible on the machine. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mike Travis <travis@sgi.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-14 01:36:50 +03:00
nr_cpu_ids);
}
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list The existing cpumap_print_to_pagebuf() is used by cpu topology and other drivers to export hexadecimal bitmask and decimal list to userspace by sysfs ABI. Right now, those drivers are using a normal attribute for this kind of ABIs. A normal attribute typically has show entry as below: static ssize_t example_dev_show(struct device *dev, struct device_attribute *attr, char *buf) { ... return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu); } show entry of attribute has no offset and count parameters and this means the file is limited to one page only. cpumap_print_to_pagebuf() API works terribly well for this kind of normal attribute with buf parameter and without offset, count: static inline ssize_t cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask) { return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask), nr_cpu_ids); } The problem is once we have many cpus, we have a chance to make bitmask or list more than one page. Especially for list, it could be as complex as 0,3,5,7,9,...... We have no simple way to know it exact size. It turns out bin_attribute is a way to break this limit. bin_attribute has show entry as below: static ssize_t example_bin_attribute_show(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t offset, size_t count) { ... } With the new offset and count parameters, this makes sysfs ABI be able to support file size more than one page. For example, offset could be >= 4096. This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers can move to bin_attribute to support large bitmask and list. At the same time, we have to pass those corresponding parameters such as offset, count from bin_attribute to this new API. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Link: https://lore.kernel.org/r/20210806110251.560-2-song.bao.hua@hisilicon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-06 14:02:47 +03:00
/**
* cpumap_print_bitmask_to_buf - copies the cpumask into the buffer as
* hex values of cpumask
*
* @buf: the buffer to copy into
* @mask: the cpumask to copy
* @off: in the string from which we are copying, we copy to @buf
* @count: the maximum number of bytes to print
*
* The function prints the cpumask into the buffer as hex values of
* cpumask; Typically used by bin_attribute to export cpumask bitmask
* ABI.
*
* Returns the length of how many bytes have been copied, excluding
* terminating '\0'.
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list The existing cpumap_print_to_pagebuf() is used by cpu topology and other drivers to export hexadecimal bitmask and decimal list to userspace by sysfs ABI. Right now, those drivers are using a normal attribute for this kind of ABIs. A normal attribute typically has show entry as below: static ssize_t example_dev_show(struct device *dev, struct device_attribute *attr, char *buf) { ... return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu); } show entry of attribute has no offset and count parameters and this means the file is limited to one page only. cpumap_print_to_pagebuf() API works terribly well for this kind of normal attribute with buf parameter and without offset, count: static inline ssize_t cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask) { return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask), nr_cpu_ids); } The problem is once we have many cpus, we have a chance to make bitmask or list more than one page. Especially for list, it could be as complex as 0,3,5,7,9,...... We have no simple way to know it exact size. It turns out bin_attribute is a way to break this limit. bin_attribute has show entry as below: static ssize_t example_bin_attribute_show(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t offset, size_t count) { ... } With the new offset and count parameters, this makes sysfs ABI be able to support file size more than one page. For example, offset could be >= 4096. This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers can move to bin_attribute to support large bitmask and list. At the same time, we have to pass those corresponding parameters such as offset, count from bin_attribute to this new API. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Link: https://lore.kernel.org/r/20210806110251.560-2-song.bao.hua@hisilicon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-06 14:02:47 +03:00
*/
static inline ssize_t
cpumap_print_bitmask_to_buf(char *buf, const struct cpumask *mask,
loff_t off, size_t count)
{
return bitmap_print_bitmask_to_buf(buf, cpumask_bits(mask),
nr_cpu_ids, off, count) - 1;
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list The existing cpumap_print_to_pagebuf() is used by cpu topology and other drivers to export hexadecimal bitmask and decimal list to userspace by sysfs ABI. Right now, those drivers are using a normal attribute for this kind of ABIs. A normal attribute typically has show entry as below: static ssize_t example_dev_show(struct device *dev, struct device_attribute *attr, char *buf) { ... return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu); } show entry of attribute has no offset and count parameters and this means the file is limited to one page only. cpumap_print_to_pagebuf() API works terribly well for this kind of normal attribute with buf parameter and without offset, count: static inline ssize_t cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask) { return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask), nr_cpu_ids); } The problem is once we have many cpus, we have a chance to make bitmask or list more than one page. Especially for list, it could be as complex as 0,3,5,7,9,...... We have no simple way to know it exact size. It turns out bin_attribute is a way to break this limit. bin_attribute has show entry as below: static ssize_t example_bin_attribute_show(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t offset, size_t count) { ... } With the new offset and count parameters, this makes sysfs ABI be able to support file size more than one page. For example, offset could be >= 4096. This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers can move to bin_attribute to support large bitmask and list. At the same time, we have to pass those corresponding parameters such as offset, count from bin_attribute to this new API. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Link: https://lore.kernel.org/r/20210806110251.560-2-song.bao.hua@hisilicon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-06 14:02:47 +03:00
}
/**
* cpumap_print_list_to_buf - copies the cpumask into the buffer as
* comma-separated list of cpus
*
* Everything is same with the above cpumap_print_bitmask_to_buf()
* except the print format.
*/
static inline ssize_t
cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
loff_t off, size_t count)
{
return bitmap_print_list_to_buf(buf, cpumask_bits(mask),
nr_cpu_ids, off, count) - 1;
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list The existing cpumap_print_to_pagebuf() is used by cpu topology and other drivers to export hexadecimal bitmask and decimal list to userspace by sysfs ABI. Right now, those drivers are using a normal attribute for this kind of ABIs. A normal attribute typically has show entry as below: static ssize_t example_dev_show(struct device *dev, struct device_attribute *attr, char *buf) { ... return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu); } show entry of attribute has no offset and count parameters and this means the file is limited to one page only. cpumap_print_to_pagebuf() API works terribly well for this kind of normal attribute with buf parameter and without offset, count: static inline ssize_t cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask) { return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask), nr_cpu_ids); } The problem is once we have many cpus, we have a chance to make bitmask or list more than one page. Especially for list, it could be as complex as 0,3,5,7,9,...... We have no simple way to know it exact size. It turns out bin_attribute is a way to break this limit. bin_attribute has show entry as below: static ssize_t example_bin_attribute_show(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t offset, size_t count) { ... } With the new offset and count parameters, this makes sysfs ABI be able to support file size more than one page. For example, offset could be >= 4096. This patch introduces cpumap_print_bitmask/list_to_buf() and their bitmap infrastructure bitmap_print_bitmask/list_to_buf() so that those drivers can move to bin_attribute to support large bitmask and list. At the same time, we have to pass those corresponding parameters such as offset, count from bin_attribute to this new API. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Link: https://lore.kernel.org/r/20210806110251.560-2-song.bao.hua@hisilicon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-06 14:02:47 +03:00
}
#if NR_CPUS <= BITS_PER_LONG
#define CPU_MASK_ALL \
(cpumask_t) { { \
[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS) \
} }
#else
#define CPU_MASK_ALL \
(cpumask_t) { { \
[0 ... BITS_TO_LONGS(NR_CPUS)-2] = ~0UL, \
[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS) \
} }
#endif /* NR_CPUS > BITS_PER_LONG */
#define CPU_MASK_NONE \
(cpumask_t) { { \
[0 ... BITS_TO_LONGS(NR_CPUS)-1] = 0UL \
} }
#define CPU_MASK_CPU0 \
(cpumask_t) { { \
[0] = 1UL \
} }
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist [ Upstream commit 7ee951acd31a88f941fd6535fbdee3a1567f1d63 ] Using bin_attributes with a 0 size causes fstat and friends to return that 0 size. This breaks userspace code that retrieves the size before reading the file. Rather than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") let's put in a size value at compile time. For cpulist the maximum size is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2 which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need a system with every other CPU on one node. For example: (0,2,4,8, ... ). To simplify the math and support larger NR_CPUS in the future we are using (NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older behavior for smaller NR_CPUS. The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1 (or NR_CPUS * 9/32 - 1) including the ","s. Add a set of macros for these values to cpumask.h so they can be used in multiple places. Apply these to the handful of such files in drivers/base/topology.c as well as node.c. As an example, on an 80 cpu 4-node system (NR_CPUS == 8192): before: -r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist -r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap after: -r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist -r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap CONFIG_NR_CPUS = 16384 -r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist -r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap The actual number of cpus doesn't matter for the reported size since they are based on NR_CPUS. Fixes: 75bd50fa841d ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h) Signed-off-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-15 16:49:24 +03:00
/*
* Provide a valid theoretical max size for cpumap and cpulist sysfs files
* to avoid breaking userspace which may allocate a buffer based on the size
* reported by e.g. fstat.
*
* for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
*
* For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
* to 2 orders of magnitude larger than 8192. And then we divide by 2 to
* cover a worst-case of every other cpu being on one of two nodes for a
* very large NR_CPUS.
*
drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES commit d7f06bdd6ee87fbefa05af5f57361d85e7715b11 upstream. As PAGE_SIZE is unsigned long, -1 > PAGE_SIZE when NR_CPUS <= 3. This leads to very large file sizes: topology$ ls -l total 0 -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 core_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 core_cpus_list -r--r--r-- 1 root root 4096 Sep 5 10:58 core_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 10:10 core_siblings -r--r--r-- 1 root root 4096 Sep 5 11:59 core_siblings_list -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 die_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 die_cpus_list -r--r--r-- 1 root root 4096 Sep 5 11:59 die_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 package_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 package_cpus_list -r--r--r-- 1 root root 4096 Sep 5 10:58 physical_package_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 10:10 thread_siblings -r--r--r-- 1 root root 4096 Sep 5 11:59 thread_siblings_list Adjust the inequality to catch the case when NR_CPUS is configured to a small value. Fixes: 7ee951acd31a ("drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: stable@vger.kernel.org Cc: feng xiangjun <fengxj325@gmail.com> Reported-by: feng xiangjun <fengxj325@gmail.com> Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20220906203542.1796629-1-pauld@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-06 23:35:42 +03:00
* Use PAGE_SIZE as a minimum for smaller configurations while avoiding
* unsigned comparison to -1.
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist [ Upstream commit 7ee951acd31a88f941fd6535fbdee3a1567f1d63 ] Using bin_attributes with a 0 size causes fstat and friends to return that 0 size. This breaks userspace code that retrieves the size before reading the file. Rather than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") let's put in a size value at compile time. For cpulist the maximum size is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2 which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need a system with every other CPU on one node. For example: (0,2,4,8, ... ). To simplify the math and support larger NR_CPUS in the future we are using (NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older behavior for smaller NR_CPUS. The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1 (or NR_CPUS * 9/32 - 1) including the ","s. Add a set of macros for these values to cpumask.h so they can be used in multiple places. Apply these to the handful of such files in drivers/base/topology.c as well as node.c. As an example, on an 80 cpu 4-node system (NR_CPUS == 8192): before: -r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist -r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap after: -r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist -r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap CONFIG_NR_CPUS = 16384 -r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist -r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap The actual number of cpus doesn't matter for the reported size since they are based on NR_CPUS. Fixes: 75bd50fa841d ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h) Signed-off-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-15 16:49:24 +03:00
*/
drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES commit d7f06bdd6ee87fbefa05af5f57361d85e7715b11 upstream. As PAGE_SIZE is unsigned long, -1 > PAGE_SIZE when NR_CPUS <= 3. This leads to very large file sizes: topology$ ls -l total 0 -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 core_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 core_cpus_list -r--r--r-- 1 root root 4096 Sep 5 10:58 core_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 10:10 core_siblings -r--r--r-- 1 root root 4096 Sep 5 11:59 core_siblings_list -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 die_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 die_cpus_list -r--r--r-- 1 root root 4096 Sep 5 11:59 die_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 11:59 package_cpus -r--r--r-- 1 root root 4096 Sep 5 11:59 package_cpus_list -r--r--r-- 1 root root 4096 Sep 5 10:58 physical_package_id -r--r--r-- 1 root root 18446744073709551615 Sep 5 10:10 thread_siblings -r--r--r-- 1 root root 4096 Sep 5 11:59 thread_siblings_list Adjust the inequality to catch the case when NR_CPUS is configured to a small value. Fixes: 7ee951acd31a ("drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: stable@vger.kernel.org Cc: feng xiangjun <fengxj325@gmail.com> Reported-by: feng xiangjun <fengxj325@gmail.com> Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20220906203542.1796629-1-pauld@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-06 23:35:42 +03:00
#define CPUMAP_FILE_MAX_BYTES (((NR_CPUS * 9)/32 > PAGE_SIZE) \
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist [ Upstream commit 7ee951acd31a88f941fd6535fbdee3a1567f1d63 ] Using bin_attributes with a 0 size causes fstat and friends to return that 0 size. This breaks userspace code that retrieves the size before reading the file. Rather than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") let's put in a size value at compile time. For cpulist the maximum size is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2 which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need a system with every other CPU on one node. For example: (0,2,4,8, ... ). To simplify the math and support larger NR_CPUS in the future we are using (NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older behavior for smaller NR_CPUS. The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1 (or NR_CPUS * 9/32 - 1) including the ","s. Add a set of macros for these values to cpumask.h so they can be used in multiple places. Apply these to the handful of such files in drivers/base/topology.c as well as node.c. As an example, on an 80 cpu 4-node system (NR_CPUS == 8192): before: -r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist -r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap after: -r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist -r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap CONFIG_NR_CPUS = 16384 -r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist -r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap The actual number of cpus doesn't matter for the reported size since they are based on NR_CPUS. Fixes: 75bd50fa841d ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI") Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h) Signed-off-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-15 16:49:24 +03:00
? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
#define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
#endif /* __LINUX_CPUMASK_H */