Merge branch 'akpm' (patches from Andrew)

Merge first patch-bomb from Andrew Morton:

 - some misc things

 - ofs2 updates

 - about half of MM

 - checkpatch updates

 - autofs4 update

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits)
  autofs4: fix string.h include in auto_dev-ioctl.h
  autofs4: use pr_xxx() macros directly for logging
  autofs4: change log print macros to not insert newline
  autofs4: make autofs log prints consistent
  autofs4: fix some white space errors
  autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked()
  autofs4: fix coding style line length in autofs4_wait()
  autofs4: fix coding style problem in autofs4_get_set_timeout()
  autofs4: coding style fixes
  autofs: show pipe inode in mount options
  kallsyms: add support for relative offsets in kallsyms address table
  kallsyms: don't overload absolute symbol type for percpu symbols
  x86: kallsyms: disable absolute percpu symbols on !SMP
  checkpatch: fix another left brace warning
  checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses
  checkpatch: warn on bare unsigned or signed declarations without int
  checkpatch: exclude asm volatile from complex macro check
  mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate()
  mm: migrate: consolidate mem_cgroup_migrate() calls
  mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  ...
This commit is contained in:
Linus Torvalds 2016-03-16 11:51:08 -07:00
Родитель aa6865d836 63c06227a2
Коммит 271ecc5253
119 изменённых файлов: 3156 добавлений и 1822 удалений

Просмотреть файл

@ -1759,7 +1759,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
keepinitrd [HW,ARM] keepinitrd [HW,ARM]
kernelcore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter kernelcore= [KNL,X86,IA-64,PPC]
Format: nn[KMGTPE] | "mirror"
This parameter
specifies the amount of memory usable by the kernel specifies the amount of memory usable by the kernel
for non-movable allocations. The requested amount is for non-movable allocations. The requested amount is
spread evenly throughout all nodes in the system. The spread evenly throughout all nodes in the system. The
@ -1775,6 +1777,14 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
use the HighMem zone if it exists, and the Normal use the HighMem zone if it exists, and the Normal
zone if it does not. zone if it does not.
Instead of specifying the amount of memory (nn[KMGTPE]),
you can specify "mirror" option. In case "mirror"
option is specified, mirrored (reliable) memory is used
for non-movable allocations and remaining memory is used
for Movable pages. nn[KMGTPE] and "mirror" are exclusive,
so you can NOT specify nn[KMGTPE] and "mirror" at the same
time.
kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port. kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port.
Format: <Controller#>[,poll interval] Format: <Controller#>[,poll interval]
The controller # is the number of the ehci usb debug The controller # is the number of the ehci usb debug
@ -2732,6 +2742,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
we can turn it on. we can turn it on.
on: enable the feature on: enable the feature
page_poison= [KNL] Boot-time parameter changing the state of
poisoning on the buddy allocator.
off: turn off poisoning
on: turn on poisoning
panic= [KNL] Kernel behaviour on panic: delay <timeout> panic= [KNL] Kernel behaviour on panic: delay <timeout>
timeout > 0: seconds before rebooting timeout > 0: seconds before rebooting
timeout = 0: wait forever timeout = 0: wait forever

Просмотреть файл

@ -256,10 +256,27 @@ If the memory block is offline, you'll read "offline".
5.2. How to online memory 5.2. How to online memory
------------ ------------
Even if the memory is hot-added, it is not at ready-to-use state. When the memory is hot-added, the kernel decides whether or not to "online"
For using newly added memory, you have to "online" the memory block. it according to the policy which can be read from "auto_online_blocks" file:
For onlining, you have to write "online" to the memory block's state file as: % cat /sys/devices/system/memory/auto_online_blocks
The default is "offline" which means the newly added memory is not in a
ready-to-use state and you have to "online" the newly added memory blocks
manually. Automatic onlining can be requested by writing "online" to
"auto_online_blocks" file:
% echo online > /sys/devices/system/memory/auto_online_blocks
This sets a global policy and impacts all memory blocks that will subsequently
be hotplugged. Currently offline blocks keep their state. It is possible, under
certain circumstances, that some memory blocks will be added but will fail to
online. User space tools can check their "state" files
(/sys/devices/system/memory/memoryXXX/state) and try to online them manually.
If the automatic onlining wasn't requested, failed, or some memory block was
offlined it is possible to change the individual block's state by writing to the
"state" file:
% echo online > /sys/devices/system/memory/memoryXXX/state % echo online > /sys/devices/system/memory/memoryXXX/state

Просмотреть файл

@ -298,6 +298,24 @@ bitmap and its derivatives such as cpumask and nodemask:
Passed by reference. Passed by reference.
Flags bitfields such as page flags, gfp_flags:
%pGp referenced|uptodate|lru|active|private
%pGg GFP_USER|GFP_DMA32|GFP_NOWARN
%pGv read|exec|mayread|maywrite|mayexec|denywrite
For printing flags bitfields as a collection of symbolic constants that
would construct the value. The type of flags is given by the third
character. Currently supported are [p]age flags, [v]ma_flags (both
expect unsigned long *) and [g]fp_flags (expects gfp_t *). The flag
names and print order depends on the particular type.
Note that this format should not be used directly in TP_printk() part
of a tracepoint. Instead, use the show_*_flags() functions from
<trace/events/mmflags.h>.
Passed by reference.
Network device features: Network device features:
%pNF 0x000000000000c000 %pNF 0x000000000000c000

Просмотреть файл

@ -28,10 +28,11 @@ with page owner and page owner is disabled in runtime due to no enabling
boot option, runtime overhead is marginal. If disabled in runtime, it boot option, runtime overhead is marginal. If disabled in runtime, it
doesn't require memory to store owner information, so there is no runtime doesn't require memory to store owner information, so there is no runtime
memory overhead. And, page owner inserts just two unlikely branches into memory overhead. And, page owner inserts just two unlikely branches into
the page allocator hotpath and if it returns false then allocation is the page allocator hotpath and if not enabled, then allocation is done
done like as the kernel without page owner. These two unlikely branches like as the kernel without page owner. These two unlikely branches should
would not affect to allocation performance. Following is the kernel's not affect to allocation performance, especially if the static keys jump
code size change due to this facility. label patching functionality is available. Following is the kernel's code
size change due to this facility.
- Without page owner - Without page owner
text data bss dec hex filename text data bss dec hex filename

Просмотреть файл

@ -35,8 +35,8 @@ slub_debug=<Debug-Options>,<slab name>
Enable options only for select slabs Enable options only for select slabs
Possible debug options are Possible debug options are
F Sanity checks on (enables SLAB_DEBUG_FREE. Sorry F Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS
SLAB legacy issues) Sorry SLAB legacy issues)
Z Red zoning Z Red zoning
P Poisoning (object and padding) P Poisoning (object and padding)
U User tracking (free and alloc) U User tracking (free and alloc)

Просмотреть файл

@ -97,6 +97,8 @@ extern unsigned long get_fb_unmapped_area(struct file *filp, unsigned long,
unsigned long); unsigned long);
#define HAVE_ARCH_FB_UNMAPPED_AREA #define HAVE_ARCH_FB_UNMAPPED_AREA
#define pgprot_writecombine pgprot_noncached
#include <asm-generic/pgtable.h> #include <asm-generic/pgtable.h>
#endif /* _BLACKFIN_PGTABLE_H */ #endif /* _BLACKFIN_PGTABLE_H */

Просмотреть файл

@ -59,21 +59,24 @@ void free_initrd_mem(unsigned long, unsigned long);
void __init zone_sizes_init(void) void __init zone_sizes_init(void)
{ {
unsigned long zones_size[MAX_NR_ZONES] = {0, }; unsigned long zones_size[MAX_NR_ZONES] = {0, };
unsigned long max_dma;
unsigned long low;
unsigned long start_pfn; unsigned long start_pfn;
#ifdef CONFIG_MMU #ifdef CONFIG_MMU
start_pfn = START_PFN(0); {
max_dma = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT; unsigned long low;
low = MAX_LOW_PFN(0); unsigned long max_dma;
if (low < max_dma){ start_pfn = START_PFN(0);
zones_size[ZONE_DMA] = low - start_pfn; max_dma = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
zones_size[ZONE_NORMAL] = 0; low = MAX_LOW_PFN(0);
} else {
zones_size[ZONE_DMA] = low - start_pfn; if (low < max_dma) {
zones_size[ZONE_NORMAL] = low - max_dma; zones_size[ZONE_DMA] = low - start_pfn;
zones_size[ZONE_NORMAL] = 0;
} else {
zones_size[ZONE_DMA] = low - start_pfn;
zones_size[ZONE_NORMAL] = low - max_dma;
}
} }
#else #else
zones_size[ZONE_DMA] = 0 >> PAGE_SHIFT; zones_size[ZONE_DMA] = 0 >> PAGE_SHIFT;

Просмотреть файл

@ -11,6 +11,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/kdebug.h> #include <linux/kdebug.h>
#include <linux/ptrace.h> #include <linux/ptrace.h>
#include <linux/mm.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <asm/processor.h> #include <asm/processor.h>
@ -189,9 +190,8 @@ void die(struct pt_regs *regs, const char *str)
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
printk("SMP "); printk("SMP ");
#endif #endif
#ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled())
printk("DEBUG_PAGEALLOC"); printk("DEBUG_PAGEALLOC");
#endif
printk("\n"); printk("\n");
notify_die(DIE_OOPS, str, regs, 0, regs->int_code & 0xffff, SIGSEGV); notify_die(DIE_OOPS, str, regs, 0, regs->int_code & 0xffff, SIGSEGV);
print_modules(); print_modules();

Просмотреть файл

@ -94,16 +94,15 @@ static int vmem_add_mem(unsigned long start, unsigned long size, int ro)
pgd_populate(&init_mm, pg_dir, pu_dir); pgd_populate(&init_mm, pg_dir, pu_dir);
} }
pu_dir = pud_offset(pg_dir, address); pu_dir = pud_offset(pg_dir, address);
#ifndef CONFIG_DEBUG_PAGEALLOC
if (MACHINE_HAS_EDAT2 && pud_none(*pu_dir) && address && if (MACHINE_HAS_EDAT2 && pud_none(*pu_dir) && address &&
!(address & ~PUD_MASK) && (address + PUD_SIZE <= end)) { !(address & ~PUD_MASK) && (address + PUD_SIZE <= end) &&
!debug_pagealloc_enabled()) {
pud_val(*pu_dir) = __pa(address) | pud_val(*pu_dir) = __pa(address) |
_REGION_ENTRY_TYPE_R3 | _REGION3_ENTRY_LARGE | _REGION_ENTRY_TYPE_R3 | _REGION3_ENTRY_LARGE |
(ro ? _REGION_ENTRY_PROTECT : 0); (ro ? _REGION_ENTRY_PROTECT : 0);
address += PUD_SIZE; address += PUD_SIZE;
continue; continue;
} }
#endif
if (pud_none(*pu_dir)) { if (pud_none(*pu_dir)) {
pm_dir = vmem_pmd_alloc(); pm_dir = vmem_pmd_alloc();
if (!pm_dir) if (!pm_dir)
@ -111,9 +110,9 @@ static int vmem_add_mem(unsigned long start, unsigned long size, int ro)
pud_populate(&init_mm, pu_dir, pm_dir); pud_populate(&init_mm, pu_dir, pm_dir);
} }
pm_dir = pmd_offset(pu_dir, address); pm_dir = pmd_offset(pu_dir, address);
#ifndef CONFIG_DEBUG_PAGEALLOC
if (MACHINE_HAS_EDAT1 && pmd_none(*pm_dir) && address && if (MACHINE_HAS_EDAT1 && pmd_none(*pm_dir) && address &&
!(address & ~PMD_MASK) && (address + PMD_SIZE <= end)) { !(address & ~PMD_MASK) && (address + PMD_SIZE <= end) &&
!debug_pagealloc_enabled()) {
pmd_val(*pm_dir) = __pa(address) | pmd_val(*pm_dir) = __pa(address) |
_SEGMENT_ENTRY | _SEGMENT_ENTRY_LARGE | _SEGMENT_ENTRY | _SEGMENT_ENTRY_LARGE |
_SEGMENT_ENTRY_YOUNG | _SEGMENT_ENTRY_YOUNG |
@ -121,7 +120,6 @@ static int vmem_add_mem(unsigned long start, unsigned long size, int ro)
address += PMD_SIZE; address += PMD_SIZE;
continue; continue;
} }
#endif
if (pmd_none(*pm_dir)) { if (pmd_none(*pm_dir)) {
pt_dir = vmem_pte_alloc(address); pt_dir = vmem_pte_alloc(address);
if (!pt_dir) if (!pt_dir)

Просмотреть файл

@ -265,9 +265,8 @@ int __die(const char *str, struct pt_regs *regs, long err)
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
printk("SMP "); printk("SMP ");
#endif #endif
#ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled())
printk("DEBUG_PAGEALLOC "); printk("DEBUG_PAGEALLOC ");
#endif
#ifdef CONFIG_KASAN #ifdef CONFIG_KASAN
printk("KASAN"); printk("KASAN");
#endif #endif

Просмотреть файл

@ -150,13 +150,14 @@ static int page_size_mask;
static void __init probe_page_size_mask(void) static void __init probe_page_size_mask(void)
{ {
#if !defined(CONFIG_DEBUG_PAGEALLOC) && !defined(CONFIG_KMEMCHECK) #if !defined(CONFIG_KMEMCHECK)
/* /*
* For CONFIG_DEBUG_PAGEALLOC, identity mapping will use small pages. * For CONFIG_KMEMCHECK or pagealloc debugging, identity mapping will
* use small pages.
* This will simplify cpa(), which otherwise needs to support splitting * This will simplify cpa(), which otherwise needs to support splitting
* large pages into small in interrupt context, etc. * large pages into small in interrupt context, etc.
*/ */
if (cpu_has_pse) if (cpu_has_pse && !debug_pagealloc_enabled())
page_size_mask |= 1 << PG_LEVEL_2M; page_size_mask |= 1 << PG_LEVEL_2M;
#endif #endif
@ -666,21 +667,22 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
* mark them not present - any buggy init-section access will * mark them not present - any buggy init-section access will
* create a kernel page fault: * create a kernel page fault:
*/ */
#ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled()) {
printk(KERN_INFO "debug: unmapping init [mem %#010lx-%#010lx]\n", pr_info("debug: unmapping init [mem %#010lx-%#010lx]\n",
begin, end - 1); begin, end - 1);
set_memory_np(begin, (end - begin) >> PAGE_SHIFT); set_memory_np(begin, (end - begin) >> PAGE_SHIFT);
#else } else {
/* /*
* We just marked the kernel text read only above, now that * We just marked the kernel text read only above, now that
* we are going to free part of that, we need to make that * we are going to free part of that, we need to make that
* writeable and non-executable first. * writeable and non-executable first.
*/ */
set_memory_nx(begin, (end - begin) >> PAGE_SHIFT); set_memory_nx(begin, (end - begin) >> PAGE_SHIFT);
set_memory_rw(begin, (end - begin) >> PAGE_SHIFT); set_memory_rw(begin, (end - begin) >> PAGE_SHIFT);
free_reserved_area((void *)begin, (void *)end, POISON_FREE_INITMEM, what); free_reserved_area((void *)begin, (void *)end,
#endif POISON_FREE_INITMEM, what);
}
} }
void free_initmem(void) void free_initmem(void)

Просмотреть файл

@ -106,12 +106,6 @@ static inline unsigned long highmap_end_pfn(void)
#endif #endif
#ifdef CONFIG_DEBUG_PAGEALLOC
# define debug_pagealloc 1
#else
# define debug_pagealloc 0
#endif
static inline int static inline int
within(unsigned long addr, unsigned long start, unsigned long end) within(unsigned long addr, unsigned long start, unsigned long end)
{ {
@ -714,10 +708,10 @@ static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
{ {
struct page *base; struct page *base;
if (!debug_pagealloc) if (!debug_pagealloc_enabled())
spin_unlock(&cpa_lock); spin_unlock(&cpa_lock);
base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0); base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
if (!debug_pagealloc) if (!debug_pagealloc_enabled())
spin_lock(&cpa_lock); spin_lock(&cpa_lock);
if (!base) if (!base)
return -ENOMEM; return -ENOMEM;
@ -1339,10 +1333,10 @@ static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
if (cpa->flags & (CPA_ARRAY | CPA_PAGES_ARRAY)) if (cpa->flags & (CPA_ARRAY | CPA_PAGES_ARRAY))
cpa->numpages = 1; cpa->numpages = 1;
if (!debug_pagealloc) if (!debug_pagealloc_enabled())
spin_lock(&cpa_lock); spin_lock(&cpa_lock);
ret = __change_page_attr(cpa, checkalias); ret = __change_page_attr(cpa, checkalias);
if (!debug_pagealloc) if (!debug_pagealloc_enabled())
spin_unlock(&cpa_lock); spin_unlock(&cpa_lock);
if (ret) if (ret)
return ret; return ret;

Просмотреть файл

@ -217,10 +217,21 @@ static void part_release(struct device *dev)
kfree(p); kfree(p);
} }
static int part_uevent(struct device *dev, struct kobj_uevent_env *env)
{
struct hd_struct *part = dev_to_part(dev);
add_uevent_var(env, "PARTN=%u", part->partno);
if (part->info && part->info->volname[0])
add_uevent_var(env, "PARTNAME=%s", part->info->volname);
return 0;
}
struct device_type part_type = { struct device_type part_type = {
.name = "partition", .name = "partition",
.groups = part_attr_groups, .groups = part_attr_groups,
.release = part_release, .release = part_release,
.uevent = part_uevent,
}; };
static void delete_partition_rcu_cb(struct rcu_head *head) static void delete_partition_rcu_cb(struct rcu_head *head)

Просмотреть файл

@ -61,8 +61,8 @@ module_param(latency_factor, uint, 0644);
static DEFINE_PER_CPU(struct cpuidle_device *, acpi_cpuidle_device); static DEFINE_PER_CPU(struct cpuidle_device *, acpi_cpuidle_device);
static DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], static
acpi_cstate); DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
static int disabled_by_idle_boot_param(void) static int disabled_by_idle_boot_param(void)
{ {

Просмотреть файл

@ -251,7 +251,7 @@ memory_block_action(unsigned long phys_index, unsigned long action, int online_t
return ret; return ret;
} }
static int memory_block_change_state(struct memory_block *mem, int memory_block_change_state(struct memory_block *mem,
unsigned long to_state, unsigned long from_state_req) unsigned long to_state, unsigned long from_state_req)
{ {
int ret = 0; int ret = 0;
@ -438,6 +438,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL); static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
/*
* Memory auto online policy.
*/
static ssize_t
show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
char *buf)
{
if (memhp_auto_online)
return sprintf(buf, "online\n");
else
return sprintf(buf, "offline\n");
}
static ssize_t
store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
if (sysfs_streq(buf, "online"))
memhp_auto_online = true;
else if (sysfs_streq(buf, "offline"))
memhp_auto_online = false;
else
return -EINVAL;
return count;
}
static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
store_auto_online_blocks);
/* /*
* Some architectures will have custom drivers to do this, and * Some architectures will have custom drivers to do this, and
* will not need to do it from userspace. The fake hot-add code * will not need to do it from userspace. The fake hot-add code
@ -746,6 +777,7 @@ static struct attribute *memory_root_attrs[] = {
#endif #endif
&dev_attr_block_size_bytes.attr, &dev_attr_block_size_bytes.attr,
&dev_attr_auto_online_blocks.attr,
NULL NULL
}; };

Просмотреть файл

@ -126,7 +126,7 @@
*/ */
#include <linux/types.h> #include <linux/types.h>
static bool verbose = 0; static int verbose = 0;
static int major = PD_MAJOR; static int major = PD_MAJOR;
static char *name = PD_NAME; static char *name = PD_NAME;
static int cluster = 64; static int cluster = 64;
@ -161,7 +161,7 @@ enum {D_PRT, D_PRO, D_UNI, D_MOD, D_GEO, D_SBY, D_DLY, D_SLV};
static DEFINE_MUTEX(pd_mutex); static DEFINE_MUTEX(pd_mutex);
static DEFINE_SPINLOCK(pd_lock); static DEFINE_SPINLOCK(pd_lock);
module_param(verbose, bool, 0); module_param(verbose, int, 0);
module_param(major, int, 0); module_param(major, int, 0);
module_param(name, charp, 0); module_param(name, charp, 0);
module_param(cluster, int, 0); module_param(cluster, int, 0);

Просмотреть файл

@ -117,7 +117,7 @@
*/ */
static bool verbose = 0; static int verbose = 0;
static int major = PT_MAJOR; static int major = PT_MAJOR;
static char *name = PT_NAME; static char *name = PT_NAME;
static int disable = 0; static int disable = 0;
@ -152,7 +152,7 @@ static int (*drives[4])[6] = {&drive0, &drive1, &drive2, &drive3};
#include <asm/uaccess.h> #include <asm/uaccess.h>
module_param(verbose, bool, 0); module_param(verbose, int, 0);
module_param(major, int, 0); module_param(major, int, 0);
module_param(name, charp, 0); module_param(name, charp, 0);
module_param_array(drive0, int, NULL, 0); module_param_array(drive0, int, NULL, 0);

Просмотреть файл

@ -37,24 +37,31 @@ config XEN_BALLOON_MEMORY_HOTPLUG
Memory could be hotplugged in following steps: Memory could be hotplugged in following steps:
1) dom0: xl mem-max <domU> <maxmem> 1) target domain: ensure that memory auto online policy is in
effect by checking /sys/devices/system/memory/auto_online_blocks
file (should be 'online').
2) control domain: xl mem-max <target-domain> <maxmem>
where <maxmem> is >= requested memory size, where <maxmem> is >= requested memory size,
2) dom0: xl mem-set <domU> <memory> 3) control domain: xl mem-set <target-domain> <memory>
where <memory> is requested memory size; alternatively memory where <memory> is requested memory size; alternatively memory
could be added by writing proper value to could be added by writing proper value to
/sys/devices/system/xen_memory/xen_memory0/target or /sys/devices/system/xen_memory/xen_memory0/target or
/sys/devices/system/xen_memory/xen_memory0/target_kb on dumU, /sys/devices/system/xen_memory/xen_memory0/target_kb on the
target domain.
3) domU: for i in /sys/devices/system/memory/memory*/state; do \ Alternatively, if memory auto onlining was not requested at step 1
[ "`cat "$i"`" = offline ] && echo online > "$i"; done the newly added memory can be manually onlined in the target domain
by doing the following:
Memory could be onlined automatically on domU by adding following line to udev rules: for i in /sys/devices/system/memory/memory*/state; do \
[ "`cat "$i"`" = offline ] && echo online > "$i"; done
or by adding the following line to udev rules:
SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'" SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'"
In that case step 3 should be omitted.
config XEN_BALLOON_MEMORY_HOTPLUG_LIMIT config XEN_BALLOON_MEMORY_HOTPLUG_LIMIT
int "Hotplugged memory limit (in GiB) for a PV guest" int "Hotplugged memory limit (in GiB) for a PV guest"
default 512 if X86_64 default 512 if X86_64

Просмотреть файл

@ -338,7 +338,16 @@ static enum bp_state reserve_additional_memory(void)
} }
#endif #endif
rc = add_memory_resource(nid, resource); /*
* add_memory_resource() will call online_pages() which in its turn
* will call xen_online_page() callback causing deadlock if we don't
* release balloon_mutex here. Unlocking here is safe because the
* callers drop the mutex before trying again.
*/
mutex_unlock(&balloon_mutex);
rc = add_memory_resource(nid, resource, memhp_auto_online);
mutex_lock(&balloon_mutex);
if (rc) { if (rc) {
pr_warn("Cannot add additional memory (%i)\n", rc); pr_warn("Cannot add additional memory (%i)\n", rc);
goto err; goto err;

Просмотреть файл

@ -38,8 +38,9 @@
/* Find the first set bit in a evtchn mask */ /* Find the first set bit in a evtchn mask */
#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD) #define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD)
static DEFINE_PER_CPU(xen_ulong_t [EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD], #define EVTCHN_MASK_SIZE (EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD)
cpu_evtchn_mask);
static DEFINE_PER_CPU(xen_ulong_t [EVTCHN_MASK_SIZE], cpu_evtchn_mask);
static unsigned evtchn_2l_max_channels(void) static unsigned evtchn_2l_max_channels(void)
{ {

Просмотреть файл

@ -1,15 +1,11 @@
/* -*- c -*- ------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation - All Rights Reserved
* linux/fs/autofs/autofs_i.h * Copyright 2005-2006 Ian Kent <raven@themaw.net>
*
* Copyright 1997-1998 Transmeta Corporation - All Rights Reserved
* Copyright 2005-2006 Ian Kent <raven@themaw.net>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ----------------------------------------------------------------------- */
/* Internal header file for autofs */ /* Internal header file for autofs */
@ -35,28 +31,23 @@
#include <linux/mount.h> #include <linux/mount.h>
#include <linux/namei.h> #include <linux/namei.h>
#include <asm/current.h> #include <asm/current.h>
#include <asm/uaccess.h> #include <linux/uaccess.h>
/* #define DEBUG */ /* #define DEBUG */
#define DPRINTK(fmt, ...) \ #ifdef pr_fmt
pr_debug("pid %d: %s: " fmt "\n", \ #undef pr_fmt
current->pid, __func__, ##__VA_ARGS__) #endif
#define pr_fmt(fmt) KBUILD_MODNAME ":pid:%d:%s: " fmt, current->pid, __func__
#define AUTOFS_WARN(fmt, ...) \ /*
printk(KERN_WARNING "pid %d: %s: " fmt "\n", \ * Unified info structure. This is pointed to by both the dentry and
current->pid, __func__, ##__VA_ARGS__) * inode structures. Each file in the filesystem has an instance of this
* structure. It holds a reference to the dentry, so dentries are never
#define AUTOFS_ERROR(fmt, ...) \ * flushed while the file exists. All name lookups are dealt with at the
printk(KERN_ERR "pid %d: %s: " fmt "\n", \ * dentry level, although the filesystem can interfere in the validation
current->pid, __func__, ##__VA_ARGS__) * process. Readdir is implemented by traversing the dentry lists.
*/
/* Unified info structure. This is pointed to by both the dentry and
inode structures. Each file in the filesystem has an instance of this
structure. It holds a reference to the dentry, so dentries are never
flushed while the file exists. All name lookups are dealt with at the
dentry level, although the filesystem can interfere in the validation
process. Readdir is implemented by traversing the dentry lists. */
struct autofs_info { struct autofs_info {
struct dentry *dentry; struct dentry *dentry;
struct inode *inode; struct inode *inode;
@ -78,7 +69,7 @@ struct autofs_info {
kgid_t gid; kgid_t gid;
}; };
#define AUTOFS_INF_EXPIRING (1<<0) /* dentry is in the process of expiring */ #define AUTOFS_INF_EXPIRING (1<<0) /* dentry in the process of expiring */
#define AUTOFS_INF_NO_RCU (1<<1) /* the dentry is being considered #define AUTOFS_INF_NO_RCU (1<<1) /* the dentry is being considered
* for expiry, so RCU_walk is * for expiry, so RCU_walk is
* not permitted * not permitted
@ -140,10 +131,11 @@ static inline struct autofs_info *autofs4_dentry_ino(struct dentry *dentry)
} }
/* autofs4_oz_mode(): do we see the man behind the curtain? (The /* autofs4_oz_mode(): do we see the man behind the curtain? (The
processes which do manipulations for us in user space sees the raw * processes which do manipulations for us in user space sees the raw
filesystem without "magic".) */ * filesystem without "magic".)
*/
static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) { static inline int autofs4_oz_mode(struct autofs_sb_info *sbi)
{
return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp; return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
} }
@ -154,12 +146,12 @@ void autofs4_free_ino(struct autofs_info *);
int is_autofs4_dentry(struct dentry *); int is_autofs4_dentry(struct dentry *);
int autofs4_expire_wait(struct dentry *dentry, int rcu_walk); int autofs4_expire_wait(struct dentry *dentry, int rcu_walk);
int autofs4_expire_run(struct super_block *, struct vfsmount *, int autofs4_expire_run(struct super_block *, struct vfsmount *,
struct autofs_sb_info *, struct autofs_sb_info *,
struct autofs_packet_expire __user *); struct autofs_packet_expire __user *);
int autofs4_do_expire_multi(struct super_block *sb, struct vfsmount *mnt, int autofs4_do_expire_multi(struct super_block *sb, struct vfsmount *mnt,
struct autofs_sb_info *sbi, int when); struct autofs_sb_info *sbi, int when);
int autofs4_expire_multi(struct super_block *, struct vfsmount *, int autofs4_expire_multi(struct super_block *, struct vfsmount *,
struct autofs_sb_info *, int __user *); struct autofs_sb_info *, int __user *);
struct dentry *autofs4_expire_direct(struct super_block *sb, struct dentry *autofs4_expire_direct(struct super_block *sb,
struct vfsmount *mnt, struct vfsmount *mnt,
struct autofs_sb_info *sbi, int how); struct autofs_sb_info *sbi, int how);
@ -224,8 +216,8 @@ static inline int autofs_prepare_pipe(struct file *pipe)
/* Queue management functions */ /* Queue management functions */
int autofs4_wait(struct autofs_sb_info *,struct dentry *, enum autofs_notify); int autofs4_wait(struct autofs_sb_info *, struct dentry *, enum autofs_notify);
int autofs4_wait_release(struct autofs_sb_info *,autofs_wqt_t,int); int autofs4_wait_release(struct autofs_sb_info *, autofs_wqt_t, int);
void autofs4_catatonic_mode(struct autofs_sb_info *); void autofs4_catatonic_mode(struct autofs_sb_info *);
static inline u32 autofs4_get_dev(struct autofs_sb_info *sbi) static inline u32 autofs4_get_dev(struct autofs_sb_info *sbi)
@ -242,37 +234,37 @@ static inline void __autofs4_add_expiring(struct dentry *dentry)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
if (ino) { if (ino) {
if (list_empty(&ino->expiring)) if (list_empty(&ino->expiring))
list_add(&ino->expiring, &sbi->expiring_list); list_add(&ino->expiring, &sbi->expiring_list);
} }
return;
} }
static inline void autofs4_add_expiring(struct dentry *dentry) static inline void autofs4_add_expiring(struct dentry *dentry)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
if (ino) { if (ino) {
spin_lock(&sbi->lookup_lock); spin_lock(&sbi->lookup_lock);
if (list_empty(&ino->expiring)) if (list_empty(&ino->expiring))
list_add(&ino->expiring, &sbi->expiring_list); list_add(&ino->expiring, &sbi->expiring_list);
spin_unlock(&sbi->lookup_lock); spin_unlock(&sbi->lookup_lock);
} }
return;
} }
static inline void autofs4_del_expiring(struct dentry *dentry) static inline void autofs4_del_expiring(struct dentry *dentry)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
if (ino) { if (ino) {
spin_lock(&sbi->lookup_lock); spin_lock(&sbi->lookup_lock);
if (!list_empty(&ino->expiring)) if (!list_empty(&ino->expiring))
list_del_init(&ino->expiring); list_del_init(&ino->expiring);
spin_unlock(&sbi->lookup_lock); spin_unlock(&sbi->lookup_lock);
} }
return;
} }
extern void autofs4_kill_sb(struct super_block *); extern void autofs4_kill_sb(struct super_block *);

Просмотреть файл

@ -72,13 +72,13 @@ static int check_dev_ioctl_version(int cmd, struct autofs_dev_ioctl *param)
{ {
int err = 0; int err = 0;
if ((AUTOFS_DEV_IOCTL_VERSION_MAJOR != param->ver_major) || if ((param->ver_major != AUTOFS_DEV_IOCTL_VERSION_MAJOR) ||
(AUTOFS_DEV_IOCTL_VERSION_MINOR < param->ver_minor)) { (param->ver_minor > AUTOFS_DEV_IOCTL_VERSION_MINOR)) {
AUTOFS_WARN("ioctl control interface version mismatch: " pr_warn("ioctl control interface version mismatch: "
"kernel(%u.%u), user(%u.%u), cmd(%d)", "kernel(%u.%u), user(%u.%u), cmd(%d)\n",
AUTOFS_DEV_IOCTL_VERSION_MAJOR, AUTOFS_DEV_IOCTL_VERSION_MAJOR,
AUTOFS_DEV_IOCTL_VERSION_MINOR, AUTOFS_DEV_IOCTL_VERSION_MINOR,
param->ver_major, param->ver_minor, cmd); param->ver_major, param->ver_minor, cmd);
err = -EINVAL; err = -EINVAL;
} }
@ -93,7 +93,8 @@ static int check_dev_ioctl_version(int cmd, struct autofs_dev_ioctl *param)
* Copy parameter control struct, including a possible path allocated * Copy parameter control struct, including a possible path allocated
* at the end of the struct. * at the end of the struct.
*/ */
static struct autofs_dev_ioctl *copy_dev_ioctl(struct autofs_dev_ioctl __user *in) static struct autofs_dev_ioctl *
copy_dev_ioctl(struct autofs_dev_ioctl __user *in)
{ {
struct autofs_dev_ioctl tmp, *res; struct autofs_dev_ioctl tmp, *res;
@ -116,7 +117,6 @@ static struct autofs_dev_ioctl *copy_dev_ioctl(struct autofs_dev_ioctl __user *i
static inline void free_dev_ioctl(struct autofs_dev_ioctl *param) static inline void free_dev_ioctl(struct autofs_dev_ioctl *param)
{ {
kfree(param); kfree(param);
return;
} }
/* /*
@ -129,24 +129,24 @@ static int validate_dev_ioctl(int cmd, struct autofs_dev_ioctl *param)
err = check_dev_ioctl_version(cmd, param); err = check_dev_ioctl_version(cmd, param);
if (err) { if (err) {
AUTOFS_WARN("invalid device control module version " pr_warn("invalid device control module version "
"supplied for cmd(0x%08x)", cmd); "supplied for cmd(0x%08x)\n", cmd);
goto out; goto out;
} }
if (param->size > sizeof(*param)) { if (param->size > sizeof(*param)) {
err = invalid_str(param->path, param->size - sizeof(*param)); err = invalid_str(param->path, param->size - sizeof(*param));
if (err) { if (err) {
AUTOFS_WARN( pr_warn(
"path string terminator missing for cmd(0x%08x)", "path string terminator missing for cmd(0x%08x)\n",
cmd); cmd);
goto out; goto out;
} }
err = check_name(param->path); err = check_name(param->path);
if (err) { if (err) {
AUTOFS_WARN("invalid path supplied for cmd(0x%08x)", pr_warn("invalid path supplied for cmd(0x%08x)\n",
cmd); cmd);
goto out; goto out;
} }
} }
@ -197,7 +197,9 @@ static int find_autofs_mount(const char *pathname,
void *data) void *data)
{ {
struct path path; struct path path;
int err = kern_path_mountpoint(AT_FDCWD, pathname, &path, 0); int err;
err = kern_path_mountpoint(AT_FDCWD, pathname, &path, 0);
if (err) if (err)
return err; return err;
err = -ENOENT; err = -ENOENT;
@ -225,6 +227,7 @@ static int test_by_dev(struct path *path, void *p)
static int test_by_type(struct path *path, void *p) static int test_by_type(struct path *path, void *p)
{ {
struct autofs_info *ino = autofs4_dentry_ino(path->dentry); struct autofs_info *ino = autofs4_dentry_ino(path->dentry);
return ino && ino->sbi->type & *(unsigned *)p; return ino && ino->sbi->type & *(unsigned *)p;
} }
@ -370,7 +373,7 @@ static int autofs_dev_ioctl_setpipefd(struct file *fp,
new_pid = get_task_pid(current, PIDTYPE_PGID); new_pid = get_task_pid(current, PIDTYPE_PGID);
if (ns_of_pid(new_pid) != ns_of_pid(sbi->oz_pgrp)) { if (ns_of_pid(new_pid) != ns_of_pid(sbi->oz_pgrp)) {
AUTOFS_WARN("Not allowed to change PID namespace"); pr_warn("not allowed to change PID namespace\n");
err = -EINVAL; err = -EINVAL;
goto out; goto out;
} }
@ -456,8 +459,10 @@ static int autofs_dev_ioctl_requester(struct file *fp,
err = 0; err = 0;
autofs4_expire_wait(path.dentry, 0); autofs4_expire_wait(path.dentry, 0);
spin_lock(&sbi->fs_lock); spin_lock(&sbi->fs_lock);
param->requester.uid = from_kuid_munged(current_user_ns(), ino->uid); param->requester.uid =
param->requester.gid = from_kgid_munged(current_user_ns(), ino->gid); from_kuid_munged(current_user_ns(), ino->uid);
param->requester.gid =
from_kgid_munged(current_user_ns(), ino->gid);
spin_unlock(&sbi->fs_lock); spin_unlock(&sbi->fs_lock);
} }
path_put(&path); path_put(&path);
@ -619,7 +624,8 @@ static ioctl_fn lookup_dev_ioctl(unsigned int cmd)
} }
/* ioctl dispatcher */ /* ioctl dispatcher */
static int _autofs_dev_ioctl(unsigned int command, struct autofs_dev_ioctl __user *user) static int _autofs_dev_ioctl(unsigned int command,
struct autofs_dev_ioctl __user *user)
{ {
struct autofs_dev_ioctl *param; struct autofs_dev_ioctl *param;
struct file *fp; struct file *fp;
@ -655,7 +661,7 @@ static int _autofs_dev_ioctl(unsigned int command, struct autofs_dev_ioctl __use
fn = lookup_dev_ioctl(cmd); fn = lookup_dev_ioctl(cmd);
if (!fn) { if (!fn) {
AUTOFS_WARN("unknown command 0x%08x", command); pr_warn("unknown command 0x%08x\n", command);
return -ENOTTY; return -ENOTTY;
} }
@ -711,6 +717,7 @@ out:
static long autofs_dev_ioctl(struct file *file, uint command, ulong u) static long autofs_dev_ioctl(struct file *file, uint command, ulong u)
{ {
int err; int err;
err = _autofs_dev_ioctl(command, (struct autofs_dev_ioctl __user *) u); err = _autofs_dev_ioctl(command, (struct autofs_dev_ioctl __user *) u);
return (long) err; return (long) err;
} }
@ -733,8 +740,8 @@ static const struct file_operations _dev_ioctl_fops = {
static struct miscdevice _autofs_dev_ioctl_misc = { static struct miscdevice _autofs_dev_ioctl_misc = {
.minor = AUTOFS_MINOR, .minor = AUTOFS_MINOR,
.name = AUTOFS_DEVICE_NAME, .name = AUTOFS_DEVICE_NAME,
.fops = &_dev_ioctl_fops .fops = &_dev_ioctl_fops
}; };
MODULE_ALIAS_MISCDEV(AUTOFS_MINOR); MODULE_ALIAS_MISCDEV(AUTOFS_MINOR);
@ -747,7 +754,7 @@ int __init autofs_dev_ioctl_init(void)
r = misc_register(&_autofs_dev_ioctl_misc); r = misc_register(&_autofs_dev_ioctl_misc);
if (r) { if (r) {
AUTOFS_ERROR("misc_register failed for control device"); pr_err("misc_register failed for control device\n");
return r; return r;
} }
@ -757,6 +764,4 @@ int __init autofs_dev_ioctl_init(void)
void autofs_dev_ioctl_exit(void) void autofs_dev_ioctl_exit(void)
{ {
misc_deregister(&_autofs_dev_ioctl_misc); misc_deregister(&_autofs_dev_ioctl_misc);
return;
} }

Просмотреть файл

@ -1,16 +1,12 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/expire.c * Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org>
* * Copyright 2001-2006 Ian Kent <raven@themaw.net>
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org>
* Copyright 2001-2006 Ian Kent <raven@themaw.net>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include "autofs_i.h" #include "autofs_i.h"
@ -18,7 +14,7 @@ static unsigned long now;
/* Check if a dentry can be expired */ /* Check if a dentry can be expired */
static inline int autofs4_can_expire(struct dentry *dentry, static inline int autofs4_can_expire(struct dentry *dentry,
unsigned long timeout, int do_now) unsigned long timeout, int do_now)
{ {
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
@ -41,7 +37,7 @@ static int autofs4_mount_busy(struct vfsmount *mnt, struct dentry *dentry)
struct path path = {.mnt = mnt, .dentry = dentry}; struct path path = {.mnt = mnt, .dentry = dentry};
int status = 1; int status = 1;
DPRINTK("dentry %p %pd", dentry, dentry); pr_debug("dentry %p %pd\n", dentry, dentry);
path_get(&path); path_get(&path);
@ -58,14 +54,16 @@ static int autofs4_mount_busy(struct vfsmount *mnt, struct dentry *dentry)
/* Update the expiry counter if fs is busy */ /* Update the expiry counter if fs is busy */
if (!may_umount_tree(path.mnt)) { if (!may_umount_tree(path.mnt)) {
struct autofs_info *ino = autofs4_dentry_ino(top); struct autofs_info *ino;
ino = autofs4_dentry_ino(top);
ino->last_used = jiffies; ino->last_used = jiffies;
goto done; goto done;
} }
status = 0; status = 0;
done: done:
DPRINTK("returning = %d", status); pr_debug("returning = %d\n", status);
path_put(&path); path_put(&path);
return status; return status;
} }
@ -74,7 +72,7 @@ done:
* Calculate and dget next entry in the subdirs list under root. * Calculate and dget next entry in the subdirs list under root.
*/ */
static struct dentry *get_next_positive_subdir(struct dentry *prev, static struct dentry *get_next_positive_subdir(struct dentry *prev,
struct dentry *root) struct dentry *root)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(root->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(root->d_sb);
struct list_head *next; struct list_head *next;
@ -121,7 +119,7 @@ cont:
* Calculate and dget next entry in top down tree traversal. * Calculate and dget next entry in top down tree traversal.
*/ */
static struct dentry *get_next_positive_dentry(struct dentry *prev, static struct dentry *get_next_positive_dentry(struct dentry *prev,
struct dentry *root) struct dentry *root)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(root->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(root->d_sb);
struct list_head *next; struct list_head *next;
@ -187,15 +185,17 @@ again:
* autofs submounts. * autofs submounts.
*/ */
static int autofs4_direct_busy(struct vfsmount *mnt, static int autofs4_direct_busy(struct vfsmount *mnt,
struct dentry *top, struct dentry *top,
unsigned long timeout, unsigned long timeout,
int do_now) int do_now)
{ {
DPRINTK("top %p %pd", top, top); pr_debug("top %p %pd\n", top, top);
/* If it's busy update the expiry counters */ /* If it's busy update the expiry counters */
if (!may_umount_tree(mnt)) { if (!may_umount_tree(mnt)) {
struct autofs_info *ino = autofs4_dentry_ino(top); struct autofs_info *ino;
ino = autofs4_dentry_ino(top);
if (ino) if (ino)
ino->last_used = jiffies; ino->last_used = jiffies;
return 1; return 1;
@ -208,7 +208,8 @@ static int autofs4_direct_busy(struct vfsmount *mnt,
return 0; return 0;
} }
/* Check a directory tree of mount points for busyness /*
* Check a directory tree of mount points for busyness
* The tree is not busy iff no mountpoints are busy * The tree is not busy iff no mountpoints are busy
*/ */
static int autofs4_tree_busy(struct vfsmount *mnt, static int autofs4_tree_busy(struct vfsmount *mnt,
@ -219,7 +220,7 @@ static int autofs4_tree_busy(struct vfsmount *mnt,
struct autofs_info *top_ino = autofs4_dentry_ino(top); struct autofs_info *top_ino = autofs4_dentry_ino(top);
struct dentry *p; struct dentry *p;
DPRINTK("top %p %pd", top, top); pr_debug("top %p %pd\n", top, top);
/* Negative dentry - give up */ /* Negative dentry - give up */
if (!simple_positive(top)) if (!simple_positive(top))
@ -227,7 +228,7 @@ static int autofs4_tree_busy(struct vfsmount *mnt,
p = NULL; p = NULL;
while ((p = get_next_positive_dentry(p, top))) { while ((p = get_next_positive_dentry(p, top))) {
DPRINTK("dentry %p %pd", p, p); pr_debug("dentry %p %pd\n", p, p);
/* /*
* Is someone visiting anywhere in the subtree ? * Is someone visiting anywhere in the subtree ?
@ -273,11 +274,11 @@ static struct dentry *autofs4_check_leaves(struct vfsmount *mnt,
{ {
struct dentry *p; struct dentry *p;
DPRINTK("parent %p %pd", parent, parent); pr_debug("parent %p %pd\n", parent, parent);
p = NULL; p = NULL;
while ((p = get_next_positive_dentry(p, parent))) { while ((p = get_next_positive_dentry(p, parent))) {
DPRINTK("dentry %p %pd", p, p); pr_debug("dentry %p %pd\n", p, p);
if (d_mountpoint(p)) { if (d_mountpoint(p)) {
/* Can we umount this guy */ /* Can we umount this guy */
@ -362,7 +363,7 @@ static struct dentry *should_expire(struct dentry *dentry,
* offset (autofs-5.0+). * offset (autofs-5.0+).
*/ */
if (d_mountpoint(dentry)) { if (d_mountpoint(dentry)) {
DPRINTK("checking mountpoint %p %pd", dentry, dentry); pr_debug("checking mountpoint %p %pd\n", dentry, dentry);
/* Can we umount this guy */ /* Can we umount this guy */
if (autofs4_mount_busy(mnt, dentry)) if (autofs4_mount_busy(mnt, dentry))
@ -375,7 +376,7 @@ static struct dentry *should_expire(struct dentry *dentry,
} }
if (d_really_is_positive(dentry) && d_is_symlink(dentry)) { if (d_really_is_positive(dentry) && d_is_symlink(dentry)) {
DPRINTK("checking symlink %p %pd", dentry, dentry); pr_debug("checking symlink %p %pd\n", dentry, dentry);
/* /*
* A symlink can't be "busy" in the usual sense so * A symlink can't be "busy" in the usual sense so
* just check last used for expire timeout. * just check last used for expire timeout.
@ -404,6 +405,7 @@ static struct dentry *should_expire(struct dentry *dentry,
} else { } else {
/* Path walk currently on this dentry? */ /* Path walk currently on this dentry? */
struct dentry *expired; struct dentry *expired;
ino_count = atomic_read(&ino->count) + 1; ino_count = atomic_read(&ino->count) + 1;
if (d_count(dentry) > ino_count) if (d_count(dentry) > ino_count)
return NULL; return NULL;
@ -471,7 +473,7 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
return NULL; return NULL;
found: found:
DPRINTK("returning %p %pd", expired, expired); pr_debug("returning %p %pd\n", expired, expired);
ino->flags |= AUTOFS_INF_EXPIRING; ino->flags |= AUTOFS_INF_EXPIRING;
smp_mb(); smp_mb();
ino->flags &= ~AUTOFS_INF_NO_RCU; ino->flags &= ~AUTOFS_INF_NO_RCU;
@ -503,12 +505,12 @@ int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
if (ino->flags & AUTOFS_INF_EXPIRING) { if (ino->flags & AUTOFS_INF_EXPIRING) {
spin_unlock(&sbi->fs_lock); spin_unlock(&sbi->fs_lock);
DPRINTK("waiting for expire %p name=%pd", dentry, dentry); pr_debug("waiting for expire %p name=%pd\n", dentry, dentry);
status = autofs4_wait(sbi, dentry, NFY_NONE); status = autofs4_wait(sbi, dentry, NFY_NONE);
wait_for_completion(&ino->expire_complete); wait_for_completion(&ino->expire_complete);
DPRINTK("expire done status=%d", status); pr_debug("expire done status=%d\n", status);
if (d_unhashed(dentry)) if (d_unhashed(dentry))
return -EAGAIN; return -EAGAIN;
@ -522,21 +524,22 @@ int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
/* Perform an expiry operation */ /* Perform an expiry operation */
int autofs4_expire_run(struct super_block *sb, int autofs4_expire_run(struct super_block *sb,
struct vfsmount *mnt, struct vfsmount *mnt,
struct autofs_sb_info *sbi, struct autofs_sb_info *sbi,
struct autofs_packet_expire __user *pkt_p) struct autofs_packet_expire __user *pkt_p)
{ {
struct autofs_packet_expire pkt; struct autofs_packet_expire pkt;
struct autofs_info *ino; struct autofs_info *ino;
struct dentry *dentry; struct dentry *dentry;
int ret = 0; int ret = 0;
memset(&pkt,0,sizeof pkt); memset(&pkt, 0, sizeof(pkt));
pkt.hdr.proto_version = sbi->version; pkt.hdr.proto_version = sbi->version;
pkt.hdr.type = autofs_ptype_expire; pkt.hdr.type = autofs_ptype_expire;
if ((dentry = autofs4_expire_indirect(sb, mnt, sbi, 0)) == NULL) dentry = autofs4_expire_indirect(sb, mnt, sbi, 0);
if (!dentry)
return -EAGAIN; return -EAGAIN;
pkt.len = dentry->d_name.len; pkt.len = dentry->d_name.len;
@ -544,7 +547,7 @@ int autofs4_expire_run(struct super_block *sb,
pkt.name[pkt.len] = '\0'; pkt.name[pkt.len] = '\0';
dput(dentry); dput(dentry);
if ( copy_to_user(pkt_p, &pkt, sizeof(struct autofs_packet_expire)) ) if (copy_to_user(pkt_p, &pkt, sizeof(struct autofs_packet_expire)))
ret = -EFAULT; ret = -EFAULT;
spin_lock(&sbi->fs_lock); spin_lock(&sbi->fs_lock);
@ -573,7 +576,8 @@ int autofs4_do_expire_multi(struct super_block *sb, struct vfsmount *mnt,
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
/* This is synchronous because it makes the daemon a /* This is synchronous because it makes the daemon a
little easier */ * little easier
*/
ret = autofs4_wait(sbi, dentry, NFY_EXPIRE); ret = autofs4_wait(sbi, dentry, NFY_EXPIRE);
spin_lock(&sbi->fs_lock); spin_lock(&sbi->fs_lock);
@ -588,8 +592,10 @@ int autofs4_do_expire_multi(struct super_block *sb, struct vfsmount *mnt,
return ret; return ret;
} }
/* Call repeatedly until it returns -EAGAIN, meaning there's nothing /*
more to be done */ * Call repeatedly until it returns -EAGAIN, meaning there's nothing
* more to be done.
*/
int autofs4_expire_multi(struct super_block *sb, struct vfsmount *mnt, int autofs4_expire_multi(struct super_block *sb, struct vfsmount *mnt,
struct autofs_sb_info *sbi, int __user *arg) struct autofs_sb_info *sbi, int __user *arg)
{ {

Просмотреть файл

@ -1,14 +1,10 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/init.c
*
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include <linux/module.h> #include <linux/module.h>
#include <linux/init.h> #include <linux/init.h>

Просмотреть файл

@ -1,15 +1,11 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/inode.c * Copyright 2005-2006 Ian Kent <raven@themaw.net>
*
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* Copyright 2005-2006 Ian Kent <raven@themaw.net>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/slab.h> #include <linux/slab.h>
@ -24,7 +20,9 @@
struct autofs_info *autofs4_new_ino(struct autofs_sb_info *sbi) struct autofs_info *autofs4_new_ino(struct autofs_sb_info *sbi)
{ {
struct autofs_info *ino = kzalloc(sizeof(*ino), GFP_KERNEL); struct autofs_info *ino;
ino = kzalloc(sizeof(*ino), GFP_KERNEL);
if (ino) { if (ino) {
INIT_LIST_HEAD(&ino->active); INIT_LIST_HEAD(&ino->active);
INIT_LIST_HEAD(&ino->expiring); INIT_LIST_HEAD(&ino->expiring);
@ -62,7 +60,7 @@ void autofs4_kill_sb(struct super_block *sb)
put_pid(sbi->oz_pgrp); put_pid(sbi->oz_pgrp);
} }
DPRINTK("shutting down"); pr_debug("shutting down\n");
kill_litter_super(sb); kill_litter_super(sb);
if (sbi) if (sbi)
kfree_rcu(sbi, rcu); kfree_rcu(sbi, rcu);
@ -94,7 +92,12 @@ static int autofs4_show_options(struct seq_file *m, struct dentry *root)
seq_printf(m, ",direct"); seq_printf(m, ",direct");
else else
seq_printf(m, ",indirect"); seq_printf(m, ",indirect");
#ifdef CONFIG_CHECKPOINT_RESTORE
if (sbi->pipe)
seq_printf(m, ",pipe_ino=%ld", sbi->pipe->f_inode->i_ino);
else
seq_printf(m, ",pipe_ino=-1");
#endif
return 0; return 0;
} }
@ -147,6 +150,7 @@ static int parse_options(char *options, int *pipefd, kuid_t *uid, kgid_t *gid,
while ((p = strsep(&options, ",")) != NULL) { while ((p = strsep(&options, ",")) != NULL) {
int token; int token;
if (!*p) if (!*p)
continue; continue;
@ -204,9 +208,9 @@ static int parse_options(char *options, int *pipefd, kuid_t *uid, kgid_t *gid,
int autofs4_fill_super(struct super_block *s, void *data, int silent) int autofs4_fill_super(struct super_block *s, void *data, int silent)
{ {
struct inode * root_inode; struct inode *root_inode;
struct dentry * root; struct dentry *root;
struct file * pipe; struct file *pipe;
int pipefd; int pipefd;
struct autofs_sb_info *sbi; struct autofs_sb_info *sbi;
struct autofs_info *ino; struct autofs_info *ino;
@ -217,7 +221,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL); sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi) if (!sbi)
return -ENOMEM; return -ENOMEM;
DPRINTK("starting up, sbi = %p",sbi); pr_debug("starting up, sbi = %p\n", sbi);
s->s_fs_info = sbi; s->s_fs_info = sbi;
sbi->magic = AUTOFS_SBI_MAGIC; sbi->magic = AUTOFS_SBI_MAGIC;
@ -266,14 +270,14 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
if (parse_options(data, &pipefd, &root_inode->i_uid, &root_inode->i_gid, if (parse_options(data, &pipefd, &root_inode->i_uid, &root_inode->i_gid,
&pgrp, &pgrp_set, &sbi->type, &sbi->min_proto, &pgrp, &pgrp_set, &sbi->type, &sbi->min_proto,
&sbi->max_proto)) { &sbi->max_proto)) {
printk("autofs: called with bogus options\n"); pr_err("called with bogus options\n");
goto fail_dput; goto fail_dput;
} }
if (pgrp_set) { if (pgrp_set) {
sbi->oz_pgrp = find_get_pid(pgrp); sbi->oz_pgrp = find_get_pid(pgrp);
if (!sbi->oz_pgrp) { if (!sbi->oz_pgrp) {
pr_warn("autofs: could not find process group %d\n", pr_err("could not find process group %d\n",
pgrp); pgrp);
goto fail_dput; goto fail_dput;
} }
@ -290,10 +294,10 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
/* Couldn't this be tested earlier? */ /* Couldn't this be tested earlier? */
if (sbi->max_proto < AUTOFS_MIN_PROTO_VERSION || if (sbi->max_proto < AUTOFS_MIN_PROTO_VERSION ||
sbi->min_proto > AUTOFS_MAX_PROTO_VERSION) { sbi->min_proto > AUTOFS_MAX_PROTO_VERSION) {
printk("autofs: kernel does not match daemon version " pr_err("kernel does not match daemon version "
"daemon (%d, %d) kernel (%d, %d)\n", "daemon (%d, %d) kernel (%d, %d)\n",
sbi->min_proto, sbi->max_proto, sbi->min_proto, sbi->max_proto,
AUTOFS_MIN_PROTO_VERSION, AUTOFS_MAX_PROTO_VERSION); AUTOFS_MIN_PROTO_VERSION, AUTOFS_MAX_PROTO_VERSION);
goto fail_dput; goto fail_dput;
} }
@ -304,11 +308,11 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
sbi->version = sbi->max_proto; sbi->version = sbi->max_proto;
sbi->sub_version = AUTOFS_PROTO_SUBVERSION; sbi->sub_version = AUTOFS_PROTO_SUBVERSION;
DPRINTK("pipe fd = %d, pgrp = %u", pipefd, pid_nr(sbi->oz_pgrp)); pr_debug("pipe fd = %d, pgrp = %u\n", pipefd, pid_nr(sbi->oz_pgrp));
pipe = fget(pipefd); pipe = fget(pipefd);
if (!pipe) { if (!pipe) {
printk("autofs: could not open pipe file descriptor\n"); pr_err("could not open pipe file descriptor\n");
goto fail_dput; goto fail_dput;
} }
ret = autofs_prepare_pipe(pipe); ret = autofs_prepare_pipe(pipe);
@ -328,7 +332,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
* Failure ... clean up. * Failure ... clean up.
*/ */
fail_fput: fail_fput:
printk("autofs: pipe file descriptor does not contain proper ops\n"); pr_err("pipe file descriptor does not contain proper ops\n");
fput(pipe); fput(pipe);
/* fall through */ /* fall through */
fail_dput: fail_dput:

Просмотреть файл

@ -1,16 +1,12 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/root.c * Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org>
* * Copyright 2001-2006 Ian Kent <raven@themaw.net>
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org>
* Copyright 2001-2006 Ian Kent <raven@themaw.net>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include <linux/capability.h> #include <linux/capability.h>
#include <linux/errno.h> #include <linux/errno.h>
@ -23,16 +19,18 @@
#include "autofs_i.h" #include "autofs_i.h"
static int autofs4_dir_symlink(struct inode *,struct dentry *,const char *); static int autofs4_dir_symlink(struct inode *, struct dentry *, const char *);
static int autofs4_dir_unlink(struct inode *,struct dentry *); static int autofs4_dir_unlink(struct inode *, struct dentry *);
static int autofs4_dir_rmdir(struct inode *,struct dentry *); static int autofs4_dir_rmdir(struct inode *, struct dentry *);
static int autofs4_dir_mkdir(struct inode *,struct dentry *,umode_t); static int autofs4_dir_mkdir(struct inode *, struct dentry *, umode_t);
static long autofs4_root_ioctl(struct file *,unsigned int,unsigned long); static long autofs4_root_ioctl(struct file *, unsigned int, unsigned long);
#ifdef CONFIG_COMPAT #ifdef CONFIG_COMPAT
static long autofs4_root_compat_ioctl(struct file *,unsigned int,unsigned long); static long autofs4_root_compat_ioctl(struct file *,
unsigned int, unsigned long);
#endif #endif
static int autofs4_dir_open(struct inode *inode, struct file *file); static int autofs4_dir_open(struct inode *inode, struct file *file);
static struct dentry *autofs4_lookup(struct inode *,struct dentry *, unsigned int); static struct dentry *autofs4_lookup(struct inode *,
struct dentry *, unsigned int);
static struct vfsmount *autofs4_d_automount(struct path *); static struct vfsmount *autofs4_d_automount(struct path *);
static int autofs4_d_manage(struct dentry *, bool); static int autofs4_d_manage(struct dentry *, bool);
static void autofs4_dentry_release(struct dentry *); static void autofs4_dentry_release(struct dentry *);
@ -74,7 +72,9 @@ const struct dentry_operations autofs4_dentry_operations = {
static void autofs4_add_active(struct dentry *dentry) static void autofs4_add_active(struct dentry *dentry)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino;
ino = autofs4_dentry_ino(dentry);
if (ino) { if (ino) {
spin_lock(&sbi->lookup_lock); spin_lock(&sbi->lookup_lock);
if (!ino->active_count) { if (!ino->active_count) {
@ -84,13 +84,14 @@ static void autofs4_add_active(struct dentry *dentry)
ino->active_count++; ino->active_count++;
spin_unlock(&sbi->lookup_lock); spin_unlock(&sbi->lookup_lock);
} }
return;
} }
static void autofs4_del_active(struct dentry *dentry) static void autofs4_del_active(struct dentry *dentry)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino;
ino = autofs4_dentry_ino(dentry);
if (ino) { if (ino) {
spin_lock(&sbi->lookup_lock); spin_lock(&sbi->lookup_lock);
ino->active_count--; ino->active_count--;
@ -100,7 +101,6 @@ static void autofs4_del_active(struct dentry *dentry)
} }
spin_unlock(&sbi->lookup_lock); spin_unlock(&sbi->lookup_lock);
} }
return;
} }
static int autofs4_dir_open(struct inode *inode, struct file *file) static int autofs4_dir_open(struct inode *inode, struct file *file)
@ -108,7 +108,7 @@ static int autofs4_dir_open(struct inode *inode, struct file *file)
struct dentry *dentry = file->f_path.dentry; struct dentry *dentry = file->f_path.dentry;
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
DPRINTK("file=%p dentry=%p %pd", file, dentry, dentry); pr_debug("file=%p dentry=%p %pd\n", file, dentry, dentry);
if (autofs4_oz_mode(sbi)) if (autofs4_oz_mode(sbi))
goto out; goto out;
@ -138,7 +138,7 @@ static void autofs4_dentry_release(struct dentry *de)
struct autofs_info *ino = autofs4_dentry_ino(de); struct autofs_info *ino = autofs4_dentry_ino(de);
struct autofs_sb_info *sbi = autofs4_sbi(de->d_sb); struct autofs_sb_info *sbi = autofs4_sbi(de->d_sb);
DPRINTK("releasing %p", de); pr_debug("releasing %p\n", de);
if (!ino) if (!ino)
return; return;
@ -278,9 +278,9 @@ static int autofs4_mount_wait(struct dentry *dentry, bool rcu_walk)
if (ino->flags & AUTOFS_INF_PENDING) { if (ino->flags & AUTOFS_INF_PENDING) {
if (rcu_walk) if (rcu_walk)
return -ECHILD; return -ECHILD;
DPRINTK("waiting for mount name=%pd", dentry); pr_debug("waiting for mount name=%pd\n", dentry);
status = autofs4_wait(sbi, dentry, NFY_MOUNT); status = autofs4_wait(sbi, dentry, NFY_MOUNT);
DPRINTK("mount wait done status=%d", status); pr_debug("mount wait done status=%d\n", status);
} }
ino->last_used = jiffies; ino->last_used = jiffies;
return status; return status;
@ -320,7 +320,9 @@ static struct dentry *autofs4_mountpoint_changed(struct path *path)
if (autofs_type_indirect(sbi->type) && d_unhashed(dentry)) { if (autofs_type_indirect(sbi->type) && d_unhashed(dentry)) {
struct dentry *parent = dentry->d_parent; struct dentry *parent = dentry->d_parent;
struct autofs_info *ino; struct autofs_info *ino;
struct dentry *new = d_lookup(parent, &dentry->d_name); struct dentry *new;
new = d_lookup(parent, &dentry->d_name);
if (!new) if (!new)
return NULL; return NULL;
ino = autofs4_dentry_ino(new); ino = autofs4_dentry_ino(new);
@ -338,7 +340,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
int status; int status;
DPRINTK("dentry=%p %pd", dentry, dentry); pr_debug("dentry=%p %pd\n", dentry, dentry);
/* The daemon never triggers a mount. */ /* The daemon never triggers a mount. */
if (autofs4_oz_mode(sbi)) if (autofs4_oz_mode(sbi))
@ -425,7 +427,7 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
int status; int status;
DPRINTK("dentry=%p %pd", dentry, dentry); pr_debug("dentry=%p %pd\n", dentry, dentry);
/* The daemon never waits. */ /* The daemon never waits. */
if (autofs4_oz_mode(sbi)) { if (autofs4_oz_mode(sbi)) {
@ -455,6 +457,7 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
* a mount-trap. * a mount-trap.
*/ */
struct inode *inode; struct inode *inode;
if (ino->flags & (AUTOFS_INF_EXPIRING | AUTOFS_INF_NO_RCU)) if (ino->flags & (AUTOFS_INF_EXPIRING | AUTOFS_INF_NO_RCU))
return 0; return 0;
if (d_mountpoint(dentry)) if (d_mountpoint(dentry))
@ -494,13 +497,14 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
} }
/* Lookups in the root directory */ /* Lookups in the root directory */
static struct dentry *autofs4_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) static struct dentry *autofs4_lookup(struct inode *dir,
struct dentry *dentry, unsigned int flags)
{ {
struct autofs_sb_info *sbi; struct autofs_sb_info *sbi;
struct autofs_info *ino; struct autofs_info *ino;
struct dentry *active; struct dentry *active;
DPRINTK("name = %pd", dentry); pr_debug("name = %pd\n", dentry);
/* File name too long to exist */ /* File name too long to exist */
if (dentry->d_name.len > NAME_MAX) if (dentry->d_name.len > NAME_MAX)
@ -508,14 +512,14 @@ static struct dentry *autofs4_lookup(struct inode *dir, struct dentry *dentry, u
sbi = autofs4_sbi(dir->i_sb); sbi = autofs4_sbi(dir->i_sb);
DPRINTK("pid = %u, pgrp = %u, catatonic = %d, oz_mode = %d", pr_debug("pid = %u, pgrp = %u, catatonic = %d, oz_mode = %d\n",
current->pid, task_pgrp_nr(current), sbi->catatonic, current->pid, task_pgrp_nr(current), sbi->catatonic,
autofs4_oz_mode(sbi)); autofs4_oz_mode(sbi));
active = autofs4_lookup_active(dentry); active = autofs4_lookup_active(dentry);
if (active) { if (active)
return active; return active;
} else { else {
/* /*
* A dentry that is not within the root can never trigger a * A dentry that is not within the root can never trigger a
* mount operation, unless the directory already exists, so we * mount operation, unless the directory already exists, so we
@ -526,7 +530,8 @@ static struct dentry *autofs4_lookup(struct inode *dir, struct dentry *dentry, u
return ERR_PTR(-ENOENT); return ERR_PTR(-ENOENT);
/* Mark entries in the root as mount triggers */ /* Mark entries in the root as mount triggers */
if (autofs_type_indirect(sbi->type) && IS_ROOT(dentry->d_parent)) if (IS_ROOT(dentry->d_parent) &&
autofs_type_indirect(sbi->type))
__managed_dentry_set_managed(dentry); __managed_dentry_set_managed(dentry);
ino = autofs4_new_ino(sbi); ino = autofs4_new_ino(sbi);
@ -554,7 +559,7 @@ static int autofs4_dir_symlink(struct inode *dir,
size_t size = strlen(symname); size_t size = strlen(symname);
char *cp; char *cp;
DPRINTK("%s <- %pd", symname, dentry); pr_debug("%s <- %pd\n", symname, dentry);
if (!autofs4_oz_mode(sbi)) if (!autofs4_oz_mode(sbi))
return -EACCES; return -EACCES;
@ -664,7 +669,6 @@ static void autofs_set_leaf_automount_flags(struct dentry *dentry)
if (IS_ROOT(parent->d_parent)) if (IS_ROOT(parent->d_parent))
return; return;
managed_dentry_clear_managed(parent); managed_dentry_clear_managed(parent);
return;
} }
static void autofs_clear_leaf_automount_flags(struct dentry *dentry) static void autofs_clear_leaf_automount_flags(struct dentry *dentry)
@ -687,7 +691,6 @@ static void autofs_clear_leaf_automount_flags(struct dentry *dentry)
if (d_child->next == &parent->d_subdirs && if (d_child->next == &parent->d_subdirs &&
d_child->prev == &parent->d_subdirs) d_child->prev == &parent->d_subdirs)
managed_dentry_set_managed(parent); managed_dentry_set_managed(parent);
return;
} }
static int autofs4_dir_rmdir(struct inode *dir, struct dentry *dentry) static int autofs4_dir_rmdir(struct inode *dir, struct dentry *dentry)
@ -696,7 +699,7 @@ static int autofs4_dir_rmdir(struct inode *dir, struct dentry *dentry)
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
struct autofs_info *p_ino; struct autofs_info *p_ino;
DPRINTK("dentry %p, removing %pd", dentry, dentry); pr_debug("dentry %p, removing %pd\n", dentry, dentry);
if (!autofs4_oz_mode(sbi)) if (!autofs4_oz_mode(sbi))
return -EACCES; return -EACCES;
@ -728,7 +731,8 @@ static int autofs4_dir_rmdir(struct inode *dir, struct dentry *dentry)
return 0; return 0;
} }
static int autofs4_dir_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) static int autofs4_dir_mkdir(struct inode *dir,
struct dentry *dentry, umode_t mode)
{ {
struct autofs_sb_info *sbi = autofs4_sbi(dir->i_sb); struct autofs_sb_info *sbi = autofs4_sbi(dir->i_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry); struct autofs_info *ino = autofs4_dentry_ino(dentry);
@ -738,7 +742,7 @@ static int autofs4_dir_mkdir(struct inode *dir, struct dentry *dentry, umode_t m
if (!autofs4_oz_mode(sbi)) if (!autofs4_oz_mode(sbi))
return -EACCES; return -EACCES;
DPRINTK("dentry %p, creating %pd", dentry, dentry); pr_debug("dentry %p, creating %pd\n", dentry, dentry);
BUG_ON(!ino); BUG_ON(!ino);
@ -768,14 +772,18 @@ static int autofs4_dir_mkdir(struct inode *dir, struct dentry *dentry, umode_t m
/* Get/set timeout ioctl() operation */ /* Get/set timeout ioctl() operation */
#ifdef CONFIG_COMPAT #ifdef CONFIG_COMPAT
static inline int autofs4_compat_get_set_timeout(struct autofs_sb_info *sbi, static inline int autofs4_compat_get_set_timeout(struct autofs_sb_info *sbi,
compat_ulong_t __user *p) compat_ulong_t __user *p)
{ {
int rv;
unsigned long ntimeout; unsigned long ntimeout;
int rv;
if ((rv = get_user(ntimeout, p)) || rv = get_user(ntimeout, p);
(rv = put_user(sbi->exp_timeout/HZ, p))) if (rv)
return rv; goto error;
rv = put_user(sbi->exp_timeout/HZ, p);
if (rv)
goto error;
if (ntimeout > UINT_MAX/HZ) if (ntimeout > UINT_MAX/HZ)
sbi->exp_timeout = 0; sbi->exp_timeout = 0;
@ -783,18 +791,24 @@ static inline int autofs4_compat_get_set_timeout(struct autofs_sb_info *sbi,
sbi->exp_timeout = ntimeout * HZ; sbi->exp_timeout = ntimeout * HZ;
return 0; return 0;
error:
return rv;
} }
#endif #endif
static inline int autofs4_get_set_timeout(struct autofs_sb_info *sbi, static inline int autofs4_get_set_timeout(struct autofs_sb_info *sbi,
unsigned long __user *p) unsigned long __user *p)
{ {
int rv;
unsigned long ntimeout; unsigned long ntimeout;
int rv;
if ((rv = get_user(ntimeout, p)) || rv = get_user(ntimeout, p);
(rv = put_user(sbi->exp_timeout/HZ, p))) if (rv)
return rv; goto error;
rv = put_user(sbi->exp_timeout/HZ, p);
if (rv)
goto error;
if (ntimeout > ULONG_MAX/HZ) if (ntimeout > ULONG_MAX/HZ)
sbi->exp_timeout = 0; sbi->exp_timeout = 0;
@ -802,16 +816,20 @@ static inline int autofs4_get_set_timeout(struct autofs_sb_info *sbi,
sbi->exp_timeout = ntimeout * HZ; sbi->exp_timeout = ntimeout * HZ;
return 0; return 0;
error:
return rv;
} }
/* Return protocol version */ /* Return protocol version */
static inline int autofs4_get_protover(struct autofs_sb_info *sbi, int __user *p) static inline int autofs4_get_protover(struct autofs_sb_info *sbi,
int __user *p)
{ {
return put_user(sbi->version, p); return put_user(sbi->version, p);
} }
/* Return protocol sub version */ /* Return protocol sub version */
static inline int autofs4_get_protosubver(struct autofs_sb_info *sbi, int __user *p) static inline int autofs4_get_protosubver(struct autofs_sb_info *sbi,
int __user *p)
{ {
return put_user(sbi->sub_version, p); return put_user(sbi->sub_version, p);
} }
@ -826,7 +844,7 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, int __user *p)
if (may_umount(mnt)) if (may_umount(mnt))
status = 1; status = 1;
DPRINTK("returning %d", status); pr_debug("returning %d\n", status);
status = put_user(status, p); status = put_user(status, p);
@ -834,9 +852,9 @@ static inline int autofs4_ask_umount(struct vfsmount *mnt, int __user *p)
} }
/* Identify autofs4_dentries - this is so we can tell if there's /* Identify autofs4_dentries - this is so we can tell if there's
an extra dentry refcount or not. We only hold a refcount on the * an extra dentry refcount or not. We only hold a refcount on the
dentry if its non-negative (ie, d_inode != NULL) * dentry if its non-negative (ie, d_inode != NULL)
*/ */
int is_autofs4_dentry(struct dentry *dentry) int is_autofs4_dentry(struct dentry *dentry)
{ {
return dentry && d_really_is_positive(dentry) && return dentry && d_really_is_positive(dentry) &&
@ -854,8 +872,8 @@ static int autofs4_root_ioctl_unlocked(struct inode *inode, struct file *filp,
struct autofs_sb_info *sbi = autofs4_sbi(inode->i_sb); struct autofs_sb_info *sbi = autofs4_sbi(inode->i_sb);
void __user *p = (void __user *)arg; void __user *p = (void __user *)arg;
DPRINTK("cmd = 0x%08x, arg = 0x%08lx, sbi = %p, pgrp = %u", pr_debug("cmd = 0x%08x, arg = 0x%08lx, sbi = %p, pgrp = %u\n",
cmd,arg,sbi,task_pgrp_nr(current)); cmd, arg, sbi, task_pgrp_nr(current));
if (_IOC_TYPE(cmd) != _IOC_TYPE(AUTOFS_IOC_FIRST) || if (_IOC_TYPE(cmd) != _IOC_TYPE(AUTOFS_IOC_FIRST) ||
_IOC_NR(cmd) - _IOC_NR(AUTOFS_IOC_FIRST) >= AUTOFS_IOC_COUNT) _IOC_NR(cmd) - _IOC_NR(AUTOFS_IOC_FIRST) >= AUTOFS_IOC_COUNT)
@ -864,11 +882,11 @@ static int autofs4_root_ioctl_unlocked(struct inode *inode, struct file *filp,
if (!autofs4_oz_mode(sbi) && !capable(CAP_SYS_ADMIN)) if (!autofs4_oz_mode(sbi) && !capable(CAP_SYS_ADMIN))
return -EPERM; return -EPERM;
switch(cmd) { switch (cmd) {
case AUTOFS_IOC_READY: /* Wait queue: go ahead and retry */ case AUTOFS_IOC_READY: /* Wait queue: go ahead and retry */
return autofs4_wait_release(sbi,(autofs_wqt_t)arg,0); return autofs4_wait_release(sbi, (autofs_wqt_t) arg, 0);
case AUTOFS_IOC_FAIL: /* Wait queue: fail with ENOENT */ case AUTOFS_IOC_FAIL: /* Wait queue: fail with ENOENT */
return autofs4_wait_release(sbi,(autofs_wqt_t)arg,-ENOENT); return autofs4_wait_release(sbi, (autofs_wqt_t) arg, -ENOENT);
case AUTOFS_IOC_CATATONIC: /* Enter catatonic mode (daemon shutdown) */ case AUTOFS_IOC_CATATONIC: /* Enter catatonic mode (daemon shutdown) */
autofs4_catatonic_mode(sbi); autofs4_catatonic_mode(sbi);
return 0; return 0;
@ -888,13 +906,15 @@ static int autofs4_root_ioctl_unlocked(struct inode *inode, struct file *filp,
/* return a single thing to expire */ /* return a single thing to expire */
case AUTOFS_IOC_EXPIRE: case AUTOFS_IOC_EXPIRE:
return autofs4_expire_run(inode->i_sb,filp->f_path.mnt,sbi, p); return autofs4_expire_run(inode->i_sb,
filp->f_path.mnt, sbi, p);
/* same as above, but can send multiple expires through pipe */ /* same as above, but can send multiple expires through pipe */
case AUTOFS_IOC_EXPIRE_MULTI: case AUTOFS_IOC_EXPIRE_MULTI:
return autofs4_expire_multi(inode->i_sb,filp->f_path.mnt,sbi, p); return autofs4_expire_multi(inode->i_sb,
filp->f_path.mnt, sbi, p);
default: default:
return -ENOSYS; return -EINVAL;
} }
} }
@ -902,12 +922,13 @@ static long autofs4_root_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg) unsigned int cmd, unsigned long arg)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
return autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); return autofs4_root_ioctl_unlocked(inode, filp, cmd, arg);
} }
#ifdef CONFIG_COMPAT #ifdef CONFIG_COMPAT
static long autofs4_root_compat_ioctl(struct file *filp, static long autofs4_root_compat_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg) unsigned int cmd, unsigned long arg)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
int ret; int ret;
@ -916,7 +937,7 @@ static long autofs4_root_compat_ioctl(struct file *filp,
ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg);
else else
ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, ret = autofs4_root_ioctl_unlocked(inode, filp, cmd,
(unsigned long)compat_ptr(arg)); (unsigned long) compat_ptr(arg));
return ret; return ret;
} }

Просмотреть файл

@ -1,14 +1,10 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/symlink.c
*
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include "autofs_i.h" #include "autofs_i.h"
@ -18,6 +14,7 @@ static const char *autofs4_get_link(struct dentry *dentry,
{ {
struct autofs_sb_info *sbi; struct autofs_sb_info *sbi;
struct autofs_info *ino; struct autofs_info *ino;
if (!dentry) if (!dentry)
return ERR_PTR(-ECHILD); return ERR_PTR(-ECHILD);
sbi = autofs4_sbi(dentry->d_sb); sbi = autofs4_sbi(dentry->d_sb);

Просмотреть файл

@ -1,15 +1,11 @@
/* -*- c -*- --------------------------------------------------------------- * /*
* * Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* linux/fs/autofs/waitq.c * Copyright 2001-2006 Ian Kent <raven@themaw.net>
*
* Copyright 1997-1998 Transmeta Corporation -- All Rights Reserved
* Copyright 2001-2006 Ian Kent <raven@themaw.net>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ------------------------------------------------------------------------- */
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/time.h> #include <linux/time.h>
@ -18,7 +14,8 @@
#include "autofs_i.h" #include "autofs_i.h"
/* We make this a static variable rather than a part of the superblock; it /* We make this a static variable rather than a part of the superblock; it
is better if we don't reassign numbers easily even across filesystems */ * is better if we don't reassign numbers easily even across filesystems
*/
static autofs_wqt_t autofs4_next_wait_queue = 1; static autofs_wqt_t autofs4_next_wait_queue = 1;
/* These are the signals we allow interrupting a pending mount */ /* These are the signals we allow interrupting a pending mount */
@ -34,7 +31,7 @@ void autofs4_catatonic_mode(struct autofs_sb_info *sbi)
return; return;
} }
DPRINTK("entering catatonic mode"); pr_debug("entering catatonic mode\n");
sbi->catatonic = 1; sbi->catatonic = 1;
wq = sbi->queues; wq = sbi->queues;
@ -69,17 +66,19 @@ static int autofs4_write(struct autofs_sb_info *sbi,
set_fs(KERNEL_DS); set_fs(KERNEL_DS);
mutex_lock(&sbi->pipe_mutex); mutex_lock(&sbi->pipe_mutex);
while (bytes && wr = __vfs_write(file, data, bytes, &file->f_pos);
(wr = __vfs_write(file,data,bytes,&file->f_pos)) > 0) { while (bytes && wr) {
data += wr; data += wr;
bytes -= wr; bytes -= wr;
wr = __vfs_write(file, data, bytes, &file->f_pos);
} }
mutex_unlock(&sbi->pipe_mutex); mutex_unlock(&sbi->pipe_mutex);
set_fs(fs); set_fs(fs);
/* Keep the currently executing process from receiving a /* Keep the currently executing process from receiving a
SIGPIPE unless it was already supposed to get one */ * SIGPIPE unless it was already supposed to get one
*/
if (wr == -EPIPE && !sigpipe) { if (wr == -EPIPE && !sigpipe) {
spin_lock_irqsave(&current->sighand->siglock, flags); spin_lock_irqsave(&current->sighand->siglock, flags);
sigdelset(&current->pending.signal, SIGPIPE); sigdelset(&current->pending.signal, SIGPIPE);
@ -102,10 +101,11 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct file *pipe = NULL; struct file *pipe = NULL;
size_t pktsz; size_t pktsz;
DPRINTK("wait id = 0x%08lx, name = %.*s, type=%d", pr_debug("wait id = 0x%08lx, name = %.*s, type=%d\n",
(unsigned long) wq->wait_queue_token, wq->name.len, wq->name.name, type); (unsigned long) wq->wait_queue_token,
wq->name.len, wq->name.name, type);
memset(&pkt,0,sizeof pkt); /* For security reasons */ memset(&pkt, 0, sizeof(pkt)); /* For security reasons */
pkt.hdr.proto_version = sbi->version; pkt.hdr.proto_version = sbi->version;
pkt.hdr.type = type; pkt.hdr.type = type;
@ -126,7 +126,8 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
} }
case autofs_ptype_expire_multi: case autofs_ptype_expire_multi:
{ {
struct autofs_packet_expire_multi *ep = &pkt.v4_pkt.expire_multi; struct autofs_packet_expire_multi *ep =
&pkt.v4_pkt.expire_multi;
pktsz = sizeof(*ep); pktsz = sizeof(*ep);
@ -163,7 +164,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
break; break;
} }
default: default:
printk("autofs4_notify_daemon: bad type %d!\n", type); pr_warn("bad type %d!\n", type);
mutex_unlock(&sbi->wq_mutex); mutex_unlock(&sbi->wq_mutex);
return; return;
} }
@ -231,7 +232,7 @@ autofs4_find_wait(struct autofs_sb_info *sbi, struct qstr *qstr)
if (wq->name.hash == qstr->hash && if (wq->name.hash == qstr->hash &&
wq->name.len == qstr->len && wq->name.len == qstr->len &&
wq->name.name && wq->name.name &&
!memcmp(wq->name.name, qstr->name, qstr->len)) !memcmp(wq->name.name, qstr->name, qstr->len))
break; break;
} }
return wq; return wq;
@ -248,7 +249,7 @@ autofs4_find_wait(struct autofs_sb_info *sbi, struct qstr *qstr)
static int validate_request(struct autofs_wait_queue **wait, static int validate_request(struct autofs_wait_queue **wait,
struct autofs_sb_info *sbi, struct autofs_sb_info *sbi,
struct qstr *qstr, struct qstr *qstr,
struct dentry*dentry, enum autofs_notify notify) struct dentry *dentry, enum autofs_notify notify)
{ {
struct autofs_wait_queue *wq; struct autofs_wait_queue *wq;
struct autofs_info *ino; struct autofs_info *ino;
@ -322,8 +323,10 @@ static int validate_request(struct autofs_wait_queue **wait,
* continue on and create a new request. * continue on and create a new request.
*/ */
if (!IS_ROOT(dentry)) { if (!IS_ROOT(dentry)) {
if (d_really_is_positive(dentry) && d_unhashed(dentry)) { if (d_unhashed(dentry) &&
d_really_is_positive(dentry)) {
struct dentry *parent = dentry->d_parent; struct dentry *parent = dentry->d_parent;
new = d_lookup(parent, &dentry->d_name); new = d_lookup(parent, &dentry->d_name);
if (new) if (new)
dentry = new; dentry = new;
@ -340,8 +343,8 @@ static int validate_request(struct autofs_wait_queue **wait,
return 1; return 1;
} }
int autofs4_wait(struct autofs_sb_info *sbi, struct dentry *dentry, int autofs4_wait(struct autofs_sb_info *sbi,
enum autofs_notify notify) struct dentry *dentry, enum autofs_notify notify)
{ {
struct autofs_wait_queue *wq; struct autofs_wait_queue *wq;
struct qstr qstr; struct qstr qstr;
@ -411,7 +414,7 @@ int autofs4_wait(struct autofs_sb_info *sbi, struct dentry *dentry,
if (!wq) { if (!wq) {
/* Create a new wait queue */ /* Create a new wait queue */
wq = kmalloc(sizeof(struct autofs_wait_queue),GFP_KERNEL); wq = kmalloc(sizeof(struct autofs_wait_queue), GFP_KERNEL);
if (!wq) { if (!wq) {
kfree(qstr.name); kfree(qstr.name);
mutex_unlock(&sbi->wq_mutex); mutex_unlock(&sbi->wq_mutex);
@ -450,17 +453,19 @@ int autofs4_wait(struct autofs_sb_info *sbi, struct dentry *dentry,
autofs_ptype_expire_indirect; autofs_ptype_expire_indirect;
} }
DPRINTK("new wait id = 0x%08lx, name = %.*s, nfy=%d\n", pr_debug("new wait id = 0x%08lx, name = %.*s, nfy=%d\n",
(unsigned long) wq->wait_queue_token, wq->name.len, (unsigned long) wq->wait_queue_token, wq->name.len,
wq->name.name, notify); wq->name.name, notify);
/* autofs4_notify_daemon() may block; it will unlock ->wq_mutex */ /*
* autofs4_notify_daemon() may block; it will unlock ->wq_mutex
*/
autofs4_notify_daemon(sbi, wq, type); autofs4_notify_daemon(sbi, wq, type);
} else { } else {
wq->wait_ctr++; wq->wait_ctr++;
DPRINTK("existing wait id = 0x%08lx, name = %.*s, nfy=%d", pr_debug("existing wait id = 0x%08lx, name = %.*s, nfy=%d\n",
(unsigned long) wq->wait_queue_token, wq->name.len, (unsigned long) wq->wait_queue_token, wq->name.len,
wq->name.name, notify); wq->name.name, notify);
mutex_unlock(&sbi->wq_mutex); mutex_unlock(&sbi->wq_mutex);
kfree(qstr.name); kfree(qstr.name);
} }
@ -471,12 +476,14 @@ int autofs4_wait(struct autofs_sb_info *sbi, struct dentry *dentry,
*/ */
if (wq->name.name) { if (wq->name.name) {
/* Block all but "shutdown" signals while waiting */ /* Block all but "shutdown" signals while waiting */
sigset_t oldset; unsigned long shutdown_sigs_mask;
unsigned long irqflags; unsigned long irqflags;
sigset_t oldset;
spin_lock_irqsave(&current->sighand->siglock, irqflags); spin_lock_irqsave(&current->sighand->siglock, irqflags);
oldset = current->blocked; oldset = current->blocked;
siginitsetinv(&current->blocked, SHUTDOWN_SIGS & ~oldset.sig[0]); shutdown_sigs_mask = SHUTDOWN_SIGS & ~oldset.sig[0];
siginitsetinv(&current->blocked, shutdown_sigs_mask);
recalc_sigpending(); recalc_sigpending();
spin_unlock_irqrestore(&current->sighand->siglock, irqflags); spin_unlock_irqrestore(&current->sighand->siglock, irqflags);
@ -487,7 +494,7 @@ int autofs4_wait(struct autofs_sb_info *sbi, struct dentry *dentry,
recalc_sigpending(); recalc_sigpending();
spin_unlock_irqrestore(&current->sighand->siglock, irqflags); spin_unlock_irqrestore(&current->sighand->siglock, irqflags);
} else { } else {
DPRINTK("skipped sleeping"); pr_debug("skipped sleeping\n");
} }
status = wq->status; status = wq->status;
@ -562,4 +569,3 @@ int autofs4_wait_release(struct autofs_sb_info *sbi, autofs_wqt_t wait_queue_tok
return 0; return 0;
} }

Просмотреть файл

@ -621,17 +621,17 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
* If warn is true, then emit a warning if the page is not uptodate and has * If warn is true, then emit a warning if the page is not uptodate and has
* not been truncated. * not been truncated.
* *
* The caller must hold mem_cgroup_begin_page_stat() lock. * The caller must hold lock_page_memcg().
*/ */
static void __set_page_dirty(struct page *page, struct address_space *mapping, static void __set_page_dirty(struct page *page, struct address_space *mapping,
struct mem_cgroup *memcg, int warn) int warn)
{ {
unsigned long flags; unsigned long flags;
spin_lock_irqsave(&mapping->tree_lock, flags); spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */ if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(warn && !PageUptodate(page)); WARN_ON_ONCE(warn && !PageUptodate(page));
account_page_dirtied(page, mapping, memcg); account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree, radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY); page_index(page), PAGECACHE_TAG_DIRTY);
} }
@ -666,7 +666,6 @@ static void __set_page_dirty(struct page *page, struct address_space *mapping,
int __set_page_dirty_buffers(struct page *page) int __set_page_dirty_buffers(struct page *page)
{ {
int newly_dirty; int newly_dirty;
struct mem_cgroup *memcg;
struct address_space *mapping = page_mapping(page); struct address_space *mapping = page_mapping(page);
if (unlikely(!mapping)) if (unlikely(!mapping))
@ -683,17 +682,17 @@ int __set_page_dirty_buffers(struct page *page)
} while (bh != head); } while (bh != head);
} }
/* /*
* Use mem_group_begin_page_stat() to keep PageDirty synchronized with * Lock out page->mem_cgroup migration to keep PageDirty
* per-memcg dirty page counters. * synchronized with per-memcg dirty page counters.
*/ */
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
newly_dirty = !TestSetPageDirty(page); newly_dirty = !TestSetPageDirty(page);
spin_unlock(&mapping->private_lock); spin_unlock(&mapping->private_lock);
if (newly_dirty) if (newly_dirty)
__set_page_dirty(page, mapping, memcg, 1); __set_page_dirty(page, mapping, 1);
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
if (newly_dirty) if (newly_dirty)
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES); __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
@ -1167,15 +1166,14 @@ void mark_buffer_dirty(struct buffer_head *bh)
if (!test_set_buffer_dirty(bh)) { if (!test_set_buffer_dirty(bh)) {
struct page *page = bh->b_page; struct page *page = bh->b_page;
struct address_space *mapping = NULL; struct address_space *mapping = NULL;
struct mem_cgroup *memcg;
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
if (!TestSetPageDirty(page)) { if (!TestSetPageDirty(page)) {
mapping = page_mapping(page); mapping = page_mapping(page);
if (mapping) if (mapping)
__set_page_dirty(page, mapping, memcg, 0); __set_page_dirty(page, mapping, 0);
} }
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
if (mapping) if (mapping)
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES); __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
} }

Просмотреть файл

@ -24,6 +24,7 @@
#include <linux/highmem.h> #include <linux/highmem.h>
#include <linux/prefetch.h> #include <linux/prefetch.h>
#include <linux/mpage.h> #include <linux/mpage.h>
#include <linux/mm_inline.h>
#include <linux/writeback.h> #include <linux/writeback.h>
#include <linux/backing-dev.h> #include <linux/backing-dev.h>
#include <linux/pagevec.h> #include <linux/pagevec.h>
@ -366,7 +367,7 @@ mpage_readpages(struct address_space *mapping, struct list_head *pages,
map_bh.b_state = 0; map_bh.b_state = 0;
map_bh.b_size = 0; map_bh.b_size = 0;
for (page_idx = 0; page_idx < nr_pages; page_idx++) { for (page_idx = 0; page_idx < nr_pages; page_idx++) {
struct page *page = list_entry(pages->prev, struct page, lru); struct page *page = lru_to_page(pages);
prefetchw(&page->flags); prefetchw(&page->flags);
list_del(&page->lru); list_del(&page->lru);

Просмотреть файл

@ -287,7 +287,6 @@ struct o2hb_bio_wait_ctxt {
static void o2hb_write_timeout(struct work_struct *work) static void o2hb_write_timeout(struct work_struct *work)
{ {
int failed, quorum; int failed, quorum;
unsigned long flags;
struct o2hb_region *reg = struct o2hb_region *reg =
container_of(work, struct o2hb_region, container_of(work, struct o2hb_region,
hr_write_timeout_work.work); hr_write_timeout_work.work);
@ -297,14 +296,14 @@ static void o2hb_write_timeout(struct work_struct *work)
jiffies_to_msecs(jiffies - reg->hr_last_timeout_start)); jiffies_to_msecs(jiffies - reg->hr_last_timeout_start));
if (o2hb_global_heartbeat_active()) { if (o2hb_global_heartbeat_active()) {
spin_lock_irqsave(&o2hb_live_lock, flags); spin_lock(&o2hb_live_lock);
if (test_bit(reg->hr_region_num, o2hb_quorum_region_bitmap)) if (test_bit(reg->hr_region_num, o2hb_quorum_region_bitmap))
set_bit(reg->hr_region_num, o2hb_failed_region_bitmap); set_bit(reg->hr_region_num, o2hb_failed_region_bitmap);
failed = bitmap_weight(o2hb_failed_region_bitmap, failed = bitmap_weight(o2hb_failed_region_bitmap,
O2NM_MAX_REGIONS); O2NM_MAX_REGIONS);
quorum = bitmap_weight(o2hb_quorum_region_bitmap, quorum = bitmap_weight(o2hb_quorum_region_bitmap,
O2NM_MAX_REGIONS); O2NM_MAX_REGIONS);
spin_unlock_irqrestore(&o2hb_live_lock, flags); spin_unlock(&o2hb_live_lock);
mlog(ML_HEARTBEAT, "Number of regions %d, failed regions %d\n", mlog(ML_HEARTBEAT, "Number of regions %d, failed regions %d\n",
quorum, failed); quorum, failed);
@ -2425,11 +2424,10 @@ EXPORT_SYMBOL_GPL(o2hb_check_node_heartbeating);
int o2hb_check_node_heartbeating_no_sem(u8 node_num) int o2hb_check_node_heartbeating_no_sem(u8 node_num)
{ {
unsigned long testing_map[BITS_TO_LONGS(O2NM_MAX_NODES)]; unsigned long testing_map[BITS_TO_LONGS(O2NM_MAX_NODES)];
unsigned long flags;
spin_lock_irqsave(&o2hb_live_lock, flags); spin_lock(&o2hb_live_lock);
o2hb_fill_node_map_from_callback(testing_map, sizeof(testing_map)); o2hb_fill_node_map_from_callback(testing_map, sizeof(testing_map));
spin_unlock_irqrestore(&o2hb_live_lock, flags); spin_unlock(&o2hb_live_lock);
if (!test_bit(node_num, testing_map)) { if (!test_bit(node_num, testing_map)) {
mlog(ML_HEARTBEAT, mlog(ML_HEARTBEAT,
"node (%u) does not have heartbeating enabled.\n", "node (%u) does not have heartbeating enabled.\n",

Просмотреть файл

@ -282,6 +282,7 @@ static inline void __dlm_set_joining_node(struct dlm_ctxt *dlm,
#define DLM_LOCK_RES_DROPPING_REF 0x00000040 #define DLM_LOCK_RES_DROPPING_REF 0x00000040
#define DLM_LOCK_RES_BLOCK_DIRTY 0x00001000 #define DLM_LOCK_RES_BLOCK_DIRTY 0x00001000
#define DLM_LOCK_RES_SETREF_INPROG 0x00002000 #define DLM_LOCK_RES_SETREF_INPROG 0x00002000
#define DLM_LOCK_RES_RECOVERY_WAITING 0x00004000
/* max milliseconds to wait to sync up a network failure with a node death */ /* max milliseconds to wait to sync up a network failure with a node death */
#define DLM_NODE_DEATH_WAIT_MAX (5 * 1000) #define DLM_NODE_DEATH_WAIT_MAX (5 * 1000)
@ -451,6 +452,7 @@ enum {
DLM_QUERY_REGION = 519, DLM_QUERY_REGION = 519,
DLM_QUERY_NODEINFO = 520, DLM_QUERY_NODEINFO = 520,
DLM_BEGIN_EXIT_DOMAIN_MSG = 521, DLM_BEGIN_EXIT_DOMAIN_MSG = 521,
DLM_DEREF_LOCKRES_DONE = 522,
}; };
struct dlm_reco_node_data struct dlm_reco_node_data
@ -545,7 +547,7 @@ struct dlm_master_requery
* }; * };
* *
* from ../cluster/tcp.h * from ../cluster/tcp.h
* NET_MAX_PAYLOAD_BYTES (4096 - sizeof(net_msg)) * O2NET_MAX_PAYLOAD_BYTES (4096 - sizeof(net_msg))
* (roughly 4080 bytes) * (roughly 4080 bytes)
* and sizeof(dlm_migratable_lockres) = 112 bytes * and sizeof(dlm_migratable_lockres) = 112 bytes
* and sizeof(dlm_migratable_lock) = 16 bytes * and sizeof(dlm_migratable_lock) = 16 bytes
@ -586,7 +588,7 @@ struct dlm_migratable_lockres
/* from above, 128 bytes /* from above, 128 bytes
* for some undetermined future use */ * for some undetermined future use */
#define DLM_MIG_LOCKRES_RESERVED (NET_MAX_PAYLOAD_BYTES - \ #define DLM_MIG_LOCKRES_RESERVED (O2NET_MAX_PAYLOAD_BYTES - \
DLM_MIG_LOCKRES_MAX_LEN) DLM_MIG_LOCKRES_MAX_LEN)
struct dlm_create_lock struct dlm_create_lock
@ -782,6 +784,20 @@ struct dlm_deref_lockres
u8 name[O2NM_MAX_NAME_LEN]; u8 name[O2NM_MAX_NAME_LEN];
}; };
enum {
DLM_DEREF_RESPONSE_DONE = 0,
DLM_DEREF_RESPONSE_INPROG = 1,
};
struct dlm_deref_lockres_done {
u32 pad1;
u16 pad2;
u8 node_idx;
u8 namelen;
u8 name[O2NM_MAX_NAME_LEN];
};
static inline enum dlm_status static inline enum dlm_status
__dlm_lockres_state_to_status(struct dlm_lock_resource *res) __dlm_lockres_state_to_status(struct dlm_lock_resource *res)
{ {
@ -789,7 +805,8 @@ __dlm_lockres_state_to_status(struct dlm_lock_resource *res)
assert_spin_locked(&res->spinlock); assert_spin_locked(&res->spinlock);
if (res->state & DLM_LOCK_RES_RECOVERING) if (res->state & (DLM_LOCK_RES_RECOVERING|
DLM_LOCK_RES_RECOVERY_WAITING))
status = DLM_RECOVERING; status = DLM_RECOVERING;
else if (res->state & DLM_LOCK_RES_MIGRATING) else if (res->state & DLM_LOCK_RES_MIGRATING)
status = DLM_MIGRATING; status = DLM_MIGRATING;
@ -968,6 +985,8 @@ int dlm_assert_master_handler(struct o2net_msg *msg, u32 len, void *data,
void dlm_assert_master_post_handler(int status, void *data, void *ret_data); void dlm_assert_master_post_handler(int status, void *data, void *ret_data);
int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data, int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
void **ret_data); void **ret_data);
int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
void **ret_data);
int dlm_migrate_request_handler(struct o2net_msg *msg, u32 len, void *data, int dlm_migrate_request_handler(struct o2net_msg *msg, u32 len, void *data,
void **ret_data); void **ret_data);
int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data, int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
@ -1009,6 +1028,7 @@ static inline void __dlm_wait_on_lockres(struct dlm_lock_resource *res)
{ {
__dlm_wait_on_lockres_flags(res, (DLM_LOCK_RES_IN_PROGRESS| __dlm_wait_on_lockres_flags(res, (DLM_LOCK_RES_IN_PROGRESS|
DLM_LOCK_RES_RECOVERING| DLM_LOCK_RES_RECOVERING|
DLM_LOCK_RES_RECOVERY_WAITING|
DLM_LOCK_RES_MIGRATING)); DLM_LOCK_RES_MIGRATING));
} }

Просмотреть файл

@ -132,10 +132,13 @@ static DECLARE_WAIT_QUEUE_HEAD(dlm_domain_events);
* - Message DLM_QUERY_NODEINFO added to allow online node removes * - Message DLM_QUERY_NODEINFO added to allow online node removes
* New in version 1.2: * New in version 1.2:
* - Message DLM_BEGIN_EXIT_DOMAIN_MSG added to mark start of exit domain * - Message DLM_BEGIN_EXIT_DOMAIN_MSG added to mark start of exit domain
* New in version 1.3:
* - Message DLM_DEREF_LOCKRES_DONE added to inform non-master that the
* refmap is cleared
*/ */
static const struct dlm_protocol_version dlm_protocol = { static const struct dlm_protocol_version dlm_protocol = {
.pv_major = 1, .pv_major = 1,
.pv_minor = 2, .pv_minor = 3,
}; };
#define DLM_DOMAIN_BACKOFF_MS 200 #define DLM_DOMAIN_BACKOFF_MS 200
@ -1396,7 +1399,7 @@ static int dlm_send_join_cancels(struct dlm_ctxt *dlm,
unsigned int map_size) unsigned int map_size)
{ {
int status, tmpstat; int status, tmpstat;
unsigned int node; int node;
if (map_size != (BITS_TO_LONGS(O2NM_MAX_NODES) * if (map_size != (BITS_TO_LONGS(O2NM_MAX_NODES) *
sizeof(unsigned long))) { sizeof(unsigned long))) {
@ -1853,7 +1856,13 @@ static int dlm_register_domain_handlers(struct dlm_ctxt *dlm)
sizeof(struct dlm_exit_domain), sizeof(struct dlm_exit_domain),
dlm_begin_exit_domain_handler, dlm_begin_exit_domain_handler,
dlm, NULL, &dlm->dlm_domain_handlers); dlm, NULL, &dlm->dlm_domain_handlers);
if (status)
goto bail;
status = o2net_register_handler(DLM_DEREF_LOCKRES_DONE, dlm->key,
sizeof(struct dlm_deref_lockres_done),
dlm_deref_lockres_done_handler,
dlm, NULL, &dlm->dlm_domain_handlers);
bail: bail:
if (status) if (status)
dlm_unregister_domain_handlers(dlm); dlm_unregister_domain_handlers(dlm);

Просмотреть файл

@ -2278,7 +2278,7 @@ int dlm_drop_lockres_ref(struct dlm_ctxt *dlm, struct dlm_lock_resource *res)
dlm_print_one_lock_resource(res); dlm_print_one_lock_resource(res);
BUG(); BUG();
} }
return ret; return ret ? ret : r;
} }
int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data, int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
@ -2345,7 +2345,7 @@ int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
res->lockname.len, res->lockname.name, node); res->lockname.len, res->lockname.name, node);
dlm_print_one_lock_resource(res); dlm_print_one_lock_resource(res);
} }
ret = 0; ret = DLM_DEREF_RESPONSE_DONE;
goto done; goto done;
} }
@ -2365,7 +2365,7 @@ int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
spin_unlock(&dlm->work_lock); spin_unlock(&dlm->work_lock);
queue_work(dlm->dlm_worker, &dlm->dispatched_work); queue_work(dlm->dlm_worker, &dlm->dispatched_work);
return 0; return DLM_DEREF_RESPONSE_INPROG;
done: done:
if (res) if (res)
@ -2375,6 +2375,122 @@ done:
return ret; return ret;
} }
int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
void **ret_data)
{
struct dlm_ctxt *dlm = data;
struct dlm_deref_lockres_done *deref
= (struct dlm_deref_lockres_done *)msg->buf;
struct dlm_lock_resource *res = NULL;
char *name;
unsigned int namelen;
int ret = -EINVAL;
u8 node;
unsigned int hash;
if (!dlm_grab(dlm))
return 0;
name = deref->name;
namelen = deref->namelen;
node = deref->node_idx;
if (namelen > DLM_LOCKID_NAME_MAX) {
mlog(ML_ERROR, "Invalid name length!");
goto done;
}
if (deref->node_idx >= O2NM_MAX_NODES) {
mlog(ML_ERROR, "Invalid node number: %u\n", node);
goto done;
}
hash = dlm_lockid_hash(name, namelen);
spin_lock(&dlm->spinlock);
res = __dlm_lookup_lockres_full(dlm, name, namelen, hash);
if (!res) {
spin_unlock(&dlm->spinlock);
mlog(ML_ERROR, "%s:%.*s: bad lockres name\n",
dlm->name, namelen, name);
goto done;
}
spin_lock(&res->spinlock);
BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
if (!list_empty(&res->purge)) {
mlog(0, "%s: Removing res %.*s from purgelist\n",
dlm->name, res->lockname.len, res->lockname.name);
list_del_init(&res->purge);
dlm_lockres_put(res);
dlm->purge_count--;
}
if (!__dlm_lockres_unused(res)) {
mlog(ML_ERROR, "%s: res %.*s in use after deref\n",
dlm->name, res->lockname.len, res->lockname.name);
__dlm_print_one_lock_resource(res);
BUG();
}
__dlm_unhash_lockres(dlm, res);
spin_lock(&dlm->track_lock);
if (!list_empty(&res->tracking))
list_del_init(&res->tracking);
else {
mlog(ML_ERROR, "%s: Resource %.*s not on the Tracking list\n",
dlm->name, res->lockname.len, res->lockname.name);
__dlm_print_one_lock_resource(res);
}
spin_unlock(&dlm->track_lock);
/* lockres is not in the hash now. drop the flag and wake up
* any processes waiting in dlm_get_lock_resource.
*/
res->state &= ~DLM_LOCK_RES_DROPPING_REF;
spin_unlock(&res->spinlock);
wake_up(&res->wq);
dlm_lockres_put(res);
spin_unlock(&dlm->spinlock);
done:
dlm_put(dlm);
return ret;
}
static void dlm_drop_lockres_ref_done(struct dlm_ctxt *dlm,
struct dlm_lock_resource *res, u8 node)
{
struct dlm_deref_lockres_done deref;
int ret = 0, r;
const char *lockname;
unsigned int namelen;
lockname = res->lockname.name;
namelen = res->lockname.len;
BUG_ON(namelen > O2NM_MAX_NAME_LEN);
memset(&deref, 0, sizeof(deref));
deref.node_idx = dlm->node_num;
deref.namelen = namelen;
memcpy(deref.name, lockname, namelen);
ret = o2net_send_message(DLM_DEREF_LOCKRES_DONE, dlm->key,
&deref, sizeof(deref), node, &r);
if (ret < 0) {
mlog(ML_ERROR, "%s: res %.*s, error %d send DEREF DONE "
" to node %u\n", dlm->name, namelen,
lockname, ret, node);
} else if (r < 0) {
/* ignore the error */
mlog(ML_ERROR, "%s: res %.*s, DEREF to node %u got %d\n",
dlm->name, namelen, lockname, node, r);
dlm_print_one_lock_resource(res);
}
}
static void dlm_deref_lockres_worker(struct dlm_work_item *item, void *data) static void dlm_deref_lockres_worker(struct dlm_work_item *item, void *data)
{ {
struct dlm_ctxt *dlm; struct dlm_ctxt *dlm;
@ -2395,6 +2511,8 @@ static void dlm_deref_lockres_worker(struct dlm_work_item *item, void *data)
} }
spin_unlock(&res->spinlock); spin_unlock(&res->spinlock);
dlm_drop_lockres_ref_done(dlm, res, node);
if (cleared) { if (cleared) {
mlog(0, "%s:%.*s node %u ref dropped in dispatch\n", mlog(0, "%s:%.*s node %u ref dropped in dispatch\n",
dlm->name, res->lockname.len, res->lockname.name, node); dlm->name, res->lockname.len, res->lockname.name, node);
@ -2432,7 +2550,8 @@ static int dlm_is_lockres_migrateable(struct dlm_ctxt *dlm,
return 0; return 0;
/* delay migration when the lockres is in RECOCERING state */ /* delay migration when the lockres is in RECOCERING state */
if (res->state & DLM_LOCK_RES_RECOVERING) if (res->state & (DLM_LOCK_RES_RECOVERING|
DLM_LOCK_RES_RECOVERY_WAITING))
return 0; return 0;
if (res->owner != dlm->node_num) if (res->owner != dlm->node_num)

Просмотреть файл

@ -1403,12 +1403,24 @@ int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
* and RECOVERY flag changed when it completes. */ * and RECOVERY flag changed when it completes. */
hash = dlm_lockid_hash(mres->lockname, mres->lockname_len); hash = dlm_lockid_hash(mres->lockname, mres->lockname_len);
spin_lock(&dlm->spinlock); spin_lock(&dlm->spinlock);
res = __dlm_lookup_lockres(dlm, mres->lockname, mres->lockname_len, res = __dlm_lookup_lockres_full(dlm, mres->lockname, mres->lockname_len,
hash); hash);
if (res) { if (res) {
/* this will get a ref on res */ /* this will get a ref on res */
/* mark it as recovering/migrating and hash it */ /* mark it as recovering/migrating and hash it */
spin_lock(&res->spinlock); spin_lock(&res->spinlock);
if (res->state & DLM_LOCK_RES_DROPPING_REF) {
mlog(0, "%s: node is attempting to migrate "
"lockres %.*s, but marked as dropping "
" ref!\n", dlm->name,
mres->lockname_len, mres->lockname);
ret = -EINVAL;
spin_unlock(&res->spinlock);
spin_unlock(&dlm->spinlock);
dlm_lockres_put(res);
goto leave;
}
if (mres->flags & DLM_MRES_RECOVERY) { if (mres->flags & DLM_MRES_RECOVERY) {
res->state |= DLM_LOCK_RES_RECOVERING; res->state |= DLM_LOCK_RES_RECOVERING;
} else { } else {
@ -2163,6 +2175,13 @@ static void dlm_finish_local_lockres_recovery(struct dlm_ctxt *dlm,
for (i = 0; i < DLM_HASH_BUCKETS; i++) { for (i = 0; i < DLM_HASH_BUCKETS; i++) {
bucket = dlm_lockres_hash(dlm, i); bucket = dlm_lockres_hash(dlm, i);
hlist_for_each_entry(res, bucket, hash_node) { hlist_for_each_entry(res, bucket, hash_node) {
if (res->state & DLM_LOCK_RES_RECOVERY_WAITING) {
spin_lock(&res->spinlock);
res->state &= ~DLM_LOCK_RES_RECOVERY_WAITING;
spin_unlock(&res->spinlock);
wake_up(&res->wq);
}
if (!(res->state & DLM_LOCK_RES_RECOVERING)) if (!(res->state & DLM_LOCK_RES_RECOVERING))
continue; continue;
@ -2300,6 +2319,7 @@ static void dlm_free_dead_locks(struct dlm_ctxt *dlm,
res->lockname.len, res->lockname.name, freed, dead_node); res->lockname.len, res->lockname.name, freed, dead_node);
__dlm_print_one_lock_resource(res); __dlm_print_one_lock_resource(res);
} }
res->state |= DLM_LOCK_RES_RECOVERY_WAITING;
dlm_lockres_clear_refmap_bit(dlm, res, dead_node); dlm_lockres_clear_refmap_bit(dlm, res, dead_node);
} else if (test_bit(dead_node, res->refmap)) { } else if (test_bit(dead_node, res->refmap)) {
mlog(0, "%s:%.*s: dead node %u had a ref, but had " mlog(0, "%s:%.*s: dead node %u had a ref, but had "
@ -2377,14 +2397,16 @@ static void dlm_do_local_recovery_cleanup(struct dlm_ctxt *dlm, u8 dead_node)
dlm_revalidate_lvb(dlm, res, dead_node); dlm_revalidate_lvb(dlm, res, dead_node);
if (res->owner == dead_node) { if (res->owner == dead_node) {
if (res->state & DLM_LOCK_RES_DROPPING_REF) { if (res->state & DLM_LOCK_RES_DROPPING_REF) {
mlog(ML_NOTICE, "%s: res %.*s, Skip " mlog(0, "%s:%.*s: owned by "
"recovery as it is being freed\n", "dead node %u, this node was "
dlm->name, res->lockname.len, "dropping its ref when it died. "
res->lockname.name); "continue, dropping the flag.\n",
} else dlm->name, res->lockname.len,
dlm_move_lockres_to_recovery_list(dlm, res->lockname.name, dead_node);
res); }
res->state &= ~DLM_LOCK_RES_DROPPING_REF;
dlm_move_lockres_to_recovery_list(dlm,
res);
} else if (res->owner == dlm->node_num) { } else if (res->owner == dlm->node_num) {
dlm_free_dead_locks(dlm, res, dead_node); dlm_free_dead_locks(dlm, res, dead_node);
__dlm_lockres_calc_usage(dlm, res); __dlm_lockres_calc_usage(dlm, res);

Просмотреть файл

@ -106,7 +106,8 @@ int __dlm_lockres_unused(struct dlm_lock_resource *res)
if (!list_empty(&res->dirty) || res->state & DLM_LOCK_RES_DIRTY) if (!list_empty(&res->dirty) || res->state & DLM_LOCK_RES_DIRTY)
return 0; return 0;
if (res->state & DLM_LOCK_RES_RECOVERING) if (res->state & (DLM_LOCK_RES_RECOVERING|
DLM_LOCK_RES_RECOVERY_WAITING))
return 0; return 0;
/* Another node has this resource with this node as the master */ /* Another node has this resource with this node as the master */
@ -202,6 +203,13 @@ static void dlm_purge_lockres(struct dlm_ctxt *dlm,
dlm->purge_count--; dlm->purge_count--;
} }
if (!master && ret != 0) {
mlog(0, "%s: deref %.*s in progress or master goes down\n",
dlm->name, res->lockname.len, res->lockname.name);
spin_unlock(&res->spinlock);
return;
}
if (!__dlm_lockres_unused(res)) { if (!__dlm_lockres_unused(res)) {
mlog(ML_ERROR, "%s: res %.*s in use after deref\n", mlog(ML_ERROR, "%s: res %.*s in use after deref\n",
dlm->name, res->lockname.len, res->lockname.name); dlm->name, res->lockname.len, res->lockname.name);
@ -700,7 +708,8 @@ static int dlm_thread(void *data)
* dirty for a short while. */ * dirty for a short while. */
BUG_ON(res->state & DLM_LOCK_RES_MIGRATING); BUG_ON(res->state & DLM_LOCK_RES_MIGRATING);
if (res->state & (DLM_LOCK_RES_IN_PROGRESS | if (res->state & (DLM_LOCK_RES_IN_PROGRESS |
DLM_LOCK_RES_RECOVERING)) { DLM_LOCK_RES_RECOVERING |
DLM_LOCK_RES_RECOVERY_WAITING)) {
/* move it to the tail and keep going */ /* move it to the tail and keep going */
res->state &= ~DLM_LOCK_RES_DIRTY; res->state &= ~DLM_LOCK_RES_DIRTY;
spin_unlock(&res->spinlock); spin_unlock(&res->spinlock);

Просмотреть файл

@ -236,6 +236,7 @@ static int ocfs2_osb_dump(struct ocfs2_super *osb, char *buf, int len)
struct ocfs2_recovery_map *rm = osb->recovery_map; struct ocfs2_recovery_map *rm = osb->recovery_map;
struct ocfs2_orphan_scan *os = &osb->osb_orphan_scan; struct ocfs2_orphan_scan *os = &osb->osb_orphan_scan;
int i, out = 0; int i, out = 0;
unsigned long flags;
out += snprintf(buf + out, len - out, out += snprintf(buf + out, len - out,
"%10s => Id: %-s Uuid: %-s Gen: 0x%X Label: %-s\n", "%10s => Id: %-s Uuid: %-s Gen: 0x%X Label: %-s\n",
@ -271,14 +272,14 @@ static int ocfs2_osb_dump(struct ocfs2_super *osb, char *buf, int len)
cconn->cc_version.pv_minor); cconn->cc_version.pv_minor);
} }
spin_lock(&osb->dc_task_lock); spin_lock_irqsave(&osb->dc_task_lock, flags);
out += snprintf(buf + out, len - out, out += snprintf(buf + out, len - out,
"%10s => Pid: %d Count: %lu WakeSeq: %lu " "%10s => Pid: %d Count: %lu WakeSeq: %lu "
"WorkSeq: %lu\n", "DownCnvt", "WorkSeq: %lu\n", "DownCnvt",
(osb->dc_task ? task_pid_nr(osb->dc_task) : -1), (osb->dc_task ? task_pid_nr(osb->dc_task) : -1),
osb->blocked_lock_count, osb->dc_wake_sequence, osb->blocked_lock_count, osb->dc_wake_sequence,
osb->dc_work_sequence); osb->dc_work_sequence);
spin_unlock(&osb->dc_task_lock); spin_unlock_irqrestore(&osb->dc_task_lock, flags);
spin_lock(&osb->osb_lock); spin_lock(&osb->osb_lock);
out += snprintf(buf + out, len - out, "%10s => Pid: %d Nodes:", out += snprintf(buf + out, len - out, "%10s => Pid: %d Nodes:",

Просмотреть файл

@ -1957,7 +1957,6 @@ xfs_vm_set_page_dirty(
loff_t end_offset; loff_t end_offset;
loff_t offset; loff_t offset;
int newly_dirty; int newly_dirty;
struct mem_cgroup *memcg;
if (unlikely(!mapping)) if (unlikely(!mapping))
return !TestSetPageDirty(page); return !TestSetPageDirty(page);
@ -1978,10 +1977,10 @@ xfs_vm_set_page_dirty(
} while (bh != head); } while (bh != head);
} }
/* /*
* Use mem_group_begin_page_stat() to keep PageDirty synchronized with * Lock out page->mem_cgroup migration to keep PageDirty
* per-memcg dirty page counters. * synchronized with per-memcg dirty page counters.
*/ */
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
newly_dirty = !TestSetPageDirty(page); newly_dirty = !TestSetPageDirty(page);
spin_unlock(&mapping->private_lock); spin_unlock(&mapping->private_lock);
@ -1992,13 +1991,13 @@ xfs_vm_set_page_dirty(
spin_lock_irqsave(&mapping->tree_lock, flags); spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */ if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(!PageUptodate(page)); WARN_ON_ONCE(!PageUptodate(page));
account_page_dirtied(page, mapping, memcg); account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree, radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY); page_index(page), PAGECACHE_TAG_DIRTY);
} }
spin_unlock_irqrestore(&mapping->tree_lock, flags); spin_unlock_irqrestore(&mapping->tree_lock, flags);
} }
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
if (newly_dirty) if (newly_dirty)
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES); __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
return newly_dirty; return newly_dirty;

Просмотреть файл

@ -11,12 +11,7 @@
#define _LINUX_AUTO_DEV_IOCTL_H #define _LINUX_AUTO_DEV_IOCTL_H
#include <linux/auto_fs.h> #include <linux/auto_fs.h>
#ifdef __KERNEL__
#include <linux/string.h> #include <linux/string.h>
#else
#include <string.h>
#endif /* __KERNEL__ */
#define AUTOFS_DEVICE_NAME "autofs" #define AUTOFS_DEVICE_NAME "autofs"
@ -125,7 +120,6 @@ static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in)
in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR; in->ver_minor = AUTOFS_DEV_IOCTL_VERSION_MINOR;
in->size = sizeof(struct autofs_dev_ioctl); in->size = sizeof(struct autofs_dev_ioctl);
in->ioctlfd = -1; in->ioctlfd = -1;
return;
} }
/* /*

Просмотреть файл

@ -1,14 +1,10 @@
/* -*- linux-c -*- ------------------------------------------------------- * /*
* * Copyright 1997 Transmeta Corporation - All Rights Reserved
* linux/include/linux/auto_fs.h
*
* Copyright 1997 Transmeta Corporation - All Rights Reserved
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
* the terms of the GNU General Public License, version 2, or at your * the terms of the GNU General Public License, version 2, or at your
* option, any later version, incorporated herein by reference. * option, any later version, incorporated herein by reference.
* */
* ----------------------------------------------------------------------- */
#ifndef _LINUX_AUTO_FS_H #ifndef _LINUX_AUTO_FS_H
#define _LINUX_AUTO_FS_H #define _LINUX_AUTO_FS_H

Просмотреть файл

@ -62,10 +62,9 @@ static inline struct dentry *fault_create_debugfs_attr(const char *name,
#endif /* CONFIG_FAULT_INJECTION */ #endif /* CONFIG_FAULT_INJECTION */
#ifdef CONFIG_FAILSLAB #ifdef CONFIG_FAILSLAB
extern bool should_failslab(size_t size, gfp_t gfpflags, unsigned long flags); extern bool should_failslab(struct kmem_cache *s, gfp_t gfpflags);
#else #else
static inline bool should_failslab(size_t size, gfp_t gfpflags, static inline bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
unsigned long flags)
{ {
return false; return false;
} }

Просмотреть файл

@ -9,6 +9,11 @@
struct vm_area_struct; struct vm_area_struct;
/*
* In case of changes, please don't forget to update
* include/trace/events/mmflags.h and tools/perf/builtin-kmem.c
*/
/* Plain integer GFP bitmasks. Do not use this directly. */ /* Plain integer GFP bitmasks. Do not use this directly. */
#define ___GFP_DMA 0x01u #define ___GFP_DMA 0x01u
#define ___GFP_HIGHMEM 0x02u #define ___GFP_HIGHMEM 0x02u
@ -48,7 +53,6 @@ struct vm_area_struct;
#define __GFP_DMA ((__force gfp_t)___GFP_DMA) #define __GFP_DMA ((__force gfp_t)___GFP_DMA)
#define __GFP_HIGHMEM ((__force gfp_t)___GFP_HIGHMEM) #define __GFP_HIGHMEM ((__force gfp_t)___GFP_HIGHMEM)
#define __GFP_DMA32 ((__force gfp_t)___GFP_DMA32) #define __GFP_DMA32 ((__force gfp_t)___GFP_DMA32)
#define __GFP_MOVABLE ((__force gfp_t)___GFP_MOVABLE) /* Page is movable */
#define __GFP_MOVABLE ((__force gfp_t)___GFP_MOVABLE) /* ZONE_MOVABLE allowed */ #define __GFP_MOVABLE ((__force gfp_t)___GFP_MOVABLE) /* ZONE_MOVABLE allowed */
#define GFP_ZONEMASK (__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE) #define GFP_ZONEMASK (__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE)
@ -515,13 +519,7 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
void drain_all_pages(struct zone *zone); void drain_all_pages(struct zone *zone);
void drain_local_pages(struct zone *zone); void drain_local_pages(struct zone *zone);
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
void page_alloc_init_late(void); void page_alloc_init_late(void);
#else
static inline void page_alloc_init_late(void)
{
}
#endif
/* /*
* gfp_allowed_mask is set to GFP_BOOT_MASK during early boot to restrict what * gfp_allowed_mask is set to GFP_BOOT_MASK during early boot to restrict what

Просмотреть файл

@ -28,6 +28,7 @@
#include <linux/eventfd.h> #include <linux/eventfd.h>
#include <linux/mmzone.h> #include <linux/mmzone.h>
#include <linux/writeback.h> #include <linux/writeback.h>
#include <linux/page-flags.h>
struct mem_cgroup; struct mem_cgroup;
struct page; struct page;
@ -89,6 +90,10 @@ enum mem_cgroup_events_target {
}; };
#ifdef CONFIG_MEMCG #ifdef CONFIG_MEMCG
#define MEM_CGROUP_ID_SHIFT 16
#define MEM_CGROUP_ID_MAX USHRT_MAX
struct mem_cgroup_stat_cpu { struct mem_cgroup_stat_cpu {
long count[MEMCG_NR_STAT]; long count[MEMCG_NR_STAT];
unsigned long events[MEMCG_NR_EVENTS]; unsigned long events[MEMCG_NR_EVENTS];
@ -265,6 +270,11 @@ struct mem_cgroup {
extern struct mem_cgroup *root_mem_cgroup; extern struct mem_cgroup *root_mem_cgroup;
static inline bool mem_cgroup_disabled(void)
{
return !cgroup_subsys_enabled(memory_cgrp_subsys);
}
/** /**
* mem_cgroup_events - count memory events against a cgroup * mem_cgroup_events - count memory events against a cgroup
* @memcg: the memory cgroup * @memcg: the memory cgroup
@ -291,7 +301,7 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg,
void mem_cgroup_uncharge(struct page *page); void mem_cgroup_uncharge(struct page *page);
void mem_cgroup_uncharge_list(struct list_head *page_list); void mem_cgroup_uncharge_list(struct list_head *page_list);
void mem_cgroup_replace_page(struct page *oldpage, struct page *newpage); void mem_cgroup_migrate(struct page *oldpage, struct page *newpage);
struct lruvec *mem_cgroup_zone_lruvec(struct zone *, struct mem_cgroup *); struct lruvec *mem_cgroup_zone_lruvec(struct zone *, struct mem_cgroup *);
struct lruvec *mem_cgroup_page_lruvec(struct page *, struct zone *); struct lruvec *mem_cgroup_page_lruvec(struct page *, struct zone *);
@ -312,6 +322,28 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *,
struct mem_cgroup_reclaim_cookie *); struct mem_cgroup_reclaim_cookie *);
void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *); void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *);
static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
{
if (mem_cgroup_disabled())
return 0;
return memcg->css.id;
}
/**
* mem_cgroup_from_id - look up a memcg from an id
* @id: the id to look up
*
* Caller must hold rcu_read_lock() and use css_tryget() as necessary.
*/
static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
{
struct cgroup_subsys_state *css;
css = css_from_id(id, &memory_cgrp_subsys);
return mem_cgroup_from_css(css);
}
/** /**
* parent_mem_cgroup - find the accounting parent of a memcg * parent_mem_cgroup - find the accounting parent of a memcg
* @memcg: memcg whose parent to find * @memcg: memcg whose parent to find
@ -353,11 +385,6 @@ static inline bool mm_match_cgroup(struct mm_struct *mm,
struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page); struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page);
ino_t page_cgroup_ino(struct page *page); ino_t page_cgroup_ino(struct page *page);
static inline bool mem_cgroup_disabled(void)
{
return !cgroup_subsys_enabled(memory_cgrp_subsys);
}
static inline bool mem_cgroup_online(struct mem_cgroup *memcg) static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
{ {
if (mem_cgroup_disabled()) if (mem_cgroup_disabled())
@ -429,36 +456,43 @@ bool mem_cgroup_oom_synchronize(bool wait);
extern int do_swap_account; extern int do_swap_account;
#endif #endif
struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page); void lock_page_memcg(struct page *page);
void mem_cgroup_end_page_stat(struct mem_cgroup *memcg); void unlock_page_memcg(struct page *page);
/** /**
* mem_cgroup_update_page_stat - update page state statistics * mem_cgroup_update_page_stat - update page state statistics
* @memcg: memcg to account against * @page: the page
* @idx: page state item to account * @idx: page state item to account
* @val: number of pages (positive or negative) * @val: number of pages (positive or negative)
* *
* See mem_cgroup_begin_page_stat() for locking requirements. * The @page must be locked or the caller must use lock_page_memcg()
* to prevent double accounting when the page is concurrently being
* moved to another memcg:
*
* lock_page(page) or lock_page_memcg(page)
* if (TestClearPageState(page))
* mem_cgroup_update_page_stat(page, state, -1);
* unlock_page(page) or unlock_page_memcg(page)
*/ */
static inline void mem_cgroup_update_page_stat(struct mem_cgroup *memcg, static inline void mem_cgroup_update_page_stat(struct page *page,
enum mem_cgroup_stat_index idx, int val) enum mem_cgroup_stat_index idx, int val)
{ {
VM_BUG_ON(!rcu_read_lock_held()); VM_BUG_ON(!(rcu_read_lock_held() || PageLocked(page)));
if (memcg) if (page->mem_cgroup)
this_cpu_add(memcg->stat->count[idx], val); this_cpu_add(page->mem_cgroup->stat->count[idx], val);
} }
static inline void mem_cgroup_inc_page_stat(struct mem_cgroup *memcg, static inline void mem_cgroup_inc_page_stat(struct page *page,
enum mem_cgroup_stat_index idx) enum mem_cgroup_stat_index idx)
{ {
mem_cgroup_update_page_stat(memcg, idx, 1); mem_cgroup_update_page_stat(page, idx, 1);
} }
static inline void mem_cgroup_dec_page_stat(struct mem_cgroup *memcg, static inline void mem_cgroup_dec_page_stat(struct page *page,
enum mem_cgroup_stat_index idx) enum mem_cgroup_stat_index idx)
{ {
mem_cgroup_update_page_stat(memcg, idx, -1); mem_cgroup_update_page_stat(page, idx, -1);
} }
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
@ -496,8 +530,17 @@ void mem_cgroup_split_huge_fixup(struct page *head);
#endif #endif
#else /* CONFIG_MEMCG */ #else /* CONFIG_MEMCG */
#define MEM_CGROUP_ID_SHIFT 0
#define MEM_CGROUP_ID_MAX 0
struct mem_cgroup; struct mem_cgroup;
static inline bool mem_cgroup_disabled(void)
{
return true;
}
static inline void mem_cgroup_events(struct mem_cgroup *memcg, static inline void mem_cgroup_events(struct mem_cgroup *memcg,
enum mem_cgroup_events_index idx, enum mem_cgroup_events_index idx,
unsigned int nr) unsigned int nr)
@ -539,7 +582,7 @@ static inline void mem_cgroup_uncharge_list(struct list_head *page_list)
{ {
} }
static inline void mem_cgroup_replace_page(struct page *old, struct page *new) static inline void mem_cgroup_migrate(struct page *old, struct page *new)
{ {
} }
@ -580,9 +623,16 @@ static inline void mem_cgroup_iter_break(struct mem_cgroup *root,
{ {
} }
static inline bool mem_cgroup_disabled(void) static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
{ {
return true; return 0;
}
static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
{
WARN_ON_ONCE(id);
/* XXX: This should always return root_mem_cgroup */
return NULL;
} }
static inline bool mem_cgroup_online(struct mem_cgroup *memcg) static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
@ -613,12 +663,11 @@ mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
{ {
} }
static inline struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page) static inline void lock_page_memcg(struct page *page)
{ {
return NULL;
} }
static inline void mem_cgroup_end_page_stat(struct mem_cgroup *memcg) static inline void unlock_page_memcg(struct page *page)
{ {
} }
@ -644,12 +693,12 @@ static inline bool mem_cgroup_oom_synchronize(bool wait)
return false; return false;
} }
static inline void mem_cgroup_inc_page_stat(struct mem_cgroup *memcg, static inline void mem_cgroup_inc_page_stat(struct page *page,
enum mem_cgroup_stat_index idx) enum mem_cgroup_stat_index idx)
{ {
} }
static inline void mem_cgroup_dec_page_stat(struct mem_cgroup *memcg, static inline void mem_cgroup_dec_page_stat(struct page *page,
enum mem_cgroup_stat_index idx) enum mem_cgroup_stat_index idx)
{ {
} }
@ -765,7 +814,7 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
void __memcg_kmem_uncharge(struct page *page, int order); void __memcg_kmem_uncharge(struct page *page, int order);
/* /*
* helper for acessing a memcg's index. It will be used as an index in the * helper for accessing a memcg's index. It will be used as an index in the
* child cache array in kmem_cache, and also to derive its name. This function * child cache array in kmem_cache, and also to derive its name. This function
* will return -1 when this is not a kmem-limited memcg. * will return -1 when this is not a kmem-limited memcg.
*/ */

Просмотреть файл

@ -109,6 +109,9 @@ extern void unregister_memory_notifier(struct notifier_block *nb);
extern int register_memory_isolate_notifier(struct notifier_block *nb); extern int register_memory_isolate_notifier(struct notifier_block *nb);
extern void unregister_memory_isolate_notifier(struct notifier_block *nb); extern void unregister_memory_isolate_notifier(struct notifier_block *nb);
extern int register_new_memory(int, struct mem_section *); extern int register_new_memory(int, struct mem_section *);
extern int memory_block_change_state(struct memory_block *mem,
unsigned long to_state,
unsigned long from_state_req);
#ifdef CONFIG_MEMORY_HOTREMOVE #ifdef CONFIG_MEMORY_HOTREMOVE
extern int unregister_memory_section(struct mem_section *); extern int unregister_memory_section(struct mem_section *);
#endif #endif

Просмотреть файл

@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
extern int try_online_node(int nid); extern int try_online_node(int nid);
extern bool memhp_auto_online;
#ifdef CONFIG_MEMORY_HOTREMOVE #ifdef CONFIG_MEMORY_HOTREMOVE
extern bool is_pageblock_removable_nolock(struct page *page); extern bool is_pageblock_removable_nolock(struct page *page);
extern int arch_remove_memory(u64 start, u64 size); extern int arch_remove_memory(u64 start, u64 size);
@ -196,6 +198,9 @@ void put_online_mems(void);
void mem_hotplug_begin(void); void mem_hotplug_begin(void);
void mem_hotplug_done(void); void mem_hotplug_done(void);
extern void set_zone_contiguous(struct zone *zone);
extern void clear_zone_contiguous(struct zone *zone);
#else /* ! CONFIG_MEMORY_HOTPLUG */ #else /* ! CONFIG_MEMORY_HOTPLUG */
/* /*
* Stub functions for when hotplug is off * Stub functions for when hotplug is off
@ -267,7 +272,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn, extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
void *arg, int (*func)(struct memory_block *, void *)); void *arg, int (*func)(struct memory_block *, void *));
extern int add_memory(int nid, u64 start, u64 size); extern int add_memory(int nid, u64 start, u64 size);
extern int add_memory_resource(int nid, struct resource *resource); extern int add_memory_resource(int nid, struct resource *resource, bool online);
extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default, extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
bool for_device); bool for_device);
extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device); extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);

Просмотреть файл

@ -23,9 +23,13 @@ enum migrate_reason {
MR_SYSCALL, /* also applies to cpusets */ MR_SYSCALL, /* also applies to cpusets */
MR_MEMPOLICY_MBIND, MR_MEMPOLICY_MBIND,
MR_NUMA_MISPLACED, MR_NUMA_MISPLACED,
MR_CMA MR_CMA,
MR_TYPES
}; };
/* In mm/debug.c; also keep sync with include/trace/events/migrate.h */
extern char *migrate_reason_names[MR_TYPES];
#ifdef CONFIG_MIGRATION #ifdef CONFIG_MIGRATION
extern void putback_movable_pages(struct list_head *l); extern void putback_movable_pages(struct list_head *l);

Просмотреть файл

@ -905,20 +905,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
{ {
return page->mem_cgroup; return page->mem_cgroup;
} }
static inline void set_page_memcg(struct page *page, struct mem_cgroup *memcg)
{
page->mem_cgroup = memcg;
}
#else #else
static inline struct mem_cgroup *page_memcg(struct page *page) static inline struct mem_cgroup *page_memcg(struct page *page)
{ {
return NULL; return NULL;
} }
static inline void set_page_memcg(struct page *page, struct mem_cgroup *memcg)
{
}
#endif #endif
/* /*
@ -1300,10 +1291,9 @@ int __set_page_dirty_nobuffers(struct page *page);
int __set_page_dirty_no_writeback(struct page *page); int __set_page_dirty_no_writeback(struct page *page);
int redirty_page_for_writepage(struct writeback_control *wbc, int redirty_page_for_writepage(struct writeback_control *wbc,
struct page *page); struct page *page);
void account_page_dirtied(struct page *page, struct address_space *mapping, void account_page_dirtied(struct page *page, struct address_space *mapping);
struct mem_cgroup *memcg);
void account_page_cleaned(struct page *page, struct address_space *mapping, void account_page_cleaned(struct page *page, struct address_space *mapping,
struct mem_cgroup *memcg, struct bdi_writeback *wb); struct bdi_writeback *wb);
int set_page_dirty(struct page *page); int set_page_dirty(struct page *page);
int set_page_dirty_lock(struct page *page); int set_page_dirty_lock(struct page *page);
void cancel_dirty_page(struct page *page); void cancel_dirty_page(struct page *page);
@ -2178,6 +2168,17 @@ extern int apply_to_page_range(struct mm_struct *mm, unsigned long address,
unsigned long size, pte_fn_t fn, void *data); unsigned long size, pte_fn_t fn, void *data);
#ifdef CONFIG_PAGE_POISONING
extern bool page_poisoning_enabled(void);
extern void kernel_poison_pages(struct page *page, int numpages, int enable);
extern bool page_is_poisoned(struct page *page);
#else
static inline bool page_poisoning_enabled(void) { return false; }
static inline void kernel_poison_pages(struct page *page, int numpages,
int enable) { }
static inline bool page_is_poisoned(struct page *page) { return false; }
#endif
#ifdef CONFIG_DEBUG_PAGEALLOC #ifdef CONFIG_DEBUG_PAGEALLOC
extern bool _debug_pagealloc_enabled; extern bool _debug_pagealloc_enabled;
extern void __kernel_map_pages(struct page *page, int numpages, int enable); extern void __kernel_map_pages(struct page *page, int numpages, int enable);
@ -2197,14 +2198,18 @@ kernel_map_pages(struct page *page, int numpages, int enable)
} }
#ifdef CONFIG_HIBERNATION #ifdef CONFIG_HIBERNATION
extern bool kernel_page_present(struct page *page); extern bool kernel_page_present(struct page *page);
#endif /* CONFIG_HIBERNATION */ #endif /* CONFIG_HIBERNATION */
#else #else /* CONFIG_DEBUG_PAGEALLOC */
static inline void static inline void
kernel_map_pages(struct page *page, int numpages, int enable) {} kernel_map_pages(struct page *page, int numpages, int enable) {}
#ifdef CONFIG_HIBERNATION #ifdef CONFIG_HIBERNATION
static inline bool kernel_page_present(struct page *page) { return true; } static inline bool kernel_page_present(struct page *page) { return true; }
#endif /* CONFIG_HIBERNATION */ #endif /* CONFIG_HIBERNATION */
#endif static inline bool debug_pagealloc_enabled(void)
{
return false;
}
#endif /* CONFIG_DEBUG_PAGEALLOC */
#ifdef __HAVE_ARCH_GATE_AREA #ifdef __HAVE_ARCH_GATE_AREA
extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm); extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);

Просмотреть файл

@ -9,8 +9,7 @@ struct vm_area_struct;
struct mm_struct; struct mm_struct;
extern void dump_page(struct page *page, const char *reason); extern void dump_page(struct page *page, const char *reason);
extern void dump_page_badflags(struct page *page, const char *reason, extern void __dump_page(struct page *page, const char *reason);
unsigned long badflags);
void dump_vma(const struct vm_area_struct *vma); void dump_vma(const struct vm_area_struct *vma);
void dump_mm(const struct mm_struct *mm); void dump_mm(const struct mm_struct *mm);

Просмотреть файл

@ -63,6 +63,9 @@ enum {
MIGRATE_TYPES MIGRATE_TYPES
}; };
/* In mm/page_alloc.c; keep in sync also with show_migration_types() there */
extern char * const migratetype_names[MIGRATE_TYPES];
#ifdef CONFIG_CMA #ifdef CONFIG_CMA
# define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA) # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
#else #else
@ -209,10 +212,12 @@ struct zone_reclaim_stat {
}; };
struct lruvec { struct lruvec {
struct list_head lists[NR_LRU_LISTS]; struct list_head lists[NR_LRU_LISTS];
struct zone_reclaim_stat reclaim_stat; struct zone_reclaim_stat reclaim_stat;
/* Evictions & activations on the inactive file list */
atomic_long_t inactive_age;
#ifdef CONFIG_MEMCG #ifdef CONFIG_MEMCG
struct zone *zone; struct zone *zone;
#endif #endif
}; };
@ -487,9 +492,6 @@ struct zone {
spinlock_t lru_lock; spinlock_t lru_lock;
struct lruvec lruvec; struct lruvec lruvec;
/* Evictions & activations on the inactive file list */
atomic_long_t inactive_age;
/* /*
* When free pages are below this point, additional steps are taken * When free pages are below this point, additional steps are taken
* when reading the number of free pages to avoid per-cpu counter * when reading the number of free pages to avoid per-cpu counter
@ -520,6 +522,8 @@ struct zone {
bool compact_blockskip_flush; bool compact_blockskip_flush;
#endif #endif
bool contiguous;
ZONE_PADDING(_pad3_) ZONE_PADDING(_pad3_)
/* Zone statistics */ /* Zone statistics */
atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS]; atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
@ -758,6 +762,8 @@ static inline struct zone *lruvec_zone(struct lruvec *lruvec)
#endif #endif
} }
extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru);
#ifdef CONFIG_HAVE_MEMORY_PRESENT #ifdef CONFIG_HAVE_MEMORY_PRESENT
void memory_present(int nid, unsigned long start, unsigned long end); void memory_present(int nid, unsigned long start, unsigned long end);
#else #else

Просмотреть файл

@ -45,6 +45,7 @@ struct page_ext {
unsigned int order; unsigned int order;
gfp_t gfp_mask; gfp_t gfp_mask;
unsigned int nr_entries; unsigned int nr_entries;
int last_migrate_reason;
unsigned long trace_entries[8]; unsigned long trace_entries[8];
#endif #endif
}; };

Просмотреть файл

@ -1,38 +1,54 @@
#ifndef __LINUX_PAGE_OWNER_H #ifndef __LINUX_PAGE_OWNER_H
#define __LINUX_PAGE_OWNER_H #define __LINUX_PAGE_OWNER_H
#include <linux/jump_label.h>
#ifdef CONFIG_PAGE_OWNER #ifdef CONFIG_PAGE_OWNER
extern bool page_owner_inited; extern struct static_key_false page_owner_inited;
extern struct page_ext_operations page_owner_ops; extern struct page_ext_operations page_owner_ops;
extern void __reset_page_owner(struct page *page, unsigned int order); extern void __reset_page_owner(struct page *page, unsigned int order);
extern void __set_page_owner(struct page *page, extern void __set_page_owner(struct page *page,
unsigned int order, gfp_t gfp_mask); unsigned int order, gfp_t gfp_mask);
extern gfp_t __get_page_owner_gfp(struct page *page); extern gfp_t __get_page_owner_gfp(struct page *page);
extern void __copy_page_owner(struct page *oldpage, struct page *newpage);
extern void __set_page_owner_migrate_reason(struct page *page, int reason);
extern void __dump_page_owner(struct page *page);
static inline void reset_page_owner(struct page *page, unsigned int order) static inline void reset_page_owner(struct page *page, unsigned int order)
{ {
if (likely(!page_owner_inited)) if (static_branch_unlikely(&page_owner_inited))
return; __reset_page_owner(page, order);
__reset_page_owner(page, order);
} }
static inline void set_page_owner(struct page *page, static inline void set_page_owner(struct page *page,
unsigned int order, gfp_t gfp_mask) unsigned int order, gfp_t gfp_mask)
{ {
if (likely(!page_owner_inited)) if (static_branch_unlikely(&page_owner_inited))
return; __set_page_owner(page, order, gfp_mask);
__set_page_owner(page, order, gfp_mask);
} }
static inline gfp_t get_page_owner_gfp(struct page *page) static inline gfp_t get_page_owner_gfp(struct page *page)
{ {
if (likely(!page_owner_inited)) if (static_branch_unlikely(&page_owner_inited))
return __get_page_owner_gfp(page);
else
return 0; return 0;
}
return __get_page_owner_gfp(page); static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
{
if (static_branch_unlikely(&page_owner_inited))
__copy_page_owner(oldpage, newpage);
}
static inline void set_page_owner_migrate_reason(struct page *page, int reason)
{
if (static_branch_unlikely(&page_owner_inited))
__set_page_owner_migrate_reason(page, reason);
}
static inline void dump_page_owner(struct page *page)
{
if (static_branch_unlikely(&page_owner_inited))
__dump_page_owner(page);
} }
#else #else
static inline void reset_page_owner(struct page *page, unsigned int order) static inline void reset_page_owner(struct page *page, unsigned int order)
@ -46,6 +62,14 @@ static inline gfp_t get_page_owner_gfp(struct page *page)
{ {
return 0; return 0;
} }
static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
{
}
static inline void set_page_owner_migrate_reason(struct page *page, int reason)
{
}
static inline void dump_page_owner(struct page *page)
{
}
#endif /* CONFIG_PAGE_OWNER */ #endif /* CONFIG_PAGE_OWNER */
#endif /* __LINUX_PAGE_OWNER_H */ #endif /* __LINUX_PAGE_OWNER_H */

Просмотреть файл

@ -663,8 +663,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
int add_to_page_cache_lru(struct page *page, struct address_space *mapping, int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
pgoff_t index, gfp_t gfp_mask); pgoff_t index, gfp_t gfp_mask);
extern void delete_from_page_cache(struct page *page); extern void delete_from_page_cache(struct page *page);
extern void __delete_from_page_cache(struct page *page, void *shadow, extern void __delete_from_page_cache(struct page *page, void *shadow);
struct mem_cgroup *memcg);
int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask); int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask);
/* /*

Просмотреть файл

@ -30,7 +30,11 @@
#define TIMER_ENTRY_STATIC ((void *) 0x300 + POISON_POINTER_DELTA) #define TIMER_ENTRY_STATIC ((void *) 0x300 + POISON_POINTER_DELTA)
/********** mm/debug-pagealloc.c **********/ /********** mm/debug-pagealloc.c **********/
#ifdef CONFIG_PAGE_POISONING_ZERO
#define PAGE_POISON 0x00
#else
#define PAGE_POISON 0xaa #define PAGE_POISON 0xaa
#endif
/********** mm/page_alloc.c ************/ /********** mm/page_alloc.c ************/

Просмотреть файл

@ -20,7 +20,7 @@
* Flags to pass to kmem_cache_create(). * Flags to pass to kmem_cache_create().
* The ones marked DEBUG are only valid if CONFIG_DEBUG_SLAB is set. * The ones marked DEBUG are only valid if CONFIG_DEBUG_SLAB is set.
*/ */
#define SLAB_DEBUG_FREE 0x00000100UL /* DEBUG: Perform (expensive) checks on free */ #define SLAB_CONSISTENCY_CHECKS 0x00000100UL /* DEBUG: Perform (expensive) checks on alloc/free */
#define SLAB_RED_ZONE 0x00000400UL /* DEBUG: Red zone objs in a cache */ #define SLAB_RED_ZONE 0x00000400UL /* DEBUG: Red zone objs in a cache */
#define SLAB_POISON 0x00000800UL /* DEBUG: Poison objects */ #define SLAB_POISON 0x00000800UL /* DEBUG: Poison objects */
#define SLAB_HWCACHE_ALIGN 0x00002000UL /* Align objs on cache lines */ #define SLAB_HWCACHE_ALIGN 0x00002000UL /* Align objs on cache lines */
@ -314,7 +314,7 @@ void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment
void kmem_cache_free(struct kmem_cache *, void *); void kmem_cache_free(struct kmem_cache *, void *);
/* /*
* Bulk allocation and freeing operations. These are accellerated in an * Bulk allocation and freeing operations. These are accelerated in an
* allocator specific way to avoid taking locks repeatedly or building * allocator specific way to avoid taking locks repeatedly or building
* metadata structures unnecessarily. * metadata structures unnecessarily.
* *
@ -323,6 +323,15 @@ void kmem_cache_free(struct kmem_cache *, void *);
void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **); void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **); int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
/*
* Caller must not use kfree_bulk() on memory not originally allocated
* by kmalloc(), because the SLOB allocator cannot handle this.
*/
static __always_inline void kfree_bulk(size_t size, void **p)
{
kmem_cache_free_bulk(NULL, size, p);
}
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment; void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment;
void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment; void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment;

Просмотреть файл

@ -60,6 +60,9 @@ struct kmem_cache {
atomic_t allocmiss; atomic_t allocmiss;
atomic_t freehit; atomic_t freehit;
atomic_t freemiss; atomic_t freemiss;
#ifdef CONFIG_DEBUG_SLAB_LEAK
atomic_t store_user_clean;
#endif
/* /*
* If debugging is enabled, then the allocator can add additional * If debugging is enabled, then the allocator can add additional

Просмотреть файл

@ -81,6 +81,7 @@ struct kmem_cache {
int reserved; /* Reserved bytes at the end of slabs */ int reserved; /* Reserved bytes at the end of slabs */
const char *name; /* Name (only for display!) */ const char *name; /* Name (only for display!) */
struct list_head list; /* List of slab caches */ struct list_head list; /* List of slab caches */
int red_left_pad; /* Left redzone padding size */
#ifdef CONFIG_SYSFS #ifdef CONFIG_SYSFS
struct kobject kobj; /* For sysfs */ struct kobject kobj; /* For sysfs */
#endif #endif

Просмотреть файл

@ -15,16 +15,6 @@ struct tracer;
struct dentry; struct dentry;
struct bpf_prog; struct bpf_prog;
struct trace_print_flags {
unsigned long mask;
const char *name;
};
struct trace_print_flags_u64 {
unsigned long long mask;
const char *name;
};
const char *trace_print_flags_seq(struct trace_seq *p, const char *delim, const char *trace_print_flags_seq(struct trace_seq *p, const char *delim,
unsigned long flags, unsigned long flags,
const struct trace_print_flags *flag_array); const struct trace_print_flags *flag_array);

Просмотреть файл

@ -3,13 +3,23 @@
/* /*
* File can be included directly by headers who only want to access * File can be included directly by headers who only want to access
* tracepoint->key to guard out of line trace calls. Otherwise * tracepoint->key to guard out of line trace calls, or the definition of
* linux/tracepoint.h should be used. * trace_print_flags{_u64}. Otherwise linux/tracepoint.h should be used.
*/ */
#include <linux/atomic.h> #include <linux/atomic.h>
#include <linux/static_key.h> #include <linux/static_key.h>
struct trace_print_flags {
unsigned long mask;
const char *name;
};
struct trace_print_flags_u64 {
unsigned long long mask;
const char *name;
};
struct tracepoint_func { struct tracepoint_func {
void *func; void *func;
void *data; void *data;

Просмотреть файл

@ -6,7 +6,7 @@
#include <linux/writeback.h> #include <linux/writeback.h>
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/gfpflags.h> #include <trace/events/mmflags.h>
struct btrfs_root; struct btrfs_root;
struct btrfs_fs_info; struct btrfs_fs_info;

Просмотреть файл

@ -7,7 +7,7 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/list.h> #include <linux/list.h>
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/gfpflags.h> #include <trace/events/mmflags.h>
#define COMPACTION_STATUS \ #define COMPACTION_STATUS \
EM( COMPACT_DEFERRED, "deferred") \ EM( COMPACT_DEFERRED, "deferred") \

Просмотреть файл

@ -1,43 +0,0 @@
/*
* The order of these masks is important. Matching masks will be seen
* first and the left over flags will end up showing by themselves.
*
* For example, if we have GFP_KERNEL before GFP_USER we wil get:
*
* GFP_KERNEL|GFP_HARDWALL
*
* Thus most bits set go first.
*/
#define show_gfp_flags(flags) \
(flags) ? __print_flags(flags, "|", \
{(unsigned long)GFP_TRANSHUGE, "GFP_TRANSHUGE"}, \
{(unsigned long)GFP_HIGHUSER_MOVABLE, "GFP_HIGHUSER_MOVABLE"}, \
{(unsigned long)GFP_HIGHUSER, "GFP_HIGHUSER"}, \
{(unsigned long)GFP_USER, "GFP_USER"}, \
{(unsigned long)GFP_TEMPORARY, "GFP_TEMPORARY"}, \
{(unsigned long)GFP_KERNEL, "GFP_KERNEL"}, \
{(unsigned long)GFP_NOFS, "GFP_NOFS"}, \
{(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"}, \
{(unsigned long)GFP_NOIO, "GFP_NOIO"}, \
{(unsigned long)__GFP_HIGH, "GFP_HIGH"}, \
{(unsigned long)__GFP_ATOMIC, "GFP_ATOMIC"}, \
{(unsigned long)__GFP_IO, "GFP_IO"}, \
{(unsigned long)__GFP_COLD, "GFP_COLD"}, \
{(unsigned long)__GFP_NOWARN, "GFP_NOWARN"}, \
{(unsigned long)__GFP_REPEAT, "GFP_REPEAT"}, \
{(unsigned long)__GFP_NOFAIL, "GFP_NOFAIL"}, \
{(unsigned long)__GFP_NORETRY, "GFP_NORETRY"}, \
{(unsigned long)__GFP_COMP, "GFP_COMP"}, \
{(unsigned long)__GFP_ZERO, "GFP_ZERO"}, \
{(unsigned long)__GFP_NOMEMALLOC, "GFP_NOMEMALLOC"}, \
{(unsigned long)__GFP_MEMALLOC, "GFP_MEMALLOC"}, \
{(unsigned long)__GFP_HARDWALL, "GFP_HARDWALL"}, \
{(unsigned long)__GFP_THISNODE, "GFP_THISNODE"}, \
{(unsigned long)__GFP_RECLAIMABLE, "GFP_RECLAIMABLE"}, \
{(unsigned long)__GFP_MOVABLE, "GFP_MOVABLE"}, \
{(unsigned long)__GFP_NOTRACK, "GFP_NOTRACK"}, \
{(unsigned long)__GFP_DIRECT_RECLAIM, "GFP_DIRECT_RECLAIM"}, \
{(unsigned long)__GFP_KSWAPD_RECLAIM, "GFP_KSWAPD_RECLAIM"}, \
{(unsigned long)__GFP_OTHER_NODE, "GFP_OTHER_NODE"} \
) : "GFP_NOWAIT"

Просмотреть файл

@ -6,8 +6,6 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/gfpflags.h>
#define SCAN_STATUS \ #define SCAN_STATUS \
EM( SCAN_FAIL, "failed") \ EM( SCAN_FAIL, "failed") \
EM( SCAN_SUCCEED, "succeeded") \ EM( SCAN_SUCCEED, "succeeded") \

Просмотреть файл

@ -6,7 +6,7 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/gfpflags.h> #include <trace/events/mmflags.h>
DECLARE_EVENT_CLASS(kmem_alloc, DECLARE_EVENT_CLASS(kmem_alloc,

Просмотреть файл

@ -0,0 +1,164 @@
/*
* The order of these masks is important. Matching masks will be seen
* first and the left over flags will end up showing by themselves.
*
* For example, if we have GFP_KERNEL before GFP_USER we wil get:
*
* GFP_KERNEL|GFP_HARDWALL
*
* Thus most bits set go first.
*/
#define __def_gfpflag_names \
{(unsigned long)GFP_TRANSHUGE, "GFP_TRANSHUGE"}, \
{(unsigned long)GFP_HIGHUSER_MOVABLE, "GFP_HIGHUSER_MOVABLE"},\
{(unsigned long)GFP_HIGHUSER, "GFP_HIGHUSER"}, \
{(unsigned long)GFP_USER, "GFP_USER"}, \
{(unsigned long)GFP_TEMPORARY, "GFP_TEMPORARY"}, \
{(unsigned long)GFP_KERNEL_ACCOUNT, "GFP_KERNEL_ACCOUNT"}, \
{(unsigned long)GFP_KERNEL, "GFP_KERNEL"}, \
{(unsigned long)GFP_NOFS, "GFP_NOFS"}, \
{(unsigned long)GFP_ATOMIC, "GFP_ATOMIC"}, \
{(unsigned long)GFP_NOIO, "GFP_NOIO"}, \
{(unsigned long)GFP_NOWAIT, "GFP_NOWAIT"}, \
{(unsigned long)GFP_DMA, "GFP_DMA"}, \
{(unsigned long)__GFP_HIGHMEM, "__GFP_HIGHMEM"}, \
{(unsigned long)GFP_DMA32, "GFP_DMA32"}, \
{(unsigned long)__GFP_HIGH, "__GFP_HIGH"}, \
{(unsigned long)__GFP_ATOMIC, "__GFP_ATOMIC"}, \
{(unsigned long)__GFP_IO, "__GFP_IO"}, \
{(unsigned long)__GFP_FS, "__GFP_FS"}, \
{(unsigned long)__GFP_COLD, "__GFP_COLD"}, \
{(unsigned long)__GFP_NOWARN, "__GFP_NOWARN"}, \
{(unsigned long)__GFP_REPEAT, "__GFP_REPEAT"}, \
{(unsigned long)__GFP_NOFAIL, "__GFP_NOFAIL"}, \
{(unsigned long)__GFP_NORETRY, "__GFP_NORETRY"}, \
{(unsigned long)__GFP_COMP, "__GFP_COMP"}, \
{(unsigned long)__GFP_ZERO, "__GFP_ZERO"}, \
{(unsigned long)__GFP_NOMEMALLOC, "__GFP_NOMEMALLOC"}, \
{(unsigned long)__GFP_MEMALLOC, "__GFP_MEMALLOC"}, \
{(unsigned long)__GFP_HARDWALL, "__GFP_HARDWALL"}, \
{(unsigned long)__GFP_THISNODE, "__GFP_THISNODE"}, \
{(unsigned long)__GFP_RECLAIMABLE, "__GFP_RECLAIMABLE"}, \
{(unsigned long)__GFP_MOVABLE, "__GFP_MOVABLE"}, \
{(unsigned long)__GFP_ACCOUNT, "__GFP_ACCOUNT"}, \
{(unsigned long)__GFP_NOTRACK, "__GFP_NOTRACK"}, \
{(unsigned long)__GFP_WRITE, "__GFP_WRITE"}, \
{(unsigned long)__GFP_RECLAIM, "__GFP_RECLAIM"}, \
{(unsigned long)__GFP_DIRECT_RECLAIM, "__GFP_DIRECT_RECLAIM"},\
{(unsigned long)__GFP_KSWAPD_RECLAIM, "__GFP_KSWAPD_RECLAIM"},\
{(unsigned long)__GFP_OTHER_NODE, "__GFP_OTHER_NODE"} \
#define show_gfp_flags(flags) \
(flags) ? __print_flags(flags, "|", \
__def_gfpflag_names \
) : "none"
#ifdef CONFIG_MMU
#define IF_HAVE_PG_MLOCK(flag,string) ,{1UL << flag, string}
#else
#define IF_HAVE_PG_MLOCK(flag,string)
#endif
#ifdef CONFIG_ARCH_USES_PG_UNCACHED
#define IF_HAVE_PG_UNCACHED(flag,string) ,{1UL << flag, string}
#else
#define IF_HAVE_PG_UNCACHED(flag,string)
#endif
#ifdef CONFIG_MEMORY_FAILURE
#define IF_HAVE_PG_HWPOISON(flag,string) ,{1UL << flag, string}
#else
#define IF_HAVE_PG_HWPOISON(flag,string)
#endif
#if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT)
#define IF_HAVE_PG_IDLE(flag,string) ,{1UL << flag, string}
#else
#define IF_HAVE_PG_IDLE(flag,string)
#endif
#define __def_pageflag_names \
{1UL << PG_locked, "locked" }, \
{1UL << PG_error, "error" }, \
{1UL << PG_referenced, "referenced" }, \
{1UL << PG_uptodate, "uptodate" }, \
{1UL << PG_dirty, "dirty" }, \
{1UL << PG_lru, "lru" }, \
{1UL << PG_active, "active" }, \
{1UL << PG_slab, "slab" }, \
{1UL << PG_owner_priv_1, "owner_priv_1" }, \
{1UL << PG_arch_1, "arch_1" }, \
{1UL << PG_reserved, "reserved" }, \
{1UL << PG_private, "private" }, \
{1UL << PG_private_2, "private_2" }, \
{1UL << PG_writeback, "writeback" }, \
{1UL << PG_head, "head" }, \
{1UL << PG_swapcache, "swapcache" }, \
{1UL << PG_mappedtodisk, "mappedtodisk" }, \
{1UL << PG_reclaim, "reclaim" }, \
{1UL << PG_swapbacked, "swapbacked" }, \
{1UL << PG_unevictable, "unevictable" } \
IF_HAVE_PG_MLOCK(PG_mlocked, "mlocked" ) \
IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \
IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \
IF_HAVE_PG_IDLE(PG_young, "young" ) \
IF_HAVE_PG_IDLE(PG_idle, "idle" )
#define show_page_flags(flags) \
(flags) ? __print_flags(flags, "|", \
__def_pageflag_names \
) : "none"
#if defined(CONFIG_X86)
#define __VM_ARCH_SPECIFIC {VM_PAT, "pat" }
#elif defined(CONFIG_PPC)
#define __VM_ARCH_SPECIFIC {VM_SAO, "sao" }
#elif defined(CONFIG_PARISC) || defined(CONFIG_METAG) || defined(CONFIG_IA64)
#define __VM_ARCH_SPECIFIC {VM_GROWSUP, "growsup" }
#elif !defined(CONFIG_MMU)
#define __VM_ARCH_SPECIFIC {VM_MAPPED_COPY,"mappedcopy" }
#else
#define __VM_ARCH_SPECIFIC {VM_ARCH_1, "arch_1" }
#endif
#ifdef CONFIG_MEM_SOFT_DIRTY
#define IF_HAVE_VM_SOFTDIRTY(flag,name) {flag, name },
#else
#define IF_HAVE_VM_SOFTDIRTY(flag,name)
#endif
#define __def_vmaflag_names \
{VM_READ, "read" }, \
{VM_WRITE, "write" }, \
{VM_EXEC, "exec" }, \
{VM_SHARED, "shared" }, \
{VM_MAYREAD, "mayread" }, \
{VM_MAYWRITE, "maywrite" }, \
{VM_MAYEXEC, "mayexec" }, \
{VM_MAYSHARE, "mayshare" }, \
{VM_GROWSDOWN, "growsdown" }, \
{VM_PFNMAP, "pfnmap" }, \
{VM_DENYWRITE, "denywrite" }, \
{VM_LOCKONFAULT, "lockonfault" }, \
{VM_LOCKED, "locked" }, \
{VM_IO, "io" }, \
{VM_SEQ_READ, "seqread" }, \
{VM_RAND_READ, "randread" }, \
{VM_DONTCOPY, "dontcopy" }, \
{VM_DONTEXPAND, "dontexpand" }, \
{VM_ACCOUNT, "account" }, \
{VM_NORESERVE, "noreserve" }, \
{VM_HUGETLB, "hugetlb" }, \
__VM_ARCH_SPECIFIC , \
{VM_DONTDUMP, "dontdump" }, \
IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \
{VM_MIXEDMAP, "mixedmap" }, \
{VM_HUGEPAGE, "hugepage" }, \
{VM_NOHUGEPAGE, "nohugepage" }, \
{VM_MERGEABLE, "mergeable" } \
#define show_vma_flags(flags) \
(flags) ? __print_flags(flags, "|", \
__def_vmaflag_names \
) : "none"

Просмотреть файл

@ -8,7 +8,7 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/memcontrol.h> #include <linux/memcontrol.h>
#include <trace/events/gfpflags.h> #include <trace/events/mmflags.h>
#define RECLAIM_WB_ANON 0x0001u #define RECLAIM_WB_ANON 0x0001u
#define RECLAIM_WB_FILE 0x0002u #define RECLAIM_WB_FILE 0x0002u

Просмотреть файл

@ -1,7 +1,4 @@
/* -*- linux-c -*- ------------------------------------------------------- * /*
*
* linux/include/linux/auto_fs.h
*
* Copyright 1997 Transmeta Corporation - All Rights Reserved * Copyright 1997 Transmeta Corporation - All Rights Reserved
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
@ -51,7 +48,7 @@ struct autofs_packet_hdr {
struct autofs_packet_missing { struct autofs_packet_missing {
struct autofs_packet_hdr hdr; struct autofs_packet_hdr hdr;
autofs_wqt_t wait_queue_token; autofs_wqt_t wait_queue_token;
int len; int len;
char name[NAME_MAX+1]; char name[NAME_MAX+1];
}; };
@ -63,12 +60,12 @@ struct autofs_packet_expire {
char name[NAME_MAX+1]; char name[NAME_MAX+1];
}; };
#define AUTOFS_IOC_READY _IO(0x93,0x60) #define AUTOFS_IOC_READY _IO(0x93, 0x60)
#define AUTOFS_IOC_FAIL _IO(0x93,0x61) #define AUTOFS_IOC_FAIL _IO(0x93, 0x61)
#define AUTOFS_IOC_CATATONIC _IO(0x93,0x62) #define AUTOFS_IOC_CATATONIC _IO(0x93, 0x62)
#define AUTOFS_IOC_PROTOVER _IOR(0x93,0x63,int) #define AUTOFS_IOC_PROTOVER _IOR(0x93, 0x63, int)
#define AUTOFS_IOC_SETTIMEOUT32 _IOWR(0x93,0x64,compat_ulong_t) #define AUTOFS_IOC_SETTIMEOUT32 _IOWR(0x93, 0x64, compat_ulong_t)
#define AUTOFS_IOC_SETTIMEOUT _IOWR(0x93,0x64,unsigned long) #define AUTOFS_IOC_SETTIMEOUT _IOWR(0x93, 0x64, unsigned long)
#define AUTOFS_IOC_EXPIRE _IOR(0x93,0x65,struct autofs_packet_expire) #define AUTOFS_IOC_EXPIRE _IOR(0x93, 0x65, struct autofs_packet_expire)
#endif /* _UAPI_LINUX_AUTO_FS_H */ #endif /* _UAPI_LINUX_AUTO_FS_H */

Просмотреть файл

@ -1,6 +1,4 @@
/* -*- c -*- /*
* linux/include/linux/auto_fs4.h
*
* Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org> * Copyright 1999-2000 Jeremy Fitzhardinge <jeremy@goop.org>
* *
* This file is part of the Linux kernel and is made available under * This file is part of the Linux kernel and is made available under
@ -38,7 +36,6 @@
static inline void set_autofs_type_indirect(unsigned int *type) static inline void set_autofs_type_indirect(unsigned int *type)
{ {
*type = AUTOFS_TYPE_INDIRECT; *type = AUTOFS_TYPE_INDIRECT;
return;
} }
static inline unsigned int autofs_type_indirect(unsigned int type) static inline unsigned int autofs_type_indirect(unsigned int type)
@ -49,7 +46,6 @@ static inline unsigned int autofs_type_indirect(unsigned int type)
static inline void set_autofs_type_direct(unsigned int *type) static inline void set_autofs_type_direct(unsigned int *type)
{ {
*type = AUTOFS_TYPE_DIRECT; *type = AUTOFS_TYPE_DIRECT;
return;
} }
static inline unsigned int autofs_type_direct(unsigned int type) static inline unsigned int autofs_type_direct(unsigned int type)
@ -60,7 +56,6 @@ static inline unsigned int autofs_type_direct(unsigned int type)
static inline void set_autofs_type_offset(unsigned int *type) static inline void set_autofs_type_offset(unsigned int *type)
{ {
*type = AUTOFS_TYPE_OFFSET; *type = AUTOFS_TYPE_OFFSET;
return;
} }
static inline unsigned int autofs_type_offset(unsigned int type) static inline unsigned int autofs_type_offset(unsigned int type)
@ -81,7 +76,6 @@ static inline unsigned int autofs_type_trigger(unsigned int type)
static inline void set_autofs_type_any(unsigned int *type) static inline void set_autofs_type_any(unsigned int *type)
{ {
*type = AUTOFS_TYPE_ANY; *type = AUTOFS_TYPE_ANY;
return;
} }
static inline unsigned int autofs_type_any(unsigned int type) static inline unsigned int autofs_type_any(unsigned int type)
@ -114,7 +108,7 @@ enum autofs_notify {
/* v4 multi expire (via pipe) */ /* v4 multi expire (via pipe) */
struct autofs_packet_expire_multi { struct autofs_packet_expire_multi {
struct autofs_packet_hdr hdr; struct autofs_packet_hdr hdr;
autofs_wqt_t wait_queue_token; autofs_wqt_t wait_queue_token;
int len; int len;
char name[NAME_MAX+1]; char name[NAME_MAX+1];
}; };
@ -154,11 +148,10 @@ union autofs_v5_packet_union {
autofs_packet_expire_direct_t expire_direct; autofs_packet_expire_direct_t expire_direct;
}; };
#define AUTOFS_IOC_EXPIRE_MULTI _IOW(0x93,0x66,int) #define AUTOFS_IOC_EXPIRE_MULTI _IOW(0x93, 0x66, int)
#define AUTOFS_IOC_EXPIRE_INDIRECT AUTOFS_IOC_EXPIRE_MULTI #define AUTOFS_IOC_EXPIRE_INDIRECT AUTOFS_IOC_EXPIRE_MULTI
#define AUTOFS_IOC_EXPIRE_DIRECT AUTOFS_IOC_EXPIRE_MULTI #define AUTOFS_IOC_EXPIRE_DIRECT AUTOFS_IOC_EXPIRE_MULTI
#define AUTOFS_IOC_PROTOSUBVER _IOR(0x93,0x67,int) #define AUTOFS_IOC_PROTOSUBVER _IOR(0x93, 0x67, int)
#define AUTOFS_IOC_ASKUMOUNT _IOR(0x93,0x70,int) #define AUTOFS_IOC_ASKUMOUNT _IOR(0x93, 0x70, int)
#endif /* _LINUX_AUTO_FS4_H */ #endif /* _LINUX_AUTO_FS4_H */

Просмотреть файл

@ -1420,6 +1420,28 @@ config KALLSYMS_ALL
Say N unless you really need all symbols. Say N unless you really need all symbols.
config KALLSYMS_ABSOLUTE_PERCPU
bool
default X86_64 && SMP
config KALLSYMS_BASE_RELATIVE
bool
depends on KALLSYMS
default !IA64 && !(TILE && 64BIT)
help
Instead of emitting them as absolute values in the native word size,
emit the symbol references in the kallsyms table as 32-bit entries,
each containing a relative value in the range [base, base + U32_MAX]
or, when KALLSYMS_ABSOLUTE_PERCPU is in effect, each containing either
an absolute value in the range [0, S32_MAX] or a relative value in the
range [base, base + S32_MAX], where base is the lowest relative symbol
address encountered in the image.
On 64-bit builds, this reduces the size of the address table by 50%,
but more importantly, it results in entries whose values are build
time constants, and no relocation pass is required at runtime to fix
up the entries based on the runtime load address of the kernel.
config PRINTK config PRINTK
default y default y
bool "Enable support for printk" if EXPERT bool "Enable support for printk" if EXPERT

Просмотреть файл

@ -705,7 +705,6 @@ static int __init initcall_blacklist(char *str)
static bool __init_or_module initcall_blacklisted(initcall_t fn) static bool __init_or_module initcall_blacklisted(initcall_t fn)
{ {
struct list_head *tmp;
struct blacklist_entry *entry; struct blacklist_entry *entry;
char *fn_name; char *fn_name;
@ -713,8 +712,7 @@ static bool __init_or_module initcall_blacklisted(initcall_t fn)
if (!fn_name) if (!fn_name)
return false; return false;
list_for_each(tmp, &blacklisted_initcalls) { list_for_each_entry(entry, &blacklisted_initcalls, next) {
entry = list_entry(tmp, struct blacklist_entry, next);
if (!strcmp(fn_name, entry->buf)) { if (!strcmp(fn_name, entry->buf)) {
pr_debug("initcall %s blacklisted\n", fn_name); pr_debug("initcall %s blacklisted\n", fn_name);
kfree(fn_name); kfree(fn_name);

Просмотреть файл

@ -38,6 +38,7 @@
* during the second link stage. * during the second link stage.
*/ */
extern const unsigned long kallsyms_addresses[] __weak; extern const unsigned long kallsyms_addresses[] __weak;
extern const int kallsyms_offsets[] __weak;
extern const u8 kallsyms_names[] __weak; extern const u8 kallsyms_names[] __weak;
/* /*
@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
extern const unsigned long kallsyms_num_syms extern const unsigned long kallsyms_num_syms
__attribute__((weak, section(".rodata"))); __attribute__((weak, section(".rodata")));
extern const unsigned long kallsyms_relative_base
__attribute__((weak, section(".rodata")));
extern const u8 kallsyms_token_table[] __weak; extern const u8 kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak; extern const u16 kallsyms_token_index[] __weak;
@ -176,6 +180,23 @@ static unsigned int get_symbol_offset(unsigned long pos)
return name - kallsyms_names; return name - kallsyms_names;
} }
static unsigned long kallsyms_sym_address(int idx)
{
if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
return kallsyms_addresses[idx];
/* values are unsigned offsets if --absolute-percpu is not in effect */
if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU))
return kallsyms_relative_base + (u32)kallsyms_offsets[idx];
/* ...otherwise, positive offsets are absolute values */
if (kallsyms_offsets[idx] >= 0)
return kallsyms_offsets[idx];
/* ...and negative offsets are relative to kallsyms_relative_base - 1 */
return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
}
/* Lookup the address for this symbol. Returns 0 if not found. */ /* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name) unsigned long kallsyms_lookup_name(const char *name)
{ {
@ -187,7 +208,7 @@ unsigned long kallsyms_lookup_name(const char *name)
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf)); off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
if (strcmp(namebuf, name) == 0) if (strcmp(namebuf, name) == 0)
return kallsyms_addresses[i]; return kallsyms_sym_address(i);
} }
return module_kallsyms_lookup_name(name); return module_kallsyms_lookup_name(name);
} }
@ -204,7 +225,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
for (i = 0, off = 0; i < kallsyms_num_syms; i++) { for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf)); off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
ret = fn(data, namebuf, NULL, kallsyms_addresses[i]); ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
if (ret != 0) if (ret != 0)
return ret; return ret;
} }
@ -220,7 +241,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
unsigned long i, low, high, mid; unsigned long i, low, high, mid;
/* This kernel should never had been booted. */ /* This kernel should never had been booted. */
BUG_ON(!kallsyms_addresses); if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
BUG_ON(!kallsyms_addresses);
else
BUG_ON(!kallsyms_offsets);
/* Do a binary search on the sorted kallsyms_addresses array. */ /* Do a binary search on the sorted kallsyms_addresses array. */
low = 0; low = 0;
@ -228,7 +252,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
while (high - low > 1) { while (high - low > 1) {
mid = low + (high - low) / 2; mid = low + (high - low) / 2;
if (kallsyms_addresses[mid] <= addr) if (kallsyms_sym_address(mid) <= addr)
low = mid; low = mid;
else else
high = mid; high = mid;
@ -238,15 +262,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
* Search for the first aliased symbol. Aliased * Search for the first aliased symbol. Aliased
* symbols are symbols with the same address. * symbols are symbols with the same address.
*/ */
while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low]) while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
--low; --low;
symbol_start = kallsyms_addresses[low]; symbol_start = kallsyms_sym_address(low);
/* Search for next non-aliased symbol. */ /* Search for next non-aliased symbol. */
for (i = low + 1; i < kallsyms_num_syms; i++) { for (i = low + 1; i < kallsyms_num_syms; i++) {
if (kallsyms_addresses[i] > symbol_start) { if (kallsyms_sym_address(i) > symbol_start) {
symbol_end = kallsyms_addresses[i]; symbol_end = kallsyms_sym_address(i);
break; break;
} }
} }
@ -470,7 +494,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
unsigned off = iter->nameoff; unsigned off = iter->nameoff;
iter->module_name[0] = '\0'; iter->module_name[0] = '\0';
iter->value = kallsyms_addresses[iter->pos]; iter->value = kallsyms_sym_address(iter->pos);
iter->type = kallsyms_get_symbol_type(off); iter->type = kallsyms_get_symbol_type(off);

Просмотреть файл

@ -148,8 +148,7 @@ static inline struct lock_class *hlock_class(struct held_lock *hlock)
} }
#ifdef CONFIG_LOCK_STAT #ifdef CONFIG_LOCK_STAT
static DEFINE_PER_CPU(struct lock_class_stats[MAX_LOCKDEP_KEYS], static DEFINE_PER_CPU(struct lock_class_stats[MAX_LOCKDEP_KEYS], cpu_lock_stats);
cpu_lock_stats);
static inline u64 lockstat_clock(void) static inline u64 lockstat_clock(void)
{ {

Просмотреть файл

@ -391,7 +391,7 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
/* /*
* 'memmap_start' is the virtual address for the first "struct * 'memmap_start' is the virtual address for the first "struct
* page" in this range of the vmemmap array. In the case of * page" in this range of the vmemmap array. In the case of
* CONFIG_SPARSE_VMEMMAP a page_to_pfn conversion is simple * CONFIG_SPARSEMEM_VMEMMAP a page_to_pfn conversion is simple
* pointer arithmetic, so we can perform this to_vmem_altmap() * pointer arithmetic, so we can perform this to_vmem_altmap()
* conversion without concern for the initialization state of * conversion without concern for the initialization state of
* the struct page fields. * the struct page fields.
@ -400,7 +400,7 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
struct dev_pagemap *pgmap; struct dev_pagemap *pgmap;
/* /*
* Uncoditionally retrieve a dev_pagemap associated with the * Unconditionally retrieve a dev_pagemap associated with the
* given physical address, this is only for use in the * given physical address, this is only for use in the
* arch_{add|remove}_memory() for setting up and tearing down * arch_{add|remove}_memory() for setting up and tearing down
* the memmap. * the memmap.

Просмотреть файл

@ -1158,6 +1158,22 @@ static int __init kaslr_nohibernate_setup(char *str)
return nohibernate_setup(str); return nohibernate_setup(str);
} }
static int __init page_poison_nohibernate_setup(char *str)
{
#ifdef CONFIG_PAGE_POISONING_ZERO
/*
* The zeroing option for page poison skips the checks on alloc.
* since hibernation doesn't save free pages there's no way to
* guarantee the pages will still be zeroed.
*/
if (!strcmp(str, "on")) {
pr_info("Disabling hibernation due to page poisoning\n");
return nohibernate_setup(str);
}
#endif
return 1;
}
__setup("noresume", noresume_setup); __setup("noresume", noresume_setup);
__setup("resume_offset=", resume_offset_setup); __setup("resume_offset=", resume_offset_setup);
__setup("resume=", resume_setup); __setup("resume=", resume_setup);
@ -1166,3 +1182,4 @@ __setup("resumewait", resumewait_setup);
__setup("resumedelay=", resumedelay_setup); __setup("resumedelay=", resumedelay_setup);
__setup("nohibernate", nohibernate_setup); __setup("nohibernate", nohibernate_setup);
__setup("kaslr", kaslr_nohibernate_setup); __setup("kaslr", kaslr_nohibernate_setup);
__setup("page_poison=", page_poison_nohibernate_setup);

Просмотреть файл

@ -130,10 +130,8 @@ static struct rcu_torture __rcu *rcu_torture_current;
static unsigned long rcu_torture_current_version; static unsigned long rcu_torture_current_version;
static struct rcu_torture rcu_tortures[10 * RCU_TORTURE_PIPE_LEN]; static struct rcu_torture rcu_tortures[10 * RCU_TORTURE_PIPE_LEN];
static DEFINE_SPINLOCK(rcu_torture_lock); static DEFINE_SPINLOCK(rcu_torture_lock);
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_count) = { 0 };
rcu_torture_count) = { 0 }; static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_batch) = { 0 };
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1],
rcu_torture_batch) = { 0 };
static atomic_t rcu_torture_wcount[RCU_TORTURE_PIPE_LEN + 1]; static atomic_t rcu_torture_wcount[RCU_TORTURE_PIPE_LEN + 1];
static atomic_t n_rcu_torture_alloc; static atomic_t n_rcu_torture_alloc;
static atomic_t n_rcu_torture_alloc_fail; static atomic_t n_rcu_torture_alloc_fail;

Просмотреть файл

@ -320,8 +320,7 @@ static bool wq_debug_force_rr_cpu = false;
module_param_named(debug_force_rr_cpu, wq_debug_force_rr_cpu, bool, 0644); module_param_named(debug_force_rr_cpu, wq_debug_force_rr_cpu, bool, 0644);
/* the per-cpu worker pools */ /* the per-cpu worker pools */
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS], static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS], cpu_worker_pools);
cpu_worker_pools);
static DEFINE_IDR(worker_pool_idr); /* PR: idr of all pools */ static DEFINE_IDR(worker_pool_idr); /* PR: idr of all pools */

Просмотреть файл

@ -17,6 +17,9 @@
#include <linux/socket.h> #include <linux/socket.h>
#include <linux/in.h> #include <linux/in.h>
#include <linux/gfp.h>
#include <linux/mm.h>
#define BUF_SIZE 256 #define BUF_SIZE 256
#define PAD_SIZE 16 #define PAD_SIZE 16
#define FILL_CHAR '$' #define FILL_CHAR '$'
@ -410,6 +413,55 @@ netdev_features(void)
{ {
} }
static void __init
flags(void)
{
unsigned long flags;
gfp_t gfp;
char *cmp_buffer;
flags = 0;
test("", "%pGp", &flags);
/* Page flags should filter the zone id */
flags = 1UL << NR_PAGEFLAGS;
test("", "%pGp", &flags);
flags |= 1UL << PG_uptodate | 1UL << PG_dirty | 1UL << PG_lru
| 1UL << PG_active | 1UL << PG_swapbacked;
test("uptodate|dirty|lru|active|swapbacked", "%pGp", &flags);
flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC
| VM_DENYWRITE;
test("read|exec|mayread|maywrite|mayexec|denywrite", "%pGv", &flags);
gfp = GFP_TRANSHUGE;
test("GFP_TRANSHUGE", "%pGg", &gfp);
gfp = GFP_ATOMIC|__GFP_DMA;
test("GFP_ATOMIC|GFP_DMA", "%pGg", &gfp);
gfp = __GFP_ATOMIC;
test("__GFP_ATOMIC", "%pGg", &gfp);
cmp_buffer = kmalloc(BUF_SIZE, GFP_KERNEL);
if (!cmp_buffer)
return;
/* Any flags not translated by the table should remain numeric */
gfp = ~__GFP_BITS_MASK;
snprintf(cmp_buffer, BUF_SIZE, "%#lx", (unsigned long) gfp);
test(cmp_buffer, "%pGg", &gfp);
snprintf(cmp_buffer, BUF_SIZE, "__GFP_ATOMIC|%#lx",
(unsigned long) gfp);
gfp |= __GFP_ATOMIC;
test(cmp_buffer, "%pGg", &gfp);
kfree(cmp_buffer);
}
static void __init static void __init
test_pointer(void) test_pointer(void)
{ {
@ -428,6 +480,7 @@ test_pointer(void)
struct_clk(); struct_clk();
bitmap(); bitmap();
netdev_features(); netdev_features();
flags();
} }
static int __init static int __init

Просмотреть файл

@ -35,6 +35,8 @@
#include <linux/blkdev.h> #include <linux/blkdev.h>
#endif #endif
#include "../mm/internal.h" /* For the trace_print_flags arrays */
#include <asm/page.h> /* for PAGE_SIZE */ #include <asm/page.h> /* for PAGE_SIZE */
#include <asm/sections.h> /* for dereference_function_descriptor() */ #include <asm/sections.h> /* for dereference_function_descriptor() */
#include <asm/byteorder.h> /* cpu_to_le16 */ #include <asm/byteorder.h> /* cpu_to_le16 */
@ -1407,6 +1409,72 @@ char *clock(char *buf, char *end, struct clk *clk, struct printf_spec spec,
} }
} }
static
char *format_flags(char *buf, char *end, unsigned long flags,
const struct trace_print_flags *names)
{
unsigned long mask;
const struct printf_spec strspec = {
.field_width = -1,
.precision = -1,
};
const struct printf_spec numspec = {
.flags = SPECIAL|SMALL,
.field_width = -1,
.precision = -1,
.base = 16,
};
for ( ; flags && names->name; names++) {
mask = names->mask;
if ((flags & mask) != mask)
continue;
buf = string(buf, end, names->name, strspec);
flags &= ~mask;
if (flags) {
if (buf < end)
*buf = '|';
buf++;
}
}
if (flags)
buf = number(buf, end, flags, numspec);
return buf;
}
static noinline_for_stack
char *flags_string(char *buf, char *end, void *flags_ptr, const char *fmt)
{
unsigned long flags;
const struct trace_print_flags *names;
switch (fmt[1]) {
case 'p':
flags = *(unsigned long *)flags_ptr;
/* Remove zone id */
flags &= (1UL << NR_PAGEFLAGS) - 1;
names = pageflag_names;
break;
case 'v':
flags = *(unsigned long *)flags_ptr;
names = vmaflag_names;
break;
case 'g':
flags = *(gfp_t *)flags_ptr;
names = gfpflag_names;
break;
default:
WARN_ONCE(1, "Unsupported flags modifier: %c\n", fmt[1]);
return buf;
}
return format_flags(buf, end, flags, names);
}
int kptr_restrict __read_mostly; int kptr_restrict __read_mostly;
/* /*
@ -1495,6 +1563,11 @@ int kptr_restrict __read_mostly;
* - 'Cn' For a clock, it prints the name (Common Clock Framework) or address * - 'Cn' For a clock, it prints the name (Common Clock Framework) or address
* (legacy clock framework) of the clock * (legacy clock framework) of the clock
* - 'Cr' For a clock, it prints the current rate of the clock * - 'Cr' For a clock, it prints the current rate of the clock
* - 'G' For flags to be printed as a collection of symbolic strings that would
* construct the specific value. Supported flags given by option:
* p page flags (see struct page) given as pointer to unsigned long
* g gfp flags (GFP_* and __GFP_*) given as pointer to gfp_t
* v vma flags (VM_*) given as pointer to unsigned long
* *
* ** Please update also Documentation/printk-formats.txt when making changes ** * ** Please update also Documentation/printk-formats.txt when making changes **
* *
@ -1648,6 +1721,8 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
return bdev_name(buf, end, ptr, spec, fmt); return bdev_name(buf, end, ptr, spec, fmt);
#endif #endif
case 'G':
return flags_string(buf, end, ptr, fmt);
} }
spec.flags |= SMALL; spec.flags |= SMALL;
if (spec.field_width == -1) { if (spec.field_width == -1) {

Просмотреть файл

@ -16,8 +16,8 @@ config DEBUG_PAGEALLOC
select PAGE_POISONING if !ARCH_SUPPORTS_DEBUG_PAGEALLOC select PAGE_POISONING if !ARCH_SUPPORTS_DEBUG_PAGEALLOC
---help--- ---help---
Unmap pages from the kernel linear mapping after free_pages(). Unmap pages from the kernel linear mapping after free_pages().
This results in a large slowdown, but helps to find certain types Depending on runtime enablement, this results in a small or large
of memory corruption. slowdown, but helps to find certain types of memory corruption.
For architectures which don't enable ARCH_SUPPORTS_DEBUG_PAGEALLOC, For architectures which don't enable ARCH_SUPPORTS_DEBUG_PAGEALLOC,
fill the pages with poison patterns after free_pages() and verify fill the pages with poison patterns after free_pages() and verify
@ -26,5 +26,56 @@ config DEBUG_PAGEALLOC
that would result in incorrect warnings of memory corruption after that would result in incorrect warnings of memory corruption after
a resume because free pages are not saved to the suspend image. a resume because free pages are not saved to the suspend image.
By default this option will have a small overhead, e.g. by not
allowing the kernel mapping to be backed by large pages on some
architectures. Even bigger overhead comes when the debugging is
enabled by DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc
command line parameter.
config DEBUG_PAGEALLOC_ENABLE_DEFAULT
bool "Enable debug page memory allocations by default?"
default n
depends on DEBUG_PAGEALLOC
---help---
Enable debug page memory allocations by default? This value
can be overridden by debug_pagealloc=off|on.
config PAGE_POISONING config PAGE_POISONING
bool bool "Poison pages after freeing"
select PAGE_EXTENSION
select PAGE_POISONING_NO_SANITY if HIBERNATION
---help---
Fill the pages with poison patterns after free_pages() and verify
the patterns before alloc_pages. The filling of the memory helps
reduce the risk of information leaks from freed data. This does
have a potential performance impact.
Note that "poison" here is not the same thing as the "HWPoison"
for CONFIG_MEMORY_FAILURE. This is software poisoning only.
If unsure, say N
config PAGE_POISONING_NO_SANITY
depends on PAGE_POISONING
bool "Only poison, don't sanity check"
---help---
Skip the sanity checking on alloc, only fill the pages with
poison on free. This reduces some of the overhead of the
poisoning feature.
If you are only interested in sanitization, say Y. Otherwise
say N.
config PAGE_POISONING_ZERO
bool "Use zero for poisoning instead of random data"
depends on PAGE_POISONING
---help---
Instead of using the existing poison value, fill the pages with
zeros. This makes it harder to detect when errors are occurring
due to sanitization but the zeroing at free means that it is
no longer necessary to write zeros when GFP_ZERO is used on
allocation.
Enabling page poisoning with this option will disable hibernation
If unsure, say N

Просмотреть файл

@ -48,7 +48,7 @@ obj-$(CONFIG_SPARSEMEM_VMEMMAP) += sparse-vmemmap.o
obj-$(CONFIG_SLOB) += slob.o obj-$(CONFIG_SLOB) += slob.o
obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o
obj-$(CONFIG_KSM) += ksm.o obj-$(CONFIG_KSM) += ksm.o
obj-$(CONFIG_PAGE_POISONING) += debug-pagealloc.o obj-$(CONFIG_PAGE_POISONING) += page_poison.o
obj-$(CONFIG_SLAB) += slab.o obj-$(CONFIG_SLAB) += slab.o
obj-$(CONFIG_SLUB) += slub.o obj-$(CONFIG_SLUB) += slub.o
obj-$(CONFIG_KMEMCHECK) += kmemcheck.o obj-$(CONFIG_KMEMCHECK) += kmemcheck.o

Просмотреть файл

@ -71,49 +71,6 @@ static inline bool migrate_async_suitable(int migratetype)
return is_migrate_cma(migratetype) || migratetype == MIGRATE_MOVABLE; return is_migrate_cma(migratetype) || migratetype == MIGRATE_MOVABLE;
} }
/*
* Check that the whole (or subset of) a pageblock given by the interval of
* [start_pfn, end_pfn) is valid and within the same zone, before scanning it
* with the migration of free compaction scanner. The scanners then need to
* use only pfn_valid_within() check for arches that allow holes within
* pageblocks.
*
* Return struct page pointer of start_pfn, or NULL if checks were not passed.
*
* It's possible on some configurations to have a setup like node0 node1 node0
* i.e. it's possible that all pages within a zones range of pages do not
* belong to a single zone. We assume that a border between node0 and node1
* can occur within a single pageblock, but not a node0 node1 node0
* interleaving within a single pageblock. It is therefore sufficient to check
* the first and last page of a pageblock and avoid checking each individual
* page in a pageblock.
*/
static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone)
{
struct page *start_page;
struct page *end_page;
/* end_pfn is one past the range we are checking */
end_pfn--;
if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
return NULL;
start_page = pfn_to_page(start_pfn);
if (page_zone(start_page) != zone)
return NULL;
end_page = pfn_to_page(end_pfn);
/* This gives a shorter code than deriving page_zone(end_page) */
if (page_zone_id(start_page) != page_zone_id(end_page))
return NULL;
return start_page;
}
#ifdef CONFIG_COMPACTION #ifdef CONFIG_COMPACTION
/* Do not skip compaction more than 64 times */ /* Do not skip compaction more than 64 times */
@ -200,7 +157,8 @@ static void reset_cached_positions(struct zone *zone)
{ {
zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn; zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn; zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
zone->compact_cached_free_pfn = zone_end_pfn(zone); zone->compact_cached_free_pfn =
round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
} }
/* /*
@ -554,13 +512,17 @@ unsigned long
isolate_freepages_range(struct compact_control *cc, isolate_freepages_range(struct compact_control *cc,
unsigned long start_pfn, unsigned long end_pfn) unsigned long start_pfn, unsigned long end_pfn)
{ {
unsigned long isolated, pfn, block_end_pfn; unsigned long isolated, pfn, block_start_pfn, block_end_pfn;
LIST_HEAD(freelist); LIST_HEAD(freelist);
pfn = start_pfn; pfn = start_pfn;
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
if (block_start_pfn < cc->zone->zone_start_pfn)
block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
for (; pfn < end_pfn; pfn += isolated, for (; pfn < end_pfn; pfn += isolated,
block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) { block_end_pfn += pageblock_nr_pages) {
/* Protect pfn from changing by isolate_freepages_block */ /* Protect pfn from changing by isolate_freepages_block */
unsigned long isolate_start_pfn = pfn; unsigned long isolate_start_pfn = pfn;
@ -573,11 +535,13 @@ isolate_freepages_range(struct compact_control *cc,
* scanning range to right one. * scanning range to right one.
*/ */
if (pfn >= block_end_pfn) { if (pfn >= block_end_pfn) {
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
block_end_pfn = min(block_end_pfn, end_pfn); block_end_pfn = min(block_end_pfn, end_pfn);
} }
if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone)) if (!pageblock_pfn_to_page(block_start_pfn,
block_end_pfn, cc->zone))
break; break;
isolated = isolate_freepages_block(cc, &isolate_start_pfn, isolated = isolate_freepages_block(cc, &isolate_start_pfn,
@ -863,18 +827,23 @@ unsigned long
isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
unsigned long end_pfn) unsigned long end_pfn)
{ {
unsigned long pfn, block_end_pfn; unsigned long pfn, block_start_pfn, block_end_pfn;
/* Scan block by block. First and last block may be incomplete */ /* Scan block by block. First and last block may be incomplete */
pfn = start_pfn; pfn = start_pfn;
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
if (block_start_pfn < cc->zone->zone_start_pfn)
block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
for (; pfn < end_pfn; pfn = block_end_pfn, for (; pfn < end_pfn; pfn = block_end_pfn,
block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) { block_end_pfn += pageblock_nr_pages) {
block_end_pfn = min(block_end_pfn, end_pfn); block_end_pfn = min(block_end_pfn, end_pfn);
if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone)) if (!pageblock_pfn_to_page(block_start_pfn,
block_end_pfn, cc->zone))
continue; continue;
pfn = isolate_migratepages_block(cc, pfn, block_end_pfn, pfn = isolate_migratepages_block(cc, pfn, block_end_pfn,
@ -1103,7 +1072,9 @@ int sysctl_compact_unevictable_allowed __read_mostly = 1;
static isolate_migrate_t isolate_migratepages(struct zone *zone, static isolate_migrate_t isolate_migratepages(struct zone *zone,
struct compact_control *cc) struct compact_control *cc)
{ {
unsigned long low_pfn, end_pfn; unsigned long block_start_pfn;
unsigned long block_end_pfn;
unsigned long low_pfn;
unsigned long isolate_start_pfn; unsigned long isolate_start_pfn;
struct page *page; struct page *page;
const isolate_mode_t isolate_mode = const isolate_mode_t isolate_mode =
@ -1115,16 +1086,21 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
* initialized by compact_zone() * initialized by compact_zone()
*/ */
low_pfn = cc->migrate_pfn; low_pfn = cc->migrate_pfn;
block_start_pfn = cc->migrate_pfn & ~(pageblock_nr_pages - 1);
if (block_start_pfn < zone->zone_start_pfn)
block_start_pfn = zone->zone_start_pfn;
/* Only scan within a pageblock boundary */ /* Only scan within a pageblock boundary */
end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages); block_end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
/* /*
* Iterate over whole pageblocks until we find the first suitable. * Iterate over whole pageblocks until we find the first suitable.
* Do not cross the free scanner. * Do not cross the free scanner.
*/ */
for (; end_pfn <= cc->free_pfn; for (; block_end_pfn <= cc->free_pfn;
low_pfn = end_pfn, end_pfn += pageblock_nr_pages) { low_pfn = block_end_pfn,
block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) {
/* /*
* This can potentially iterate a massively long zone with * This can potentially iterate a massively long zone with
@ -1135,7 +1111,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
&& compact_should_abort(cc)) && compact_should_abort(cc))
break; break;
page = pageblock_pfn_to_page(low_pfn, end_pfn, zone); page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn,
zone);
if (!page) if (!page)
continue; continue;
@ -1154,8 +1131,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
/* Perform the isolation */ /* Perform the isolation */
isolate_start_pfn = low_pfn; isolate_start_pfn = low_pfn;
low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn, low_pfn = isolate_migratepages_block(cc, low_pfn,
isolate_mode); block_end_pfn, isolate_mode);
if (!low_pfn || cc->contended) { if (!low_pfn || cc->contended) {
acct_isolated(zone, cc); acct_isolated(zone, cc);
@ -1371,11 +1348,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
*/ */
cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync]; cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
cc->free_pfn = zone->compact_cached_free_pfn; cc->free_pfn = zone->compact_cached_free_pfn;
if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) { if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1); cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
zone->compact_cached_free_pfn = cc->free_pfn; zone->compact_cached_free_pfn = cc->free_pfn;
} }
if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) { if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
cc->migrate_pfn = start_pfn; cc->migrate_pfn = start_pfn;
zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn; zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn; zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;

Просмотреть файл

@ -9,75 +9,38 @@
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/trace_events.h> #include <linux/trace_events.h>
#include <linux/memcontrol.h> #include <linux/memcontrol.h>
#include <trace/events/mmflags.h>
#include <linux/migrate.h>
#include <linux/page_owner.h>
static const struct trace_print_flags pageflag_names[] = { #include "internal.h"
{1UL << PG_locked, "locked" },
{1UL << PG_error, "error" }, char *migrate_reason_names[MR_TYPES] = {
{1UL << PG_referenced, "referenced" }, "compaction",
{1UL << PG_uptodate, "uptodate" }, "memory_failure",
{1UL << PG_dirty, "dirty" }, "memory_hotplug",
{1UL << PG_lru, "lru" }, "syscall_or_cpuset",
{1UL << PG_active, "active" }, "mempolicy_mbind",
{1UL << PG_slab, "slab" }, "numa_misplaced",
{1UL << PG_owner_priv_1, "owner_priv_1" }, "cma",
{1UL << PG_arch_1, "arch_1" },
{1UL << PG_reserved, "reserved" },
{1UL << PG_private, "private" },
{1UL << PG_private_2, "private_2" },
{1UL << PG_writeback, "writeback" },
{1UL << PG_head, "head" },
{1UL << PG_swapcache, "swapcache" },
{1UL << PG_mappedtodisk, "mappedtodisk" },
{1UL << PG_reclaim, "reclaim" },
{1UL << PG_swapbacked, "swapbacked" },
{1UL << PG_unevictable, "unevictable" },
#ifdef CONFIG_MMU
{1UL << PG_mlocked, "mlocked" },
#endif
#ifdef CONFIG_ARCH_USES_PG_UNCACHED
{1UL << PG_uncached, "uncached" },
#endif
#ifdef CONFIG_MEMORY_FAILURE
{1UL << PG_hwpoison, "hwpoison" },
#endif
#if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT)
{1UL << PG_young, "young" },
{1UL << PG_idle, "idle" },
#endif
}; };
static void dump_flags(unsigned long flags, const struct trace_print_flags pageflag_names[] = {
const struct trace_print_flags *names, int count) __def_pageflag_names,
{ {0, NULL}
const char *delim = ""; };
unsigned long mask;
int i;
pr_emerg("flags: %#lx(", flags); const struct trace_print_flags gfpflag_names[] = {
__def_gfpflag_names,
{0, NULL}
};
/* remove zone id */ const struct trace_print_flags vmaflag_names[] = {
flags &= (1UL << NR_PAGEFLAGS) - 1; __def_vmaflag_names,
{0, NULL}
};
for (i = 0; i < count && flags; i++) { void __dump_page(struct page *page, const char *reason)
mask = names[i].mask;
if ((flags & mask) != mask)
continue;
flags &= ~mask;
pr_cont("%s%s", delim, names[i].name);
delim = "|";
}
/* check for left over flags */
if (flags)
pr_cont("%s%#lx", delim, flags);
pr_cont(")\n");
}
void dump_page_badflags(struct page *page, const char *reason,
unsigned long badflags)
{ {
pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx", pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx",
page, atomic_read(&page->_count), page_mapcount(page), page, atomic_read(&page->_count), page_mapcount(page),
@ -85,15 +48,13 @@ void dump_page_badflags(struct page *page, const char *reason,
if (PageCompound(page)) if (PageCompound(page))
pr_cont(" compound_mapcount: %d", compound_mapcount(page)); pr_cont(" compound_mapcount: %d", compound_mapcount(page));
pr_cont("\n"); pr_cont("\n");
BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS); BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
dump_flags(page->flags, pageflag_names, ARRAY_SIZE(pageflag_names));
pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
if (reason) if (reason)
pr_alert("page dumped because: %s\n", reason); pr_alert("page dumped because: %s\n", reason);
if (page->flags & badflags) {
pr_alert("bad because of flags:\n");
dump_flags(page->flags & badflags,
pageflag_names, ARRAY_SIZE(pageflag_names));
}
#ifdef CONFIG_MEMCG #ifdef CONFIG_MEMCG
if (page->mem_cgroup) if (page->mem_cgroup)
pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup); pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup);
@ -102,67 +63,26 @@ void dump_page_badflags(struct page *page, const char *reason,
void dump_page(struct page *page, const char *reason) void dump_page(struct page *page, const char *reason)
{ {
dump_page_badflags(page, reason, 0); __dump_page(page, reason);
dump_page_owner(page);
} }
EXPORT_SYMBOL(dump_page); EXPORT_SYMBOL(dump_page);
#ifdef CONFIG_DEBUG_VM #ifdef CONFIG_DEBUG_VM
static const struct trace_print_flags vmaflags_names[] = {
{VM_READ, "read" },
{VM_WRITE, "write" },
{VM_EXEC, "exec" },
{VM_SHARED, "shared" },
{VM_MAYREAD, "mayread" },
{VM_MAYWRITE, "maywrite" },
{VM_MAYEXEC, "mayexec" },
{VM_MAYSHARE, "mayshare" },
{VM_GROWSDOWN, "growsdown" },
{VM_PFNMAP, "pfnmap" },
{VM_DENYWRITE, "denywrite" },
{VM_LOCKONFAULT, "lockonfault" },
{VM_LOCKED, "locked" },
{VM_IO, "io" },
{VM_SEQ_READ, "seqread" },
{VM_RAND_READ, "randread" },
{VM_DONTCOPY, "dontcopy" },
{VM_DONTEXPAND, "dontexpand" },
{VM_ACCOUNT, "account" },
{VM_NORESERVE, "noreserve" },
{VM_HUGETLB, "hugetlb" },
#if defined(CONFIG_X86)
{VM_PAT, "pat" },
#elif defined(CONFIG_PPC)
{VM_SAO, "sao" },
#elif defined(CONFIG_PARISC) || defined(CONFIG_METAG) || defined(CONFIG_IA64)
{VM_GROWSUP, "growsup" },
#elif !defined(CONFIG_MMU)
{VM_MAPPED_COPY, "mappedcopy" },
#else
{VM_ARCH_1, "arch_1" },
#endif
{VM_DONTDUMP, "dontdump" },
#ifdef CONFIG_MEM_SOFT_DIRTY
{VM_SOFTDIRTY, "softdirty" },
#endif
{VM_MIXEDMAP, "mixedmap" },
{VM_HUGEPAGE, "hugepage" },
{VM_NOHUGEPAGE, "nohugepage" },
{VM_MERGEABLE, "mergeable" },
};
void dump_vma(const struct vm_area_struct *vma) void dump_vma(const struct vm_area_struct *vma)
{ {
pr_emerg("vma %p start %p end %p\n" pr_emerg("vma %p start %p end %p\n"
"next %p prev %p mm %p\n" "next %p prev %p mm %p\n"
"prot %lx anon_vma %p vm_ops %p\n" "prot %lx anon_vma %p vm_ops %p\n"
"pgoff %lx file %p private_data %p\n", "pgoff %lx file %p private_data %p\n"
"flags: %#lx(%pGv)\n",
vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_next, vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_next,
vma->vm_prev, vma->vm_mm, vma->vm_prev, vma->vm_mm,
(unsigned long)pgprot_val(vma->vm_page_prot), (unsigned long)pgprot_val(vma->vm_page_prot),
vma->anon_vma, vma->vm_ops, vma->vm_pgoff, vma->anon_vma, vma->vm_ops, vma->vm_pgoff,
vma->vm_file, vma->vm_private_data); vma->vm_file, vma->vm_private_data,
dump_flags(vma->vm_flags, vmaflags_names, ARRAY_SIZE(vmaflags_names)); vma->vm_flags, &vma->vm_flags);
} }
EXPORT_SYMBOL(dump_vma); EXPORT_SYMBOL(dump_vma);
@ -196,7 +116,7 @@ void dump_mm(const struct mm_struct *mm)
#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION) #if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
"tlb_flush_pending %d\n" "tlb_flush_pending %d\n"
#endif #endif
"%s", /* This is here to hold the comma */ "def_flags: %#lx(%pGv)\n",
mm, mm->mmap, mm->vmacache_seqnum, mm->task_size, mm, mm->mmap, mm->vmacache_seqnum, mm->task_size,
#ifdef CONFIG_MMU #ifdef CONFIG_MMU
@ -230,11 +150,8 @@ void dump_mm(const struct mm_struct *mm)
#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION) #if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
mm->tlb_flush_pending, mm->tlb_flush_pending,
#endif #endif
"" /* This is here to not have a comma! */ mm->def_flags, &mm->def_flags
); );
dump_flags(mm->def_flags, vmaflags_names,
ARRAY_SIZE(vmaflags_names));
} }
#endif /* CONFIG_DEBUG_VM */ #endif /* CONFIG_DEBUG_VM */

Просмотреть файл

@ -1,5 +1,7 @@
#include <linux/fault-inject.h> #include <linux/fault-inject.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/mm.h>
#include "slab.h"
static struct { static struct {
struct fault_attr attr; struct fault_attr attr;
@ -11,18 +13,22 @@ static struct {
.cache_filter = false, .cache_filter = false,
}; };
bool should_failslab(size_t size, gfp_t gfpflags, unsigned long cache_flags) bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
{ {
/* No fault-injection for bootstrap cache */
if (unlikely(s == kmem_cache))
return false;
if (gfpflags & __GFP_NOFAIL) if (gfpflags & __GFP_NOFAIL)
return false; return false;
if (failslab.ignore_gfp_reclaim && (gfpflags & __GFP_RECLAIM)) if (failslab.ignore_gfp_reclaim && (gfpflags & __GFP_RECLAIM))
return false; return false;
if (failslab.cache_filter && !(cache_flags & SLAB_FAILSLAB)) if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
return false; return false;
return should_fail(&failslab.attr, size); return should_fail(&failslab.attr, s->object_size);
} }
static int __init setup_failslab(char *str) static int __init setup_failslab(char *str)

Просмотреть файл

@ -101,7 +101,7 @@
* ->tree_lock (page_remove_rmap->set_page_dirty) * ->tree_lock (page_remove_rmap->set_page_dirty)
* bdi.wb->list_lock (page_remove_rmap->set_page_dirty) * bdi.wb->list_lock (page_remove_rmap->set_page_dirty)
* ->inode->i_lock (page_remove_rmap->set_page_dirty) * ->inode->i_lock (page_remove_rmap->set_page_dirty)
* ->memcg->move_lock (page_remove_rmap->mem_cgroup_begin_page_stat) * ->memcg->move_lock (page_remove_rmap->lock_page_memcg)
* bdi.wb->list_lock (zap_pte_range->set_page_dirty) * bdi.wb->list_lock (zap_pte_range->set_page_dirty)
* ->inode->i_lock (zap_pte_range->set_page_dirty) * ->inode->i_lock (zap_pte_range->set_page_dirty)
* ->private_lock (zap_pte_range->__set_page_dirty_buffers) * ->private_lock (zap_pte_range->__set_page_dirty_buffers)
@ -176,11 +176,9 @@ static void page_cache_tree_delete(struct address_space *mapping,
/* /*
* Delete a page from the page cache and free it. Caller has to make * Delete a page from the page cache and free it. Caller has to make
* sure the page is locked and that nobody else uses it - or that usage * sure the page is locked and that nobody else uses it - or that usage
* is safe. The caller must hold the mapping's tree_lock and * is safe. The caller must hold the mapping's tree_lock.
* mem_cgroup_begin_page_stat().
*/ */
void __delete_from_page_cache(struct page *page, void *shadow, void __delete_from_page_cache(struct page *page, void *shadow)
struct mem_cgroup *memcg)
{ {
struct address_space *mapping = page->mapping; struct address_space *mapping = page->mapping;
@ -239,8 +237,7 @@ void __delete_from_page_cache(struct page *page, void *shadow,
* anyway will be cleared before returning page into buddy allocator. * anyway will be cleared before returning page into buddy allocator.
*/ */
if (WARN_ON_ONCE(PageDirty(page))) if (WARN_ON_ONCE(PageDirty(page)))
account_page_cleaned(page, mapping, memcg, account_page_cleaned(page, mapping, inode_to_wb(mapping->host));
inode_to_wb(mapping->host));
} }
/** /**
@ -254,7 +251,6 @@ void __delete_from_page_cache(struct page *page, void *shadow,
void delete_from_page_cache(struct page *page) void delete_from_page_cache(struct page *page)
{ {
struct address_space *mapping = page->mapping; struct address_space *mapping = page->mapping;
struct mem_cgroup *memcg;
unsigned long flags; unsigned long flags;
void (*freepage)(struct page *); void (*freepage)(struct page *);
@ -263,11 +259,9 @@ void delete_from_page_cache(struct page *page)
freepage = mapping->a_ops->freepage; freepage = mapping->a_ops->freepage;
memcg = mem_cgroup_begin_page_stat(page);
spin_lock_irqsave(&mapping->tree_lock, flags); spin_lock_irqsave(&mapping->tree_lock, flags);
__delete_from_page_cache(page, NULL, memcg); __delete_from_page_cache(page, NULL);
spin_unlock_irqrestore(&mapping->tree_lock, flags); spin_unlock_irqrestore(&mapping->tree_lock, flags);
mem_cgroup_end_page_stat(memcg);
if (freepage) if (freepage)
freepage(page); freepage(page);
@ -551,7 +545,6 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
if (!error) { if (!error) {
struct address_space *mapping = old->mapping; struct address_space *mapping = old->mapping;
void (*freepage)(struct page *); void (*freepage)(struct page *);
struct mem_cgroup *memcg;
unsigned long flags; unsigned long flags;
pgoff_t offset = old->index; pgoff_t offset = old->index;
@ -561,9 +554,8 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
new->mapping = mapping; new->mapping = mapping;
new->index = offset; new->index = offset;
memcg = mem_cgroup_begin_page_stat(old);
spin_lock_irqsave(&mapping->tree_lock, flags); spin_lock_irqsave(&mapping->tree_lock, flags);
__delete_from_page_cache(old, NULL, memcg); __delete_from_page_cache(old, NULL);
error = radix_tree_insert(&mapping->page_tree, offset, new); error = radix_tree_insert(&mapping->page_tree, offset, new);
BUG_ON(error); BUG_ON(error);
mapping->nrpages++; mapping->nrpages++;
@ -576,8 +568,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
if (PageSwapBacked(new)) if (PageSwapBacked(new))
__inc_zone_page_state(new, NR_SHMEM); __inc_zone_page_state(new, NR_SHMEM);
spin_unlock_irqrestore(&mapping->tree_lock, flags); spin_unlock_irqrestore(&mapping->tree_lock, flags);
mem_cgroup_end_page_stat(memcg); mem_cgroup_migrate(old, new);
mem_cgroup_replace_page(old, new);
radix_tree_preload_end(); radix_tree_preload_end();
if (freepage) if (freepage)
freepage(old); freepage(old);
@ -1668,6 +1659,15 @@ find_page:
index, last_index - index); index, last_index - index);
} }
if (!PageUptodate(page)) { if (!PageUptodate(page)) {
/*
* See comment in do_read_cache_page on why
* wait_on_page_locked is used to avoid unnecessarily
* serialisations and why it's safe.
*/
wait_on_page_locked_killable(page);
if (PageUptodate(page))
goto page_ok;
if (inode->i_blkbits == PAGE_CACHE_SHIFT || if (inode->i_blkbits == PAGE_CACHE_SHIFT ||
!mapping->a_ops->is_partially_uptodate) !mapping->a_ops->is_partially_uptodate)
goto page_not_up_to_date; goto page_not_up_to_date;
@ -2303,7 +2303,7 @@ static struct page *wait_on_page_read(struct page *page)
return page; return page;
} }
static struct page *__read_cache_page(struct address_space *mapping, static struct page *do_read_cache_page(struct address_space *mapping,
pgoff_t index, pgoff_t index,
int (*filler)(void *, struct page *), int (*filler)(void *, struct page *),
void *data, void *data,
@ -2325,53 +2325,74 @@ repeat:
/* Presumably ENOMEM for radix tree node */ /* Presumably ENOMEM for radix tree node */
return ERR_PTR(err); return ERR_PTR(err);
} }
filler:
err = filler(data, page); err = filler(data, page);
if (err < 0) { if (err < 0) {
page_cache_release(page); page_cache_release(page);
page = ERR_PTR(err); return ERR_PTR(err);
} else {
page = wait_on_page_read(page);
} }
page = wait_on_page_read(page);
if (IS_ERR(page))
return page;
goto out;
} }
return page;
}
static struct page *do_read_cache_page(struct address_space *mapping,
pgoff_t index,
int (*filler)(void *, struct page *),
void *data,
gfp_t gfp)
{
struct page *page;
int err;
retry:
page = __read_cache_page(mapping, index, filler, data, gfp);
if (IS_ERR(page))
return page;
if (PageUptodate(page)) if (PageUptodate(page))
goto out; goto out;
/*
* Page is not up to date and may be locked due one of the following
* case a: Page is being filled and the page lock is held
* case b: Read/write error clearing the page uptodate status
* case c: Truncation in progress (page locked)
* case d: Reclaim in progress
*
* Case a, the page will be up to date when the page is unlocked.
* There is no need to serialise on the page lock here as the page
* is pinned so the lock gives no additional protection. Even if the
* the page is truncated, the data is still valid if PageUptodate as
* it's a race vs truncate race.
* Case b, the page will not be up to date
* Case c, the page may be truncated but in itself, the data may still
* be valid after IO completes as it's a read vs truncate race. The
* operation must restart if the page is not uptodate on unlock but
* otherwise serialising on page lock to stabilise the mapping gives
* no additional guarantees to the caller as the page lock is
* released before return.
* Case d, similar to truncation. If reclaim holds the page lock, it
* will be a race with remove_mapping that determines if the mapping
* is valid on unlock but otherwise the data is valid and there is
* no need to serialise with page lock.
*
* As the page lock gives no additional guarantee, we optimistically
* wait on the page to be unlocked and check if it's up to date and
* use the page if it is. Otherwise, the page lock is required to
* distinguish between the different cases. The motivation is that we
* avoid spurious serialisations and wakeups when multiple processes
* wait on the same page for IO to complete.
*/
wait_on_page_locked(page);
if (PageUptodate(page))
goto out;
/* Distinguish between all the cases under the safety of the lock */
lock_page(page); lock_page(page);
/* Case c or d, restart the operation */
if (!page->mapping) { if (!page->mapping) {
unlock_page(page); unlock_page(page);
page_cache_release(page); page_cache_release(page);
goto retry; goto repeat;
} }
/* Someone else locked and filled the page in a very small window */
if (PageUptodate(page)) { if (PageUptodate(page)) {
unlock_page(page); unlock_page(page);
goto out; goto out;
} }
err = filler(data, page); goto filler;
if (err < 0) {
page_cache_release(page);
return ERR_PTR(err);
} else {
page = wait_on_page_read(page);
if (IS_ERR(page))
return page;
}
out: out:
mark_page_accessed(page); mark_page_accessed(page);
return page; return page;

Просмотреть файл

@ -3220,28 +3220,26 @@ static void unfreeze_page(struct anon_vma *anon_vma, struct page *page)
} }
} }
static int __split_huge_page_tail(struct page *head, int tail, static void __split_huge_page_tail(struct page *head, int tail,
struct lruvec *lruvec, struct list_head *list) struct lruvec *lruvec, struct list_head *list)
{ {
int mapcount;
struct page *page_tail = head + tail; struct page *page_tail = head + tail;
mapcount = atomic_read(&page_tail->_mapcount) + 1; VM_BUG_ON_PAGE(atomic_read(&page_tail->_mapcount) != -1, page_tail);
VM_BUG_ON_PAGE(atomic_read(&page_tail->_count) != 0, page_tail); VM_BUG_ON_PAGE(atomic_read(&page_tail->_count) != 0, page_tail);
/* /*
* tail_page->_count is zero and not changing from under us. But * tail_page->_count is zero and not changing from under us. But
* get_page_unless_zero() may be running from under us on the * get_page_unless_zero() may be running from under us on the
* tail_page. If we used atomic_set() below instead of atomic_add(), we * tail_page. If we used atomic_set() below instead of atomic_inc(), we
* would then run atomic_set() concurrently with * would then run atomic_set() concurrently with
* get_page_unless_zero(), and atomic_set() is implemented in C not * get_page_unless_zero(), and atomic_set() is implemented in C not
* using locked ops. spin_unlock on x86 sometime uses locked ops * using locked ops. spin_unlock on x86 sometime uses locked ops
* because of PPro errata 66, 92, so unless somebody can guarantee * because of PPro errata 66, 92, so unless somebody can guarantee
* atomic_set() here would be safe on all archs (and not only on x86), * atomic_set() here would be safe on all archs (and not only on x86),
* it's safer to use atomic_add(). * it's safer to use atomic_inc().
*/ */
atomic_add(mapcount + 1, &page_tail->_count); atomic_inc(&page_tail->_count);
page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
page_tail->flags |= (head->flags & page_tail->flags |= (head->flags &
@ -3275,8 +3273,6 @@ static int __split_huge_page_tail(struct page *head, int tail,
page_tail->index = head->index + tail; page_tail->index = head->index + tail;
page_cpupid_xchg_last(page_tail, page_cpupid_last(head)); page_cpupid_xchg_last(page_tail, page_cpupid_last(head));
lru_add_page_tail(head, page_tail, lruvec, list); lru_add_page_tail(head, page_tail, lruvec, list);
return mapcount;
} }
static void __split_huge_page(struct page *page, struct list_head *list) static void __split_huge_page(struct page *page, struct list_head *list)
@ -3284,7 +3280,7 @@ static void __split_huge_page(struct page *page, struct list_head *list)
struct page *head = compound_head(page); struct page *head = compound_head(page);
struct zone *zone = page_zone(head); struct zone *zone = page_zone(head);
struct lruvec *lruvec; struct lruvec *lruvec;
int i, tail_mapcount; int i;
/* prevent PageLRU to go away from under us, and freeze lru stats */ /* prevent PageLRU to go away from under us, and freeze lru stats */
spin_lock_irq(&zone->lru_lock); spin_lock_irq(&zone->lru_lock);
@ -3293,10 +3289,8 @@ static void __split_huge_page(struct page *page, struct list_head *list)
/* complete memcg works before add pages to LRU */ /* complete memcg works before add pages to LRU */
mem_cgroup_split_huge_fixup(head); mem_cgroup_split_huge_fixup(head);
tail_mapcount = 0;
for (i = HPAGE_PMD_NR - 1; i >= 1; i--) for (i = HPAGE_PMD_NR - 1; i >= 1; i--)
tail_mapcount += __split_huge_page_tail(head, i, lruvec, list); __split_huge_page_tail(head, i, lruvec, list);
atomic_sub(tail_mapcount, &head->_count);
ClearPageCompound(head); ClearPageCompound(head);
spin_unlock_irq(&zone->lru_lock); spin_unlock_irq(&zone->lru_lock);

Просмотреть файл

@ -14,6 +14,7 @@
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/pagemap.h> #include <linux/pagemap.h>
#include <linux/tracepoint-defs.h>
/* /*
* The set of flags that only affect watermark checking and reclaim * The set of flags that only affect watermark checking and reclaim
@ -131,6 +132,18 @@ __find_buddy_index(unsigned long page_idx, unsigned int order)
return page_idx ^ (1 << order); return page_idx ^ (1 << order);
} }
extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone);
static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone)
{
if (zone->contiguous)
return pfn_to_page(start_pfn);
return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
}
extern int __isolate_free_page(struct page *page, unsigned int order); extern int __isolate_free_page(struct page *page, unsigned int order);
extern void __free_pages_bootmem(struct page *page, unsigned long pfn, extern void __free_pages_bootmem(struct page *page, unsigned long pfn,
unsigned int order); unsigned int order);
@ -466,4 +479,9 @@ static inline void try_to_unmap_flush_dirty(void)
} }
#endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */
extern const struct trace_print_flags pageflag_names[];
extern const struct trace_print_flags vmaflag_names[];
extern const struct trace_print_flags gfpflag_names[];
#endif /* __MM_INTERNAL_H */ #endif /* __MM_INTERNAL_H */

Просмотреть файл

@ -60,6 +60,9 @@ void kmemcheck_free_shadow(struct page *page, int order)
void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object, void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
size_t size) size_t size)
{ {
if (unlikely(!object)) /* Skip object if allocation failed */
return;
/* /*
* Has already been memset(), which initializes the shadow for us * Has already been memset(), which initializes the shadow for us
* as well. * as well.

Просмотреть файл

@ -555,8 +555,9 @@ static int madvise_hwpoison(int bhv, unsigned long start, unsigned long end)
} }
pr_info("Injecting memory failure for page %#lx at %#lx\n", pr_info("Injecting memory failure for page %#lx at %#lx\n",
page_to_pfn(p), start); page_to_pfn(p), start);
/* Ignore return value for now */ ret = memory_failure(page_to_pfn(p), 0, MF_COUNT_INCREASED);
memory_failure(page_to_pfn(p), 0, MF_COUNT_INCREASED); if (ret)
return ret;
} }
return 0; return 0;
} }
@ -638,14 +639,28 @@ madvise_behavior_valid(int behavior)
* some pages ahead. * some pages ahead.
* MADV_DONTNEED - the application is finished with the given range, * MADV_DONTNEED - the application is finished with the given range,
* so the kernel can free resources associated with it. * so the kernel can free resources associated with it.
* MADV_FREE - the application marks pages in the given range as lazy free,
* where actual purges are postponed until memory pressure happens.
* MADV_REMOVE - the application wants to free up the given range of * MADV_REMOVE - the application wants to free up the given range of
* pages and associated backing store. * pages and associated backing store.
* MADV_DONTFORK - omit this area from child's address space when forking: * MADV_DONTFORK - omit this area from child's address space when forking:
* typically, to avoid COWing pages pinned by get_user_pages(). * typically, to avoid COWing pages pinned by get_user_pages().
* MADV_DOFORK - cancel MADV_DONTFORK: no longer omit this area when forking. * MADV_DOFORK - cancel MADV_DONTFORK: no longer omit this area when forking.
* MADV_HWPOISON - trigger memory error handler as if the given memory range
* were corrupted by unrecoverable hardware memory failure.
* MADV_SOFT_OFFLINE - try to soft-offline the given range of memory.
* MADV_MERGEABLE - the application recommends that KSM try to merge pages in * MADV_MERGEABLE - the application recommends that KSM try to merge pages in
* this area with pages of identical content from other such areas. * this area with pages of identical content from other such areas.
* MADV_UNMERGEABLE- cancel MADV_MERGEABLE: no longer merge pages with others. * MADV_UNMERGEABLE- cancel MADV_MERGEABLE: no longer merge pages with others.
* MADV_HUGEPAGE - the application wants to back the given range by transparent
* huge pages in the future. Existing pages might be coalesced and
* new pages might be allocated as THP.
* MADV_NOHUGEPAGE - mark the given range as not worth being backed by
* transparent huge pages so the existing pages will not be
* coalesced into THP and new pages will not be allocated as THP.
* MADV_DONTDUMP - the application wants to prevent pages in the given range
* from being included in its core dump.
* MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump.
* *
* return values: * return values:
* zero - success * zero - success

Просмотреть файл

@ -612,14 +612,12 @@ static int __init_memblock memblock_add_region(phys_addr_t base,
int nid, int nid,
unsigned long flags) unsigned long flags)
{ {
struct memblock_type *type = &memblock.memory;
memblock_dbg("memblock_add: [%#016llx-%#016llx] flags %#02lx %pF\n", memblock_dbg("memblock_add: [%#016llx-%#016llx] flags %#02lx %pF\n",
(unsigned long long)base, (unsigned long long)base,
(unsigned long long)base + size - 1, (unsigned long long)base + size - 1,
flags, (void *)_RET_IP_); flags, (void *)_RET_IP_);
return memblock_add_range(type, base, size, nid, flags); return memblock_add_range(&memblock.memory, base, size, nid, flags);
} }
int __init_memblock memblock_add(phys_addr_t base, phys_addr_t size) int __init_memblock memblock_add(phys_addr_t base, phys_addr_t size)
@ -740,14 +738,12 @@ static int __init_memblock memblock_reserve_region(phys_addr_t base,
int nid, int nid,
unsigned long flags) unsigned long flags)
{ {
struct memblock_type *type = &memblock.reserved;
memblock_dbg("memblock_reserve: [%#016llx-%#016llx] flags %#02lx %pF\n", memblock_dbg("memblock_reserve: [%#016llx-%#016llx] flags %#02lx %pF\n",
(unsigned long long)base, (unsigned long long)base,
(unsigned long long)base + size - 1, (unsigned long long)base + size - 1,
flags, (void *)_RET_IP_); flags, (void *)_RET_IP_);
return memblock_add_range(type, base, size, nid, flags); return memblock_add_range(&memblock.reserved, base, size, nid, flags);
} }
int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size) int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size)

Просмотреть файл

@ -268,31 +268,6 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg)
return (memcg == root_mem_cgroup); return (memcg == root_mem_cgroup);
} }
/*
* We restrict the id in the range of [1, 65535], so it can fit into
* an unsigned short.
*/
#define MEM_CGROUP_ID_MAX USHRT_MAX
static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
{
return memcg->css.id;
}
/*
* A helper function to get mem_cgroup from ID. must be called under
* rcu_read_lock(). The caller is responsible for calling
* css_tryget_online() if the mem_cgroup is used for charging. (dropping
* refcnt from swap can be called against removed memcg.)
*/
static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
{
struct cgroup_subsys_state *css;
css = css_from_id(id, &memory_cgrp_subsys);
return mem_cgroup_from_css(css);
}
#ifndef CONFIG_SLOB #ifndef CONFIG_SLOB
/* /*
* This will be the memcg's index in each cache's ->memcg_params.memcg_caches. * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
@ -1709,19 +1684,13 @@ cleanup:
} }
/** /**
* mem_cgroup_begin_page_stat - begin a page state statistics transaction * lock_page_memcg - lock a page->mem_cgroup binding
* @page: page that is going to change accounted state * @page: the page
* *
* This function must mark the beginning of an accounted page state * This function protects unlocked LRU pages from being moved to
* change to prevent double accounting when the page is concurrently * another cgroup and stabilizes their page->mem_cgroup binding.
* being moved to another memcg:
*
* memcg = mem_cgroup_begin_page_stat(page);
* if (TestClearPageState(page))
* mem_cgroup_update_page_stat(memcg, state, -1);
* mem_cgroup_end_page_stat(memcg);
*/ */
struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page) void lock_page_memcg(struct page *page)
{ {
struct mem_cgroup *memcg; struct mem_cgroup *memcg;
unsigned long flags; unsigned long flags;
@ -1730,25 +1699,18 @@ struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page)
* The RCU lock is held throughout the transaction. The fast * The RCU lock is held throughout the transaction. The fast
* path can get away without acquiring the memcg->move_lock * path can get away without acquiring the memcg->move_lock
* because page moving starts with an RCU grace period. * because page moving starts with an RCU grace period.
*
* The RCU lock also protects the memcg from being freed when
* the page state that is going to change is the only thing
* preventing the page from being uncharged.
* E.g. end-writeback clearing PageWriteback(), which allows
* migration to go ahead and uncharge the page before the
* account transaction might be complete.
*/ */
rcu_read_lock(); rcu_read_lock();
if (mem_cgroup_disabled()) if (mem_cgroup_disabled())
return NULL; return;
again: again:
memcg = page->mem_cgroup; memcg = page->mem_cgroup;
if (unlikely(!memcg)) if (unlikely(!memcg))
return NULL; return;
if (atomic_read(&memcg->moving_account) <= 0) if (atomic_read(&memcg->moving_account) <= 0)
return memcg; return;
spin_lock_irqsave(&memcg->move_lock, flags); spin_lock_irqsave(&memcg->move_lock, flags);
if (memcg != page->mem_cgroup) { if (memcg != page->mem_cgroup) {
@ -1759,21 +1721,23 @@ again:
/* /*
* When charge migration first begins, we can have locked and * When charge migration first begins, we can have locked and
* unlocked page stat updates happening concurrently. Track * unlocked page stat updates happening concurrently. Track
* the task who has the lock for mem_cgroup_end_page_stat(). * the task who has the lock for unlock_page_memcg().
*/ */
memcg->move_lock_task = current; memcg->move_lock_task = current;
memcg->move_lock_flags = flags; memcg->move_lock_flags = flags;
return memcg; return;
} }
EXPORT_SYMBOL(mem_cgroup_begin_page_stat); EXPORT_SYMBOL(lock_page_memcg);
/** /**
* mem_cgroup_end_page_stat - finish a page state statistics transaction * unlock_page_memcg - unlock a page->mem_cgroup binding
* @memcg: the memcg that was accounted against * @page: the page
*/ */
void mem_cgroup_end_page_stat(struct mem_cgroup *memcg) void unlock_page_memcg(struct page *page)
{ {
struct mem_cgroup *memcg = page->mem_cgroup;
if (memcg && memcg->move_lock_task == current) { if (memcg && memcg->move_lock_task == current) {
unsigned long flags = memcg->move_lock_flags; unsigned long flags = memcg->move_lock_flags;
@ -1785,7 +1749,7 @@ void mem_cgroup_end_page_stat(struct mem_cgroup *memcg)
rcu_read_unlock(); rcu_read_unlock();
} }
EXPORT_SYMBOL(mem_cgroup_end_page_stat); EXPORT_SYMBOL(unlock_page_memcg);
/* /*
* size of first charge trial. "32" comes from vmscan.c's magic value. * size of first charge trial. "32" comes from vmscan.c's magic value.
@ -4488,7 +4452,7 @@ static int mem_cgroup_move_account(struct page *page,
VM_BUG_ON(compound && !PageTransHuge(page)); VM_BUG_ON(compound && !PageTransHuge(page));
/* /*
* Prevent mem_cgroup_replace_page() from looking at * Prevent mem_cgroup_migrate() from looking at
* page->mem_cgroup of its source page while we change it. * page->mem_cgroup of its source page while we change it.
*/ */
ret = -EBUSY; ret = -EBUSY;
@ -4923,9 +4887,9 @@ static void mem_cgroup_move_charge(struct mm_struct *mm)
lru_add_drain_all(); lru_add_drain_all();
/* /*
* Signal mem_cgroup_begin_page_stat() to take the memcg's * Signal lock_page_memcg() to take the memcg's move_lock
* move_lock while we're moving its pages to another memcg. * while we're moving its pages to another memcg. Then wait
* Then wait for already started RCU-only updates to finish. * for already started RCU-only updates to finish.
*/ */
atomic_inc(&mc.from->moving_account); atomic_inc(&mc.from->moving_account);
synchronize_rcu(); synchronize_rcu();
@ -5517,16 +5481,16 @@ void mem_cgroup_uncharge_list(struct list_head *page_list)
} }
/** /**
* mem_cgroup_replace_page - migrate a charge to another page * mem_cgroup_migrate - charge a page's replacement
* @oldpage: currently charged page * @oldpage: currently circulating page
* @newpage: page to transfer the charge to * @newpage: replacement page
* *
* Migrate the charge from @oldpage to @newpage. * Charge @newpage as a replacement page for @oldpage. @oldpage will
* be uncharged upon free.
* *
* Both pages must be locked, @newpage->mapping must be set up. * Both pages must be locked, @newpage->mapping must be set up.
* Either or both pages might be on the LRU already.
*/ */
void mem_cgroup_replace_page(struct page *oldpage, struct page *newpage) void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
{ {
struct mem_cgroup *memcg; struct mem_cgroup *memcg;
unsigned int nr_pages; unsigned int nr_pages;
@ -5559,7 +5523,7 @@ void mem_cgroup_replace_page(struct page *oldpage, struct page *newpage)
page_counter_charge(&memcg->memsw, nr_pages); page_counter_charge(&memcg->memsw, nr_pages);
css_get_many(&memcg->css, nr_pages); css_get_many(&memcg->css, nr_pages);
commit_charge(newpage, memcg, true); commit_charge(newpage, memcg, false);
local_irq_disable(); local_irq_disable();
mem_cgroup_charge_statistics(memcg, newpage, compound, nr_pages); mem_cgroup_charge_statistics(memcg, newpage, compound, nr_pages);

Просмотреть файл

@ -826,8 +826,6 @@ static struct page_state {
#undef lru #undef lru
#undef swapbacked #undef swapbacked
#undef head #undef head
#undef tail
#undef compound
#undef slab #undef slab
#undef reserved #undef reserved

Просмотреть файл

@ -1897,7 +1897,9 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
unsigned long end = addr + size; unsigned long end = addr + size;
int err; int err;
BUG_ON(addr >= end); if (WARN_ON(addr >= end))
return -EINVAL;
pgd = pgd_offset(mm, addr); pgd = pgd_offset(mm, addr);
do { do {
next = pgd_addr_end(addr, end); next = pgd_addr_end(addr, end);
@ -3143,8 +3145,7 @@ static int do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pte_t *page_table, pmd_t *pmd, unsigned long address, pte_t *page_table, pmd_t *pmd,
unsigned int flags, pte_t orig_pte) unsigned int flags, pte_t orig_pte)
{ {
pgoff_t pgoff = (((address & PAGE_MASK) pgoff_t pgoff = linear_page_index(vma, address);
- vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
pte_unmap(page_table); pte_unmap(page_table);
/* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */

Просмотреть файл

@ -77,6 +77,9 @@ static struct {
#define memhp_lock_acquire() lock_map_acquire(&mem_hotplug.dep_map) #define memhp_lock_acquire() lock_map_acquire(&mem_hotplug.dep_map)
#define memhp_lock_release() lock_map_release(&mem_hotplug.dep_map) #define memhp_lock_release() lock_map_release(&mem_hotplug.dep_map)
bool memhp_auto_online;
EXPORT_SYMBOL_GPL(memhp_auto_online);
void get_online_mems(void) void get_online_mems(void)
{ {
might_sleep(); might_sleep();
@ -509,6 +512,8 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
int start_sec, end_sec; int start_sec, end_sec;
struct vmem_altmap *altmap; struct vmem_altmap *altmap;
clear_zone_contiguous(zone);
/* during initialize mem_map, align hot-added range to section */ /* during initialize mem_map, align hot-added range to section */
start_sec = pfn_to_section_nr(phys_start_pfn); start_sec = pfn_to_section_nr(phys_start_pfn);
end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1);
@ -521,7 +526,8 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
if (altmap->base_pfn != phys_start_pfn if (altmap->base_pfn != phys_start_pfn
|| vmem_altmap_offset(altmap) > nr_pages) { || vmem_altmap_offset(altmap) > nr_pages) {
pr_warn_once("memory add fail, invalid altmap\n"); pr_warn_once("memory add fail, invalid altmap\n");
return -EINVAL; err = -EINVAL;
goto out;
} }
altmap->alloc = 0; altmap->alloc = 0;
} }
@ -539,7 +545,8 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
err = 0; err = 0;
} }
vmemmap_populate_print_last(); vmemmap_populate_print_last();
out:
set_zone_contiguous(zone);
return err; return err;
} }
EXPORT_SYMBOL_GPL(__add_pages); EXPORT_SYMBOL_GPL(__add_pages);
@ -811,6 +818,8 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
} }
} }
clear_zone_contiguous(zone);
/* /*
* We can only remove entire sections * We can only remove entire sections
*/ */
@ -826,6 +835,9 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
if (ret) if (ret)
break; break;
} }
set_zone_contiguous(zone);
return ret; return ret;
} }
EXPORT_SYMBOL_GPL(__remove_pages); EXPORT_SYMBOL_GPL(__remove_pages);
@ -1261,8 +1273,13 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
return zone_default; return zone_default;
} }
static int online_memory_block(struct memory_block *mem, void *arg)
{
return memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
}
/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
int __ref add_memory_resource(int nid, struct resource *res) int __ref add_memory_resource(int nid, struct resource *res, bool online)
{ {
u64 start, size; u64 start, size;
pg_data_t *pgdat = NULL; pg_data_t *pgdat = NULL;
@ -1322,6 +1339,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
/* create new memmap entry */ /* create new memmap entry */
firmware_map_add_hotplug(start, start + size, "System RAM"); firmware_map_add_hotplug(start, start + size, "System RAM");
/* online pages if requested */
if (online)
walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
NULL, online_memory_block);
goto out; goto out;
error: error:
@ -1345,7 +1367,7 @@ int __ref add_memory(int nid, u64 start, u64 size)
if (IS_ERR(res)) if (IS_ERR(res))
return PTR_ERR(res); return PTR_ERR(res);
ret = add_memory_resource(nid, res); ret = add_memory_resource(nid, res, memhp_auto_online);
if (ret < 0) if (ret < 0)
release_memory_resource(res); release_memory_resource(res);
return ret; return ret;

Просмотреть файл

@ -643,7 +643,9 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
if (flags & MPOL_MF_LAZY) { if (flags & MPOL_MF_LAZY) {
/* Similar to task_numa_work, skip inaccessible VMAs */ /* Similar to task_numa_work, skip inaccessible VMAs */
if (vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)) if (!is_vm_hugetlb_page(vma) &&
(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)) &&
!(vma->vm_flags & VM_MIXEDMAP))
change_prot_numa(vma, start, endvma); change_prot_numa(vma, start, endvma);
return 1; return 1;
} }

Просмотреть файл

@ -38,6 +38,7 @@
#include <linux/balloon_compaction.h> #include <linux/balloon_compaction.h>
#include <linux/mmu_notifier.h> #include <linux/mmu_notifier.h>
#include <linux/page_idle.h> #include <linux/page_idle.h>
#include <linux/page_owner.h>
#include <asm/tlbflush.h> #include <asm/tlbflush.h>
@ -325,7 +326,6 @@ int migrate_page_move_mapping(struct address_space *mapping,
return -EAGAIN; return -EAGAIN;
/* No turning back from here */ /* No turning back from here */
set_page_memcg(newpage, page_memcg(page));
newpage->index = page->index; newpage->index = page->index;
newpage->mapping = page->mapping; newpage->mapping = page->mapping;
if (PageSwapBacked(page)) if (PageSwapBacked(page))
@ -372,7 +372,6 @@ int migrate_page_move_mapping(struct address_space *mapping,
* Now we know that no one else is looking at the page: * Now we know that no one else is looking at the page:
* no turning back from here. * no turning back from here.
*/ */
set_page_memcg(newpage, page_memcg(page));
newpage->index = page->index; newpage->index = page->index;
newpage->mapping = page->mapping; newpage->mapping = page->mapping;
if (PageSwapBacked(page)) if (PageSwapBacked(page))
@ -457,9 +456,9 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
return -EAGAIN; return -EAGAIN;
} }
set_page_memcg(newpage, page_memcg(page));
newpage->index = page->index; newpage->index = page->index;
newpage->mapping = page->mapping; newpage->mapping = page->mapping;
get_page(newpage); get_page(newpage);
radix_tree_replace_slot(pslot, newpage); radix_tree_replace_slot(pslot, newpage);
@ -467,6 +466,7 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
page_unfreeze_refs(page, expected_count - 1); page_unfreeze_refs(page, expected_count - 1);
spin_unlock_irq(&mapping->tree_lock); spin_unlock_irq(&mapping->tree_lock);
return MIGRATEPAGE_SUCCESS; return MIGRATEPAGE_SUCCESS;
} }
@ -578,6 +578,10 @@ void migrate_page_copy(struct page *newpage, struct page *page)
*/ */
if (PageWriteback(newpage)) if (PageWriteback(newpage))
end_page_writeback(newpage); end_page_writeback(newpage);
copy_page_owner(page, newpage);
mem_cgroup_migrate(page, newpage);
} }
/************************************************************ /************************************************************
@ -772,7 +776,6 @@ static int move_to_new_page(struct page *newpage, struct page *page,
* page is freed; but stats require that PageAnon be left as PageAnon. * page is freed; but stats require that PageAnon be left as PageAnon.
*/ */
if (rc == MIGRATEPAGE_SUCCESS) { if (rc == MIGRATEPAGE_SUCCESS) {
set_page_memcg(page, NULL);
if (!PageAnon(page)) if (!PageAnon(page))
page->mapping = NULL; page->mapping = NULL;
} }
@ -952,8 +955,10 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
} }
rc = __unmap_and_move(page, newpage, force, mode); rc = __unmap_and_move(page, newpage, force, mode);
if (rc == MIGRATEPAGE_SUCCESS) if (rc == MIGRATEPAGE_SUCCESS) {
put_new_page = NULL; put_new_page = NULL;
set_page_owner_migrate_reason(newpage, reason);
}
out: out:
if (rc != -EAGAIN) { if (rc != -EAGAIN) {
@ -1018,7 +1023,7 @@ out:
static int unmap_and_move_huge_page(new_page_t get_new_page, static int unmap_and_move_huge_page(new_page_t get_new_page,
free_page_t put_new_page, unsigned long private, free_page_t put_new_page, unsigned long private,
struct page *hpage, int force, struct page *hpage, int force,
enum migrate_mode mode) enum migrate_mode mode, int reason)
{ {
int rc = -EAGAIN; int rc = -EAGAIN;
int *result = NULL; int *result = NULL;
@ -1076,6 +1081,7 @@ put_anon:
if (rc == MIGRATEPAGE_SUCCESS) { if (rc == MIGRATEPAGE_SUCCESS) {
hugetlb_cgroup_migrate(hpage, new_hpage); hugetlb_cgroup_migrate(hpage, new_hpage);
put_new_page = NULL; put_new_page = NULL;
set_page_owner_migrate_reason(new_hpage, reason);
} }
unlock_page(hpage); unlock_page(hpage);
@ -1148,7 +1154,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
if (PageHuge(page)) if (PageHuge(page))
rc = unmap_and_move_huge_page(get_new_page, rc = unmap_and_move_huge_page(get_new_page,
put_new_page, private, page, put_new_page, private, page,
pass > 2, mode); pass > 2, mode, reason);
else else
rc = unmap_and_move(get_new_page, put_new_page, rc = unmap_and_move(get_new_page, put_new_page,
private, page, pass > 2, mode, private, page, pass > 2, mode,
@ -1836,9 +1842,8 @@ fail_putback:
} }
mlock_migrate_page(new_page, page); mlock_migrate_page(new_page, page);
set_page_memcg(new_page, page_memcg(page));
set_page_memcg(page, NULL);
page_remove_rmap(page, true); page_remove_rmap(page, true);
set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);
spin_unlock(ptl); spin_unlock(ptl);
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);

Просмотреть файл

@ -386,10 +386,11 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
static void dump_header(struct oom_control *oc, struct task_struct *p, static void dump_header(struct oom_control *oc, struct task_struct *p,
struct mem_cgroup *memcg) struct mem_cgroup *memcg)
{ {
pr_warning("%s invoked oom-killer: gfp_mask=0x%x, order=%d, " pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), order=%d, "
"oom_score_adj=%hd\n", "oom_score_adj=%hd\n",
current->comm, oc->gfp_mask, oc->order, current->comm, oc->gfp_mask, &oc->gfp_mask, oc->order,
current->signal->oom_score_adj); current->signal->oom_score_adj);
cpuset_print_current_mems_allowed(); cpuset_print_current_mems_allowed();
dump_stack(); dump_stack();
if (memcg) if (memcg)

Просмотреть файл

@ -1169,6 +1169,7 @@ static void wb_update_dirty_ratelimit(struct dirty_throttle_control *dtc,
unsigned long balanced_dirty_ratelimit; unsigned long balanced_dirty_ratelimit;
unsigned long step; unsigned long step;
unsigned long x; unsigned long x;
unsigned long shift;
/* /*
* The dirty rate will match the writeout rate in long term, except * The dirty rate will match the writeout rate in long term, except
@ -1293,11 +1294,11 @@ static void wb_update_dirty_ratelimit(struct dirty_throttle_control *dtc,
* rate itself is constantly fluctuating. So decrease the track speed * rate itself is constantly fluctuating. So decrease the track speed
* when it gets close to the target. Helps eliminate pointless tremors. * when it gets close to the target. Helps eliminate pointless tremors.
*/ */
step >>= dirty_ratelimit / (2 * step + 1); shift = dirty_ratelimit / (2 * step + 1);
/* if (shift < BITS_PER_LONG)
* Limit the tracking speed to avoid overshooting. step = DIV_ROUND_UP(step >> shift, 8);
*/ else
step = (step + 7) / 8; step = 0;
if (dirty_ratelimit < balanced_dirty_ratelimit) if (dirty_ratelimit < balanced_dirty_ratelimit)
dirty_ratelimit += step; dirty_ratelimit += step;
@ -2409,12 +2410,11 @@ int __set_page_dirty_no_writeback(struct page *page)
/* /*
* Helper function for set_page_dirty family. * Helper function for set_page_dirty family.
* *
* Caller must hold mem_cgroup_begin_page_stat(). * Caller must hold lock_page_memcg().
* *
* NOTE: This relies on being atomic wrt interrupts. * NOTE: This relies on being atomic wrt interrupts.
*/ */
void account_page_dirtied(struct page *page, struct address_space *mapping, void account_page_dirtied(struct page *page, struct address_space *mapping)
struct mem_cgroup *memcg)
{ {
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
@ -2426,7 +2426,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping,
inode_attach_wb(inode, page); inode_attach_wb(inode, page);
wb = inode_to_wb(inode); wb = inode_to_wb(inode);
mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_DIRTY); mem_cgroup_inc_page_stat(page, MEM_CGROUP_STAT_DIRTY);
__inc_zone_page_state(page, NR_FILE_DIRTY); __inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_zone_page_state(page, NR_DIRTIED); __inc_zone_page_state(page, NR_DIRTIED);
__inc_wb_stat(wb, WB_RECLAIMABLE); __inc_wb_stat(wb, WB_RECLAIMABLE);
@ -2441,13 +2441,13 @@ EXPORT_SYMBOL(account_page_dirtied);
/* /*
* Helper function for deaccounting dirty page without writeback. * Helper function for deaccounting dirty page without writeback.
* *
* Caller must hold mem_cgroup_begin_page_stat(). * Caller must hold lock_page_memcg().
*/ */
void account_page_cleaned(struct page *page, struct address_space *mapping, void account_page_cleaned(struct page *page, struct address_space *mapping,
struct mem_cgroup *memcg, struct bdi_writeback *wb) struct bdi_writeback *wb)
{ {
if (mapping_cap_account_dirty(mapping)) { if (mapping_cap_account_dirty(mapping)) {
mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_DIRTY); mem_cgroup_dec_page_stat(page, MEM_CGROUP_STAT_DIRTY);
dec_zone_page_state(page, NR_FILE_DIRTY); dec_zone_page_state(page, NR_FILE_DIRTY);
dec_wb_stat(wb, WB_RECLAIMABLE); dec_wb_stat(wb, WB_RECLAIMABLE);
task_io_account_cancelled_write(PAGE_CACHE_SIZE); task_io_account_cancelled_write(PAGE_CACHE_SIZE);
@ -2468,26 +2468,24 @@ void account_page_cleaned(struct page *page, struct address_space *mapping,
*/ */
int __set_page_dirty_nobuffers(struct page *page) int __set_page_dirty_nobuffers(struct page *page)
{ {
struct mem_cgroup *memcg; lock_page_memcg(page);
memcg = mem_cgroup_begin_page_stat(page);
if (!TestSetPageDirty(page)) { if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page); struct address_space *mapping = page_mapping(page);
unsigned long flags; unsigned long flags;
if (!mapping) { if (!mapping) {
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
return 1; return 1;
} }
spin_lock_irqsave(&mapping->tree_lock, flags); spin_lock_irqsave(&mapping->tree_lock, flags);
BUG_ON(page_mapping(page) != mapping); BUG_ON(page_mapping(page) != mapping);
WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page)); WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
account_page_dirtied(page, mapping, memcg); account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree, page_index(page), radix_tree_tag_set(&mapping->page_tree, page_index(page),
PAGECACHE_TAG_DIRTY); PAGECACHE_TAG_DIRTY);
spin_unlock_irqrestore(&mapping->tree_lock, flags); spin_unlock_irqrestore(&mapping->tree_lock, flags);
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
if (mapping->host) { if (mapping->host) {
/* !PageAnon && !swapper_space */ /* !PageAnon && !swapper_space */
@ -2495,7 +2493,7 @@ int __set_page_dirty_nobuffers(struct page *page)
} }
return 1; return 1;
} }
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
return 0; return 0;
} }
EXPORT_SYMBOL(__set_page_dirty_nobuffers); EXPORT_SYMBOL(__set_page_dirty_nobuffers);
@ -2625,17 +2623,16 @@ void cancel_dirty_page(struct page *page)
if (mapping_cap_account_dirty(mapping)) { if (mapping_cap_account_dirty(mapping)) {
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
struct bdi_writeback *wb; struct bdi_writeback *wb;
struct mem_cgroup *memcg;
bool locked; bool locked;
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
wb = unlocked_inode_to_wb_begin(inode, &locked); wb = unlocked_inode_to_wb_begin(inode, &locked);
if (TestClearPageDirty(page)) if (TestClearPageDirty(page))
account_page_cleaned(page, mapping, memcg, wb); account_page_cleaned(page, mapping, wb);
unlocked_inode_to_wb_end(inode, locked); unlocked_inode_to_wb_end(inode, locked);
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
} else { } else {
ClearPageDirty(page); ClearPageDirty(page);
} }
@ -2666,7 +2663,6 @@ int clear_page_dirty_for_io(struct page *page)
if (mapping && mapping_cap_account_dirty(mapping)) { if (mapping && mapping_cap_account_dirty(mapping)) {
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
struct bdi_writeback *wb; struct bdi_writeback *wb;
struct mem_cgroup *memcg;
bool locked; bool locked;
/* /*
@ -2704,16 +2700,14 @@ int clear_page_dirty_for_io(struct page *page)
* always locked coming in here, so we get the desired * always locked coming in here, so we get the desired
* exclusion. * exclusion.
*/ */
memcg = mem_cgroup_begin_page_stat(page);
wb = unlocked_inode_to_wb_begin(inode, &locked); wb = unlocked_inode_to_wb_begin(inode, &locked);
if (TestClearPageDirty(page)) { if (TestClearPageDirty(page)) {
mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_DIRTY); mem_cgroup_dec_page_stat(page, MEM_CGROUP_STAT_DIRTY);
dec_zone_page_state(page, NR_FILE_DIRTY); dec_zone_page_state(page, NR_FILE_DIRTY);
dec_wb_stat(wb, WB_RECLAIMABLE); dec_wb_stat(wb, WB_RECLAIMABLE);
ret = 1; ret = 1;
} }
unlocked_inode_to_wb_end(inode, locked); unlocked_inode_to_wb_end(inode, locked);
mem_cgroup_end_page_stat(memcg);
return ret; return ret;
} }
return TestClearPageDirty(page); return TestClearPageDirty(page);
@ -2723,10 +2717,9 @@ EXPORT_SYMBOL(clear_page_dirty_for_io);
int test_clear_page_writeback(struct page *page) int test_clear_page_writeback(struct page *page)
{ {
struct address_space *mapping = page_mapping(page); struct address_space *mapping = page_mapping(page);
struct mem_cgroup *memcg;
int ret; int ret;
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
if (mapping) { if (mapping) {
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
struct backing_dev_info *bdi = inode_to_bdi(inode); struct backing_dev_info *bdi = inode_to_bdi(inode);
@ -2750,21 +2743,20 @@ int test_clear_page_writeback(struct page *page)
ret = TestClearPageWriteback(page); ret = TestClearPageWriteback(page);
} }
if (ret) { if (ret) {
mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_WRITEBACK); mem_cgroup_dec_page_stat(page, MEM_CGROUP_STAT_WRITEBACK);
dec_zone_page_state(page, NR_WRITEBACK); dec_zone_page_state(page, NR_WRITEBACK);
inc_zone_page_state(page, NR_WRITTEN); inc_zone_page_state(page, NR_WRITTEN);
} }
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
return ret; return ret;
} }
int __test_set_page_writeback(struct page *page, bool keep_write) int __test_set_page_writeback(struct page *page, bool keep_write)
{ {
struct address_space *mapping = page_mapping(page); struct address_space *mapping = page_mapping(page);
struct mem_cgroup *memcg;
int ret; int ret;
memcg = mem_cgroup_begin_page_stat(page); lock_page_memcg(page);
if (mapping) { if (mapping) {
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
struct backing_dev_info *bdi = inode_to_bdi(inode); struct backing_dev_info *bdi = inode_to_bdi(inode);
@ -2792,10 +2784,10 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
ret = TestSetPageWriteback(page); ret = TestSetPageWriteback(page);
} }
if (!ret) { if (!ret) {
mem_cgroup_inc_page_stat(memcg, MEM_CGROUP_STAT_WRITEBACK); mem_cgroup_inc_page_stat(page, MEM_CGROUP_STAT_WRITEBACK);
inc_zone_page_state(page, NR_WRITEBACK); inc_zone_page_state(page, NR_WRITEBACK);
} }
mem_cgroup_end_page_stat(memcg); unlock_page_memcg(page);
return ret; return ret;
} }

Просмотреть файл

@ -223,6 +223,19 @@ static char * const zone_names[MAX_NR_ZONES] = {
#endif #endif
}; };
char * const migratetype_names[MIGRATE_TYPES] = {
"Unmovable",
"Movable",
"Reclaimable",
"HighAtomic",
#ifdef CONFIG_CMA
"CMA",
#endif
#ifdef CONFIG_MEMORY_ISOLATION
"Isolate",
#endif
};
compound_page_dtor * const compound_page_dtors[] = { compound_page_dtor * const compound_page_dtors[] = {
NULL, NULL,
free_compound_page, free_compound_page,
@ -247,6 +260,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
static unsigned long __initdata required_kernelcore; static unsigned long __initdata required_kernelcore;
static unsigned long __initdata required_movablecore; static unsigned long __initdata required_movablecore;
static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES]; static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
static bool mirrored_kernelcore;
/* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */ /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
int movable_zone; int movable_zone;
@ -416,7 +430,7 @@ static void bad_page(struct page *page, const char *reason,
goto out; goto out;
} }
if (nr_unshown) { if (nr_unshown) {
printk(KERN_ALERT pr_alert(
"BUG: Bad page state: %lu messages suppressed\n", "BUG: Bad page state: %lu messages suppressed\n",
nr_unshown); nr_unshown);
nr_unshown = 0; nr_unshown = 0;
@ -426,9 +440,14 @@ static void bad_page(struct page *page, const char *reason,
if (nr_shown++ == 0) if (nr_shown++ == 0)
resume = jiffies + 60 * HZ; resume = jiffies + 60 * HZ;
printk(KERN_ALERT "BUG: Bad page state in process %s pfn:%05lx\n", pr_alert("BUG: Bad page state in process %s pfn:%05lx\n",
current->comm, page_to_pfn(page)); current->comm, page_to_pfn(page));
dump_page_badflags(page, reason, bad_flags); __dump_page(page, reason);
bad_flags &= page->flags;
if (bad_flags)
pr_alert("bad because of flags: %#lx(%pGp)\n",
bad_flags, &bad_flags);
dump_page_owner(page);
print_modules(); print_modules();
dump_stack(); dump_stack();
@ -477,7 +496,8 @@ void prep_compound_page(struct page *page, unsigned int order)
#ifdef CONFIG_DEBUG_PAGEALLOC #ifdef CONFIG_DEBUG_PAGEALLOC
unsigned int _debug_guardpage_minorder; unsigned int _debug_guardpage_minorder;
bool _debug_pagealloc_enabled __read_mostly; bool _debug_pagealloc_enabled __read_mostly
= IS_ENABLED(CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT);
bool _debug_guardpage_enabled __read_mostly; bool _debug_guardpage_enabled __read_mostly;
static int __init early_debug_pagealloc(char *buf) static int __init early_debug_pagealloc(char *buf)
@ -488,6 +508,9 @@ static int __init early_debug_pagealloc(char *buf)
if (strcmp(buf, "on") == 0) if (strcmp(buf, "on") == 0)
_debug_pagealloc_enabled = true; _debug_pagealloc_enabled = true;
if (strcmp(buf, "off") == 0)
_debug_pagealloc_enabled = false;
return 0; return 0;
} }
early_param("debug_pagealloc", early_debug_pagealloc); early_param("debug_pagealloc", early_debug_pagealloc);
@ -1002,6 +1025,7 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
PAGE_SIZE << order); PAGE_SIZE << order);
} }
arch_free_page(page, order); arch_free_page(page, order);
kernel_poison_pages(page, 1 << order, 0);
kernel_map_pages(page, 1 << order, 0); kernel_map_pages(page, 1 << order, 0);
return true; return true;
@ -1104,6 +1128,75 @@ void __init __free_pages_bootmem(struct page *page, unsigned long pfn,
return __free_pages_boot_core(page, pfn, order); return __free_pages_boot_core(page, pfn, order);
} }
/*
* Check that the whole (or subset of) a pageblock given by the interval of
* [start_pfn, end_pfn) is valid and within the same zone, before scanning it
* with the migration of free compaction scanner. The scanners then need to
* use only pfn_valid_within() check for arches that allow holes within
* pageblocks.
*
* Return struct page pointer of start_pfn, or NULL if checks were not passed.
*
* It's possible on some configurations to have a setup like node0 node1 node0
* i.e. it's possible that all pages within a zones range of pages do not
* belong to a single zone. We assume that a border between node0 and node1
* can occur within a single pageblock, but not a node0 node1 node0
* interleaving within a single pageblock. It is therefore sufficient to check
* the first and last page of a pageblock and avoid checking each individual
* page in a pageblock.
*/
struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone)
{
struct page *start_page;
struct page *end_page;
/* end_pfn is one past the range we are checking */
end_pfn--;
if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
return NULL;
start_page = pfn_to_page(start_pfn);
if (page_zone(start_page) != zone)
return NULL;
end_page = pfn_to_page(end_pfn);
/* This gives a shorter code than deriving page_zone(end_page) */
if (page_zone_id(start_page) != page_zone_id(end_page))
return NULL;
return start_page;
}
void set_zone_contiguous(struct zone *zone)
{
unsigned long block_start_pfn = zone->zone_start_pfn;
unsigned long block_end_pfn;
block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
for (; block_start_pfn < zone_end_pfn(zone);
block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) {
block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
if (!__pageblock_pfn_to_page(block_start_pfn,
block_end_pfn, zone))
return;
}
/* We confirm that there is no hole */
zone->contiguous = true;
}
void clear_zone_contiguous(struct zone *zone)
{
zone->contiguous = false;
}
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
static void __init deferred_free_range(struct page *page, static void __init deferred_free_range(struct page *page,
unsigned long pfn, int nr_pages) unsigned long pfn, int nr_pages)
@ -1254,9 +1347,13 @@ free_range:
pgdat_init_report_one_done(); pgdat_init_report_one_done();
return 0; return 0;
} }
#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
void __init page_alloc_init_late(void) void __init page_alloc_init_late(void)
{ {
struct zone *zone;
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
int nid; int nid;
/* There will be num_node_state(N_MEMORY) threads */ /* There will be num_node_state(N_MEMORY) threads */
@ -1270,8 +1367,11 @@ void __init page_alloc_init_late(void)
/* Reinit limits that are based on free pages after the kernel is up */ /* Reinit limits that are based on free pages after the kernel is up */
files_maxfiles_init(); files_maxfiles_init();
#endif
for_each_populated_zone(zone)
set_zone_contiguous(zone);
} }
#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
#ifdef CONFIG_CMA #ifdef CONFIG_CMA
/* Free whole pageblock and set its migration type to MIGRATE_CMA. */ /* Free whole pageblock and set its migration type to MIGRATE_CMA. */
@ -1381,15 +1481,24 @@ static inline int check_new_page(struct page *page)
return 0; return 0;
} }
static inline bool free_pages_prezeroed(bool poisoned)
{
return IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) &&
page_poisoning_enabled() && poisoned;
}
static int prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, static int prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
int alloc_flags) int alloc_flags)
{ {
int i; int i;
bool poisoned = true;
for (i = 0; i < (1 << order); i++) { for (i = 0; i < (1 << order); i++) {
struct page *p = page + i; struct page *p = page + i;
if (unlikely(check_new_page(p))) if (unlikely(check_new_page(p)))
return 1; return 1;
if (poisoned)
poisoned &= page_is_poisoned(p);
} }
set_page_private(page, 0); set_page_private(page, 0);
@ -1397,9 +1506,10 @@ static int prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
arch_alloc_page(page, order); arch_alloc_page(page, order);
kernel_map_pages(page, 1 << order, 1); kernel_map_pages(page, 1 << order, 1);
kernel_poison_pages(page, 1 << order, 1);
kasan_alloc_pages(page, order); kasan_alloc_pages(page, order);
if (gfp_flags & __GFP_ZERO) if (!free_pages_prezeroed(poisoned) && (gfp_flags & __GFP_ZERO))
for (i = 0; i < (1 << order); i++) for (i = 0; i < (1 << order); i++)
clear_highpage(page + i); clear_highpage(page + i);
@ -2690,9 +2800,8 @@ void warn_alloc_failed(gfp_t gfp_mask, unsigned int order, const char *fmt, ...)
va_end(args); va_end(args);
} }
pr_warn("%s: page allocation failure: order:%u, mode:0x%x\n", pr_warn("%s: page allocation failure: order:%u, mode:%#x(%pGg)\n",
current->comm, order, gfp_mask); current->comm, order, gfp_mask, &gfp_mask);
dump_stack(); dump_stack();
if (!should_suppress_show_mem()) if (!should_suppress_show_mem())
show_mem(filter); show_mem(filter);
@ -4491,6 +4600,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
pg_data_t *pgdat = NODE_DATA(nid); pg_data_t *pgdat = NODE_DATA(nid);
unsigned long pfn; unsigned long pfn;
unsigned long nr_initialised = 0; unsigned long nr_initialised = 0;
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
struct memblock_region *r = NULL, *tmp;
#endif
if (highest_memmap_pfn < end_pfn - 1) if (highest_memmap_pfn < end_pfn - 1)
highest_memmap_pfn = end_pfn - 1; highest_memmap_pfn = end_pfn - 1;
@ -4504,20 +4616,51 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
for (pfn = start_pfn; pfn < end_pfn; pfn++) { for (pfn = start_pfn; pfn < end_pfn; pfn++) {
/* /*
* There can be holes in boot-time mem_map[]s * There can be holes in boot-time mem_map[]s handed to this
* handed to this function. They do not * function. They do not exist on hotplugged memory.
* exist on hotplugged memory.
*/ */
if (context == MEMMAP_EARLY) { if (context != MEMMAP_EARLY)
if (!early_pfn_valid(pfn)) goto not_early;
continue;
if (!early_pfn_in_nid(pfn, nid))
continue;
if (!update_defer_init(pgdat, pfn, end_pfn,
&nr_initialised))
break;
}
if (!early_pfn_valid(pfn))
continue;
if (!early_pfn_in_nid(pfn, nid))
continue;
if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
break;
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
/*
* If not mirrored_kernelcore and ZONE_MOVABLE exists, range
* from zone_movable_pfn[nid] to end of each node should be
* ZONE_MOVABLE not ZONE_NORMAL. skip it.
*/
if (!mirrored_kernelcore && zone_movable_pfn[nid])
if (zone == ZONE_NORMAL && pfn >= zone_movable_pfn[nid])
continue;
/*
* Check given memblock attribute by firmware which can affect
* kernel memory layout. If zone==ZONE_MOVABLE but memory is
* mirrored, it's an overlapped memmap init. skip it.
*/
if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
if (!r || pfn >= memblock_region_memory_end_pfn(r)) {
for_each_memblock(memory, tmp)
if (pfn < memblock_region_memory_end_pfn(tmp))
break;
r = tmp;
}
if (pfn >= memblock_region_memory_base_pfn(r) &&
memblock_is_mirror(r)) {
/* already initialized as NORMAL */
pfn = memblock_region_memory_end_pfn(r);
continue;
}
}
#endif
not_early:
/* /*
* Mark the block movable so that blocks are reserved for * Mark the block movable so that blocks are reserved for
* movable at startup. This will force kernel allocations * movable at startup. This will force kernel allocations
@ -4934,11 +5077,6 @@ static void __meminit adjust_zone_range_for_zone_movable(int nid,
*zone_end_pfn = min(node_end_pfn, *zone_end_pfn = min(node_end_pfn,
arch_zone_highest_possible_pfn[movable_zone]); arch_zone_highest_possible_pfn[movable_zone]);
/* Adjust for ZONE_MOVABLE starting within this range */
} else if (*zone_start_pfn < zone_movable_pfn[nid] &&
*zone_end_pfn > zone_movable_pfn[nid]) {
*zone_end_pfn = zone_movable_pfn[nid];
/* Check if this whole range is within ZONE_MOVABLE */ /* Check if this whole range is within ZONE_MOVABLE */
} else if (*zone_start_pfn >= zone_movable_pfn[nid]) } else if (*zone_start_pfn >= zone_movable_pfn[nid])
*zone_start_pfn = *zone_end_pfn; *zone_start_pfn = *zone_end_pfn;
@ -4953,31 +5091,31 @@ static unsigned long __meminit zone_spanned_pages_in_node(int nid,
unsigned long zone_type, unsigned long zone_type,
unsigned long node_start_pfn, unsigned long node_start_pfn,
unsigned long node_end_pfn, unsigned long node_end_pfn,
unsigned long *zone_start_pfn,
unsigned long *zone_end_pfn,
unsigned long *ignored) unsigned long *ignored)
{ {
unsigned long zone_start_pfn, zone_end_pfn;
/* When hotadd a new node from cpu_up(), the node should be empty */ /* When hotadd a new node from cpu_up(), the node should be empty */
if (!node_start_pfn && !node_end_pfn) if (!node_start_pfn && !node_end_pfn)
return 0; return 0;
/* Get the start and end of the zone */ /* Get the start and end of the zone */
zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type]; *zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type];
zone_end_pfn = arch_zone_highest_possible_pfn[zone_type]; *zone_end_pfn = arch_zone_highest_possible_pfn[zone_type];
adjust_zone_range_for_zone_movable(nid, zone_type, adjust_zone_range_for_zone_movable(nid, zone_type,
node_start_pfn, node_end_pfn, node_start_pfn, node_end_pfn,
&zone_start_pfn, &zone_end_pfn); zone_start_pfn, zone_end_pfn);
/* Check that this node has pages within the zone's required range */ /* Check that this node has pages within the zone's required range */
if (zone_end_pfn < node_start_pfn || zone_start_pfn > node_end_pfn) if (*zone_end_pfn < node_start_pfn || *zone_start_pfn > node_end_pfn)
return 0; return 0;
/* Move the zone boundaries inside the node if necessary */ /* Move the zone boundaries inside the node if necessary */
zone_end_pfn = min(zone_end_pfn, node_end_pfn); *zone_end_pfn = min(*zone_end_pfn, node_end_pfn);
zone_start_pfn = max(zone_start_pfn, node_start_pfn); *zone_start_pfn = max(*zone_start_pfn, node_start_pfn);
/* Return the spanned pages */ /* Return the spanned pages */
return zone_end_pfn - zone_start_pfn; return *zone_end_pfn - *zone_start_pfn;
} }
/* /*
@ -5023,6 +5161,7 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
unsigned long zone_low = arch_zone_lowest_possible_pfn[zone_type]; unsigned long zone_low = arch_zone_lowest_possible_pfn[zone_type];
unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type]; unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type];
unsigned long zone_start_pfn, zone_end_pfn; unsigned long zone_start_pfn, zone_end_pfn;
unsigned long nr_absent;
/* When hotadd a new node from cpu_up(), the node should be empty */ /* When hotadd a new node from cpu_up(), the node should be empty */
if (!node_start_pfn && !node_end_pfn) if (!node_start_pfn && !node_end_pfn)
@ -5034,7 +5173,39 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
adjust_zone_range_for_zone_movable(nid, zone_type, adjust_zone_range_for_zone_movable(nid, zone_type,
node_start_pfn, node_end_pfn, node_start_pfn, node_end_pfn,
&zone_start_pfn, &zone_end_pfn); &zone_start_pfn, &zone_end_pfn);
return __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn); nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
/*
* ZONE_MOVABLE handling.
* Treat pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages
* and vice versa.
*/
if (zone_movable_pfn[nid]) {
if (mirrored_kernelcore) {
unsigned long start_pfn, end_pfn;
struct memblock_region *r;
for_each_memblock(memory, r) {
start_pfn = clamp(memblock_region_memory_base_pfn(r),
zone_start_pfn, zone_end_pfn);
end_pfn = clamp(memblock_region_memory_end_pfn(r),
zone_start_pfn, zone_end_pfn);
if (zone_type == ZONE_MOVABLE &&
memblock_is_mirror(r))
nr_absent += end_pfn - start_pfn;
if (zone_type == ZONE_NORMAL &&
!memblock_is_mirror(r))
nr_absent += end_pfn - start_pfn;
}
} else {
if (zone_type == ZONE_NORMAL)
nr_absent += node_end_pfn - zone_movable_pfn[nid];
}
}
return nr_absent;
} }
#else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ #else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
@ -5042,8 +5213,18 @@ static inline unsigned long __meminit zone_spanned_pages_in_node(int nid,
unsigned long zone_type, unsigned long zone_type,
unsigned long node_start_pfn, unsigned long node_start_pfn,
unsigned long node_end_pfn, unsigned long node_end_pfn,
unsigned long *zone_start_pfn,
unsigned long *zone_end_pfn,
unsigned long *zones_size) unsigned long *zones_size)
{ {
unsigned int zone;
*zone_start_pfn = node_start_pfn;
for (zone = 0; zone < zone_type; zone++)
*zone_start_pfn += zones_size[zone];
*zone_end_pfn = *zone_start_pfn + zones_size[zone_type];
return zones_size[zone_type]; return zones_size[zone_type];
} }
@ -5072,15 +5253,22 @@ static void __meminit calculate_node_totalpages(struct pglist_data *pgdat,
for (i = 0; i < MAX_NR_ZONES; i++) { for (i = 0; i < MAX_NR_ZONES; i++) {
struct zone *zone = pgdat->node_zones + i; struct zone *zone = pgdat->node_zones + i;
unsigned long zone_start_pfn, zone_end_pfn;
unsigned long size, real_size; unsigned long size, real_size;
size = zone_spanned_pages_in_node(pgdat->node_id, i, size = zone_spanned_pages_in_node(pgdat->node_id, i,
node_start_pfn, node_start_pfn,
node_end_pfn, node_end_pfn,
&zone_start_pfn,
&zone_end_pfn,
zones_size); zones_size);
real_size = size - zone_absent_pages_in_node(pgdat->node_id, i, real_size = size - zone_absent_pages_in_node(pgdat->node_id, i,
node_start_pfn, node_end_pfn, node_start_pfn, node_end_pfn,
zholes_size); zholes_size);
if (size)
zone->zone_start_pfn = zone_start_pfn;
else
zone->zone_start_pfn = 0;
zone->spanned_pages = size; zone->spanned_pages = size;
zone->present_pages = real_size; zone->present_pages = real_size;
@ -5201,7 +5389,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
{ {
enum zone_type j; enum zone_type j;
int nid = pgdat->node_id; int nid = pgdat->node_id;
unsigned long zone_start_pfn = pgdat->node_start_pfn;
int ret; int ret;
pgdat_resize_init(pgdat); pgdat_resize_init(pgdat);
@ -5222,6 +5409,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
for (j = 0; j < MAX_NR_ZONES; j++) { for (j = 0; j < MAX_NR_ZONES; j++) {
struct zone *zone = pgdat->node_zones + j; struct zone *zone = pgdat->node_zones + j;
unsigned long size, realsize, freesize, memmap_pages; unsigned long size, realsize, freesize, memmap_pages;
unsigned long zone_start_pfn = zone->zone_start_pfn;
size = zone->spanned_pages; size = zone->spanned_pages;
realsize = freesize = zone->present_pages; realsize = freesize = zone->present_pages;
@ -5290,7 +5478,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
ret = init_currently_empty_zone(zone, zone_start_pfn, size); ret = init_currently_empty_zone(zone, zone_start_pfn, size);
BUG_ON(ret); BUG_ON(ret);
memmap_init(size, nid, j, zone_start_pfn); memmap_init(size, nid, j, zone_start_pfn);
zone_start_pfn += size;
} }
} }
@ -5358,6 +5545,8 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid, pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
(u64)start_pfn << PAGE_SHIFT, (u64)start_pfn << PAGE_SHIFT,
end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0); end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
#else
start_pfn = node_start_pfn;
#endif #endif
calculate_node_totalpages(pgdat, start_pfn, end_pfn, calculate_node_totalpages(pgdat, start_pfn, end_pfn,
zones_size, zholes_size); zones_size, zholes_size);
@ -5528,6 +5717,36 @@ static void __init find_zone_movable_pfns_for_nodes(void)
goto out2; goto out2;
} }
/*
* If kernelcore=mirror is specified, ignore movablecore option
*/
if (mirrored_kernelcore) {
bool mem_below_4gb_not_mirrored = false;
for_each_memblock(memory, r) {
if (memblock_is_mirror(r))
continue;
nid = r->nid;
usable_startpfn = memblock_region_memory_base_pfn(r);
if (usable_startpfn < 0x100000) {
mem_below_4gb_not_mirrored = true;
continue;
}
zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
min(usable_startpfn, zone_movable_pfn[nid]) :
usable_startpfn;
}
if (mem_below_4gb_not_mirrored)
pr_warn("This configuration results in unmirrored kernel memory.");
goto out2;
}
/* /*
* If movablecore=nn[KMG] was specified, calculate what size of * If movablecore=nn[KMG] was specified, calculate what size of
* kernelcore that corresponds so that memory usable for * kernelcore that corresponds so that memory usable for
@ -5788,6 +6007,12 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
*/ */
static int __init cmdline_parse_kernelcore(char *p) static int __init cmdline_parse_kernelcore(char *p)
{ {
/* parse kernelcore=mirror */
if (parse_option_str(p, "mirror")) {
mirrored_kernelcore = true;
return 0;
}
return cmdline_parse_core(p, &required_kernelcore); return cmdline_parse_core(p, &required_kernelcore);
} }

Просмотреть файл

@ -106,12 +106,15 @@ struct page_ext *lookup_page_ext(struct page *page)
struct page_ext *base; struct page_ext *base;
base = NODE_DATA(page_to_nid(page))->node_page_ext; base = NODE_DATA(page_to_nid(page))->node_page_ext;
#ifdef CONFIG_DEBUG_VM #if defined(CONFIG_DEBUG_VM) || defined(CONFIG_PAGE_POISONING)
/* /*
* The sanity checks the page allocator does upon freeing a * The sanity checks the page allocator does upon freeing a
* page can reach here before the page_ext arrays are * page can reach here before the page_ext arrays are
* allocated when feeding a range of pages to the allocator * allocated when feeding a range of pages to the allocator
* for the first time during bootup or memory hotplug. * for the first time during bootup or memory hotplug.
*
* This check is also necessary for ensuring page poisoning
* works as expected when enabled
*/ */
if (unlikely(!base)) if (unlikely(!base))
return NULL; return NULL;
@ -180,12 +183,15 @@ struct page_ext *lookup_page_ext(struct page *page)
{ {
unsigned long pfn = page_to_pfn(page); unsigned long pfn = page_to_pfn(page);
struct mem_section *section = __pfn_to_section(pfn); struct mem_section *section = __pfn_to_section(pfn);
#ifdef CONFIG_DEBUG_VM #if defined(CONFIG_DEBUG_VM) || defined(CONFIG_PAGE_POISONING)
/* /*
* The sanity checks the page allocator does upon freeing a * The sanity checks the page allocator does upon freeing a
* page can reach here before the page_ext arrays are * page can reach here before the page_ext arrays are
* allocated when feeding a range of pages to the allocator * allocated when feeding a range of pages to the allocator
* for the first time during bootup or memory hotplug. * for the first time during bootup or memory hotplug.
*
* This check is also necessary for ensuring page poisoning
* works as expected when enabled
*/ */
if (!section->page_ext) if (!section->page_ext)
return NULL; return NULL;

Просмотреть файл

@ -5,10 +5,12 @@
#include <linux/bootmem.h> #include <linux/bootmem.h>
#include <linux/stacktrace.h> #include <linux/stacktrace.h>
#include <linux/page_owner.h> #include <linux/page_owner.h>
#include <linux/jump_label.h>
#include <linux/migrate.h>
#include "internal.h" #include "internal.h"
static bool page_owner_disabled = true; static bool page_owner_disabled = true;
bool page_owner_inited __read_mostly; DEFINE_STATIC_KEY_FALSE(page_owner_inited);
static void init_early_allocated_pages(void); static void init_early_allocated_pages(void);
@ -37,7 +39,7 @@ static void init_page_owner(void)
if (page_owner_disabled) if (page_owner_disabled)
return; return;
page_owner_inited = true; static_branch_enable(&page_owner_inited);
init_early_allocated_pages(); init_early_allocated_pages();
} }
@ -72,10 +74,18 @@ void __set_page_owner(struct page *page, unsigned int order, gfp_t gfp_mask)
page_ext->order = order; page_ext->order = order;
page_ext->gfp_mask = gfp_mask; page_ext->gfp_mask = gfp_mask;
page_ext->nr_entries = trace.nr_entries; page_ext->nr_entries = trace.nr_entries;
page_ext->last_migrate_reason = -1;
__set_bit(PAGE_EXT_OWNER, &page_ext->flags); __set_bit(PAGE_EXT_OWNER, &page_ext->flags);
} }
void __set_page_owner_migrate_reason(struct page *page, int reason)
{
struct page_ext *page_ext = lookup_page_ext(page);
page_ext->last_migrate_reason = reason;
}
gfp_t __get_page_owner_gfp(struct page *page) gfp_t __get_page_owner_gfp(struct page *page)
{ {
struct page_ext *page_ext = lookup_page_ext(page); struct page_ext *page_ext = lookup_page_ext(page);
@ -83,6 +93,31 @@ gfp_t __get_page_owner_gfp(struct page *page)
return page_ext->gfp_mask; return page_ext->gfp_mask;
} }
void __copy_page_owner(struct page *oldpage, struct page *newpage)
{
struct page_ext *old_ext = lookup_page_ext(oldpage);
struct page_ext *new_ext = lookup_page_ext(newpage);
int i;
new_ext->order = old_ext->order;
new_ext->gfp_mask = old_ext->gfp_mask;
new_ext->nr_entries = old_ext->nr_entries;
for (i = 0; i < ARRAY_SIZE(new_ext->trace_entries); i++)
new_ext->trace_entries[i] = old_ext->trace_entries[i];
/*
* We don't clear the bit on the oldpage as it's going to be freed
* after migration. Until then, the info can be useful in case of
* a bug, and the overal stats will be off a bit only temporarily.
* Also, migrate_misplaced_transhuge_page() can still fail the
* migration and then we want the oldpage to retain the info. But
* in that case we also don't need to explicitly clear the info from
* the new page, which will be freed.
*/
__set_bit(PAGE_EXT_OWNER, &new_ext->flags);
}
static ssize_t static ssize_t
print_page_owner(char __user *buf, size_t count, unsigned long pfn, print_page_owner(char __user *buf, size_t count, unsigned long pfn,
struct page *page, struct page_ext *page_ext) struct page *page, struct page_ext *page_ext)
@ -100,8 +135,9 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
return -ENOMEM; return -ENOMEM;
ret = snprintf(kbuf, count, ret = snprintf(kbuf, count,
"Page allocated via order %u, mask 0x%x\n", "Page allocated via order %u, mask %#x(%pGg)\n",
page_ext->order, page_ext->gfp_mask); page_ext->order, page_ext->gfp_mask,
&page_ext->gfp_mask);
if (ret >= count) if (ret >= count)
goto err; goto err;
@ -110,23 +146,12 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
pageblock_mt = get_pfnblock_migratetype(page, pfn); pageblock_mt = get_pfnblock_migratetype(page, pfn);
page_mt = gfpflags_to_migratetype(page_ext->gfp_mask); page_mt = gfpflags_to_migratetype(page_ext->gfp_mask);
ret += snprintf(kbuf + ret, count - ret, ret += snprintf(kbuf + ret, count - ret,
"PFN %lu Block %lu type %d %s Flags %s%s%s%s%s%s%s%s%s%s%s%s\n", "PFN %lu type %s Block %lu type %s Flags %#lx(%pGp)\n",
pfn, pfn,
migratetype_names[page_mt],
pfn >> pageblock_order, pfn >> pageblock_order,
pageblock_mt, migratetype_names[pageblock_mt],
pageblock_mt != page_mt ? "Fallback" : " ", page->flags, &page->flags);
PageLocked(page) ? "K" : " ",
PageError(page) ? "E" : " ",
PageReferenced(page) ? "R" : " ",
PageUptodate(page) ? "U" : " ",
PageDirty(page) ? "D" : " ",
PageLRU(page) ? "L" : " ",
PageActive(page) ? "A" : " ",
PageSlab(page) ? "S" : " ",
PageWriteback(page) ? "W" : " ",
PageCompound(page) ? "C" : " ",
PageSwapCache(page) ? "B" : " ",
PageMappedToDisk(page) ? "M" : " ");
if (ret >= count) if (ret >= count)
goto err; goto err;
@ -135,6 +160,14 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
if (ret >= count) if (ret >= count)
goto err; goto err;
if (page_ext->last_migrate_reason != -1) {
ret += snprintf(kbuf + ret, count - ret,
"Page has been migrated, last migrate reason: %s\n",
migrate_reason_names[page_ext->last_migrate_reason]);
if (ret >= count)
goto err;
}
ret += snprintf(kbuf + ret, count - ret, "\n"); ret += snprintf(kbuf + ret, count - ret, "\n");
if (ret >= count) if (ret >= count)
goto err; goto err;
@ -150,6 +183,31 @@ err:
return -ENOMEM; return -ENOMEM;
} }
void __dump_page_owner(struct page *page)
{
struct page_ext *page_ext = lookup_page_ext(page);
struct stack_trace trace = {
.nr_entries = page_ext->nr_entries,
.entries = &page_ext->trace_entries[0],
};
gfp_t gfp_mask = page_ext->gfp_mask;
int mt = gfpflags_to_migratetype(gfp_mask);
if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
pr_alert("page_owner info is not active (free page?)\n");
return;
}
pr_alert("page allocated via order %u, migratetype %s, "
"gfp_mask %#x(%pGg)\n", page_ext->order,
migratetype_names[mt], gfp_mask, &gfp_mask);
print_stack_trace(&trace, 0);
if (page_ext->last_migrate_reason != -1)
pr_alert("page has been migrated, last migrate reason: %s\n",
migrate_reason_names[page_ext->last_migrate_reason]);
}
static ssize_t static ssize_t
read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
{ {
@ -157,7 +215,7 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
struct page *page; struct page *page;
struct page_ext *page_ext; struct page_ext *page_ext;
if (!page_owner_inited) if (!static_branch_unlikely(&page_owner_inited))
return -EINVAL; return -EINVAL;
page = NULL; page = NULL;
@ -305,7 +363,7 @@ static int __init pageowner_init(void)
{ {
struct dentry *dentry; struct dentry *dentry;
if (!page_owner_inited) { if (!static_branch_unlikely(&page_owner_inited)) {
pr_info("page_owner is disabled\n"); pr_info("page_owner is disabled\n");
return 0; return 0;
} }

Просмотреть файл

@ -6,22 +6,48 @@
#include <linux/poison.h> #include <linux/poison.h>
#include <linux/ratelimit.h> #include <linux/ratelimit.h>
static bool page_poisoning_enabled __read_mostly; static bool __page_poisoning_enabled __read_mostly;
static bool want_page_poisoning __read_mostly;
static int early_page_poison_param(char *buf)
{
if (!buf)
return -EINVAL;
if (strcmp(buf, "on") == 0)
want_page_poisoning = true;
else if (strcmp(buf, "off") == 0)
want_page_poisoning = false;
return 0;
}
early_param("page_poison", early_page_poison_param);
bool page_poisoning_enabled(void)
{
return __page_poisoning_enabled;
}
static bool need_page_poisoning(void) static bool need_page_poisoning(void)
{ {
if (!debug_pagealloc_enabled()) return want_page_poisoning;
return false;
return true;
} }
static void init_page_poisoning(void) static void init_page_poisoning(void)
{ {
if (!debug_pagealloc_enabled()) /*
return; * page poisoning is debug page alloc for some arches. If either
* of those options are enabled, enable poisoning
*/
if (!IS_ENABLED(CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC)) {
if (!want_page_poisoning && !debug_pagealloc_enabled())
return;
} else {
if (!want_page_poisoning)
return;
}
page_poisoning_enabled = true; __page_poisoning_enabled = true;
} }
struct page_ext_operations page_poisoning_ops = { struct page_ext_operations page_poisoning_ops = {
@ -45,11 +71,14 @@ static inline void clear_page_poison(struct page *page)
__clear_bit(PAGE_EXT_DEBUG_POISON, &page_ext->flags); __clear_bit(PAGE_EXT_DEBUG_POISON, &page_ext->flags);
} }
static inline bool page_poison(struct page *page) bool page_is_poisoned(struct page *page)
{ {
struct page_ext *page_ext; struct page_ext *page_ext;
page_ext = lookup_page_ext(page); page_ext = lookup_page_ext(page);
if (!page_ext)
return false;
return test_bit(PAGE_EXT_DEBUG_POISON, &page_ext->flags); return test_bit(PAGE_EXT_DEBUG_POISON, &page_ext->flags);
} }
@ -83,6 +112,9 @@ static void check_poison_mem(unsigned char *mem, size_t bytes)
unsigned char *start; unsigned char *start;
unsigned char *end; unsigned char *end;
if (IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
return;
start = memchr_inv(mem, PAGE_POISON, bytes); start = memchr_inv(mem, PAGE_POISON, bytes);
if (!start) if (!start)
return; return;
@ -95,9 +127,9 @@ static void check_poison_mem(unsigned char *mem, size_t bytes)
if (!__ratelimit(&ratelimit)) if (!__ratelimit(&ratelimit))
return; return;
else if (start == end && single_bit_flip(*start, PAGE_POISON)) else if (start == end && single_bit_flip(*start, PAGE_POISON))
printk(KERN_ERR "pagealloc: single bit error\n"); pr_err("pagealloc: single bit error\n");
else else
printk(KERN_ERR "pagealloc: memory corruption\n"); pr_err("pagealloc: memory corruption\n");
print_hex_dump(KERN_ERR, "", DUMP_PREFIX_ADDRESS, 16, 1, start, print_hex_dump(KERN_ERR, "", DUMP_PREFIX_ADDRESS, 16, 1, start,
end - start + 1, 1); end - start + 1, 1);
@ -108,7 +140,7 @@ static void unpoison_page(struct page *page)
{ {
void *addr; void *addr;
if (!page_poison(page)) if (!page_is_poisoned(page))
return; return;
addr = kmap_atomic(page); addr = kmap_atomic(page);
@ -125,9 +157,9 @@ static void unpoison_pages(struct page *page, int n)
unpoison_page(page + i); unpoison_page(page + i);
} }
void __kernel_map_pages(struct page *page, int numpages, int enable) void kernel_poison_pages(struct page *page, int numpages, int enable)
{ {
if (!page_poisoning_enabled) if (!page_poisoning_enabled())
return; return;
if (enable) if (enable)
@ -135,3 +167,10 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
else else
poison_pages(page, numpages); poison_pages(page, numpages);
} }
#ifndef CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC
void __kernel_map_pages(struct page *page, int numpages, int enable)
{
/* This function does nothing, all work is done via poison pages */
}
#endif

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше