The stuff under sysctl describes /sys interface from userspace
point of view. So, add it to the admin-guide and remove the
:orphan: from its index file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
The two files there describes a Kernel API feature, used to
support early userspace stuff. Prepare for moving them to
the kernel API book by converting to ReST format.
The conversion itself was quite trivial: just add/mark a few
titles as such, add a literal block markup, add a table markup
and a few blank lines, in order to make Sphinx to properly parse it.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
In this round, we've introduced native swap file support which can exploit DIO,
enhanced existing checkpoint=disable feature with additional mount option to
tune the triggering condition, and allowed user to preallocate physical blocks
in a pinned file which will be useful to avoid f2fs fragmentation in append-only
workloads. In addition, we've fixed subtle quota corruption issue.
Enhancement:
- add swap file support which uses DIO
- allocate blocks for pinned file
- allow SSR and mount option to enhance checkpoint=disable
- enhance IPU IOs
- add more sanity checks such as memory boundary access
Bug fix:
- quota corruption in very corner case of error-injected SPO case
- fix root_reserved on remount and some wrong counts
- add missing fsck flag
Some patches were also introduced to clean up ambiguous i_flags and debugging
messages codes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl0mlKIACgkQQBSofoJI
UNI2Ow//e4QxinKVdgA6F2wx0CkdSreqfzQbA1t+6pcWgzCgLfj4dpOuSp8Yu1NT
aG6YFfUxjtUNN8D85WqJ+6qKt0gFBoxjQXDvvbxLFB9Xoa2XqzFrHW8xenSRHppj
V63Yye5Z+Qgss65hTktgHMWSi4mUWRq76t1lFBprXm41uC036rIQCSYioztcLQCN
fFi2xfkFHf7vIIg6ZrCy22wNSCWL9X6dzKftIZ6LSz+jkPGEard1D/OUYLMMQ4YG
b5DS0LEWbudn1vwALPPXwTHgZuG12W581MsHUsu2FIyenGGTk7EIZfNBN6cGfIMk
NsEMnanFvXp7ZYP6HQnZlSkoBRIkD2JpYh7bFxTklw4H09GJxYFksed8uqNoDRog
GPcNZjKSm0wCUHe2awOhF9kRXMFnwIR7m4DNOQHH1MYmpp3ponGsbfYy3J/qLS5y
Smh8pcbsttDMQ0NaWZznby2bSEv9k9R9CoqE5sKCNDnh6Ky8WjK8x6xt3wrPG6h8
jI7venvHvJbFpxAmnKDzflZrlGj95pg0j0DAk07ql/9i8YPfRrlBKv8jOo+7aRWC
jIO70URGDO6R/o5XdRqx6u8DOE1JPVP+6XzsT6MdDtBlxE+TbcTPRt3RzEgmo1d3
CI0jq0IRDTDZJITmQoQKCS85OZDyaAe2iHUPwSjCEuDex5lynuY=
=S2ZS
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've introduced native swap file support which can
exploit DIO, enhanced existing checkpoint=disable feature with
additional mount option to tune the triggering condition, and allowed
user to preallocate physical blocks in a pinned file which will be
useful to avoid f2fs fragmentation in append-only workloads. In
addition, we've fixed subtle quota corruption issue.
Enhancements:
- add swap file support which uses DIO
- allocate blocks for pinned file
- allow SSR and mount option to enhance checkpoint=disable
- enhance IPU IOs
- add more sanity checks such as memory boundary access
Bug fixes:
- quota corruption in very corner case of error-injected SPO case
- fix root_reserved on remount and some wrong counts
- add missing fsck flag
Some patches were also introduced to clean up ambiguous i_flags and
debugging messages codes"
* tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits)
f2fs: improve print log in f2fs_sanity_check_ckpt()
f2fs: avoid out-of-range memory access
f2fs: fix to avoid long latency during umount
f2fs: allow all the users to pin a file
f2fs: support swap file w/ DIO
f2fs: allocate blocks for pinned file
f2fs: fix is_idle() check for discard type
f2fs: add a rw_sem to cover quota flag changes
f2fs: set SBI_NEED_FSCK for xattr corruption case
f2fs: use generic EFSBADCRC/EFSCORRUPTED
f2fs: Use DIV_ROUND_UP() instead of open-coding
f2fs: print kernel message if filesystem is inconsistent
f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
f2fs: avoid get_valid_blocks() for cleanup
f2fs: ioctl for removing a range from F2FS
f2fs: only set project inherit bit for directory
f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
f2fs: replace ktype default_attrs with default_groups
f2fs: Add option to limit required GC for checkpoint=disable
f2fs: Fix accounting for unusable blocks
...
- Refactor inode geometry calculation into a single structure instead of
open-coding pieces everywhere.
- Add online repair to build options.
- Remove unnecessary function call flags and functions.
- Claim maintainership of various loose xfs documentation and header
files.
- Use struct bio directly for log buffer IOs instead of struct xfs_buf.
- Reduce log item boilerplate code requirements.
- Merge log item code spread across too many files.
- Further distinguish between log item commits and cancellations.
- Various small cleanups to the ag small allocator.
- Support cgroup-aware writeback
- libxfs refactoring for mkfs cleanup
- Remove unneeded #includes
- Fix a memory allocation miscalculation in the new log bio code
- Fix bisection problems
- Fix a crash in ioend processing caused by tripping over freeing of
preallocated transactions
- Split out a generic inode walk mechanism from the bulkstat code, hook
up all the internal users to use the walking code, then clean up
bulkstat to serve only the bulkstat ioctls.
- Add a multithreaded iwalk implementation to speed up quotacheck on
fast storage with many CPUs.
- Remove unnecessary return values in logging teardown functions.
- Supplement the bstat and inogrp structures with new bulkstat and
inumbers structures that have all the fields we need for v5
filesystem features and none of the padding problems of their
predecessors.
- Wire up new ioctls that use the new structures with a much simpler
bulk_ireq structure at the head instead of the pointerhappy mess we
had before.
- Enable userspace to constrain bulkstat returns to a single AG or a
single special inode so that we can phase out a lot of geometry
guesswork in userspace.
- Reduce memory consumption and zeroing overhead in extended attribute
scrub code.
- Fix some behavioral regressions in the new bulkstat backend code.
- Fix some behavioral regressions in the new log bio code.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0mGqwACgkQ+H93GTRK
tOvUGA//d9+SVDaisbFzwxQVHX2/MPJExKA56kFE4PvSQhIpvPZeGNXUXCZfQXGt
Hkwm/uEVUdndH9ERo8Z4L2nmT1WQCt3ONUihHn/aE8NSh/g1Bb1J+/t2Wr+Cdqic
sgU5imhEoMTh0cknoFcGg9PdAJ7ZOv4mr2wzDrj3odq0etP63U+zabxxlX2Lg0VW
3EXu6YhY+bmq54jdyc4xIL930opfYC64FDVmG7iNK4xlX1M2wJjdEGEP5fEchxf8
XgsWgJqxRyxJWVH3KTbpXREQkrbj+uDM8MqqFF9uYzjUt2UHc9H206BxJCUfWo7/
HGhTQmWqXmYMlA+ngEhJ3lcwsgAmHnnLL4YczbprgrQcwIbxMkpw2leEWUGOiLhH
HdNCnJNVgPcAtmjjDxENqGUR19qslrexgSH2eUrFZFM15/3GHMYT3wdEHyqiZDya
LiTHllcLDvauUmv8df51jJ+l5dd2Zj9c5iLQUOV5nRGVUj7Fyi7kAuFL+4SVcIbI
VJVFgtHKsSwKYRPMqb7iTtDnTHkwZNTtQQ1Ptpe+TUH5ngTmqNn5EQVeoUtG6HUi
0xEaA2MrhoA3VkHk1zFfzStlR8g1YsX5ajBq7luotYY+i3cC4oHLlrvIx9zI7elI
WMc6Wnn7ozetwehYROVXvgsV+wiqC4D7yWGsnFyhqaS1QakwIM0=
=VEdn
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"In this release there are a significant amounts of consolidations and
cleanups in the log code; restructuring of the log to issue struct
bios directly; new bulkstat ioctls to return v5 fs inode information
(and fix all the padding problems of the old ioctl); the beginnings of
multithreaded inode walks (e.g. quotacheck); and a reduction in memory
usage in the online scrub code leading to reduced runtimes.
- Refactor inode geometry calculation into a single structure instead
of open-coding pieces everywhere.
- Add online repair to build options.
- Remove unnecessary function call flags and functions.
- Claim maintainership of various loose xfs documentation and header
files.
- Use struct bio directly for log buffer IOs instead of struct
xfs_buf.
- Reduce log item boilerplate code requirements.
- Merge log item code spread across too many files.
- Further distinguish between log item commits and cancellations.
- Various small cleanups to the ag small allocator.
- Support cgroup-aware writeback
- libxfs refactoring for mkfs cleanup
- Remove unneeded #includes
- Fix a memory allocation miscalculation in the new log bio code
- Fix bisection problems
- Fix a crash in ioend processing caused by tripping over freeing of
preallocated transactions
- Split out a generic inode walk mechanism from the bulkstat code,
hook up all the internal users to use the walking code, then clean
up bulkstat to serve only the bulkstat ioctls.
- Add a multithreaded iwalk implementation to speed up quotacheck on
fast storage with many CPUs.
- Remove unnecessary return values in logging teardown functions.
- Supplement the bstat and inogrp structures with new bulkstat and
inumbers structures that have all the fields we need for v5
filesystem features and none of the padding problems of their
predecessors.
- Wire up new ioctls that use the new structures with a much simpler
bulk_ireq structure at the head instead of the pointerhappy mess we
had before.
- Enable userspace to constrain bulkstat returns to a single AG or a
single special inode so that we can phase out a lot of geometry
guesswork in userspace.
- Reduce memory consumption and zeroing overhead in extended
attribute scrub code.
- Fix some behavioral regressions in the new bulkstat backend code.
- Fix some behavioral regressions in the new log bio code"
* tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (100 commits)
xfs: chain bios the right way around in xfs_rw_bdev
xfs: bump INUMBERS cursor correctly in xfs_inumbers_walk
xfs: don't update lastino for FSBULKSTAT_SINGLE
xfs: online scrub needn't bother zeroing its temporary buffer
xfs: only allocate memory for scrubbing attributes when we need it
xfs: refactor attr scrub memory allocation function
xfs: refactor extended attribute buffer pointer functions
xfs: attribute scrub should use seen_enough to pass error values
xfs: allow single bulkstat of special inodes
xfs: specify AG in bulk req
xfs: wire up the v5 inumbers ioctl
xfs: wire up new v5 bulkstat ioctls
xfs: introduce v5 inode group structure
xfs: introduce new v5 bulkstat structure
xfs: rename bulkstat functions
xfs: remove various bulk request typedef usage
fs: xfs: xfs_log: Change return type from int to void
xfs: poll waiting for quotacheck
xfs: multithreaded iwalk implementation
xfs: refactor INUMBERS to use iwalk functions
...
Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups. Because of this, there is going
to be some merge issues with your tree at the moment, I'll follow up
with the expected resolutions to make it easier for you.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups (will cause build warnings
with s390 and coresight drivers in your tree)
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse
easier due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of merge
issues that Stephen has been patient with me for. Other than the merge
issues, functionality is working properly in linux-next :)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
LpRyb3zX29oChFaZkc5a
=XrEZ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
Report separate components (anon, file, and shmem) for PSS in
smaps_rollup.
This helps understand and tune the memory manager behavior in consumer
devices, particularly mobile devices. Many of them (e.g. chromebooks and
Android-based devices) use zram for anon memory, and perform disk reads
for discarded file pages. The difference in latency is large (e.g.
reading a single page from SSD is 30 times slower than decompressing a
zram page on one popular device), thus it is useful to know how much of
the PSS is anon vs. file.
All the information is already present in /proc/pid/smaps, but much more
expensive to obtain because of the large size of that procfs entry.
This patch also removes a small code duplication in smaps_account, which
would have gotten worse otherwise.
Also updated Documentation/filesystems/proc.txt (the smaps section was a
bit stale, and I added a smaps_rollup section) and
Documentation/ABI/testing/procfs-smaps_rollup.
[semenzato@chromium.org: v5]
Link: http://lkml.kernel.org/r/20190626234333.44608-1-semenzato@chromium.org
Link: http://lkml.kernel.org/r/20190626180429.174569-1-semenzato@chromium.org
Signed-off-by: Luigi Semenzato <semenzato@chromium.org>
Acked-by: Yu Zhao <yuzhao@chromium.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Yu Zhao <yuzhao@chromium.org>
Cc: Brian Geffon <bgeffon@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Add a new /proc/fs/nfsd/clients/ directory which exposes some
long-requested information about NFSv4 clients (like open files) and
allows forced revocation of client state.
- Replace the global duplicate reply cache by a cache per network
namespace; previously, a request in one network namespace could
incorrectly match an entry from another, though we haven't seen this
in production. This is the last remaining container bug that I'm
aware of; at this point you should be able to run separate nfsd's in
each network namespace, each with their own set of exports, and
everything should work.
- Cleanup and modify lock code to show the pid of lockd as the owner of
NLM locks. This is the correct version of the bugfix originally
attempted in b8eee0e90f "lockd: Show pid of lockd for remote locks".
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl0mX+YVHGJmaWVsZHNA
ZmllbGRzZXMub3JnAAoJECebzXlCjuG+EoYQAIbNV7tqpnWRk19ulxveif9zRLMV
ImW99rNhzjfLoIBBTclncCrU1+b2VHqlVGYvml+rdsl+fUCESj2m9/P+D70WHDsl
tk2NJoXkSe1tW4G3YltRfSNNQIsUsEGRa88/4gAT0vYA2OCFDpYzrMleENISQFTp
QQ+p1ct5tofTZbelx5KqdFnLRnQlUeykJbW68/YKIdtNF+nhq07LlvpVKjy4f3MB
rK93qn9YUtnNKldkrP2tWjiPAnzJFiX9XFRPLo2JCv13G28XhhuNp2PmWqsVoY+/
8YMfXY9C028YbrHG9ebwH197XcY1p6ROBZhRxGczEmiSrAHLap8rNGjyYk6+4eO9
5HAFUQJcFEA1NUD84kpUKNZs9PIi818IgI5FhuJrcCKt8OAeyNJaOo0YU3EhzND2
/iPt+FCBlJwEwXI9WSjZiyW3OFKuvCZZk99iN2s33X0dNqMSrkQVe4AmHm7vYlzF
KD0pthVaOwAA9sHua5MSTpi5LHH/IBdWU49NoCgzK277w8xi05oI6ZkYFJQ9hncV
PIWtmmW1b3uHF95s6Ko7mSU7GLEWB9Ux6B1sfOVNgMETK4i2z0ezUDJ+Hp9RSDcJ
iHrU3kaGZ60uq3HPwunlhOYuSDt5sew5GIpNdheGoLOjuhySK7ZBwFuvupqZKC7H
4nxqlrHVI4B8FOAH
=pAAs
-----END PGP SIGNATURE-----
Merge tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Highlights:
- Add a new /proc/fs/nfsd/clients/ directory which exposes some
long-requested information about NFSv4 clients (like open files)
and allows forced revocation of client state.
- Replace the global duplicate reply cache by a cache per network
namespace; previously, a request in one network namespace could
incorrectly match an entry from another, though we haven't seen
this in production. This is the last remaining container bug that
I'm aware of; at this point you should be able to run separate
nfsd's in each network namespace, each with their own set of
exports, and everything should work.
- Cleanup and modify lock code to show the pid of lockd as the owner
of NLM locks. This is the correct version of the bugfix originally
attempted in b8eee0e90f ("lockd: Show pid of lockd for remote
locks")"
* tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits)
nfsd: Make __get_nfsdfs_client() static
nfsd: Make two functions static
nfsd: Fix misuse of strlcpy
sunrpc/cache: remove the exporting of cache_seq_next
nfsd: decode implementation id
nfsd: create xdr_netobj_dup helper
nfsd: allow forced expiration of NFSv4 clients
nfsd: create get_nfsdfs_clp helper
nfsd4: show layout stateids
nfsd: show lock and deleg stateids
nfsd4: add file to display list of client's opens
nfsd: add more information to client info file
nfsd: escape high characters in binary data
nfsd: copy client's address including port number to cl_addr
nfsd4: add a client info file
nfsd: make client/ directory names small ints
nfsd: add nfsd/clients directory
nfsd4: use reference count to free client
nfsd: rename cl_refcount
nfsd: persist nfsd filesystem across mounts
...
- Preparations for supporting encryption on ext4 filesystems where the
filesystem block size is smaller than PAGE_SIZE.
- Don't allow setting encryption policies on dead directories.
- Various cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXSNh5xQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK2GPAQDIGnAJ557jCpI23QCWLbCuF1cfEk8J
i+sGdLjx/7SVewD9F1AbNPkm1u6sF0XN7NpGXMk6nU4HDYUXxAYr2RGDYQQ=
=2vas
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
- Preparations for supporting encryption on ext4 filesystems where the
filesystem block size is smaller than PAGE_SIZE.
- Don't allow setting encryption policies on dead directories.
- Various cleanups.
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: document testing with xfstests
fscrypt: remove selection of CONFIG_CRYPTO_SHA256
fscrypt: remove unnecessary includes of ratelimit.h
fscrypt: don't set policy for a dead directory
ext4: encrypt only up to last block in ext4_bio_write_page()
ext4: decrypt only the needed block in __ext4_block_zero_page_range()
ext4: decrypt only the needed blocks in ext4_block_write_begin()
ext4: clear BH_Uptodate flag on decryption error
fscrypt: decrypt only the needed blocks in __fscrypt_decrypt_bio()
fscrypt: support decrypting multiple filesystem blocks per page
fscrypt: introduce fscrypt_decrypt_block_inplace()
fscrypt: handle blocksize < PAGE_SIZE in fscrypt_zeroout_range()
fscrypt: support encrypting multiple filesystem blocks per page
fscrypt: introduce fscrypt_encrypt_block_inplace()
fscrypt: clean up some BUG_ON()s in block encryption/decryption
fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block()
fscrypt: remove the "write" part of struct fscrypt_ctx
fscrypt: simplify bounce page handling
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl0mED8ACgkQnJ2qBz9k
QNnZ9wgAmC+eP8m6jB38HM7gZ+fWGEX3+FvnjbMbnXmNJTsnYWYC1VIRZhwKZb4b
42OGinfLq5tZMY/whrFBdB/c4UbVhAMhd1aFTpM2n5A6FR12YZxaLZEC+MLy3T7z
VU8m4uWDn80OvlUByo4Bylh+Icj78m8tLgj8SHSWxoh/DlGVKSLj9OKufV9Laens
YxubcUxE5sEEu8IVQen84283oJoizmeQf+f9yogAKIaskDLBzxqBIZwEACEUUchz
kEWRiHwS+Ou8EUHuwXqdKKksQgoLHEdxz2szYK1xSQ1wPmxMKPG5DqbQZv2QUBD0
Ek5T5YP4Tmph4s14n+jKDhakAJcqIQ==
=HWaa
-----END PGP SIGNATURE-----
Merge tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, udf and quota updates from Jan Kara:
- some ext2 fixes and cleanups
- a fix of udf bug when extending files
- a fix of quota Q_XGETQSTAT[V] handling
* tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix incorrect final NOT_ALLOCATED (hole) extent length
ext2: Use kmemdup rather than duplicating its implementation
quota: honor quota type in Q_XGETQSTAT[V] calls
ext2: Always brelse bh on failure in ext2_iget()
ext2: add missing brelse() in ext2_iget()
ext2: Fix a typo in ext2_getattr argument
ext2: fix a typo in comment
ext2: add missing brelse() in ext2_new_inode()
ext2: optimize ext2_xattr_get()
ext2: introduce new helper for xattr entry comparison
ext2: merge xattr next entry check to ext2_xattr_entry_valid()
ext2: code cleanup for ext2_preread_inode()
ext2: code cleanup by using test_opt() and clear_opt()
doc: ext2: update description of quota options for ext2
ext2: Strengthen xattr block checks
ext2: Merge loops in ext2_xattr_set()
ext2: introduce helper for xattr entry validation
ext2: introduce helper for xattr header validation
quota: add dqi_dirty_list description to comment of Dquot List Management
- A fair pile of RST conversions, many from Mauro. These create more
than the usual number of simple but annoying merge conflicts with other
trees, unfortunately. He has a lot more of these waiting on the wings
that, I think, will go to you directly later on.
- A new document on how to use merges and rebases in kernel repos, and one
on Spectre vulnerabilities.
- Various improvements to the build system, including automatic markup of
function() references because some people, for reasons I will never
understand, were of the opinion that :c:func:``function()`` is
unattractive and not fun to type.
- We now recommend using sphinx 1.7, but still support back to 1.4.
- Lots of smaller improvements, warning fixes, typo fixes, etc.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl0krAEPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5Yg98H/AuLqO9LpOgUjF4LhyjxGPdzJkY9RExSJ7km
gznyreLCZgFaJR+AY6YDsd4Jw6OJlPbu1YM/Qo3C3WrZVFVhgL/s2ebvBgCo50A8
raAFd8jTf4/mGCHnAqRotAPQ3mETJUk315B66lBJ6Oc+YdpRhwXWq8ZW2bJxInFF
3HDvoFgMf0KhLuMHUkkL0u3fxH1iA+KvDu8diPbJYFjOdOWENz/CV8wqdVkXRSEW
DJxIq89h/7d+hIG3d1I7Nw+gibGsAdjSjKv4eRKauZs4Aoxd1Gpl62z0JNk6aT3m
dtq4joLdwScydonXROD/Twn2jsu4xYTrPwVzChomElMowW/ZBBY=
=D0eO
-----END PGP SIGNATURE-----
Merge tag 'docs-5.3' of git://git.lwn.net/linux
Pull Documentation updates from Jonathan Corbet:
"It's been a relatively busy cycle for docs:
- A fair pile of RST conversions, many from Mauro. These create more
than the usual number of simple but annoying merge conflicts with
other trees, unfortunately. He has a lot more of these waiting on
the wings that, I think, will go to you directly later on.
- A new document on how to use merges and rebases in kernel repos,
and one on Spectre vulnerabilities.
- Various improvements to the build system, including automatic
markup of function() references because some people, for reasons I
will never understand, were of the opinion that
:c:func:``function()`` is unattractive and not fun to type.
- We now recommend using sphinx 1.7, but still support back to 1.4.
- Lots of smaller improvements, warning fixes, typo fixes, etc"
* tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
docs: automarkup.py: ignore exceptions when seeking for xrefs
docs: Move binderfs to admin-guide
Disable Sphinx SmartyPants in HTML output
doc: RCU callback locks need only _bh, not necessarily _irq
docs: format kernel-parameters -- as code
Doc : doc-guide : Fix a typo
platform: x86: get rid of a non-existent document
Add the RCU docs to the core-api manual
Documentation: RCU: Add TOC tree hooks
Documentation: RCU: Rename txt files to rst
Documentation: RCU: Convert RCU UP systems to reST
Documentation: RCU: Convert RCU linked list to reST
Documentation: RCU: Convert RCU basic concepts to reST
docs: filesystems: Remove uneeded .rst extension on toctables
scripts/sphinx-pre-install: fix out-of-tree build
docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
Documentation: PGP: update for newer HW devices
Documentation: Add section about CPU vulnerabilities for Spectre
Documentation: platform: Delete x86-laptop-drivers.txt
docs: Note that :c:func: should no longer be used
...
Pull cgroup updates from Tejun Heo:
"Documentation updates and the addition of cgroup_parse_float() which
will be used by new controllers including blk-iocost"
* 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs: cgroup-v1: convert docs to ReST and rename to *.rst
cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS
cgroup: add cgroup_parse_float()
The documentation is more appropriate for the administrator than for
the internal kernel API section it is currently in.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
After the update to use nlm_lockowners for the NLM server, there are no
more users of lm_compare_owner and lm_owner_key.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch allows fallocate to allocate physical blocks for pinned file.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We need to derive the mount pointer from a buffer in a lot of place.
Add a direct pointer to short cut the pointer chasing.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Document how to test ext4, f2fs, and ubifs encryption with xfstests.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
fscrypt only uses SHA-256 for AES-128-CBC-ESSIV, which isn't the default
and is only recommended on platforms that have hardware accelerated
AES-CBC but not AES-XTS. There's no link-time dependency, since SHA-256
is requested via the crypto API on first use.
To reduce bloat, we should limit FS_ENCRYPTION to selecting the default
algorithms only. SHA-256 by itself isn't that much bloat, but it's
being discussed to move ESSIV into a crypto API template, which would
incidentally bring in other things like "authenc" support, which would
all end up being built-in since FS_ENCRYPTION is now a bool.
For Adiantum encryption we already just document that users who want to
use it have to enable CONFIG_CRYPTO_ADIANTUM themselves. So, let's do
the same for AES-128-CBC-ESSIV and CONFIG_CRYPTO_SHA256.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Clearify d_make_root() usage, error handling and cleanup
requirements.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
There's no need to use a .rst on Sphinx toc tables. As most of
the Documentation don't use, remove the remaing occurrences.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Convert the cgroup-v1 files to ReST format, in order to
allow a later addition to the admin-guide.
The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
The conversion is actually:
- add blank lines and indentation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlz8fAYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1asH/3ySguxqtqL1MCBa
4/SZ37PHeWKMerfX6ZyJdgEqK3B+PWlmuLiOMNK5h2bPLzeQQQAmHU/mfKmpXqgB
dHwUbG9yNnyUtTfsfRqAnCA6vpuw9Yb1oIzTCVQrgJLSWD0j7scBBvmzYqguOkto
ThwigLUq3AILr8EfR4rh+GM+5Dn9OTEFAxwil9fPHQo7QoczwZxpURhScT6Co9TB
DqLA3fvXbBvLs/CZy/S5vKM9hKzC+p39ApFTURvFPrelUVnythAM0dPDJg3pIn5u
g+/+gDxDFa+7ANxvxO2ng1sJPDqJMeY/xmjJYlYyLpA33B7zLNk2vDHhAP06VTtr
XCMhQ9s=
=cb80
-----END PGP SIGNATURE-----
Merge tag 'v5.2-rc4' into mauro
We need to pick up post-rc1 changes to various document files so they don't
get lost in Mauro's massive RST conversion push.
A recent documentation conversion renamed this file but forgot
to update the links.
Fixes: af96c1e304 ("docs: filesystems: vfs: Convert vfs.txt to RST")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
In "Y+P" of this line, there are two non-ASCII characters(0xd9 0x8d)
following behind the 'Y'. Shown as a small '=' under the '+' in VIM
and a '賺' in webpage[1].
I think it's a mistake and remove these strange characters.
[1]: https://www.kernel.org/doc/Documentation/filesystems/xfs-delayed-logging-design.txt
Signed-off-by: Shiyang Ruan <ruansy.fnst@cn.fujitsu.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently vfs.rst does not render well into HTML the method descriptions
for VFS data structures. We can improve the HTML output by putting the
description string on a new line following the method name.
Suggested-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This extends the checkpoint option to allow checkpoint=disable:%u[%]
This allows you to specify what how much of the disk you are willing
to lose access to while mounting with checkpoint=disable. If the amount
lost would be higher, the mount will return -EAGAIN. This can be given
as a percent of total space, or in blocks.
Currently, we need to run garbage collection until the amount of holes
is smaller than the OVP space. With the new option, f2fs can mark
space as unusable up front instead of requiring garbage collection until
the number of holes is small enough.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The single user of debugfs_create_u32_array() does not care about the
return value of it, so make it return void as there is no need to do
anything with the return value.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch cleans up documentation to cover missing sysfs entries.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
vfs.txt is currently stale. If we convert it to RST this is a good
first step in the process of getting the VFS documentation up to date.
This patch does the following (all as a single patch so as not to
introduce any new SPHINX build warnings)
- Use '.. code-block:: c' for C code blocks and indent the code blocks.
- Use double backticks for struct member descriptions.
- Fix a couple of build warnings by guarding pointers (*) with double
backticks .e.g ``*ptr``.
- Add vfs to Documentation/filesystems/index.rst
The member descriptions paragraph indentation was not touched. It is
not pretty but these do not cause build warnings. These descriptions
all need updating anyways so leave it as it is for now.
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
There are bunch of places with 8 spaces, in preparation for correctly
indenting all code snippets (during conversion to RST) change these to
use tabspaces.
This patch is whitespace only.
Convert instances of 8 consecutive spaces to a single tabspace.
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently file pre-amble contains custom indentation. RST is not going
to like this, lets left-align the text. Put the copyright notices in a
list in preparation for converting document to RST.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently the licence is indicated via a custom string. We have SPDX
license identifiers now for this task.
Use SPDX license identifier matching current license string.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Kernel RST has a preferred heading adornment scheme. Currently all the
heading adornments follow this scheme except the document heading.
Use correct heading adornment for initial heading.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently spacing before and after headings is non-uniform. Use two
blank lines before a heading and one after the heading.
Use uniform spacing around headings.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
In preparation for conversion to RST format use the kernels favoured
documentation column width. If we are going to do this we might as well
do it thoroughly. Just do the paragraphs (not the indented stuff), the
rest will be done during indentation fix up patch.
This patch is whitespace only, no textual changes.
Use 72 character column width for all paragraph sections.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently sometimes document has a single space after a period and
sometimes it has double. Whichever we use it should be uniform.
Use double space after period, be uniform.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently the file has a bunch of spaces before tabspaces. This is a
nuisance when patching the file because they show up whenever we touch
these lines. Let's just fix them all now in preparation for doing the
RST conversion.
Remove spaces before tabspaces.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
ext2 support user/group disk quota by specifying
usrquota/grpquota option on mount, so fix the
description in the doc properly.
Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn>
Signed-off-by: Jan Kara <jack@suse.cz>
Add a description of the "ignore" pseudo mount option that can be used
to provide a generic indicator to applications that the mount entry
should be ignored when displaying mount information.
Link: http://lkml.kernel.org/r/155287084617.12593.812733161112154904.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Describe AUTOFS_EXP_FORCED in addition to AUTOFS_EXP_IMMEDIATE in the
description of the AUTOFS_DEV_IOCTL_EXPIRE_CMD ioctl.
Link: http://lkml.kernel.org/r/155287084078.12593.15000931045413195778.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update the description of AUTOFS_EXP_LEAVES to cover its possible future
use with amd format mount maps.
Link: http://lkml.kernel.org/r/155287083538.12593.18163159677020718048.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A "strictexpire" mount option has been added to the autofs file system.
It is meant to be used in cases where a GUI continually accesses or an
application frquently scans an automount directory tree causing an
accumulation of otherwise unused mounts.
Link: http://lkml.kernel.org/r/155287083000.12593.2722713092537666885.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alter a few word usages in Documentation/filesystems/autofs.txt and
correct some spelling mistakes.
Link: http://lkml.kernel.org/r/155287082394.12593.6506084453911662450.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull misc vfs updates from Al Viro:
"Assorted stuff, with no common topic whatsoever..."
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
libfs: document simple_get_link()
Documentation/filesystems/Locking: fix ->get_link() prototype
Documentation/filesystems/vfs.txt: document how ->i_link works
Documentation/filesystems/vfs.txt: remove bogus "Last updated" date
fs: use timespec64 in relatime_need_update
fs/block_dev.c: remove unused include
Pull misc dcache updates from Al Viro:
"Most of this pile is putting name length into struct name_snapshot and
making use of it.
The beginning of this series ("ovl_lookup_real_one(): don't bother
with strlen()") ought to have been split in two (separate switch of
name_snapshot to struct qstr from overlayfs reaping the trivial
benefits of that), but I wanted to avoid a rebase - by the time I'd
spotted that it was (a) in -next and (b) close to 5.1-final ;-/"
* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
audit_compare_dname_path(): switch to const struct qstr *
audit_update_watch(): switch to const struct qstr *
inotify_handle_event(): don't bother with strlen()
fsnotify: switch send_to_group() and ->handle_event to const struct qstr *
fsnotify(): switch to passing const struct qstr * for file_name
switch fsnotify_move() to passing const struct qstr * for old_name
ovl_lookup_real_one(): don't bother with strlen()
sysv: bury the broken "quietly truncate the long filenames" logics
nsfs: unobfuscate
unexport d_alloc_pseudo()
Here is the "big" set of driver core patches for 5.2-rc1
There are a number of ACPI patches in here as well, as Rafael said they
should go through this tree due to the driver core changes they
required. They have all been acked by the ACPI developers.
There are also a number of small subsystem-specific changes in here, due
to some changes to the kobject core code. Those too have all been acked
by the various subsystem maintainers.
As for content, it's pretty boring outside of the ACPI changes:
- spdx cleanups
- kobject documentation updates
- default attribute groups for kobjects
- other minor kobject/driver core fixes
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXNHDbw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynDAgCfbb4LBR6I50wFXb8JM/R6cAS7qrsAn1unshKV
8XCYcif2RxjtdJWXbjdm
=/rLh
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core/kobject updates from Greg KH:
"Here is the "big" set of driver core patches for 5.2-rc1
There are a number of ACPI patches in here as well, as Rafael said
they should go through this tree due to the driver core changes they
required. They have all been acked by the ACPI developers.
There are also a number of small subsystem-specific changes in here,
due to some changes to the kobject core code. Those too have all been
acked by the various subsystem maintainers.
As for content, it's pretty boring outside of the ACPI changes:
- spdx cleanups
- kobject documentation updates
- default attribute groups for kobjects
- other minor kobject/driver core fixes
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (47 commits)
kobject: clean up the kobject add documentation a bit more
kobject: Fix kernel-doc comment first line
kobject: Remove docstring reference to kset
firmware_loader: Fix a typo ("syfs" -> "sysfs")
kobject: fix dereference before null check on kobj
Revert "driver core: platform: Fix the usage of platform device name(pdev->name)"
init/config: Do not select BUILD_BIN2C for IKCONFIG
Provide in-kernel headers to make extending kernel easier
kobject: Improve doc clarity kobject_init_and_add()
kobject: Improve docs for kobject_add/del
driver core: platform: Fix the usage of platform device name(pdev->name)
livepatch: Replace klp_ktype_patch's default_attrs with groups
cpufreq: schedutil: Replace default_attrs field with groups
padata: Replace padata_attr_type default_attrs field with groups
irqdesc: Replace irq_kobj_type's default_attrs field with groups
net-sysfs: Replace ktype default_attrs field with groups
block: Replace all ktype default_attrs with groups
samples/kobject: Replace foo_ktype's default_attrs field with groups
kobject: Add support for default attribute groups to kobj_type
driver core: Postpone DMA tear-down until after devres release for probe failure
...
Pull vfs stable fodder fixes from Al Viro:
- acct_on() fix for deadlock caught by overlayfs folks
- autofs RCU use-after-free SNAFU (->d_manage() can be called
locklessly, so we need to RCU-delay freeing the objects it looks at)
- (hopefully) the end of "do we need freeing this dentry RCU-delayed"
whack-a-mole.
* 'stable-fodder' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
autofs: fix use-after-free in lockless ->d_manage()
dcache: sort the freeing-without-RCU-delay mess for good.
acct_on(): don't mess with freeze protection
A lot of ->destroy_inode() instances end with call_rcu() of a callback
that does RCU-delayed part of freeing. Introduce a new method for
doing just that, with saner signature.
Rules:
->destroy_inode ->free_inode
f g immediate call of f(),
RCU-delayed call of g()
f NULL immediate call of f(),
no RCU-delayed calls
NULL g RCU-delayed call of g()
NULL NULL RCU-delayed default freeing
IOW, NULL ->free_inode gives the same behaviour as now.
Note that NULL, NULL is equivalent to NULL, free_inode_nonrcu; we could
mandate the latter form, but that would have very little benefit beyond
making rules a bit more symmetric. It would break backwards compatibility,
require extra boilerplate and expected semantics for (NULL, NULL) pair
would have no use whatsoever...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This file has actually been updated over 100 times since the claimed
"Last updated" date.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since commit ff9fb72bc0 ("debugfs: return error values, not NULL")
these helper functions do not return NULL anymore (with the exception
of debugfs_create_u32_array()).
Fixes: ff9fb72bc0 ("debugfs: return error values, not NULL")
Signed-off-by: Ronald Tschalär <ronald@innovation.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No modular uses since introducion of alloc_file_pseudo(),
and the only non-modular user not in alloc_file_pseudo()
had actually been wrong - should've been d_alloc_anon().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For lockless accesses to dentries we don't have pinned we rely
(among other things) upon having an RCU delay between dropping
the last reference and actually freeing the memory.
On the other hand, for things like pipes and sockets we neither
do that kind of lockless access, nor want to deal with the
overhead of an RCU delay every time a socket gets closed.
So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags
made sure it would happen. We tried to avoid setting it unless
we knew we need it. Unfortunately, that had led to recurring
class of bugs, in which we missed the need to set it.
We only really need it for dentries that are created by
d_alloc_pseudo(), so let's not bother with trying to be smart -
just make having an RCU delay the default. The ones that do
*not* get it set the replacement flag (DCACHE_NORCU) and we'd
better use that sparingly. d_alloc_pseudo() is the only
such user right now.
FWIW, the race that finally prompted that switch had been
between __lock_parent() of immediate subdirectory of what's
currently the root of a disconnected tree (e.g. from
open-by-handle in progress) racing with d_splice_alias()
elsewhere picking another alias for the same inode, either
on outright corrupted fs image, or (in case of open-by-handle
on NFS) that subdirectory having been just moved on server.
It's not easy to hit, so the sky is not falling, but that's
not the first race on similar missed cases and the logics
for settinf DCACHE_RCUACCESS has gotten ridiculously
convoluted.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Update the mount API docs to reflect recent changes to the code.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlyMMnkACgkQiiy9cAdy
T1ElsAv/YV7vKbDgJOQfb925LbHqaythYQf8Z9CLwJdjW96k0pNP0bB8KPgw/4dE
t0Z1rzEoS7X7A1mh52tUUWEa1ygeOekMankJZtXzkMe2rl9m846jO/ynUDB0CFlE
5OuRdFpjSMlTdHIRw8F5GTBwO8PM/MYWvoNyO9+foJp+Z/rFtTtrPuAcJvr3NP/O
vyOXXVZ+xbqWYe1s/WGzk04Fzm6gB5V0BQyUZmmf3jZen+5vmDKRa2QMlqk0tt5O
DDZYj8utkgSGtEapWPWzgWU9gIWNSN5GdeKprIGLwESKxMrGrZiZDErpHDzwPKJX
MMPlZVvpU7BYtnMQCe82EQ74Nu/YDcMCCQjnaQDWcbQVEM/bt7Z4RXVEFcVsFO9s
aXwK3iRYYjLcIxuBxM3NWeZMPa5C4u6rCMjDNp91oKm5OZtJrZmB4JOHGwoeVYEF
pJZhT/txmuws828qLmuVCh9IOKouzRH3UxZ/PBKMEtnix9rX7juqSaHCh8pxlW+1
3vQdxnx2
=dG+z
-----END PGP SIGNATURE-----
Merge tag '5.1-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb3 updates from Steve French:
"Various tracing and debugging improvements, crediting fixes, some
cleanup, and important fallocate fix (fixes three xfstests) and lock
fix.
Summary:
- Various additional dynamic tracing tracepoints
- Debugging improvements (including ability to query the server via
SMB3 fsctl from userspace tools which can help with stats and
debugging)
- One minor performance improvement (root directory inode caching)
- Crediting (SMB3 flow control) fixes
- Some cleanup (docs and to mknod)
- Important fixes: one to smb3 implementation of fallocate zero range
(which fixes three xfstests) and a POSIX lock fix"
* tag '5.1-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6: (22 commits)
CIFS: fix POSIX lock leak and invalid ptr deref
SMB3: Allow SMB3 FSCTL queries to be sent to server from tools
cifs: fix incorrect handling of smb2_set_sparse() return in smb3_simple_falloc
smb2: fix typo in definition of a few error flags
CIFS: make mknod() an smb_version_op
cifs: minor documentation updates
cifs: remove unused value pointed out by Coverity
SMB3: passthru query info doesn't check for SMB3 FSCTL passthru
smb3: add dynamic tracepoints for simple fallocate and zero range
cifs: fix smb3_zero_range so it can expand the file-size when required
cifs: add SMB2_ioctl_init/free helpers to be used with compounding
smb3: Add dynamic trace points for various compounded smb3 ops
cifs: cache FILE_ALL_INFO for the shared root handle
smb3: display volume serial number for shares in /proc/fs/cifs/DebugData
cifs: simplify how we handle credits in compound_send_recv()
smb3: add dynamic tracepoint for timeout waiting for credits
smb3: display security information in /proc/fs/cifs/DebugData more accurately
cifs: add a timeout argument to wait_for_free_credits
cifs: prevent starvation in wait_for_free_credits for multi-credit requests
cifs: wait_for_free_credits() make it possible to wait for >=1 credits
...
We've continued mainly to fix bugs in this round, as f2fs has been shipped
in more devices. Especially, we've focused on stabilizing checkpoint=disable
feature, and provided some interfaces for QA.
Enhancement:
- expose FS_NOCOW_FL for pin_file
- run discard jobs at unmount time with timeout
- tune discarding thread to avoid idling which consumes power
- some checking codes to address vulnerabilities
- give random value to i_generation
- shutdown with more flags for QA
Bug fix:
- clean up stale objects when mount is failed along with checkpoint=disable
- fix system being stuck due to wrong count by atomic writes
- handle some corrupted disk cases
- fix a deadlock in f2fs_read_inline_dir
We've also added some minor build errors and clean-up patches.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAlyKk4YACgkQQBSofoJI
UNIMVw//Rb3nmbQkMW/86DxtHDxuS8GEJmle0DiHeFMHgwy0ET0uZs9/AEfmuejC
95cXnF44QfVaFwkOXCK6aKXJXwN0+ZS0YvV/gPE8lgU6sdQhJBox5DC+rx+OwFq5
rZiF8qvE8iyM9Xt+RfMBGufzUb+LKBz0ozQFZpKJiNTBBf5vpeqMYASEEfxiEmZz
GvvUNSBRw39OB5zTl5l2hnoNqkoFu6XHnf4f9+DnraVi8SuQzj6hdqsx0nYTHfLi
Rax8kA4HUwoVgjhaLLXFbbhWIQ83bcZ0cj6wq7Lr7NbbIi7bKYP6sxtKjbe2Fuql
m9Chm2LIvD1BfJnjdTk2krqY7Z4bX/4gmXukno/8X/cjWkpBV6HFWS73iTgrJjU2
d8kBFXwlIn+JlATSjsTtdfvKkTwxUhaGw1bBA96Am4c5tLQyOqyYWcfQA/tam/v4
dM9EQX5ZeRb6NXDeIxkXNfTSpDRnqlhJsTV5aK8qporyF1RkKVbyCpSt1P4q3KO5
UwsGZLFAVMzFaUVfyIS7dR5QVczQUTCH4g0yFNpBMvF8epOA4+jbYxQeGZfqFK3H
mTC/Ba+VWWdYW2pZRNc9TnBsHg/xadMJq7EQb/ykGBe6JZJfB0wREj4LSr1lGK9a
cU8JFGyqg1Rt/uRP0bb5IIec1YVton3Lq8ND9VZPNcV/mS5Gehg=
=9BoH
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"We've continued mainly to fix bugs in this round, as f2fs has been
shipped in more devices. Especially, we've focused on stabilizing
checkpoint=disable feature, and provided some interfaces for QA.
Enhancements:
- expose FS_NOCOW_FL for pin_file
- run discard jobs at unmount time with timeout
- tune discarding thread to avoid idling which consumes power
- some checking codes to address vulnerabilities
- give random value to i_generation
- shutdown with more flags for QA
Bug fixes:
- clean up stale objects when mount is failed along with
checkpoint=disable
- fix system being stuck due to wrong count by atomic writes
- handle some corrupted disk cases
- fix a deadlock in f2fs_read_inline_dir
We've also added some minor build error fixes and clean-up patches"
* tag 'f2fs-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (53 commits)
f2fs: set pin_file under CAP_SYS_ADMIN
f2fs: fix to avoid deadlock in f2fs_read_inline_dir()
f2fs: fix to adapt small inline xattr space in __find_inline_xattr()
f2fs: fix to do sanity check with inode.i_inline_xattr_size
f2fs: give some messages for inline_xattr_size
f2fs: don't trigger read IO for beyond EOF page
f2fs: fix to add refcount once page is tagged PG_private
f2fs: remove wrong comment in f2fs_invalidate_page()
f2fs: fix to use kvfree instead of kzfree
f2fs: print more parameters in trace_f2fs_map_blocks
f2fs: trace f2fs_ioc_shutdown
f2fs: fix to avoid deadlock of atomic file operations
f2fs: fix to dirty inode for i_mode recovery
f2fs: give random value to i_generation
f2fs: no need to take page lock in readdir
f2fs: fix to update iostat correctly in IPU path
f2fs: fix encrypted page memory leak
f2fs: make fault injection covering __submit_flush_wait()
f2fs: fix to retry fill_super only if recovery failed
f2fs: silence VM_WARN_ON_ONCE in mempool_alloc
...
- rbd will now ignore discards that aren't aligned and big enough to
actually free up some space (myself). This is controlled by the new
alloc_size map option and can be disabled if needed.
- support for rbd deep-flatten feature (myself). Deep-flatten allows
"rbd flatten" to fully disconnect the clone image and its snapshots
from the parent and make the parent snapshot removable.
- a new round of cap handling improvements (Zheng Yan). The kernel
client should now be much more prompt about releasing its caps and
it is possible to put a limit on the number of caps held.
- support for getting ceph.dir.pin extended attribute (Zheng Yan)
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAlyH5LUTHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi9cCCACb8PiX+PZWuwboAmO66TIQGT8VgEer
/K3zU6UsmnKHldk/gyjK+ESIxX64zP9HrNGTDxlDKZTB52GDiAYbhcBnskMtrtgl
EFLweTRs6XiHI1yV3qmElyPz0eLnWBXLUW6RDoyHxGUPWuGk9Mp4Of+PSkl2aO/9
j4eBQj7FYB6XAuzwFKltFq3uKb+jODDrW7VRDDTMEYGPHZOU6EXXUEUOrAtAreiU
j9wHF2AZ61WdVjzzXF/tBHJIwGGZj8102Af4ra/UMuHmtGZag6n0eY6uzGXluY2o
uGPuhFHMExsqjhCCPHtayWJW7WG0pQKKuwT8Ucw/KPBJ6Ok3Z2tG27/8
=sQNQ
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The highlights are:
- rbd will now ignore discards that aren't aligned and big enough to
actually free up some space (myself). This is controlled by the new
alloc_size map option and can be disabled if needed.
- support for rbd deep-flatten feature (myself). Deep-flatten allows
"rbd flatten" to fully disconnect the clone image and its snapshots
from the parent and make the parent snapshot removable.
- a new round of cap handling improvements (Zheng Yan). The kernel
client should now be much more prompt about releasing its caps and
it is possible to put a limit on the number of caps held.
- support for getting ceph.dir.pin extended attribute (Zheng Yan)"
* tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client: (26 commits)
Documentation: modern versions of ceph are not backed by btrfs
rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN
rbd: whole-object write and zeroout should copyup when snapshots exist
rbd: copyup with an empty snapshot context (aka deep-copyup)
rbd: introduce rbd_obj_issue_copyup_ops()
rbd: stop copying num_osd_ops in rbd_obj_issue_copyup()
rbd: factor out __rbd_osd_req_create()
rbd: clear ->xferred on error from rbd_obj_issue_copyup()
rbd: remove experimental designation from kernel layering
ceph: add mount option to limit caps count
ceph: periodically trim stale dentries
ceph: delete stale dentry when last reference is dropped
ceph: remove dentry_lru file from debugfs
ceph: touch existing cap when handling reply
ceph: pass inclusive lend parameter to filemap_write_and_wait_range()
rbd: round off and ignore discards that are too small
rbd: handle DISCARD and WRITE_ZEROES separately
rbd: get rid of obj_req->obj_request_count
libceph: use struct_size() for kmalloc() in crush_decode()
ceph: send cap releases more aggressively
...
Pull vfs mount infrastructure updates from Al Viro:
"The rest of core infrastructure; no new syscalls in that pile, but the
old parts are switched to new infrastructure. At that point
conversions of individual filesystems can happen independently; some
are done here (afs, cgroup, procfs, etc.), there's also a large series
outside of that pile dealing with NFS (quite a bit of option-parsing
stuff is getting used there - it's one of the most convoluted
filesystems in terms of mount-related logics), but NFS bits are the
next cycle fodder.
It got seriously simplified since the last cycle; documentation is
probably the weakest bit at the moment - I considered dropping the
commit introducing Documentation/filesystems/mount_api.txt (cutting
the size increase by quarter ;-), but decided that it would be better
to fix it up after -rc1 instead.
That pile allows to do followup work in independent branches, which
should make life much easier for the next cycle. fs/super.c size
increase is unpleasant; there's a followup series that allows to
shrink it considerably, but I decided to leave that until the next
cycle"
* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
afs: Use fs_context to pass parameters over automount
afs: Add fs_context support
vfs: Add some logging to the core users of the fs_context log
vfs: Implement logging through fs_context
vfs: Provide documentation for new mount API
vfs: Remove kern_mount_data()
hugetlbfs: Convert to fs_context
cpuset: Use fs_context
kernfs, sysfs, cgroup, intel_rdt: Support fs_context
cgroup: store a reference to cgroup_ns into cgroup_fs_context
cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper
cgroup_do_mount(): massage calling conventions
cgroup: stash cgroup_root reference into cgroup_fs_context
cgroup2: switch to option-by-option parsing
cgroup1: switch to option-by-option parsing
cgroup: take options parsing into ->parse_monolithic()
cgroup: fold cgroup1_mount() into cgroup1_get_tree()
cgroup: start switching to fs_context
ipc: Convert mqueue fs to fs_context
proc: Add fs_context support to procfs
...
This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
hisi_sas, target/iscsi and target/core. Additionally Christoph
refactored gdth as part of the dma changes. The major mid-layer
change this time is the removal of bidi commands and with them the
whole of the osd/exofs driver and filesystem.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23
ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z
f69LpXGmMpuagKGvvd4=
=Fhy1
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
hisi_sas, target/iscsi and target/core.
Additionally Christoph refactored gdth as part of the dma changes. The
major mid-layer change this time is the removal of bidi commands and
with them the whole of the osd/exofs driver and filesystem. This is a
major simplification for block and mq in particular"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
scsi: cxgb4i: get pf number from lldi->pf
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
scsi: mpt3sas: Add missing breaks in switch statements
scsi: aacraid: Fix missing break in switch statement
scsi: kill command serial number
scsi: csiostor: drop serial_number usage
scsi: mvumi: use request tag instead of serial_number
scsi: dpt_i2o: remove serial number usage
scsi: st: osst: Remove negative constant left-shifts
scsi: ufs-bsg: Allow reading descriptors
scsi: ufs: Allow reading descriptor via raw upiu
scsi: ufs-bsg: Change the calling convention for write descriptor
scsi: ufs: Remove unused device quirks
Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
scsi: megaraid_sas: Remove a bunch of set but not used variables
scsi: clean obsolete return values of eh_timed_out
scsi: sd: Optimal I/O size should be a multiple of physical block size
scsi: MAINTAINERS: SCSI initiator and target tweaks
scsi: fcoe: make use of fip_mode enum complete
...
First: Ted, Jaegeuk, and I have decided to add me as a co-maintainer for
fscrypt, and we're now using a shared git tree. So we've updated
MAINTAINERS accordingly, and I'm doing the pull request this time.
The actual changes for v5.1 are:
- Remove the fs-specific kconfig options like CONFIG_EXT4_ENCRYPTION and
make fscrypt support for all fscrypt-capable filesystems be controlled
by CONFIG_FS_ENCRYPTION, similar to how CONFIG_QUOTA works.
- Improve error code for rename() and link() into encrypted directories.
- Various cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXH2YDRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+SAAQCWYOTwYko8uE8Ze8i2fiUm0vr91NOg
zj5DGmK7Izxy/gEAsNDOVA7zWrDg/f5600/7aLpDQQTGHA38YVsgiyd7DgY=
=S3tT
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"First: Ted, Jaegeuk, and I have decided to add me as a co-maintainer
for fscrypt, and we're now using a shared git tree. So we've updated
MAINTAINERS accordingly, and I'm doing the pull request this time.
The actual changes for v5.1 are:
- Remove the fs-specific kconfig options like CONFIG_EXT4_ENCRYPTION
and make fscrypt support for all fscrypt-capable filesystems be
controlled by CONFIG_FS_ENCRYPTION, similar to how CONFIG_QUOTA
works.
- Improve error code for rename() and link() into encrypted
directories.
- Various cleanups"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
MAINTAINERS: add Eric Biggers as an fscrypt maintainer
fscrypt: return -EXDEV for incompatible rename or link into encrypted dir
fscrypt: remove filesystem specific build config option
f2fs: use IS_ENCRYPTED() to check encryption status
ext4: use IS_ENCRYPTED() to check encryption status
fscrypt: remove CRYPTO_CTR dependency
and more translations. There's also some LICENSES adjustments from
Thomas.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAlyBl54PHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YxoYH/3OcInUSk17Cb+wNpnJX66dXyVvzZcuAh5aU
HW5YWIIlp60jwsM0z+sVqNR51tfC+eMjw2HOWj0hOEUju7UGm7aDtB+WkEeJ7GUk
e/FX+GXD/OygQtpwXRQraWU/RO3RPSB9JKodF5tQ6aihOzsQGB9c11I0/f3Qp7+U
vaLBOdAlpQYemlzLKbskRZ2YpokELfpgwSb6O7mpI9i3mJeZA/lpyYSmHQxqwvG7
sqrmm7vHB7b0tZGqQISQaZNdUmSSD1lRfOX3brFw2DOIj2V2M1+O/8smBtRuAGf5
B03C7LjkNFn55tn1OHYlWEv8RpG5kH3VNc896jiWPDOXNpMSgl8=
=bOsl
-----END PGP SIGNATURE-----
Merge tag 'docs-5.1' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"A fairly routine cycle for docs - lots of typo fixes, some new
documents, and more translations. There's also some LICENSES
adjustments from Thomas"
* tag 'docs-5.1' of git://git.lwn.net/linux: (74 commits)
docs: Bring some order to filesystem documentation
Documentation/locking/lockdep: Drop last two chars of sample states
doc: rcu: Suspicious RCU usage is a warning
docs: driver-api: iio: fix errors in documentation
Documentation/process/howto: Update for 4.x -> 5.x versioning
docs: Explicitly state that the 'Fixes:' tag shouldn't split lines
doc: security: Add kern-doc for lsm_hooks.h
doc: sctp: Merge and clean up rst files
Docs: Correct /proc/stat path
scripts/spdxcheck.py: fix C++ comment style detection
doc: fix typos in license-rules.rst
Documentation: fix admin-guide/README.rst minimum gcc version requirement
doc: process: complete removal of info about -git patches
doc: translations: sync translations 'remove info about -git patches'
perf-security: wrap paragraphs on 72 columns
perf-security: elaborate on perf_events/Perf privileged users
perf-security: document collected perf_events/Perf data categories
perf-security: document perf_events/Perf resource control
sysfs.txt: add note on available attribute macros
docs: kernel-doc: typo "if ... if" -> "if ... is"
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlx63XIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpp2vEACfrrQsap7R+Av28mmXpmXi2FPa3g5Tev1t
yYjK2qHvhlMZjPTYw3hCmbYdDDczlF7PEgSE2x2DjdcsYapb8Fy1lZ2X16c7ztBR
HD/t9b5AVSQsczZzKgv3RqsNtTnjzS5V0A8XH8FAP2QRgiwDMwSN6G0FP0JBLbE/
ZgxQrH1Iy1F33Wz4hI3Z7dEghKPZrH1IlegkZCEu47q9SlWS76qUetSy2GEtchOl
3Lgu54mQZyVdI5/QZf9DyMDLF6dIz3tYU2qhuo01AHjGRCC72v86p8sIiXcUr94Q
8pbegJhJ/g8KBol9Qhv3+pWG/QUAZwi/ZwasTkK+MJ4klRXfOrznxPubW1z6t9Vn
QRo39Po5SqqP0QWAscDxCFjESIQlWlKa+LZurJL7DJDCUGrSgzTpnVwFqKwc5zTP
HJa5MT2tEeL2TfUYRYCfh0ZV0elINdHA1y1klDBh38drh4EWr2gW8xdseGYXqRjh
fLgEpoF7VQ8kTvxKN+E4jZXkcZmoLmefp0ZyAbblS6IawpPVC7kXM9Fdn2OU8f2c
fjVjvSiqxfeN6dnpfeLDRbbN9894HwgP/LPropJOQ7KmjCorQq5zMDkAvoh3tElq
qwluRqdBJpWT/F05KweY+XVW8OawIycmUWqt6JrVNoIDAK31auHQv47kR0VA4OvE
DRVVhYpocw==
=VBaU
-----END PGP SIGNATURE-----
Merge tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
"Not a huge amount of changes in this round, the biggest one is that we
finally have Mings multi-page bvec support merged. Apart from that,
this pull request contains:
- Small series that avoids quiescing the queue for sysfs changes that
match what we currently have (Aleksei)
- Series of bcache fixes (via Coly)
- Series of lightnvm fixes (via Mathias)
- NVMe pull request from Christoph. Nothing major, just SPDX/license
cleanups, RR mp policy (Hannes), and little fixes (Bart,
Chaitanya).
- BFQ series (Paolo)
- Save blk-mq cpu -> hw queue mapping, removing a pointer indirection
for the fast path (Jianchao)
- fops->iopoll() added for async IO polling, this is a feature that
the upcoming io_uring interface will use (Christoph, me)
- Partition scan loop fixes (Dongli)
- mtip32xx conversion from managed resource API (Christoph)
- cdrom registration race fix (Guenter)
- MD pull from Song, two minor fixes.
- Various documentation fixes (Marcos)
- Multi-page bvec feature. This brings a lot of nice improvements
with it, like more efficient splitting, larger IOs can be supported
without growing the bvec table size, and so on. (Ming)
- Various little fixes to core and drivers"
* tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block: (117 commits)
block: fix updating bio's front segment size
block: Replace function name in string with __func__
nbd: propagate genlmsg_reply return code
floppy: remove set but not used variable 'q'
null_blk: fix checking for REQ_FUA
block: fix NULL pointer dereference in register_disk
fs: fix guard_bio_eod to check for real EOD errors
blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map
block: optimize bvec iteration in bvec_iter_advance
block: introduce mp_bvec_for_each_page() for iterating over page
block: optimize blk_bio_segment_split for single-page bvec
block: optimize __blk_segment_map_sg() for single-page bvec
block: introduce bvec_nth_page()
iomap: wire up the iopoll method
block: add bio_set_polled() helper
block: wire up block device iopoll method
fs: add an iopoll method to struct file_operations
loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part()
loop: do not print warn message if partition scan is successful
block: bounce: make sure that bvec table is updated
...
Documentation/filesystems is, like much of the rest of the kernel's
documentation, a jumble of unorganized information. Split the
documentation into categories and try to bring some order to the top-level
index.rst files. No text changes other than a few section-introductory
blurbs; this is all just moving stuff around.
Cc: linux-fsdevel@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
We missed to add document for inline_xattr_size mount option in f2fs.txt,
add it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If number of caps exceed the limit, ceph_trim_dentires() also trim
dentries with valid leases. Trimming dentry releases references to
associated inode, which may evict inode and release caps.
By default, there is no limit for caps count.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This new methods is used to explicitly poll for I/O completion for an
iocb. It must be called for any iocb submitted asynchronously (that
is with a non-null ki_complete) which has the IOCB_HIPRI flag set.
The method is assisted by a new ki_cookie field in struct iocb to store
the polling cookie.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The common cases of attributes wrappers should probably be using the
__ATTR_XXX macros to make code more concise and readable but the current
sysfs.txt does not point developers to those convenience macros. Further
there is no note in sysfs.txt currently explaining why trying to set a
sysfs file to mode 0666 will fail respectively revert to 0664.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently we have a few PTAGs in place allowing us to transform a filesystem
error in a BUG() call. However, we don't have a panic tag for corrupt
metadata, so introduce XFS_PTAG_VERIFIER_ERROR so that the administrator can
use the fs.xfs.panic_mask sysctl knob to convert any error detected by buffer
verifiers into a kernel panic.
Signed-off-by: Marco Benatto <mbenatto@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[darrick: light editing of commit message]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This was an example for using the SCSI OSD protocol, which we're trying
to remove.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, trying to rename or link a regular file, directory, or
symlink into an encrypted directory fails with EPERM when the source
file is unencrypted or is encrypted with a different encryption policy,
and is on the same mountpoint. It is correct for the operation to fail,
but the choice of EPERM breaks tools like 'mv' that know to copy rather
than rename if they see EXDEV, but don't know what to do with EPERM.
Our original motivation for EPERM was to encourage users to securely
handle their data. Encrypting files by "moving" them into an encrypted
directory can be insecure because the unencrypted data may remain in
free space on disk, where it can later be recovered by an attacker.
It's much better to encrypt the data from the start, or at least try to
securely delete the source data e.g. using the 'shred' program.
However, the current behavior hasn't been effective at achieving its
goal because users tend to be confused, hack around it, and complain;
see e.g. https://github.com/google/fscrypt/issues/76. And in some cases
it's actually inconsistent or unnecessary. For example, 'mv'-ing files
between differently encrypted directories doesn't work even in cases
where it can be secure, such as when in userspace the same passphrase
protects both directories. Yet, you *can* already 'mv' unencrypted
files into an encrypted directory if the source files are on a different
mountpoint, even though doing so is often insecure.
There are probably better ways to teach users to securely handle their
files. For example, the 'fscrypt' userspace tool could provide a
command that migrates unencrypted files into an encrypted directory,
acting like 'shred' on the source files and providing appropriate
warnings depending on the type of the source filesystem and disk.
Receiving errors on unimportant files might also force some users to
disable encryption, thus making the behavior counterproductive. It's
desirable to make encryption as unobtrusive as possible.
Therefore, change the error code from EPERM to EXDEV so that tools
looking for EXDEV will fall back to a copy.
This, of course, doesn't prevent users from still doing the right things
to securely manage their files. Note that this also matches the
behavior when a file is renamed between two project quota hierarchies;
so there's precedent for using EXDEV for things other than mountpoints.
xfstests generic/398 will require an update with this change.
[Rewritten from an earlier patch series by Michael Halcrow.]
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Joe Richey <joerichey@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
In order to have a common code base for fscrypt "post read" processing
for all filesystems which support encryption, this commit removes
filesystem specific build config option (e.g. CONFIG_EXT4_FS_ENCRYPTION)
and replaces it with a build option (i.e. CONFIG_FS_ENCRYPTION) whose
value affects all the filesystems making use of fscrypt.
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
This documents the Android binderfs filesystem used to dynamically add and
remove binder devices that are private to each instance.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
[jc: tweaked markup and added to filesystems/index.rst]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
We are getting rid of the "raw" BUS_ATTR() macro, so fix up the
documentation to not refer to it anymore.
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix Sphinx warnings in path-lookup.rst:
Documentation/filesystems/path-lookup.rst:347: WARNING: Title underline too short.
Documentation/filesystems/path-lookup.rst:358: WARNING: Title underline too short.
[...]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlwyBbEACgkQ8vlZVpUN
gaNrawgAhYWrPwsEFM17dziRWRm8Ub9QgQUK6JRt+vE5KCRRVdXgJSLVH4esW9rJ
X+QQ0diT8ZMKjdbsyz0cVmwP7nqQ5EKzjxts6J8vtbWDB6+nvaDLNdicJgUOprcT
jIi8/45XKmyGUVO9Au6Wdda/zZi4dQBkXd+zUFGWYQRYL0LgmboWHKlaWueu7Qha
xVtavYPSKUSMH8+r1F+HU6P41+1IBiuK4tCwfKfAqJ367Ushzk9xVKHNGrGDAQNi
BTbn4NOOFaYvmVudJbQjD3tHtuQu2JsxlclB5KAtLBm1r3+vb3fMGsNyPBUmNp6Y
YE/xKhACP4kYlk9xCG7vWcWGyTu90g==
=HR7f
-----END PGP SIGNATURE-----
Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt
Pull fscrypt updates from Ted Ts'o:
"Add Adiantum support for fscrypt"
* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: add Adiantum support
Add support for the Adiantum encryption mode to fscrypt. Adiantum is a
tweakable, length-preserving encryption mode with security provably
reducible to that of XChaCha12 and AES-256, subject to a security bound.
It's also a true wide-block mode, unlike XTS. See the paper
"Adiantum: length-preserving encryption for entry-level processors"
(https://eprint.iacr.org/2018/720.pdf) for more details. Also see
commit 059c2a4d8e ("crypto: adiantum - add Adiantum support").
On sufficiently long messages, Adiantum's bottlenecks are XChaCha12 and
the NH hash function. These algorithms are fast even on processors
without dedicated crypto instructions. Adiantum makes it feasible to
enable storage encryption on low-end mobile devices that lack AES
instructions; currently such devices are unencrypted. On ARM Cortex-A7,
on 4096-byte messages Adiantum encryption is about 4 times faster than
AES-256-XTS encryption; decryption is about 5 times faster.
In fscrypt, Adiantum is suitable for encrypting both file contents and
names. With filenames, it fixes a known weakness: when two filenames in
a directory share a common prefix of >= 16 bytes, with CTS-CBC their
encrypted filenames share a common prefix too, leaking information.
Adiantum does not have this problem.
Since Adiantum also accepts long tweaks (IVs), it's also safe to use the
master key directly for Adiantum encryption rather than deriving
per-file keys, provided that the per-file nonce is included in the IVs
and the master key isn't used for any other encryption mode. This
configuration saves memory and improves performance. A new fscrypt
policy flag is added to allow users to opt-in to this configuration.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAlwv15sPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YxksH/2kdPM4ltyUfb7Nl3ioX6UQdiNf8zzYWXG+6
TllwzGWpI1nK5H+hOGRVLeF/CPNdij/9ScdMhRWTb7Di2mlp3py+5bebZgkTA4KJ
1wy+wnonbtNkHenAjP/e14PL8/JSsyTugADnLwxb4PiURiHiAhvM4jTuxsYAhAQf
LlBoGyfowzI/laNRoh8RonHFtPI3U2oMkhtdx5OIySMlMJNgEIID63KkJsdsIujz
CDUijaFX226s9PiobMNX09Y99fSfOly4yBASabePwrUtVKKL7AJ/vBTgqgdgVTBk
ixTaooEYyLWaPSjMFNYlWH9hCu+N7MZAhrdNNPhHjgGJjTjaFXQ=
=VfF6
-----END PGP SIGNATURE-----
Merge tag 'docs-5.0-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of late-arriving documentation fixes"
* tag 'docs-5.0-fixes' of git://git.lwn.net/linux:
doc: filesystems: fix bad references to nonexistent ext4.rst file
Documentation/admin-guide: update URL of LKML information link
Docs/kernel-api.rst: Remove blk-tag.c reference
The ext4.rst file does not exist anymore. This patch changes all references
to point to the whole ext4 directory.
Fixes: d309121592 ("docs: move ext4 administrative docs to admin-guide/")
Signed-off-by: Otto Sabart <ottosabart@seberm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlwqtf4ACgkQiiy9cAdy
T1GLwAv+I4MaCe5oq/IHDZnr09Mb/sIRLqLXnMWJciRHedHFIa/x2egb+584M+bf
Lrb3UjDyS4aXV8cjrm4XO8zzzvQkTRLtaJrlxo/b1oDZJ8JkH2M6EeNr5gAB6qso
dbmUX59YMX8KSpmQMhigcv+ilOQdokDWVdxqZ2ezbEMeVMotkQOnhrcSiJPx05QS
CRktWjSn7JKD87cj8i0dTX+txBPX9iIpYQJGWdbJa2n6V8mQkx9JPgyQCC/FwKF2
TzCXl7wfn1gTnFSxCa/sq7lnYAr6xCngbFi+pgVU+O/Aw0dyW3AoKfF7hBOo+gAH
ZJALnvhb8pJmKolXFt7OKQKuOoJSq8MInsjKSKgSe0Xt1yHEtm7IJPy6Kbj3zKVy
TuDq1KXstB5m3uwO3QBmzGxZ7rCB4B1w1cGjn8MFcpK4+tOxtmSvIeYuzEj9Vxet
5JFZzMICFyzedyuBaRxyEX8SKH7CxOXCiDajxLsp7GI8KN1i0skzjgpbmZ/tdRbB
kHaPnRdU
=rYS/
-----END PGP SIGNATURE-----
Merge tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs updates from Steve French:
- four fixes for stable
- improvements to DFS including allowing failover to alternate targets
- some small performance improvements
* tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (39 commits)
cifs: update internal module version number
cifs: we can not use small padding iovs together with encryption
cifs: Minor Kconfig clarification
cifs: Always resolve hostname before reconnecting
cifs: Add support for failover in cifs_reconnect_tcon()
cifs: Add support for failover in smb2_reconnect()
cifs: Only free DFS target list if we actually got one
cifs: start DFS cache refresher in cifs_mount()
cifs: Use GFP_ATOMIC when a lock is held in cifs_mount()
cifs: Add support for failover in cifs_reconnect()
cifs: Add support for failover in cifs_mount()
cifs: remove set but not used variable 'sep'
cifs: Make use of DFS cache to get new DFS referrals
cifs: minor updates to documentation
cifs: check kzalloc return
cifs: remove set but not used variable 'server'
cifs: Use kzfree() to free password
cifs: Fix to use kmem_cache_free() instead of kfree()
cifs: update for current_kernel_time64() removal
cifs: Add DFS cache routines
...
document on perf security, more Italian translations, more
improvements to the memory-management docs, improvements to the
pathname lookup documentation, and the usual array of smaller
fixes.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAlwmSPkPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5Y9ZoH/joPnMFykOxS0SmdfI7Z+F4EiJct/ZwF9bHx
T673T0RC30IgnUXGmBl5OtktfWqVh9aGqHOGwgh65ybp2QvzemdP0k6Lu6RtwNk9
6LfkpvuUb8FzaQmCHnSMzMSDmXtZUw3Z/mOjCBcQtfGAsUULNT08xl+Dr+gwWIWt
H+gPEEP+MCXTOQO1jm2dHOHW8NGm6XOijMTpOxp/pkoEY5tUxkVB1T//8EeX7LVh
c1QHzFrufE3bmmubCLtIuyVqZbm/V5l6rHREDQ46fnH/G9fM4gojzsrAL/Y2m4bt
E4y0XJHycjLMRDimAnYhbPm1ryTFAX1lNzHP3M/EF6Heqx8YHAk=
=vtwu
-----END PGP SIGNATURE-----
Merge tag 'docs-5.0' of git://git.lwn.net/linux
Pull documentation update from Jonathan Corbet:
"A fairly normal cycle for documentation stuff. We have a new document
on perf security, more Italian translations, more improvements to the
memory-management docs, improvements to the pathname lookup
documentation, and the usual array of smaller fixes.
As is often the case, there are a few reaches outside of
Documentation/ to adjust kerneldoc comments"
* tag 'docs-5.0' of git://git.lwn.net/linux: (38 commits)
docs: improve pathname-lookup document structure
configfs: fix wrong name of struct in documentation
docs/mm-api: link slab_common.c to "The Slab Cache" section
slab: make kmem_cache_create{_usercopy} description proper kernel-doc
doc:process: add links where missing
docs/core-api: make mm-api.rst more structured
x86, boot: documentation whitespace fixup
Documentation: devres: note checking needs when converting
doc🇮🇹 add some process/* translations
doc🇮🇹 fixes in process/1.Intro
Documentation: convert path-lookup from markdown to resturctured text
Documentation/admin-guide: update admin-guide index.rst
Documentation/admin-guide: introduce perf-security.rst file
scripts/kernel-doc: Fix struct and struct field attribute processing
Documentation: dev-tools: Fix typos in index.rst
Correct gen_init_cpio tool's documentation
Document /proc/pid PID reuse behavior
Documentation: update path-lookup.md for parallel lookups
Documentation: Use "while" instead of "whilst"
dmaengine: Add mailing list address to the documentation
...
David Rientjes has reported that commit 1860033237 ("mm: make
PR_SET_THP_DISABLE immediately active") has changed the way how we
report THPable VMAs to the userspace. Their monitoring tool is
triggering false alarms on PR_SET_THP_DISABLE tasks because it considers
an insufficient THP usage as a memory fragmentation resp. memory
pressure issue.
Before the said commit each newly created VMA inherited VM_NOHUGEPAGE
flag and that got exposed to the userspace via /proc/<pid>/smaps file.
This implementation had its downsides as explained in the commit message
but it is true that the userspace doesn't have any means to query for
the process wide THP enabled/disabled status.
PR_SET_THP_DISABLE is a process wide flag so it makes a lot of sense to
export in the process wide context rather than per-vma. Introduce a new
field to /proc/<pid>/status which export this status. If
PR_SET_THP_DISABLE is used then it reports false same as when the THP is
not compiled in. It doesn't consider the global THP status because we
already export that information via sysfs
Link: http://lkml.kernel.org/r/20181211143641.3503-4-mhocko@kernel.org
Fixes: 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Userspace falls short when trying to find out whether a specific memory
range is eligible for THP. There are usecases that would like to know
that
http://lkml.kernel.org/r/alpine.DEB.2.21.1809251248450.50347@chino.kir.corp.google.com
: This is used to identify heap mappings that should be able to fault thp
: but do not, and they normally point to a low-on-memory or fragmentation
: issue.
The only way to deduce this now is to query for hg resp. nh flags and
confronting the state with the global setting. Except that there is also
PR_SET_THP_DISABLE that might change the picture. So the final logic is
not trivial. Moreover the eligibility of the vma depends on the type of
VMA as well. In the past we have supported only anononymous memory VMAs
but things have changed and shmem based vmas are supported as well these
days and the query logic gets even more complicated because the
eligibility depends on the mount option and another global configuration
knob.
Simplify the current state and report the THP eligibility in
/proc/<pid>/smaps for each existing vma. Reuse
transparent_hugepage_enabled for this purpose. The original
implementation of this function assumes that the caller knows that the vma
itself is supported for THP so make the core checks into
__transparent_hugepage_enabled and use it for existing callers.
__show_smap just use the new transparent_hugepage_enabled which also
checks the vma support status (please note that this one has to be out of
line due to include dependency issues).
[mhocko@kernel.org: fix oops with NULL ->f_mapping]
Link: http://lkml.kernel.org/r/20181224185106.GC16738@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20181211143641.3503-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "THP eligibility reporting via proc".
This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable. The trigger for the change is a
regression report [2] and the long follow up discussion. In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.
A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings. Again a lack of a proper API led to an
abuse.
The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.
The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well. [1]
http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
This patch (of 3):
Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags. And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.
Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.
http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps. After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.
In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.
Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Get rid of some unneeded structural elements around the new (to RST)
pathname-lookup document.
Signed-off-by: NeilBrown <neilb@suse.com>
[ jc: grabbed from email and changelog added ]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The name of the struct is configfs_bin_attribute instead of
configfs_attribute
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Fixes: 03607ace80 ("configfs: implement binary attributes")
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This allows the document to be integrated with the main documentation
tree.
Changes include:
- rename from .md to .rst
- use `` for code, not single `
- use correct sub-section marking
- fix indented blocks, both code and non-code
- fix external-link markup
Signed-off-by: NeilBrown <neilb@suse.com>
[jc: changed the toctree organization a bit]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
State explicitly that holding a /proc/pid file descriptor open does
not reserve the PID. Also note that in the event of PID reuse, these
open file descriptors refer to the old, now-dead process, and not the
new one that happens to be named the same numeric PID.
Signed-off-by: Daniel Colascione <dancol@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Since this document was written, i_mutex has been replace with
i_rwsem, and shared locks are utilized to allow lookups in the one
directory to happen in parallel.
So replace i_mutex with i_rwsem, and explain how this is used for
parallel lookups.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Whilst making an unrelated change to some Documentation, Linus sayeth:
| Afaik, even in Britain, "whilst" is unusual and considered more
| formal, and "while" is the common word.
|
| [...]
|
| Can we just admit that we work with computers, and we don't need to
| use þe eald Englisc spelling of words that most of the world never
| uses?
dictionary.com refers to the word as "Chiefly British", which is
probably an undesirable attribute for technical documentation.
Replace all occurrences under Documentation/ with "while".
Cc: David Howells <dhowells@redhat.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Trivial fix to a spelling mistake of the error access name EACCESS,
rename to EACCES
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
It was found that two of the fields in the /proc/<pid>/status file were
missing - CapAmb & Speculation_Store_Bypass. They are now added to the
proc.txt documentation file.
v2: Update the example as well.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>