aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-08-02Merge tag 'v5.20-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Make proc files report fips module name and version Algorithms: - Move generic SHA1 code into lib/crypto - Implement Chinese Remainder Theorem for RSA - Remove blake2s - Add XCTR with x86/arm64 acceleration - Add POLYVAL with x86/arm64 acceleration - Add HCTR2 - Add ARIA Drivers: - Add support for new CCP/PSP device ID in ccp" * tag 'v5.20-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (89 commits) crypto: tcrypt - Remove the static variable initialisations to NULL crypto: arm64/poly1305 - fix a read out-of-bound crypto: hisilicon/zip - Use the bitmap API to allocate bitmaps crypto: hisilicon/sec - fix auth key size error crypto: ccree - Remove a useless dma_supported() call crypto: ccp - Add support for new CCP/PSP device ID crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq crypto: testmgr - some more fixes to RSA test vectors cyrpto: powerpc/aes - delete the rebundant word "block" in comments hwrng: via - Fix comment typo crypto: twofish - Fix comment typo crypto: rmd160 - fix Kconfig "its" grammar crypto: keembay-ocs-ecc - Drop if with an always false condition Documentation: qat: rewrite description Documentation: qat: Use code block for qat sysfs example crypto: lib - add module license to libsha1 crypto: lib - make the sha1 library optional crypto: lib - move lib/sha1.c into lib/crypto/ crypto: fips - make proc files report fips module name and version ...
2022-08-02Merge tag 'fsverity-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity update from Eric Biggers: "Just a small documentation update to mention the btrfs support" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fs-verity: mention btrfs support
2022-08-02Merge tag 'execve-v5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - Allow unsharing time namespace on vfork+exec (Andrei Vagin) - Replace usage of deprecated kmap APIs (Fabio M. De Francesco) - Fix spelling mistake (Zhang Jiaming) * tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Call kmap_local_page() in copy_string_kernel() exec: Fix a spelling mistake selftests/timens: add a test for vfork+exit fs/exec: allow to unshare a time namespace on vfork+exec
2022-08-02Merge tag 'pstore-v5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: - Migrate to modern acomp crypto interface (Ard Biesheuvel) - Use better return type for "rcnt" (Dan Carpenter) * tag 'pstore-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/zone: cleanup "rcnt" type pstore: migrate to crypto acomp interface
2022-08-02Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: - Improve the type checking of request flags (Bart) - Ensure queue mapping for a single queues always picks the right queue (Bart) - Sanitize the io priority handling (Jan) - rq-qos race fix (Jinke) - Reserved tags handling improvements (John) - Separate memory alignment from file/disk offset aligment for O_DIRECT (Keith) - Add new ublk driver, userspace block driver using io_uring for communication with the userspace backend (Ming) - Use try_cmpxchg() to cleanup the code in various spots (Uros) - Finally remove bdevname() (Christoph) - Clean up the zoned device handling (Christoph) - Clean up independent access range support (Christoph) - Clean up and improve block sysfs handling (Christoph) - Clean up and improve teardown of block devices. This turns the usual two step process into something that is simpler to implement and handle in block drivers (Christoph) - Clean up chunk size handling (Christoph) - Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu, Ming, Sebastian, Yang, Ying) * tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits) ublk_drv: fix double shift bug ublk_drv: make sure that correct flags(features) returned to userspace ublk_drv: fix error handling of ublk_add_dev ublk_drv: fix lockdep warning block: remove __blk_get_queue block: call blk_mq_exit_queue from disk_release for never added disks blk-mq: fix error handling in __blk_mq_alloc_disk ublk: defer disk allocation ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask ublk: fold __ublk_create_dev into ublk_ctrl_add_dev ublk: cleanup ublk_ctrl_uring_cmd ublk: simplify ublk_ch_open and ublk_ch_release ublk: remove the empty open and release block device operations ublk: remove UBLK_IO_F_PREFLUSH ublk: add a MAINTAINERS entry block: don't allow the same type rq_qos add more than once mmc: fix disk/queue leak in case of adding disk failure ublk_drv: fix an IS_ERR() vs NULL check ublk: remove UBLK_IO_F_INTEGRITY ublk_drv: remove unneeded semicolon ...
2022-08-02Merge tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of ↵Linus Torvalds
git://git.kernel.dk/linux-block Pull io_uring buffered writes support from Jens Axboe: "This contains support for buffered writes, specifically for XFS. btrfs is in progress, will be coming in the next release. io_uring does support buffered writes on any file type, but since the buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt to do so if IOCB_NOWAIT is set, any buffered write will effectively be handled by io-wq offload. This isn't very efficient, and we even have specific code in io-wq to serialize buffered writes to the same inode to avoid further inefficiencies with thread offload. This is particularly sad since most buffered writes don't block, they simply copy data to a page and dirty it. With this pull request, we can handle buffered writes a lot more effiently. If balance_dirty_pages() needs to block, we back off on writes as indicated. This improves buffered write support by 2-3x. Jan Kara helped with the mm bits for this, and Stefan handled the fs/iomap/xfs/io_uring parts of it" * tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block: mm: honor FGP_NOWAIT for page cache page allocation xfs: Add async buffered write support xfs: Specify lockmode when calling xfs_ilock_for_iomap() io_uring: Add tracepoint for short writes io_uring: fix issue with io_write() not always undoing sb_start_write() io_uring: Add support for async buffered writes fs: Add async write file modification handling. fs: Split off inode_needs_update_time and __file_update_time fs: add __remove_file_privs() with flags parameter fs: add a FMODE_BUF_WASYNC flags for f_mode iomap: Return -EAGAIN from iomap_write_iter() iomap: Add async buffered write support iomap: Add flags parameter to iomap_page_create() mm: Add balance_dirty_pages_ratelimited_flags() function mm: Move updates of dirty_exceeded into one place mm: Move starting of background writeback into the main balancing loop
2022-08-02Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: - As per (valid) complaint in the last merge window, fs/io_uring.c has grown quite large these days. io_uring isn't really tied to fs either, as it supports a wide variety of functionality outside of that. Move the code to io_uring/ and split it into files that either implement a specific request type, and split some code into helpers as well. The code is organized a lot better like this, and io_uring.c is now < 4K LOC (me). - Deprecate the epoll_ctl opcode. It'll still work, just trigger a warning once if used. If we don't get any complaints on this, and I don't expect any, then we can fully remove it in a future release (me). - Improve the cancel hash locking (Hao) - kbuf cleanups (Hao) - Efficiency improvements to the task_work handling (Dylan, Pavel) - Provided buffer improvements (Dylan) - Add support for recv/recvmsg multishot support. This is similar to the accept (or poll) support for have for multishot, where a single SQE can trigger everytime data is received. For applications that expect to do more than a few receives on an instantiated socket, this greatly improves efficiency (Dylan). - Efficiency improvements for poll handling (Pavel) - Poll cancelation improvements (Pavel) - Allow specifiying a range for direct descriptor allocations (Pavel) - Cleanup the cqe32 handling (Pavel) - Move io_uring types to greatly cleanup the tracing (Pavel) - Tons of great code cleanups and improvements (Pavel) - Add a way to do sync cancelations rather than through the sqe -> cqe interface, as that's a lot easier to use for some use cases (me). - Add support to IORING_OP_MSG_RING for sending direct descriptors to a different ring. This avoids the usually problematic SCM case, as we disallow those. (me) - Make the per-command alloc cache we use for apoll generic, place limits on it, and use it for netmsg as well (me). - Various cleanups (me, Michal, Gustavo, Uros) * tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_msghdr() io_uring: Don't require reinitable percpu_ref io_uring: fix types in io_recvmsg_multishot_overflow io_uring: Use atomic_long_try_cmpxchg in __io_account_mem io_uring: support multishot in recvmsg net: copy from user before calling __get_compat_msghdr net: copy from user before calling __copy_msghdr io_uring: support 0 length iov in buffer select in compat io_uring: fix multishot ending when not polled io_uring: add netmsg cache io_uring: impose max limit on apoll cache io_uring: add abstraction around apoll cache io_uring: move apoll cache to poll.c io_uring: consolidate hash_locked io-wq handling io_uring: clear REQ_F_HASH_LOCKED on hash removal io_uring: don't race double poll setting REQ_F_ASYNC_DATA io_uring: don't miss setting REQ_F_DOUBLE_POLL io_uring: disable multishot recvmsg io_uring: only trace one of complete or overflow ...
2022-08-01Merge tag 'fs.idmapped.overlay.acl.v5.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull acl updates from Christian Brauner: "Last cycle we introduced support for mounting overlayfs on top of idmapped mounts. While looking into additional testing we realized that posix acls don't really work correctly with stacking filesystems on top of idmapped layers. We already knew what the fix were but it would require work that is more suitable for the merge window so we turned off posix acls for v5.19 for overlayfs on top of idmapped layers with Miklos routing my patch upstream in 72a8e05d4f66 ("Merge tag 'ovl-fixes-5.19-rc7' [..]"). This contains the work to support posix acls for overlayfs on top of idmapped layers. Since the posix acl fixes should use the new vfs{g,u}id_t work the associated branch has been merged in. (We sent a pull request for this earlier.) We've also pulled in Miklos pull request containing my patch to turn of posix acls on top of idmapped layers. This allowed us to avoid rebasing the branch which we didn't like because we were already at rc7 by then. Merging it in allows this branch to first fix posix acls and then to cleanly revert the temporary fix it brought in by commit 4a47c6385bb4 ("ovl: turn of SB_POSIXACL with idmapped layers temporarily"). The last patch in this series adds Seth Forshee as a co-maintainer for idmapped mounts. Seth has been integral to all of this work and is also the main architect behind the filesystem idmapping work which ultimately made filesystems such as FUSE and overlayfs available in containers. He continues to be active in both development and review. I'm very happy he decided to help and he has my full trust. This increases the bus factor which is always great for work like this. I'm honestly very excited about this because I think in general we don't do great in the bringing on new maintainers department" For more explanations of the ACL issues, see https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org/ * tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: Add Seth Forshee as co-maintainer for idmapped mounts Revert "ovl: turn of SB_POSIXACL with idmapped layers temporarily" ovl: handle idmappings in ovl_get_acl() acl: make posix_acl_clone() available to overlayfs acl: port to vfs{g,u}id_t acl: move idmapped mount fixup into vfs_{g,s}etxattr() mnt_idmapping: add vfs[g,u]id_into_k[g,u]id()
2022-08-01Merge tag 'fs.idmapped.vfsuid.v5.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull fs idmapping updates from Christian Brauner: "This introduces the new vfs{g,u}id_t types we agreed on. Similar to k{g,u}id_t the new types are just simple wrapper structs around regular {g,u}id_t types. They allow to establish a type safety boundary in the VFS for idmapped mounts preventing confusion betwen {g,u}ids mapped into an idmapped mount and {g,u}ids mapped into the caller's or the filesystem's idmapping. An initial set of helpers is introduced that allows to operate on vfs{g,u}id_t types. We will remove all references to non-type safe idmapped mounts helpers in the very near future. The patches do already exist. This converts the core attribute changing codepaths which become significantly easier to reason about because of this change. Just a few highlights here as the patches give detailed overviews of what is happening in the commit messages: - The kernel internal struct iattr contains type safe vfs{g,u}id_t values clearly communicating that these values have to take a given mount's idmapping into account. - The ownership values placed in struct iattr to change ownership are identical for idmapped and non-idmapped mounts going forward. This also allows to simplify stacking filesystems such as overlayfs that change attributes In other words, they always represent the values. - Instead of open coding checks for whether ownership changes have been requested and an actual update of the inode is required we now have small static inline wrappers that abstract this logic away removing a lot of code duplication from individual filesystems that all open-coded the same checks" * tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: mnt_idmapping: align kernel doc and parameter order mnt_idmapping: use new helpers in mapped_fs{g,u}id() fs: port HAS_UNMAPPED_ID() to vfs{g,u}id_t mnt_idmapping: return false when comparing two invalid ids attr: fix kernel doc attr: port attribute changes to new types security: pass down mount idmapping to setattr hook quota: port quota helpers mount ids fs: port to iattr ownership update helpers fs: introduce tiny iattr ownership update helpers fs: use mount types in iattr fs: add two type safe mapping helpers mnt_idmapping: add vfs{g,u}id_t
2022-08-01Merge tag 'filelock-v6.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "Just a couple of flock() patches from Kuniyuki Iwashima. The main change is that this moves a file_lock allocation from the slab to the stack" * tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs/lock: Rearrange ops in flock syscall. fs/lock: Don't allocate file_lock in flock_make_lock().
2022-08-01Merge tag 'erofs-for-5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "First of all, we'd like to add Yue Hu and Jeffle Xu as two new reviewers. Thank them for spending time working on EROFS! There is no major feature outstanding in this cycle, mainly a patchset I worked on to prepare for rolling hash deduplication and folios for compressed data as the next big features. It kills the unneeded PG_error flag dependency as well. Apart from that, there are bugfixes and cleanups as always. Details are listed below: - Add Yue Hu and Jeffle Xu as reviewers - Add the missing wake_up when updating lzma streams - Avoid consecutive detection for Highmem memory - Prepare for multi-reference pclusters and get rid of PG_error - Fix ctx->pos update for NFS export - minor cleanups" * tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (23 commits) erofs: update ctx->pos for every emitted dirent erofs: get rid of the leftover PAGE_SIZE in dir.c erofs: get rid of erofs_prepare_dio() helper erofs: introduce multi-reference pclusters (fully-referenced) erofs: record the longest decompressed size in this round erofs: introduce z_erofs_do_decompressed_bvec() erofs: try to leave (de)compressed_pages on stack if possible erofs: introduce struct z_erofs_decompress_backend erofs: get rid of `z_pagemap_global' erofs: clean up `enum z_erofs_collectmode' erofs: get rid of `enum z_erofs_page_type' erofs: rework online page handling erofs: switch compressed_pages[] to bufvec erofs: introduce `z_erofs_parse_in_bvecs' erofs: drop the old pagevec approach erofs: introduce bufvec to store decompressed buffers erofs: introduce `z_erofs_parse_out_bvecs()' erofs: clean up z_erofs_collector_begin() erofs: get rid of unneeded `inode', `map' and `sb' erofs: avoid consecutive detection for Highmem memory ...
2022-08-01Merge tag 'fsnotify_for_v5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: - support for FAN_MARK_IGNORE which untangles some of the not well defined corner cases with fanotify ignore masks - small cleanups * tag 'fsnotify_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: Fix comment typo fanotify: introduce FAN_MARK_IGNORE fanotify: cleanups for fanotify_mark() input validations fanotify: prepare for setting event flags in ignore mask fs: inotify: Fix typo in inotify comment
2022-08-01Merge tag 'fs_for_v5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2 and reiserfs updates from Jan Kara: "A fix for ext2 handling of a corrupted fs image and cleanups in ext2 and reiserfs" * tag 'fs_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2: Add more validity checks for inode counts fs/reiserfs/inode: remove dead code in _get_block_create_0() fs/ext2: replace ternary operator with min_t()
2022-08-01Merge tag 'dlm-6.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: - Delay the cleanup of interrupted posix lock requests until the user space result arrives. Previously, the immediate cleanup would lead to extraneous warnings when the result arrived. - Tracepoint improvements, e.g. adding the lock resource name. - Delay the completion of lockspace creation until one full recovery cycle has completed. This allows more error cases to be returned to the caller. - Remove warnings from the locking layer about delayed network replies. The recently added midcomms warnings are much more useful. - Begin the process of deprecating two unused lock-timeout-related features. These features now require enabling via a Kconfig option, and enabling them triggers deprecation warnings. We expect to remove the code in v6.2. * tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: fs: dlm: move kref_put assert for lkb structs fs: dlm: don't use deprecated timeout features by default fs: dlm: add deprecation Kconfig and warnings for timeouts fs: dlm: remove timeout from dlm_user_adopt_orphan fs: dlm: remove waiter warnings fs: dlm: fix grammar in lowcomms output fs: dlm: add comment about lkb IFL flags fs: dlm: handle recovery result outside of ls_recover fs: dlm: make new_lockspace() wait until recovery completes fs: dlm: call dlm_lsop_recover_prep once fs: dlm: update comments about recovery and membership handling fs: dlm: add resource name to tracepoints fs: dlm: remove additional dereference of lksb fs: dlm: change ast and bast trace order fs: dlm: change posix lock sigint handling fs: dlm: use dlm_plock_info for do_unlock_close fs: dlm: change plock interrupted message to debug again fs: dlm: add pid to debug log fs: dlm: plock use list_first_entry
2022-08-01fs: dlm: move kref_put assert for lkb structsAlexander Aring
The unhold_lkb() function decrements the lock's kref, and asserts that the ref count was not the final one. Use the kref_put release function (which should not be called) to call the assert, rather than doing the assert based on the kref_put return value. Using kill_lkb() as the release function doesn't make sense if we only want to assert. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-01fs: dlm: don't use deprecated timeout features by defaultAlexander Aring
This patch will disable use of deprecated timeout features if CONFIG_DLM_DEPRECATED_API is not set. The deprecated features will be removed in upcoming kernel release v6.2. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-01fs: dlm: add deprecation Kconfig and warnings for timeoutsAlexander Aring
This patch adds a CONFIG_DLM_DEPRECATED_API Kconfig option that must be enabled to use two timeout-related features that we intend to remove in kernel v6.2. Warnings are printed if either is enabled and used. Neither has ever been used as far as we know. . The DLM_LSFL_TIMEWARN lockspace creation flag will be removed, along with the associated configfs entry for setting the timeout. Setting the flag and configfs file would cause dlm to track how long locks were waiting for reply messages. After a timeout, a kernel message would be logged, and a netlink message would be sent to userspace. Recently, midcomms messages have been added that produce much better logging about actual problems with messages. No use has ever been found for the netlink messages. . The userspace libdlm API has allowed the DLM_LKF_TIMEOUT flag with a timeout value to be set in lock requests. The lock request would be cancelled after the timeout. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2022-07-31erofs: update ctx->pos for every emitted direntHongnan Li
erofs_readdir update ctx->pos after filling a batch of dentries and it may cause dir/files duplication for NFS readdirplus which depends on ctx->pos to fill dir correctly. So update ctx->pos for every emitted dirent in erofs_fill_dentries to fix it. Also fix the update of ctx->pos when the initial file position has exceeded nameoff. Fixes: 3e917cc305c6 ("erofs: make filesystem exportable") Signed-off-by: Hongnan Li <hongnan.li@linux.alibaba.com> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220722082732.30935-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-07-27exec: Call kmap_local_page() in copy_string_kernel()Fabio M. De Francesco
The use of kmap_atomic() is being deprecated in favor of kmap_local_page(). With kmap_local_page(), the mappings are per thread, CPU local and not globally visible. Furthermore, the mappings can be acquired from any context (including interrupts). Therefore, replace kmap_atomic() with kmap_local_page() in copy_string_kernel(). Instead of open-coding local mapping + memcpy(), use memcpy_to_page(). Delete a redundant call to flush_dcache_page(). Tested with xfstests on a QEMU/ KVM x86_32 VM, 6GB RAM, booting a kernel with HIGHMEM64GB enabled. Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220724212523.13317-1-fmdefrancesco@gmail.com
2022-07-26Merge tag 'mm-hotfixes-stable-2022-07-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Thirteen hotfixes. Eight are cc:stable and the remainder are for post-5.18 issues or are too minor to warrant backporting" * tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: update Gao Xiang's email addresses userfaultfd: provide properly masked address for huge-pages Revert "ocfs2: mount shared volume without ha stack" hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte fs: sendfile handles O_NONBLOCK of out_fd ntfs: fix use-after-free in ntfs_ucsncmp() secretmem: fix unhandled fault in truncate mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range() mm: fix missing wake-up event for FSDAX pages mm: fix page leak with multiple threads mapping the same page mailmap: update Seth Forshee's email address tmpfs: fix the issue that the mount and remount results are inconsistent. mm: kfence: apply kmemleak_ignore_phys on early allocated pool
2022-07-26userfaultfd: provide properly masked address for huge-pagesNadav Amit
Commit 824ddc601adc ("userfaultfd: provide unmasked address on page-fault") was introduced to fix an old bug, in which the offset in the address of a page-fault was masked. Concerns were raised - although were never backed by actual code - that some userspace code might break because the bug has been around for quite a while. To address these concerns a new flag was introduced, and only when this flag is set by the user, userfaultfd provides the exact address of the page-fault. The commit however had a bug, and if the flag is unset, the offset was always masked based on a base-page granularity. Yet, for huge-pages, the behavior prior to the commit was that the address is masked to the huge-page granulrity. While there are no reports on real breakage, fix this issue. If the flag is unset, use the address with the masking that was done before. Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com Fixes: 824ddc601adc ("userfaultfd: provide unmasked address on page-fault") Signed-off-by: Nadav Amit <namit@vmware.com> Reported-by: James Houghton <jthoughton@google.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: James Houghton <jthoughton@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26fsnotify: Fix comment typoXin Gao
The double `if' is duplicated in line 104, remove one. Signed-off-by: Xin Gao <gaoxin@cdjrlc.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220722194639.18545-1-gaoxin@cdjrlc.com
2022-07-26ext2: Add more validity checks for inode countsJan Kara
Add checks verifying number of inodes stored in the superblock matches the number computed from number of inodes per group. Also verify we have at least one block worth of inodes per group. This prevents crashes on corrupted filesystems. Reported-by: syzbot+d273f7d7f58afd93be48@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz>
2022-07-26fs/reiserfs/inode: remove dead code in _get_block_create_0()Zeng Jingxiang
Since commit 27b3a5c51b50 ("kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0()"), which removed a label that may have the pointer 'p' touched in its control flow, related if statements now eval to constant value now. Just remove them. Assigning value NULL to p here 293 char *p = NULL; In the following conditional expression, the value of p is always NULL, As a result, the kunmap() cannot be executed. 308 if (p) 309 kunmap(bh_result->b_page); 355 if (p) 356 kunmap(bh_result->b_page); 366 if (p) 367 kunmap(bh_result->b_page); Also, the kmap() cannot be executed. 399 if (!p) 400 p = (char *)kmap(bh_result->b_page); [JK: Removed unnecessary initialization of 'p' to NULL] Signed-off-by: Zeng Jingxiang <linuszeng@tencent.com> Signed-off-by: Kairui Song <kasong@tencent.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220720083029.1065578-1-zengjx95@gmail.com
2022-07-24xfs: Add async buffered write supportStefan Roesch
This adds the async buffered write support to XFS. For async buffered write requests, the request will return -EAGAIN if the ilock cannot be obtained immediately. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-15-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24xfs: Specify lockmode when calling xfs_ilock_for_iomap()Stefan Roesch
This patch changes the helper function xfs_ilock_for_iomap such that the lock mode must be passed in. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-14-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24fs: Add async write file modification handling.Stefan Roesch
This adds a file_modified_async() function to return -EAGAIN if the request either requires to remove privileges or needs to update the file modification time. This is required for async buffered writes, so the request gets handled in the io worker of io-uring. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-11-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24fs: Split off inode_needs_update_time and __file_update_timeStefan Roesch
This splits off the functions inode_needs_update_time() and __file_update_time() from the function file_update_time(). This is required to support async buffered writes. No intended functional changes in this patch. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-10-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24fs: add __remove_file_privs() with flags parameterStefan Roesch
This adds the function __remove_file_privs, which allows the caller to pass the kiocb flags parameter. No intended functional changes in this patch. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-9-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24fs: add a FMODE_BUF_WASYNC flags for f_modeStefan Roesch
This introduces the flag FMODE_BUF_WASYNC. If devices support async buffered writes, this flag can be set. It also modifies the check in generic_write_checks to take async buffered writes into consideration. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-8-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24iomap: Return -EAGAIN from iomap_write_iter()Stefan Roesch
If iomap_write_iter() encounters -EAGAIN, return -EAGAIN to the caller. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-7-shr@fb.com Reviewed-by: Christoph Hellwig <hch@lst.de> [axboe: make the suggested ternary edit] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24iomap: Add async buffered write supportStefan Roesch
This adds async buffered write support to iomap. This replaces the call to balance_dirty_pages_ratelimited() with the call to balance_dirty_pages_ratelimited_flags. This allows to specify if the write request is async or not. In addition this also moves the above function call to the beginning of the function. If the function call is at the end of the function and the decision is made to throttle writes, then there is no request that io-uring can wait on. By moving it to the beginning of the function, the write request is not issued, but returns -EAGAIN instead. io-uring will punt the request and process it in the io-worker. By moving the function call to the beginning of the function, the write throttling will happen one page later. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-6-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24iomap: Add flags parameter to iomap_page_create()Stefan Roesch
Add the kiocb flags parameter to the function iomap_page_create(). Depending on the value of the flags parameter it enables different gfp flags. No intended functional changes in this patch. Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-5-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move to separate directoryJens Axboe
In preparation for splitting io_uring up a bit, move it into its own top level directory. It didn't really belong in fs/ anyway, as it's not a file system only API. This adds io_uring/ and moves the core files in there, and updates the MAINTAINERS file for the new location. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: define a 'prep' and 'issue' handler for each opcodeJens Axboe
Rather than have two giant switches for doing request preparation and then for doing request issue, add a prep and issue handler for each of them in the io_op_defs[] request definition. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-22Merge tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Fix for a bad kfree() introduced in this cycle, and a quick fix for disabling buffer recycling for IORING_OP_READV. The latter will get reworked for 5.20, but it gets the job done for 5.19" * tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block: io_uring: do not recycle buffer in READV io_uring: fix free of unallocated buffer list
2022-07-22erofs: get rid of the leftover PAGE_SIZE in dir.cGao Xiang
Convert the last hardcoded PAGE_SIZEs of uncompressed cases. Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220619150940.121005-1-hsiangkao@linux.alibaba.com
2022-07-22erofs: get rid of erofs_prepare_dio() helperGao Xiang
Fold in erofs_prepare_dio() in order to simplify the code. Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220720082229.12172-1-hsiangkao@linux.alibaba.com
2022-07-22erofs: introduce multi-reference pclusters (fully-referenced)Gao Xiang
Let's introduce multi-reference pclusters at runtime. In details, if one pcluster is requested by multiple extents at almost the same time (even belong to different files), the longest extent will be decompressed as representative and the other extents are actually copied from the longest one in one round. After this patch, fully-referenced extents can be correctly handled and the full decoding check needs to be bypassed for partial-referenced extents. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-17-hsiangkao@linux.alibaba.com
2022-07-21erofs: record the longest decompressed size in this roundGao Xiang
Currently, `pcl->length' records the longest decompressed length as long as the pcluster itself isn't reclaimed. However, such number is unneeded for the general cases since it doesn't indicate the exact decompressed size in this round. Instead, let's record the decompressed size for this round instead, thus `pcl->nr_pages' can be completely dropped and pageofs_out is also designed to be kept in sync with `pcl->length'. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-16-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce z_erofs_do_decompressed_bvec()Gao Xiang
Both out_bvecs and in_bvecs share the common logic for decompressed buffers. So let's make a helper for this. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-15-hsiangkao@linux.alibaba.com
2022-07-21erofs: try to leave (de)compressed_pages on stack if possibleGao Xiang
For the most cases, small pclusters can be decompressed with page arrays on stack. Try to leave both (de)compressed_pages on stack if possible as before. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-14-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce struct z_erofs_decompress_backendGao Xiang
Let's introduce struct z_erofs_decompress_backend in order to pass on the decompression backend context between helper functions more easier and avoid too many arguments. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-13-hsiangkao@linux.alibaba.com
2022-07-21erofs: get rid of `z_pagemap_global'Gao Xiang
In order to introduce multi-reference pclusters for compressed data deduplication, let's get rid of the global page array for now since it needs to be re-designed then at least. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-12-hsiangkao@linux.alibaba.com
2022-07-21erofs: clean up `enum z_erofs_collectmode'Gao Xiang
`enum z_erofs_collectmode' is really ambiguous, but I'm not quite sure if there are better naming, basically it's used to judge whether inplace I/O can be used due to the current status of pclusters in the chain. Rename it as `enum z_erofs_pclustermode' instead. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-11-hsiangkao@linux.alibaba.com
2022-07-21erofs: get rid of `enum z_erofs_page_type'Gao Xiang
Remove it since pagevec[] is no longer used. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-10-hsiangkao@linux.alibaba.com
2022-07-21erofs: rework online page handlingGao Xiang
Since all decompressed offsets have been integrated to bvecs[], this patch avoids all sub-indexes so that page->private only includes a part count and an eio flag, thus in the future folio->private can have the same meaning. In addition, PG_error will not be used anymore after this patch and we're heading to use page->private (later folio->private) and page->mapping (later folio->mapping) only. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-9-hsiangkao@linux.alibaba.com
2022-07-21erofs: switch compressed_pages[] to bufvecGao Xiang
Convert compressed_pages[] to bufvec in order to avoid using page->private to keep onlinepage_index (decompressed offset) for inplace I/O pages. In the future, we only rely on folio->private to keep a countdown to unlock folios and set folio_uptodate. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-8-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce `z_erofs_parse_in_bvecs'Gao Xiang
`z_erofs_decompress_pcluster()' is too long therefore it'd be better to introduce another helper to parse compressed pages (or laterly, compressed bvecs.) BTW, since `compressed_bvecs' is too long as a part of the function name, `in_bvecs' is used here instead. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-7-hsiangkao@linux.alibaba.com
2022-07-21erofs: drop the old pagevec approachGao Xiang
Remove the old pagevec approach but keep z_erofs_page_type for now. It will be reworked in the following commits as well. Also rename Z_EROFS_NR_INLINE_PAGEVECS as Z_EROFS_INLINE_BVECS with the new value 2 since it's actually enough to bootstrap. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-6-hsiangkao@linux.alibaba.com