350 Commits

Author SHA1 Message Date
Danny Lin
34a20308c3 f2fs: Add support for reporting a fake kernel version to fsck
fsck.f2fs forces a filesystem fix on boot if it detects that the current
kernel version differs from the one saved in the superblock, which results in
fsck blocking boot for a long time (~35 seconds). This commit provides a
way to report a constant fake kernel version to fsck to avoid triggering
the version check, which is useful if you boot new kernel builds
frequently.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2020-08-06 20:22:28 +05:30
Park Ju Hyung
1c2e9e5cc0 kernel: fake system calls on seccomp to succeed
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2020-06-15 20:42:47 +05:30
Blagovest Kolenichev
8ad87c80a2 Merge android-4.14.151 (2bb70f4) into msm-4.14
* refs/heads/tmp-2bb70f4:
  ANDROID: virtio: virtio_input: Set the amount of multitouch slots in virtio input
  ANDROID: dummy_cpufreq: Implement get()
  rtlwifi: Fix potential overflow on P2P code
  ANDROID: cpufreq: create dummy cpufreq driver
  ANDROID: Allow DRM_IOCTL_MODE_*_DUMB for render clients.
  ANDROID: sdcardfs: evict dentries on fscrypt key removal
  ANDROID: fscrypt: add key removal notifier chain
  ANDROID: Move from clang r353983c to r365631c
  ANDROID: move up spin_unlock_bh() ahead of remove_proc_entry()
  BACKPORT: arm64: tags: Preserve tags for addresses translated via TTBR1
  UPSTREAM: arm64: memory: Implement __tag_set() as common function
  UPSTREAM: arm64/mm: fix variable 'tag' set but not used
  UPSTREAM: arm64: avoid clang warning about self-assignment
  ANDROID: refactor build.config files to remove duplication
  UPSTREAM: mm: vmalloc: show number of vmalloc pages in /proc/meminfo
  BACKPORT: PM/sleep: Expose suspend stats in sysfs
  UPSTREAM: power: supply: Init device wakeup after device_add()
  UPSTREAM: PM / wakeup: Unexport wakeup_source_sysfs_{add,remove}()
  UPSTREAM: PM / wakeup: Register wakeup class kobj after device is added
  BACKPORT: PM / wakeup: Fix sysfs registration error path
  BACKPORT: PM / wakeup: Show wakeup sources stats in sysfs
  UPSTREAM: PM / wakeup: Print warn if device gets enabled as wakeup source during sleep
  UPSTREAM: PM / wakeup: Use wakeup_source_register() in wakelock.c
  UPSTREAM: PM / wakeup: Only update last time for active wakeup sources
  UPSTREAM: PM / core: Add support to skip power management in device/driver model
  cuttlefish-4.14: Enable CONFIG_DM_SNAPSHOT
  ANDROID: cuttlefish_defconfig: Enable BPF_JIT and BPF_JIT_ALWAYS_ON
  UPSTREAM: netfilter: xt_IDLETIMER: fix sysfs callback function type
  UPSTREAM: mm: untag user pointers in mmap/munmap/mremap/brk
  UPSTREAM: vfio/type1: untag user pointers in vaddr_get_pfn
  UPSTREAM: media/v4l2-core: untag user pointers in videobuf_dma_contig_user_get
  UPSTREAM: drm/radeon: untag user pointers in radeon_gem_userptr_ioctl
  BACKPORT: drm/amdgpu: untag user pointers
  UPSTREAM: userfaultfd: untag user pointers
  UPSTREAM: fs/namespace: untag user pointers in copy_mount_options
  UPSTREAM: mm: untag user pointers in get_vaddr_frames
  UPSTREAM: mm: untag user pointers in mm/gup.c
  BACKPORT: mm: untag user pointers passed to memory syscalls
  BACKPORT: lib: untag user pointers in strn*_user
  UPSTREAM: arm64: Fix reference to docs for ARM64_TAGGED_ADDR_ABI
  UPSTREAM: selftests, arm64: add kernel headers path for tags_test
  BACKPORT: arm64: Relax Documentation/arm64/tagged-pointers.rst
  UPSTREAM: arm64: Define Documentation/arm64/tagged-address-abi.rst
  UPSTREAM: arm64: Change the tagged_addr sysctl control semantics to only prevent the opt-in
  UPSTREAM: arm64: Tighten the PR_{SET, GET}_TAGGED_ADDR_CTRL prctl() unused arguments
  UPSTREAM: selftests, arm64: fix uninitialized symbol in tags_test.c
  UPSTREAM: arm64: mm: Really fix sparse warning in untagged_addr()
  UPSTREAM: selftests, arm64: add a selftest for passing tagged pointers to kernel
  BACKPORT: arm64: Introduce prctl() options to control the tagged user addresses ABI
  UPSTREAM: thread_info: Add update_thread_flag() helpers
  UPSTREAM: arm64: untag user pointers in access_ok and __uaccess_mask_ptr
  UPSTREAM: uaccess: add noop untagged_addr definition
  BACKPORT: block: annotate refault stalls from IO submission
  ext4: add verity flag check for dax
  ANDROID: usb: gadget: Fix dependency for f_accessory
  ANDROID: sched: fair: balance for single core cluster
  UPSTREAM: mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y
  f2fs: add a condition to detect overflow in f2fs_ioc_gc_range()
  f2fs: fix to add missing F2FS_IO_ALIGNED() condition
  f2fs: fix to fallback to buffered IO in IO aligned mode
  f2fs: fix to handle error path correctly in f2fs_map_blocks
  f2fs: fix extent corrupotion during directIO in LFS mode
  f2fs: check all the data segments against all node ones
  f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY
  f2fs: fix inode rwsem regression
  f2fs: fix to avoid accessing uninitialized field of inode page in is_alive()
  f2fs: avoid infinite GC loop due to stale atomic files
  f2fs: Fix indefinite loop in f2fs_gc()
  f2fs: convert inline_data in prior to i_size_write
  f2fs: fix error path of f2fs_convert_inline_page()
  f2fs: add missing documents of reserve_root/resuid/resgid
  f2fs: fix flushing node pages when checkpoint is disabled
  f2fs: enhance f2fs_is_checkpoint_ready()'s readability
  f2fs: clean up __bio_alloc()'s parameter
  f2fs: fix wrong error injection path in inc_valid_block_count()
  f2fs: fix to writeout dirty inode during node flush
  f2fs: optimize case-insensitive lookups
  f2fs: introduce f2fs_match_name() for cleanup
  f2fs: Fix indefinite loop in f2fs_gc()
  f2fs: allocate memory in batch in build_sit_info()
  f2fs: fix to avoid data corruption by forbidding SSR overwrite
  f2fs: Fix build error while CONFIG_NLS=m
  Revert "f2fs: avoid out-of-range memory access"
  f2fs: cleanup the code in build_sit_entries.
  f2fs: fix wrong available node count calculation
  f2fs: remove duplicate code in f2fs_file_write_iter
  f2fs: fix to migrate blocks correctly during defragment
  f2fs: use wrapped f2fs_cp_error()
  f2fs: fix to use more generic EOPNOTSUPP
  f2fs: use wrapped IS_SWAPFILE()
  f2fs: Support case-insensitive file name lookups
  f2fs: include charset encoding information in the superblock
  fs: Reserve flag for casefolding
  f2fs: fix to avoid call kvfree under spinlock
  fs: f2fs: Remove unnecessary checks of SM_I(sbi) in update_general_status()
  f2fs: disallow direct IO in atomic write
  f2fs: fix to handle quota_{on,off} correctly
  f2fs: fix to detect cp error in f2fs_setxattr()
  f2fs: fix to spread f2fs_is_checkpoint_ready()
  f2fs: support fiemap() for directory inode
  f2fs: fix to avoid discard command leak
  f2fs: fix to avoid tagging SBI_QUOTA_NEED_REPAIR incorrectly
  f2fs: fix to drop meta/node pages during umount
  f2fs: disallow switching io_bits option during remount
  f2fs: fix panic of IO alignment feature
  f2fs: introduce {page,io}_is_mergeable() for readability
  f2fs: fix livelock in swapfile writes
  f2fs: add fs-verity support
  ext4: update on-disk format documentation for fs-verity
  ext4: add fs-verity read support
  ext4: add basic fs-verity support
  fs-verity: support builtin file signatures
  fs-verity: add SHA-512 support
  fs-verity: implement FS_IOC_MEASURE_VERITY ioctl
  fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
  fs-verity: add data verification hooks for ->readpages()
  fs-verity: add the hook for file ->setattr()
  fs-verity: add the hook for file ->open()
  fs-verity: add inode and superblock fields
  fs-verity: add Kconfig and the helper functions for hashing
  fs: uapi: define verity bit for FS_IOC_GETFLAGS
  fs-verity: add UAPI header
  fs-verity: add MAINTAINERS file entry
  fs-verity: add a documentation file
  ext4: fix kernel oops caused by spurious casefold flag
  ext4: fix coverity warning on error path of filename setup
  ext4: optimize case-insensitive lookups
  ext4: fix dcache lookup of !casefolded directories
  unicode: update to Unicode 12.1.0 final
  unicode: add missing check for an error return from utf8lookup()
  ext4: export /sys/fs/ext4/feature/casefold if Unicode support is present
  unicode: refactor the rule for regenerating utf8data.h
  ext4: Support case-insensitive file name lookups
  ext4: include charset encoding information in the superblock
  unicode: update unicode database unicode version 12.1.0
  unicode: introduce test module for normalized utf8 implementation
  unicode: implement higher level API for string handling
  unicode: reduce the size of utf8data[]
  unicode: introduce code for UTF-8 normalization
  unicode: introduce UTF-8 character database
  ext4 crypto: fix to check feature status before get policy
  fscrypt: document the new ioctls and policy version
  ubifs: wire up new fscrypt ioctls
  f2fs: wire up new fscrypt ioctls
  ext4: wire up new fscrypt ioctls
  fscrypt: require that key be added when setting a v2 encryption policy
  fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl
  fscrypt: allow unprivileged users to add/remove keys for v2 policies
  fscrypt: v2 encryption policy support
  fscrypt: add an HKDF-SHA512 implementation
  fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl
  fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
  fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
  fscrypt: rename keyinfo.c to keysetup.c
  fscrypt: move v1 policy key setup to keysetup_v1.c
  fscrypt: refactor key setup code in preparation for v2 policies
  fscrypt: rename fscrypt_master_key to fscrypt_direct_key
  fscrypt: add ->ci_inode to fscrypt_info
  fscrypt: use FSCRYPT_* definitions, not FS_*
  fscrypt: use FSCRYPT_ prefix for uapi constants
  fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>
  fscrypt: use ENOPKG when crypto API support missing
  fscrypt: improve warnings for missing crypto API support
  fscrypt: improve warning messages for unsupported encryption contexts
  fscrypt: make fscrypt_msg() take inode instead of super_block
  fscrypt: clean up base64 encoding/decoding
  fscrypt: remove loadable module related code
  ANDROID: arm64: bpf: implement arch_bpf_jit_check_func
  ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI
  UPSTREAM: kcm: use BPF_PROG_RUN
  UPSTREAM: psi: get poll_work to run when calling poll syscall next time
  UPSTREAM: sched/psi: Do not require setsched permission from the trigger creator
  UPSTREAM: sched/psi: Reduce psimon FIFO priority
  BACKPORT: arm64: Add support for relocating the kernel with RELR relocations
  ANDROID: Log which device failed to suspend in dpm_suspend_start()
  ANDROID: Revert "ANDROID: sched: Disallow WALT with CFS bandwidth control"
  ANDROID: sched: WALT: Add support for CFS_BANDWIDTH
  ANDROID: sched: WALT: Refactor cumulative runnable average fixup
  ANDROID: sched: Disallow WALT with CFS bandwidth control
  fscrypt: document testing with xfstests
  fscrypt: remove selection of CONFIG_CRYPTO_SHA256
  fscrypt: remove unnecessary includes of ratelimit.h
  fscrypt: don't set policy for a dead directory
  fscrypt: decrypt only the needed blocks in __fscrypt_decrypt_bio()
  fscrypt: support decrypting multiple filesystem blocks per page
  fscrypt: introduce fscrypt_decrypt_block_inplace()
  fscrypt: handle blocksize < PAGE_SIZE in fscrypt_zeroout_range()
  fscrypt: support encrypting multiple filesystem blocks per page
  fscrypt: introduce fscrypt_encrypt_block_inplace()
  fscrypt: clean up some BUG_ON()s in block encryption/decryption
  fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block()
  fscrypt: remove the "write" part of struct fscrypt_ctx
  fscrypt: simplify bounce page handling
  ANDROID: fiq_debugger: remove
  UPSTREAM: lib/test_meminit.c: use GFP_ATOMIC in RCU critical section
  UPSTREAM: mm: slub: Fix slab walking for init_on_free
  UPSTREAM: lib/test_meminit.c: minor test fixes
  UPSTREAM: lib/test_meminit.c: fix -Wmaybe-uninitialized false positive
  UPSTREAM: lib: introduce test_meminit module
  UPSTREAM: mm: init: report memory auto-initialization features at boot time
  BACKPORT: mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
  UPSTREAM: arm64: move jump_label_init() before parse_early_param()
  ANDROID: Add a tracepoint for mapping inode to full path
  BACKPORT: arch: add pidfd and io_uring syscalls everywhere
  UPSTREAM: dma-buf: add show_fdinfo handler
  UPSTREAM: dma-buf: add DMA_BUF_SET_NAME ioctls
  BACKPORT: dma-buf: give each buffer a full-fledged inode
  ANDROID: fix kernelci build-break
  UPSTREAM: drm/virtio: Fix cache entry creation race.
  UPSTREAM: drm/virtio: Wake up all waiters when capset response comes in.
  UPSTREAM: drm/virtio: Ensure cached capset entries are valid before copying.
  UPSTREAM: drm/virtio: use u64_to_user_ptr macro
  UPSTREAM: drm/virtio: remove irrelevant DRM_UNLOCKED flag
  UPSTREAM: drm/virtio: Remove redundant return type
  UPSTREAM: drm/virtio: allocate fences with GFP_KERNEL
  UPSTREAM: drm/virtio: add trace events for commands
  UPSTREAM: drm/virtio: trace drm_fence_emit
  BACKPORT: drm/virtio: set seqno for dma-fence
  BACKPORT: drm/virtio: move drm_connector_update_edid_property() call
  UPSTREAM: drm/virtio: add missing drm_atomic_helper_shutdown() call.
  BACKPORT: drm/virtio: rework resource creation workflow.
  UPSTREAM: drm/virtio: params struct for virtio_gpu_cmd_create_resource_3d()
  BACKPORT: drm/virtio: params struct for virtio_gpu_cmd_create_resource()
  BACKPORT: drm/virtio: use struct to pass params to virtio_gpu_object_create()
  UPSTREAM: drm/virtio: add virtio-gpu-features debugfs file.
  UPSTREAM: drm/virtio: remove set but not used variable 'vgdev'
  BACKPORT: drm/virtio: implement prime export
  UPSTREAM: drm/virtio: remove prime pin/unpin callbacks.
  UPSTREAM: drm/virtio: implement prime mmap
  UPSTREAM: drm/virtio: drop virtio_gpu_fence_cleanup()
  UPSTREAM: drm/virtio: fix pageflip flush
  UPSTREAM: drm/virtio: log error responses
  UPSTREAM: drm/virtio: Add missing virtqueue reset
  UPSTREAM: drm/virtio: Remove incorrect kfree()
  UPSTREAM: drm/virtio: virtio_gpu_cmd_resource_create_3d: drop unused fence arg
  UPSTREAM: drm/virtio: fence: pass plain pointer
  BACKPORT: drm/virtio: add edid support
  UPSTREAM: virtio-gpu: add VIRTIO_GPU_F_EDID feature
  BACKPORT: drm/virtio: fix memory leak of vfpriv on error return path
  UPSTREAM: drm/virtio: bump driver version after explicit synchronization addition
  UPSTREAM: drm/virtio: add in/out fence support for explicit synchronization
  UPSTREAM: drm/virtio: add uapi for in and out explicit fences
  UPSTREAM: drm/virtio: add virtio_gpu_alloc_fence()
  UPSTREAM: drm/virtio: Handle error from virtio_gpu_resource_id_get
  UPSTREAM: gpu/drm/virtio/virtgpu_vq.c: Use kmem_cache_zalloc
  UPSTREAM: drm/virtio: fix resource id handling
  UPSTREAM: drm/virtio: drop resource_id argument.
  UPSTREAM: drm/virtio: use virtio_gpu_object->hw_res_handle in virtio_gpu_resource_create_ioctl()
  UPSTREAM: drm/virtio: use virtio_gpu_object->hw_res_handle in virtio_gpu_mode_dumb_create()
  UPSTREAM: drm/virtio: use virtio_gpu_object->hw_res_handle in virtio_gpufb_create()
  BACKPORT: drm/virtio: track created object state
  UPSTREAM: drm/virtio: document drm_dev_set_unique workaround
  UPSTREAM: virtio: Support prime objects vmap/vunmap
  UPSTREAM: virtio: Rework virtio_gpu_object_kmap()
  UPSTREAM: virtio: Add virtio_gpu_object_kunmap()
  UPSTREAM: drm/virtio: pass virtio_gpu_object to virtio_gpu_cmd_transfer_to_host_{2d, 3d}
  UPSTREAM: drm/virtio: add dma sync for dma mapped virtio gpu framebuffer pages
  UPSTREAM: drm/virtio: Remove set but not used variable 'bo'
  UPSTREAM: drm/virtio: add iommu support.
  UPSTREAM: drm/virtio: add virtio_gpu_object_detach() function
  UPSTREAM: drm/virtio: track virtual output state
  UPSTREAM: drm/virtio: fix bounds check in virtio_gpu_cmd_get_capset()
  UPSTREAM: gpu: drm: virtio: code cleanup
  UPSTREAM: drm/virtio: Place GEM BOs in drm_framebuffer
  UPSTREAM: drm/virtio: fix mode_valid's return type
  UPSTREAM: drm/virtio: Add spaces around operators
  UPSTREAM: drm/virtio: Remove multiple blank lines
  UPSTREAM: drm/virtio: Replace 'unsigned' for 'unsigned int'
  UPSTREAM: drm/virtio: Remove return from void function
  UPSTREAM: drm/virtio: Add */ in block comments to separate line
  UPSTREAM: drm/virtio: Add blank line after variable declarations
  UPSTREAM: drm/virtio: Add tabs at the start of a line
  UPSTREAM: drm/virtio: Don't return invalid caps on timeout
  UPSTREAM: virtgpu: remove redundant task_comm copying
  UPSTREAM: drm/virtio: add create_handle support.
  UPSTREAM: drm: virtio: replace reference/unreference with get/put
  UPSTREAM: drm/virtio: Replace instances of reference/unreference with get/put
  UPSTREAM: drm: byteorder: add DRM_FORMAT_HOST_*
  UPSTREAM: drm: add drm_connector_attach_edid_property()
  BACKPORT: drm/prime: Add drm_gem_prime_mmap()
  f2fs: fix build error on android tracepoints
  ANDROID: cuttlefish_defconfig: Enable CAN/VCAN
  UPSTREAM: pidfd: fix a poll race when setting exit_state
  BACKPORT: arch: wire-up pidfd_open()
  BACKPORT: pid: add pidfd_open()
  UPSTREAM: pidfd: add polling support
  UPSTREAM: signal: improve comments
  UPSTREAM: fork: do not release lock that wasn't taken
  BACKPORT: signal: support CLONE_PIDFD with pidfd_send_signal
  BACKPORT: clone: add CLONE_PIDFD
  UPSTREAM: Make anon_inodes unconditional
  UPSTREAM: signal: use fdget() since we don't allow O_PATH
  UPSTREAM: signal: don't silently convert SI_USER signals to non-current pidfd
  BACKPORT: signal: add pidfd_send_signal() syscall
  UPSTREAM: net-ipv6-ndisc: add support for RFC7710 RA Captive Portal Identifier
  ANDROID: fix up 9p filesystem due to CFI non-upstream patches
  f2fs: use EINVAL for superblock with invalid magic
  f2fs: fix to read source block before invalidating it
  f2fs: remove redundant check from f2fs_setflags_common()
  f2fs: use generic checking function for FS_IOC_FSSETXATTR
  f2fs: use generic checking and prep function for FS_IOC_SETFLAGS
  ubifs, fscrypt: cache decrypted symlink target in ->i_link
  vfs: use READ_ONCE() to access ->i_link
  fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory
  ANDROID: (arm64) cuttlefish_defconfig: enable CONFIG_CPU_FREQ_TIMES
  ANDROID: xfrm: remove in_compat_syscall() checks
  ANDROID: enable CONFIG_RTC_DRV_TEST on cuttlefish
  UPSTREAM: binder: Set end of SG buffer area properly.
  ANDROID: x86_64_cuttlefish_defconfig: enable CONFIG_CPU_FREQ_TIMES
  ANDROID: f2fs: add android fsync tracepoint
  ANDROID: f2fs: fix wrong android tracepoint
  fscrypt: cache decrypted symlink target in ->i_link
  fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
  fscrypt: only set dentry_operations on ciphertext dentries
  fscrypt: fix race allowing rename() and link() of ciphertext dentries
  fscrypt: clean up and improve dentry revalidation
  fscrypt: use READ_ONCE() to access ->i_crypt_info
  fscrypt: remove WARN_ON_ONCE() when decryption fails
  fscrypt: drop inode argument from fscrypt_get_ctx()
  f2fs: improve print log in f2fs_sanity_check_ckpt()
  f2fs: avoid out-of-range memory access
  f2fs: fix to avoid long latency during umount
  f2fs: allow all the users to pin a file
  f2fs: support swap file w/ DIO
  f2fs: allocate blocks for pinned file
  f2fs: fix is_idle() check for discard type
  f2fs: add a rw_sem to cover quota flag changes
  f2fs: set SBI_NEED_FSCK for xattr corruption case
  f2fs: use generic EFSBADCRC/EFSCORRUPTED
  f2fs: Use DIV_ROUND_UP() instead of open-coding
  f2fs: print kernel message if filesystem is inconsistent
  f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
  f2fs: avoid get_valid_blocks() for cleanup
  f2fs: ioctl for removing a range from F2FS
  f2fs: only set project inherit bit for directory
  f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
  UPSTREAM: kasan: initialize tag to 0xff in __kasan_kmalloc
  UPSTREAM: x86/boot: Provide KASAN compatible aliases for string routines
  UPSTREAM: mm/kasan: Remove the ULONG_MAX stack trace hackery
  UPSTREAM: x86/uaccess, kasan: Fix KASAN vs SMAP
  UPSTREAM: x86/uaccess: Introduce user_access_{save,restore}()
  UPSTREAM: kasan: fix variable 'tag' set but not used warning
  UPSTREAM: Revert "x86_64: Increase stack size for KASAN_EXTRA"
  UPSTREAM: kasan: fix coccinelle warnings in kasan_p*_table
  UPSTREAM: kasan: fix kasan_check_read/write definitions
  BACKPORT: kasan: remove use after scope bugs detection.
  BACKPORT: kasan: turn off asan-stack for clang-8 and earlier
  UPSTREAM: slub: fix a crash with SLUB_DEBUG + KASAN_SW_TAGS
  UPSTREAM: kasan, slab: remove redundant kasan_slab_alloc hooks
  UPSTREAM: kasan, slab: make freelist stored without tags
  UPSTREAM: kasan, slab: fix conflicts with CONFIG_HARDENED_USERCOPY
  UPSTREAM: kasan: prevent tracing of tags.c
  UPSTREAM: kasan: fix random seed generation for tag-based mode
  UPSTREAM: slub: fix SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS
  UPSTREAM: kasan, slub: fix more conflicts with CONFIG_SLAB_FREELIST_HARDENED
  UPSTREAM: kasan, slub: fix conflicts with CONFIG_SLAB_FREELIST_HARDENED
  UPSTREAM: kasan, slub: move kasan_poison_slab hook before page_address
  UPSTREAM: kasan, kmemleak: pass tagged pointers to kmemleak
  UPSTREAM: kasan: fix assigning tags twice
  UPSTREAM: kasan: mark file common so ftrace doesn't trace it
  UPSTREAM: kasan, arm64: remove redundant ARCH_SLAB_MINALIGN define
  UPSTREAM: kasan: fix krealloc handling for tag-based mode
  UPSTREAM: kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY
  UPSTREAM: kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligning
  BACKPORT: mm/memblock.c: skip kmemleak for kasan_init()
  UPSTREAM: kasan: add SPDX-License-Identifier mark to source files
  BACKPORT: kasan: update documentation
  UPSTREAM: kasan, arm64: select HAVE_ARCH_KASAN_SW_TAGS
  UPSTREAM: kasan: add __must_check annotations to kasan hooks
  BACKPORT: kasan, mm, arm64: tag non slab memory allocated via pagealloc
  UPSTREAM: kasan, arm64: add brk handler for inline instrumentation
  UPSTREAM: kasan: add hooks implementation for tag-based mode
  UPSTREAM: mm: move obj_to_index to include/linux/slab_def.h
  UPSTREAM: kasan: add bug reporting routines for tag-based mode
  UPSTREAM: kasan: split out generic_report.c from report.c
  UPSTREAM: kasan, mm: perform untagged pointers comparison in krealloc
  BACKPORT: kasan, arm64: enable top byte ignore for the kernel
  BACKPORT: kasan, arm64: fix up fault handling logic
  UPSTREAM: kasan: preassign tags to objects with ctors or SLAB_TYPESAFE_BY_RCU
  UPSTREAM: kasan, arm64: untag address in _virt_addr_is_linear
  UPSTREAM: kasan: add tag related helper functions
  BACKPORT: arm64: move untagged_addr macro from uaccess.h to memory.h
  BACKPORT: kasan: initialize shadow to 0xff for tag-based mode
  BACKPORT: kasan: rename kasan_zero_page to kasan_early_shadow_page
  BACKPORT: kasan, arm64: adjust shadow size for tag-based mode
  BACKPORT: kasan: add CONFIG_KASAN_GENERIC and CONFIG_KASAN_SW_TAGS
  UPSTREAM: kasan: rename source files to reflect the new naming scheme
  BACKPORT: kasan: move common generic and tag-based code to common.c
  UPSTREAM: kasan, slub: handle pointer tags in early_kmem_cache_node_alloc
  UPSTREAM: kasan, mm: change hooks signatures
  UPSTREAM: arm64: add EXPORT_SYMBOL_NOKASAN()
  BACKPORT: compiler: remove __no_sanitize_address_or_inline again
  UPSTREAM: mm/kasan/quarantine.c: make quarantine_lock a raw_spinlock_t
  UPSTREAM: lib/test_kasan.c: add tests for several string/memory API functions
  UPSTREAM: arm64: lib: use C string functions with KASAN enabled
  UPSTREAM: compiler: introduce __no_sanitize_address_or_inline
  UPSTREAM: arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
  BACKPORT: kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN
  BACKPORT: mm/mempool.c: remove unused argument in kasan_unpoison_element() and remove_element()
  UPSTREAM: kasan: only select SLUB_DEBUG with SYSFS=y
  UPSTREAM: kasan: depend on CONFIG_SLUB_DEBUG
  UPSTREAM: KASAN: prohibit KASAN+STRUCTLEAK combination
  UPSTREAM: arm64: kasan: avoid pfn_to_nid() before page array is initialized
  UPSTREAM: kasan: fix invalid-free test crashing the kernel
  UPSTREAM: kasan, slub: fix handling of kasan_slab_free hook
  UPSTREAM: slab, slub: skip unnecessary kasan_cache_shutdown()
  BACKPORT: kasan: make kasan_cache_create() work with 32-bit slab cache sizes
  UPSTREAM: locking/atomics: Instrument cmpxchg_double*()
  UPSTREAM: locking/atomics: Instrument xchg()
  UPSTREAM: locking/atomics: Simplify cmpxchg() instrumentation
  UPSTREAM: locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentation
  UPSTREAM: locking/atomic, asm-generic, x86: Add comments for atomic instrumentation
  UPSTREAM: locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations
  UPSTREAM: locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h
  UPSTREAM: locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h
  BACKPORT: kasan, arm64: clean up KASAN_SHADOW_SCALE_SHIFT usage
  UPSTREAM: kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage
  UPSTREAM: kasan: fix prototype author email address
  UPSTREAM: kasan: detect invalid frees
  UPSTREAM: kasan: unify code between kasan_slab_free() and kasan_poison_kfree()
  UPSTREAM: kasan: detect invalid frees for large mempool objects
  UPSTREAM: kasan: don't use __builtin_return_address(1)
  UPSTREAM: kasan: detect invalid frees for large objects
  UPSTREAM: kasan: add functions for unpoisoning stack variables
  UPSTREAM: kasan: add tests for alloca poisoning
  UPSTREAM: kasan: support alloca() poisoning
  UPSTREAM: kasan/Makefile: support LLVM style asan parameters
  BACKPORT: kasan: add compiler support for clang
  BACKPORT: fs: dcache: Revert "manually unpoison dname after allocation to shut up kasan's reports"
  UPSTREAM: fs/dcache: Use read_word_at_a_time() in dentry_string_cmp()
  UPSTREAM: lib/strscpy: Shut up KASAN false-positives in strscpy()
  UPSTREAM: compiler.h: Add read_word_at_a_time() function.
  UPSTREAM: compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
  UPSTREAM: arm64/mm/kasan: don't use vmemmap_populate() to initialize shadow
  UPSTREAM: Documentation/features/KASAN: mark KASAN as supported only on 64-bit on x86
  f2fs: Add option to limit required GC for checkpoint=disable
  f2fs: Fix accounting for unusable blocks
  f2fs: Fix root reserved on remount
  f2fs: Lower threshold for disable_cp_again
  f2fs: fix sparse warning
  f2fs: fix f2fs_show_options to show nodiscard mount option
  f2fs: add error prints for debugging mount failure
  f2fs: fix to do sanity check on segment bitmap of LFS curseg
  f2fs: add missing sysfs entries in documentation
  f2fs: fix to avoid deadloop if data_flush is on
  f2fs: always assume that the device is idle under gc_urgent
  f2fs: add bio cache for IPU
  f2fs: allow ssr block allocation during checkpoint=disable period
  f2fs: fix to check layout on last valid checkpoint park

Conflicts:
	arch/arm64/configs/cuttlefish_defconfig
	arch/arm64/include/asm/memory.h
	arch/arm64/include/asm/thread_info.h
	arch/x86/configs/x86_64_cuttlefish_defconfig
	build.config.common
	drivers/dma-buf/dma-buf.c
	fs/crypto/Makefile
	fs/crypto/bio.c
	fs/crypto/fscrypt_private.h
	fs/crypto/keyinfo.c
	fs/ext4/page-io.c
	fs/f2fs/data.c
	fs/f2fs/f2fs.h
	fs/f2fs/inode.c
	fs/f2fs/segment.c
	fs/userfaultfd.c
	include/linux/dma-buf.h
	include/linux/fscrypt.h
	include/linux/kasan.h
	include/linux/platform_data/ds2482.h
	include/uapi/linux/fs.h
	kernel/sched/deadline.c
	kernel/sched/fair.c
	kernel/sched/rt.c
	kernel/sched/sched.h
	kernel/sched/stop_task.c
	kernel/sched/walt.c
	kernel/sched/walt.h
	lib/test_kasan.c
	mm/kasan/common.c
	mm/kasan/kasan.h
	mm/kasan/report.c
	mm/slub.c
	mm/vmalloc.c
	scripts/Makefile.kasan

Changed below files to fix build errors:

	drivers/char/diag/diagchar_core.c
	drivers/power/supply/qcom/battery.c
	drivers/power/supply/qcom/smb1390-charger-psy.c
	drivers/power/supply/qcom/smb1390-charger.c
	drivers/power/supply/qcom/step-chg-jeita.c
	fs/crypto/fscrypt_ice.c
	fs/crypto/fscrypt_private.h
	fs/f2fs/inode.c
	include/uapi/linux/fscrypt.h
	net/qrtr/qrtr.c
	gen_headers_arm.bp
	gen_headers_arm64.bp

Extra added fixes in fs/f2fs/data.c for FBE:

  * Fix FBE regression with 9937c21ce1 ("f2fs: add bio cache
    for IPU"). The above commit is not setting the DUN for
    bio, due to which the bio's could get corrupted when FBE
    is enabled.

  * The f2fs_merge_page_bio() incorrectly uses the bio after
    it is submitted for IO when fscrypt_mergeable_bio()
    returns false. Fix it by making the submitted bio NULL
    so that a new bio gets allocated for the next/new page.

Ignored the below scheduler patches as they are already present:

  ANDROID: sched: WALT: Add support for CFS_BANDWIDTH
  ANDROID: sched: WALT: Refactor cumulative runnable average fixup

picked below patches from 4.14.159 and 4.14.172 versions to fix issues
  0e39aa9d5 "UPSTREAM: arm64: Validate tagged addresses in access_ok() called from kernel threads"
  352902650 "fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY"

Change-Id: I205b796ee125fa6e9d27fa30f881e4e8fe8bea29
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2020-04-07 15:22:27 +05:30
Catalin Marinas
cde9b8ce92 UPSTREAM: arm64: Tighten the PR_{SET, GET}_TAGGED_ADDR_CTRL prctl() unused arguments
(Upstream commit 3e91ec89f527b9870fe42dcbdb74fd389d123a95).

Require that arg{3,4,5} of the PR_{SET,GET}_TAGGED_ADDR_CTRL prctl and
arg2 of the PR_GET_TAGGED_ADDR_CTRL prctl() are zero rather than ignored
for future extensions.

Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I8bb5c3eb4728440880c971d77904f7e45b571ddc
2019-10-07 14:56:02 -04:00
Catalin Marinas
aa5a358032 BACKPORT: arm64: Introduce prctl() options to control the tagged user addresses ABI
(Upstream commit 63f0c60379650d82250f22e4cf4137ef3dc4f43d).

It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.

The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Change-Id: I2d52c5589b05415faab315c116245f1058d64750
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
2019-10-07 14:56:02 -04:00
Blagovest Kolenichev
80e9b1ab39 Merge android-4.14.126 (cfee25d) into msm-4.14
* refs/heads/tmp-cfee25d:
  Linux 4.14.126
  ALSA: seq: Cover unsubscribe_port() in list_mutex
  drm: don't block fb changes for async plane updates
  Revert "drm/nouveau: add kconfig option to turn off nouveau legacy contexts. (v3)"
  Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections"
  percpu: do not search past bitmap when allocating an area
  gpio: vf610: Do not share irq_chip
  usb: typec: fusb302: Check vconn is off when we start toggling
  ARM: exynos: Fix undefined instruction during Exynos5422 resume
  pwm: Fix deadlock warning when removing PWM device
  ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa
  pwm: tiehrpwm: Update shadow register for disabling PWMs
  dmaengine: idma64: Use actual device for DMA transfers
  gpio: gpio-omap: add check for off wake capable gpios
  PCI: xilinx: Check for __get_free_pages() failure
  block, bfq: increase idling for weight-raised queues
  video: imsttfb: fix potential NULL pointer dereferences
  video: hgafb: fix potential NULL pointer dereference
  PCI: rcar: Fix 64bit MSI message address handling
  PCI: rcar: Fix a potential NULL pointer dereference
  power: supply: max14656: fix potential use-before-alloc
  platform/x86: intel_pmc_ipc: adding error handling
  PCI: rpadlpar: Fix leaked device_node references in add/remove paths
  ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA
  ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA
  ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA
  ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA
  ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA
  ARM: dts: imx53: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
  ARM: dts: imx50: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
  ARM: dts: imx51: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
  soc: rockchip: Set the proper PWM for rk3288
  clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288
  soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher
  PCI: keystone: Prevent ARM32 specific code to be compiled for ARM64
  platform/chrome: cros_ec_proto: check for NULL transfer function
  x86/PCI: Fix PCI IRQ routing table memory leak
  vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING"
  nfsd: allow fh_want_write to be called twice
  fuse: retrieve: cap requested size to negotiated max_write
  nvmem: core: fix read buffer in place
  ALSA: hda - Register irq handler after the chip initialization
  nvme-pci: unquiesce admin queue on shutdown
  misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test
  iommu/vt-d: Set intel_iommu_gfx_mapped correctly
  blk-mq: move cancel of requeue_work into blk_mq_release
  watchdog: fix compile time error of pretimeout governors
  watchdog: imx2_wdt: Fix set_timeout for big timeout values
  mmc: mmci: Prevent polling for busy detection in IRQ context
  uml: fix a boot splat wrt use of cpu_all_mask
  configfs: fix possible use-after-free in configfs_register_group
  percpu: remove spurious lock dependency between percpu and sched
  f2fs: fix to do sanity check on valid block count of segment
  f2fs: fix to avoid panic in dec_valid_block_count()
  f2fs: fix to clear dirty inode in error path of f2fs_iget()
  f2fs: fix to avoid panic in do_recover_data()
  ntp: Allow TAI-UTC offset to be set to zero
  pwm: meson: Use the spin-lock only to protect register modifications
  EDAC/mpc85xx: Prevent building as a module
  objtool: Don't use ignore flag for fake jumps
  drm/bridge: adv7511: Fix low refresh rate selection
  perf/x86/intel: Allow PEBS multi-entry in watermark mode
  mfd: twl6040: Fix device init errors for ACCCTL register
  drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration
  mfd: intel-lpss: Set the device in reset state when init
  mfd: tps65912-spi: Add missing of table registration
  drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER
  thermal: rcar_gen3_thermal: disable interrupt in .remove
  kernel/sys.c: prctl: fix false positive in validate_prctl_map()
  mm/slab.c: fix an infinite loop in leaks_show()
  mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
  mm/cma.c: fix the bitmap status to show failed allocation reason
  mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
  mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE
  hugetlbfs: on restore reserve error path retain subpool reservation
  mm/hmm: select mmu notifier when selecting HMM
  ARM: prevent tracing IPI_CPU_BACKTRACE
  ipc: prevent lockup on alloc_msg and free_msg
  sysctl: return -EINVAL if val violates minmax
  fs/fat/file.c: issue flush after the writeback of FAT
  rapidio: fix a NULL pointer dereference when create_workqueue() fails
  BACKPORT: kheaders: Do not regenerate archive if config is not changed
  BACKPORT: kheaders: Move from proc to sysfs
  BACKPORT: Provide in-kernel headers to make extending kernel easier
  UPSTREAM: binder: check for overflow when alloc for security context

Conflicts:
	arch/arm/kernel/smp.c

Change-Id: I93aaeb09806bd05b4bb216aa12d7ba8007fbc3e9
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-07-23 11:00:29 -07:00
Greg Kroah-Hartman
cfee25d274 Merge 4.14.126 into android-4.14
Changes in 4.14.126
	rapidio: fix a NULL pointer dereference when create_workqueue() fails
	fs/fat/file.c: issue flush after the writeback of FAT
	sysctl: return -EINVAL if val violates minmax
	ipc: prevent lockup on alloc_msg and free_msg
	ARM: prevent tracing IPI_CPU_BACKTRACE
	mm/hmm: select mmu notifier when selecting HMM
	hugetlbfs: on restore reserve error path retain subpool reservation
	mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE
	mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
	mm/cma.c: fix the bitmap status to show failed allocation reason
	mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
	mm/slab.c: fix an infinite loop in leaks_show()
	kernel/sys.c: prctl: fix false positive in validate_prctl_map()
	thermal: rcar_gen3_thermal: disable interrupt in .remove
	drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER
	mfd: tps65912-spi: Add missing of table registration
	mfd: intel-lpss: Set the device in reset state when init
	drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration
	mfd: twl6040: Fix device init errors for ACCCTL register
	perf/x86/intel: Allow PEBS multi-entry in watermark mode
	drm/bridge: adv7511: Fix low refresh rate selection
	objtool: Don't use ignore flag for fake jumps
	EDAC/mpc85xx: Prevent building as a module
	pwm: meson: Use the spin-lock only to protect register modifications
	ntp: Allow TAI-UTC offset to be set to zero
	f2fs: fix to avoid panic in do_recover_data()
	f2fs: fix to clear dirty inode in error path of f2fs_iget()
	f2fs: fix to avoid panic in dec_valid_block_count()
	f2fs: fix to do sanity check on valid block count of segment
	percpu: remove spurious lock dependency between percpu and sched
	configfs: fix possible use-after-free in configfs_register_group
	uml: fix a boot splat wrt use of cpu_all_mask
	mmc: mmci: Prevent polling for busy detection in IRQ context
	watchdog: imx2_wdt: Fix set_timeout for big timeout values
	watchdog: fix compile time error of pretimeout governors
	blk-mq: move cancel of requeue_work into blk_mq_release
	iommu/vt-d: Set intel_iommu_gfx_mapped correctly
	misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test
	nvme-pci: unquiesce admin queue on shutdown
	ALSA: hda - Register irq handler after the chip initialization
	nvmem: core: fix read buffer in place
	fuse: retrieve: cap requested size to negotiated max_write
	nfsd: allow fh_want_write to be called twice
	vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING"
	x86/PCI: Fix PCI IRQ routing table memory leak
	platform/chrome: cros_ec_proto: check for NULL transfer function
	PCI: keystone: Prevent ARM32 specific code to be compiled for ARM64
	soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher
	clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288
	soc: rockchip: Set the proper PWM for rk3288
	ARM: dts: imx51: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx50: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx53: Specify IMX5_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA
	PCI: rpadlpar: Fix leaked device_node references in add/remove paths
	platform/x86: intel_pmc_ipc: adding error handling
	power: supply: max14656: fix potential use-before-alloc
	PCI: rcar: Fix a potential NULL pointer dereference
	PCI: rcar: Fix 64bit MSI message address handling
	video: hgafb: fix potential NULL pointer dereference
	video: imsttfb: fix potential NULL pointer dereferences
	block, bfq: increase idling for weight-raised queues
	PCI: xilinx: Check for __get_free_pages() failure
	gpio: gpio-omap: add check for off wake capable gpios
	dmaengine: idma64: Use actual device for DMA transfers
	pwm: tiehrpwm: Update shadow register for disabling PWMs
	ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa
	pwm: Fix deadlock warning when removing PWM device
	ARM: exynos: Fix undefined instruction during Exynos5422 resume
	usb: typec: fusb302: Check vconn is off when we start toggling
	gpio: vf610: Do not share irq_chip
	percpu: do not search past bitmap when allocating an area
	Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections"
	Revert "drm/nouveau: add kconfig option to turn off nouveau legacy contexts. (v3)"
	drm: don't block fb changes for async plane updates
	ALSA: seq: Cover unsubscribe_port() in list_mutex
	Linux 4.14.126

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-15 15:56:56 +02:00
Cyrill Gorcunov
a84bd98db8 kernel/sys.c: prctl: fix false positive in validate_prctl_map()
[ Upstream commit a9e73998f9d705c94a8dca9687633adc0f24a19a ]

While validating new map we require the @start_data to be strictly less
than @end_data, which is fine for regular applications (this is why this
nit didn't trigger for that long).  These members are set from executable
loaders such as elf handers, still it is pretty valid to have a loadable
data section with zero size in file, in such case the start_data is equal
to end_data once kernel loader finishes.

As a result when we're trying to restore such programs the procedure fails
and the kernel returns -EINVAL.  From the image dump of a program:

 | "mm_start_code": "0x400000",
 | "mm_end_code": "0x8f5fb4",
 | "mm_start_data": "0xf1bfb0",
 | "mm_end_data": "0xf1bfb0",

Thus we need to change validate_prctl_map from strictly less to less or
equal operator use.

Link: http://lkml.kernel.org/r/20190408143554.GY1421@uranus.lan
Fixes: f606b77f1a ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Andrey Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-15 11:54:52 +02:00
Yang Shi
1022f84ec6 mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct
mmap_sem is on the hot path of kernel, and it very contended, but it is
abused too.  It is used to protect arg_start|end and evn_start|end when
reading /proc/$PID/cmdline and /proc/$PID/environ, but it doesn't make
sense since those proc files just expect to read 4 values atomically and
not related to VM, they could be set to arbitrary values by C/R.

And, the mmap_sem contention may cause unexpected issue like below:

INFO: task ps:14018 blocked for more than 120 seconds.
       Tainted: G            E 4.9.79-009.ali3000.alios7.x86_64 #1
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
 ps              D    0 14018      1 0x00000004
 Call Trace:
   schedule+0x36/0x80
   rwsem_down_read_failed+0xf0/0x150
   call_rwsem_down_read_failed+0x18/0x30
   down_read+0x20/0x40
   proc_pid_cmdline_read+0xd9/0x4e0
   __vfs_read+0x37/0x150
   vfs_read+0x96/0x130
   SyS_read+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xc5

Both Alexey Dobriyan and Michal Hocko suggested to use dedicated lock
for them to mitigate the abuse of mmap_sem.

So, introduce a new spinlock in mm_struct to protect the concurrent
access to arg_start|end, env_start|end and others, as well as replace
write map_sem to read to protect the race condition between prctl and
sys_brk which might break check_data_rlimit(), and makes prctl more
friendly to other VM operations.

This patch just eliminates the abuse of mmap_sem, but it can't resolve
the above hung task warning completely since the later
access_remote_vm() call needs acquire mmap_sem.  The mmap_sem
scalability issue will be solved in the future.

Change-Id: Ifa8f001ee2fc4f0ce60c18e771cebcf8a1f0943e
[yang.shi@linux.alibaba.com: add comment about mmap_sem and arg_lock]
  Link: http://lkml.kernel.org/r/1524077799-80690-1-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1523730291-109696-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 88aa7cc688d48ddd84558b41d5905a0db9535c4b
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
2019-01-24 13:34:04 -08:00
Greg Kroah-Hartman
c535ee76cd Merge 4.14.69 into android-4.14-p
Changes in 4.14.69
	net: 6lowpan: fix reserved space for single frames
	net: mac802154: tx: expand tailroom if necessary
	9p/net: Fix zero-copy path in the 9p virtio transport
	spi: davinci: fix a NULL pointer dereference
	spi: pxa2xx: Add support for Intel Ice Lake
	spi: spi-fsl-dspi: Fix imprecise abort on VF500 during probe
	spi: cadence: Change usleep_range() to udelay(), for atomic context
	mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
	readahead: stricter check for bdi io_pages
	block: blk_init_allocated_queue() set q->fq as NULL in the fail case
	block: really disable runtime-pm for blk-mq
	drm/i915/userptr: reject zero user_size
	libertas: fix suspend and resume for SDIO connected cards
	media: Revert "[media] tvp5150: fix pad format frame height"
	mailbox: xgene-slimpro: Fix potential NULL pointer dereference
	Replace magic for trusting the secondary keyring with #define
	Fix kexec forbidding kernels signed with keys in the secondary keyring to boot
	powerpc/fadump: handle crash memory ranges array index overflow
	powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
	PCI: Add wrappers for dev_printk()
	powerpc/powernv/pci: Work around races in PCI bridge enabling
	cxl: Fix wrong comparison in cxl_adapter_context_get()
	ib_srpt: Fix a use-after-free in srpt_close_ch()
	RDMA/rxe: Set wqe->status correctly if an unexpected response is received
	9p: fix multiple NULL-pointer-dereferences
	fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed
	9p/virtio: fix off-by-one error in sg list bounds check
	net/9p/client.c: version pointer uninitialized
	net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree()
	dm integrity: change 'suspending' variable from bool to int
	dm thin: stop no_space_timeout worker when switching to write-mode
	dm cache metadata: save in-core policy_hint_size to on-disk superblock
	dm cache metadata: set dirty on all cache blocks after a crash
	dm crypt: don't decrease device limits
	uart: fix race between uart_put_char() and uart_shutdown()
	Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
	iio: sca3000: Fix missing return in switch
	iio: ad9523: Fix displayed phase
	iio: ad9523: Fix return value for ad952x_store()
	extcon: Release locking when sending the notification of connector state
	vmw_balloon: fix inflation of 64-bit GFNs
	vmw_balloon: do not use 2MB without batching
	vmw_balloon: VMCI_DOORBELL_SET does not check status
	vmw_balloon: fix VMCI use when balloon built into kernel
	rtc: omap: fix potential crash on power off
	tracing: Do not call start/stop() functions when tracing_on does not change
	tracing/blktrace: Fix to allow setting same value
	printk/tracing: Do not trace printk_nmi_enter()
	livepatch: Validate module/old func name length
	uprobes: Use synchronize_rcu() not synchronize_sched()
	mfd: hi655x: Fix regmap area declared size for hi655x
	ovl: fix wrong use of impure dir cache in ovl_iterate()
	drivers/block/zram/zram_drv.c: fix bug storing backing_dev
	cpufreq: governor: Avoid accessing invalid governor_data
	PM / sleep: wakeup: Fix build error caused by missing SRCU support
	KVM: VMX: fixes for vmentry_l1d_flush module parameter
	KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages
	xtensa: limit offsets in __loop_cache_{all,page}
	xtensa: increase ranges in ___invalidate_{i,d}cache_all
	block, bfq: return nbytes and not zero from struct cftype .write() method
	pnfs/blocklayout: off by one in bl_map_stripe()
	NFSv4 client live hangs after live data migration recovery
	NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
	NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
	ARM: tegra: Fix Tegra30 Cardhu PCA954x reset
	mm/tlb: Remove tlb_remove_table() non-concurrent condition
	iommu/vt-d: Add definitions for PFSID
	iommu/vt-d: Fix dev iotlb pfsid use
	sys: don't hold uts_sem while accessing userspace memory
	userns: move user access out of the mutex
	ubifs: Fix memory leak in lprobs self-check
	Revert "UBIFS: Fix potential integer overflow in allocation"
	ubifs: Check data node size before truncate
	ubifs: xattr: Don't operate on deleted inodes
	ubifs: Fix synced_i_size calculation for xattr inodes
	pwm: tiehrpwm: Don't use emulation mode bits to control PWM output
	pwm: tiehrpwm: Fix disabling of output of PWMs
	fb: fix lost console when the user unplugs a USB adapter
	udlfb: set optimal write delay
	getxattr: use correct xattr length
	libnvdimm: fix ars_status output length calculation
	bcache: release dc->writeback_lock properly in bch_writeback_thread()
	cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias()
	perf auxtrace: Fix queue resize
	crypto: vmx - Fix sleep-in-atomic bugs
	crypto: caam - fix DMA mapping direction for RSA forms 2 & 3
	crypto: caam/jr - fix descriptor DMA unmapping
	crypto: caam/qi - fix error path in xts setkey
	fs/quota: Fix spectre gadget in do_quotactl
	arm64: mm: always enable CONFIG_HOLES_IN_ZONE
	Linux 4.14.69

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-09-10 09:24:54 +02:00
Greg Kroah-Hartman
fc59235394 Merge 4.14.69 into android-4.14
Changes in 4.14.69
	net: 6lowpan: fix reserved space for single frames
	net: mac802154: tx: expand tailroom if necessary
	9p/net: Fix zero-copy path in the 9p virtio transport
	spi: davinci: fix a NULL pointer dereference
	spi: pxa2xx: Add support for Intel Ice Lake
	spi: spi-fsl-dspi: Fix imprecise abort on VF500 during probe
	spi: cadence: Change usleep_range() to udelay(), for atomic context
	mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
	readahead: stricter check for bdi io_pages
	block: blk_init_allocated_queue() set q->fq as NULL in the fail case
	block: really disable runtime-pm for blk-mq
	drm/i915/userptr: reject zero user_size
	libertas: fix suspend and resume for SDIO connected cards
	media: Revert "[media] tvp5150: fix pad format frame height"
	mailbox: xgene-slimpro: Fix potential NULL pointer dereference
	Replace magic for trusting the secondary keyring with #define
	Fix kexec forbidding kernels signed with keys in the secondary keyring to boot
	powerpc/fadump: handle crash memory ranges array index overflow
	powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
	PCI: Add wrappers for dev_printk()
	powerpc/powernv/pci: Work around races in PCI bridge enabling
	cxl: Fix wrong comparison in cxl_adapter_context_get()
	ib_srpt: Fix a use-after-free in srpt_close_ch()
	RDMA/rxe: Set wqe->status correctly if an unexpected response is received
	9p: fix multiple NULL-pointer-dereferences
	fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed
	9p/virtio: fix off-by-one error in sg list bounds check
	net/9p/client.c: version pointer uninitialized
	net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree()
	dm integrity: change 'suspending' variable from bool to int
	dm thin: stop no_space_timeout worker when switching to write-mode
	dm cache metadata: save in-core policy_hint_size to on-disk superblock
	dm cache metadata: set dirty on all cache blocks after a crash
	dm crypt: don't decrease device limits
	uart: fix race between uart_put_char() and uart_shutdown()
	Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
	iio: sca3000: Fix missing return in switch
	iio: ad9523: Fix displayed phase
	iio: ad9523: Fix return value for ad952x_store()
	extcon: Release locking when sending the notification of connector state
	vmw_balloon: fix inflation of 64-bit GFNs
	vmw_balloon: do not use 2MB without batching
	vmw_balloon: VMCI_DOORBELL_SET does not check status
	vmw_balloon: fix VMCI use when balloon built into kernel
	rtc: omap: fix potential crash on power off
	tracing: Do not call start/stop() functions when tracing_on does not change
	tracing/blktrace: Fix to allow setting same value
	printk/tracing: Do not trace printk_nmi_enter()
	livepatch: Validate module/old func name length
	uprobes: Use synchronize_rcu() not synchronize_sched()
	mfd: hi655x: Fix regmap area declared size for hi655x
	ovl: fix wrong use of impure dir cache in ovl_iterate()
	drivers/block/zram/zram_drv.c: fix bug storing backing_dev
	cpufreq: governor: Avoid accessing invalid governor_data
	PM / sleep: wakeup: Fix build error caused by missing SRCU support
	KVM: VMX: fixes for vmentry_l1d_flush module parameter
	KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages
	xtensa: limit offsets in __loop_cache_{all,page}
	xtensa: increase ranges in ___invalidate_{i,d}cache_all
	block, bfq: return nbytes and not zero from struct cftype .write() method
	pnfs/blocklayout: off by one in bl_map_stripe()
	NFSv4 client live hangs after live data migration recovery
	NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
	NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
	ARM: tegra: Fix Tegra30 Cardhu PCA954x reset
	mm/tlb: Remove tlb_remove_table() non-concurrent condition
	iommu/vt-d: Add definitions for PFSID
	iommu/vt-d: Fix dev iotlb pfsid use
	sys: don't hold uts_sem while accessing userspace memory
	userns: move user access out of the mutex
	ubifs: Fix memory leak in lprobs self-check
	Revert "UBIFS: Fix potential integer overflow in allocation"
	ubifs: Check data node size before truncate
	ubifs: xattr: Don't operate on deleted inodes
	ubifs: Fix synced_i_size calculation for xattr inodes
	pwm: tiehrpwm: Don't use emulation mode bits to control PWM output
	pwm: tiehrpwm: Fix disabling of output of PWMs
	fb: fix lost console when the user unplugs a USB adapter
	udlfb: set optimal write delay
	getxattr: use correct xattr length
	libnvdimm: fix ars_status output length calculation
	bcache: release dc->writeback_lock properly in bch_writeback_thread()
	cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias()
	perf auxtrace: Fix queue resize
	crypto: vmx - Fix sleep-in-atomic bugs
	crypto: caam - fix DMA mapping direction for RSA forms 2 & 3
	crypto: caam/jr - fix descriptor DMA unmapping
	crypto: caam/qi - fix error path in xts setkey
	fs/quota: Fix spectre gadget in do_quotactl
	arm64: mm: always enable CONFIG_HOLES_IN_ZONE
	Linux 4.14.69

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-09-10 09:21:07 +02:00
Jann Horn
b692c405a1 sys: don't hold uts_sem while accessing userspace memory
commit 42a0cc3478584d4d63f68f2f5af021ddbea771fa upstream.

Holding uts_sem as a writer while accessing userspace memory allows a
namespace admin to stall all processes that attempt to take uts_sem.
Instead, move data through stack buffers and don't access userspace memory
while uts_sem is held.

Cc: stable@vger.kernel.org
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09 19:56:00 +02:00
Greg Kroah-Hartman
503f6fecb8 Merge 4.14.45 into android-4.14
Changes in 4.14.45
	MIPS: c-r4k: Fix data corruption related to cache coherence
	MIPS: ptrace: Expose FIR register through FP regset
	MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
	KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
	affs_lookup(): close a race with affs_remove_link()
	fs: don't scan the inode cache before SB_BORN is set
	aio: fix io_destroy(2) vs. lookup_ioctx() race
	ALSA: timer: Fix pause event notification
	do d_instantiate/unlock_new_inode combinations safely
	mmc: sdhci-iproc: remove hard coded mmc cap 1.8v
	mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
	mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus
	libata: Blacklist some Sandisk SSDs for NCQ
	libata: blacklist Micron 500IT SSD with MU01 firmware
	xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
	drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
	arm64: lse: Add early clobbers to some input/output asm operands
	powerpc/64s: Clear PCR on boot
	IB/hfi1: Use after free race condition in send context error path
	IB/umem: Use the correct mm during ib_umem_release
	sr: pass down correctly sized SCSI sense buffer
	idr: fix invalid ptr dereference on item delete
	Revert "ipc/shm: Fix shmat mmap nil-page protection"
	ipc/shm: fix shmat() nil address after round-down when remapping
	mm/kasan: don't vfree() nonexistent vm_area
	kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
	kasan: fix memory hotplug during boot
	kernel/sys.c: fix potential Spectre v1 issue
	KVM/VMX: Expose SSBD properly to guests
	KVM: s390: vsie: fix < 8k check for the itdba
	KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed
	kvm: x86: IA32_ARCH_CAPABILITIES is always supported
	x86/kvm: fix LAPIC timer drift when guest uses periodic mode
	powerpc/64s: Improve RFI L1-D cache flush fallback
	powerpc/pseries: Support firmware disable of RFI flush
	powerpc/powernv: Support firmware disable of RFI flush
	powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code
	powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
	powerpc/rfi-flush: Always enable fallback flush on pseries
	powerpc/rfi-flush: Differentiate enabled and patched flush types
	powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration
	powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags
	powerpc: Add security feature flags for Spectre/Meltdown
	powerpc/pseries: Set or clear security feature flags
	powerpc/powernv: Set or clear security feature flags
	powerpc/64s: Move cpu_show_meltdown()
	powerpc/64s: Enhance the information in cpu_show_meltdown()
	powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()
	powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()
	powerpc/64s: Wire up cpu_show_spectre_v1()
	powerpc/64s: Wire up cpu_show_spectre_v2()
	powerpc/pseries: Fix clearing of security feature flags
	powerpc: Move default security feature flags
	powerpc/pseries: Restore default security feature flags on setup
	powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
	powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
	MIPS: generic: Fix machine compatible matching
	mac80211: mesh: fix wrong mesh TTL offset calculation
	ARC: Fix malformed ARC_EMUL_UNALIGNED default
	ptr_ring: prevent integer overflow when calculating size
	arm64: dts: rockchip: fix rock64 gmac2io stability issues
	arm64: dts: rockchip: correct ep-gpios for rk3399-sapphire
	libata: Fix compile warning with ATA_DEBUG enabled
	selftests: sync: missing CFLAGS while compiling
	selftest/vDSO: fix O=
	selftests: pstore: Adding config fragment CONFIG_PSTORE_RAM=m
	selftests: memfd: add config fragment for fuse
	ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
	ARM: OMAP3: Fix prm wake interrupt for resume
	ARM: OMAP2+: Fix sar_base inititalization for HS omaps
	ARM: OMAP1: clock: Fix debugfs_create_*() usage
	ibmvnic: Wait until reset is complete to set carrier on
	ibmvnic: Free RX socket buffer in case of adapter error
	ibmvnic: Clean RX pool buffers during device close
	tls: retrun the correct IV in getsockopt
	xhci: workaround for AMD Promontory disabled ports wakeup
	IB/uverbs: Fix method merging in uverbs_ioctl_merge
	IB/uverbs: Fix possible oops with duplicate ioctl attributes
	IB/uverbs: Fix unbalanced unlock on error path for rdma_explicit_destroy
	arm64: dts: rockchip: Fix DWMMC clocks
	ARM: dts: rockchip: Fix DWMMC clocks
	iwlwifi: mvm: fix security bug in PN checking
	iwlwifi: mvm: fix IBSS for devices that support station type API
	iwlwifi: mvm: always init rs with 20mhz bandwidth rates
	NFC: llcp: Limit size of SDP URI
	rxrpc: Work around usercopy check
	MD: Free bioset when md_run fails
	md: fix md_write_start() deadlock w/o metadata devices
	s390/dasd: fix handling of internal requests
	xfrm: do not call rcu_read_unlock when afinfo is NULL in xfrm_get_tos
	mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
	mac80211: fix a possible leak of station stats
	mac80211: fix calling sleeping function in atomic context
	cfg80211: clear wep keys after disconnection
	mac80211: Do not disconnect on invalid operating class
	mac80211: Fix sending ADDBA response for an ongoing session
	gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
	gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
	md raid10: fix NULL deference in handle_write_completed()
	drm/exynos: g2d: use monotonic timestamps
	drm/exynos: fix comparison to bitshift when dealing with a mask
	drm/meson: fix vsync buffer update
	arm64: perf: correct PMUVer probing
	RDMA/bnxt_re: Unpin SQ and RQ memory if QP create fails
	RDMA/bnxt_re: Fix system crash during load/unload
	ibmvnic: Check for NULL skb's in NAPI poll routine
	net/mlx5e: Return error if prio is specified when offloading eswitch vlan push
	locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
	md: raid5: avoid string overflow warning
	virtio_net: fix XDP code path in receive_small()
	kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
	bug.h: work around GCC PR82365 in BUG()
	selftests/memfd: add run_fuse_test.sh to TEST_FILES
	seccomp: add a selftest for get_metadata
	soc: imx: gpc: de-register power domains only if initialized
	powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
	s390/cio: fix ccw_device_start_timeout API
	s390/cio: fix return code after missing interrupt
	s390/cio: clear timer when terminating driver I/O
	selftests/bpf/test_maps: exit child process without error in ENOMEM case
	PKCS#7: fix direct verification of SignerInfo signature
	arm64: dts: cavium: fix PCI bus dtc warnings
	nfs: system crashes after NFS4ERR_MOVED recovery
	ARM: OMAP: Fix dmtimer init for omap1
	smsc75xx: fix smsc75xx_set_features()
	regulatory: add NUL to request alpha2
	integrity/security: fix digsig.c build error with header file
	x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system
	locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
	x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
	mac80211: drop frames with unexpected DS bits from fast-rx to slow path
	arm64: fix unwind_frame() for filtered out fn for function graph tracing
	macvlan: fix use-after-free in macvlan_common_newlink()
	KVM: nVMX: Don't halt vcpu when L1 is injecting events to L2
	kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
	ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
	fs: dcache: Avoid livelock between d_alloc_parallel and __d_add
	fs: dcache: Use READ_ONCE when accessing i_dir_seq
	md: fix a potential deadlock of raid5/raid10 reshape
	md/raid1: fix NULL pointer dereference
	batman-adv: fix packet checksum in receive path
	batman-adv: invalidate checksum on fragment reassembly
	netfilter: ipt_CLUSTERIP: put config struct if we can't increment ct refcount
	netfilter: ipt_CLUSTERIP: put config instead of freeing it
	netfilter: ebtables: convert BUG_ONs to WARN_ONs
	batman-adv: Ignore invalid batadv_iv_gw during netlink send
	batman-adv: Ignore invalid batadv_v_gw during netlink send
	batman-adv: Fix netlink dumping of BLA claims
	batman-adv: Fix netlink dumping of BLA backbones
	nvme-pci: Fix nvme queue cleanup if IRQ setup fails
	clocksource/drivers/fsl_ftm_timer: Fix error return checking
	libceph, ceph: avoid memory leak when specifying same option several times
	ceph: fix dentry leak when failing to init debugfs
	xen/pvcalls: fix null pointer dereference on map->sock
	ARM: orion5x: Revert commit 4904dbda41.
	qrtr: add MODULE_ALIAS macro to smd
	selftests/futex: Fix line continuation in Makefile
	r8152: fix tx packets accounting
	virtio-gpu: fix ioctl and expose the fixed status to userspace.
	dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
	bcache: fix kcrashes with fio in RAID5 backend dev
	ip_gre: fix IFLA_MTU ignored on NEWLINK
	ip6_tunnel: fix IFLA_MTU ignored on NEWLINK
	sit: fix IFLA_MTU ignored on NEWLINK
	nbd: fix return value in error handling path
	ARM: dts: NSP: Fix amount of RAM on BCM958625HR
	ARM: dts: bcm283x: Fix unit address of local_intc
	powerpc/boot: Fix random libfdt related build errors
	clocksource/drivers/mips-gic-timer: Use correct shift count to extract data
	gianfar: Fix Rx byte accounting for ndev stats
	net/tcp/illinois: replace broken algorithm reference link
	nvmet: fix PSDT field check in command format
	net/smc: use link_id of server in confirm link reply
	mlxsw: core: Fix flex keys scratchpad offset conflict
	mlxsw: spectrum: Treat IPv6 unregistered multicast as broadcast
	spectrum: Reference count VLAN entries
	ARC: mcip: halt GFRC counter when ARC cores halt
	ARC: mcip: update MCIP debug mask when the new cpu came online
	ARC: setup cpu possible mask according to possible-cpus dts property
	ipvs: remove IPS_NAT_MASK check to fix passive FTP
	IB/mlx: Set slid to zero in Ethernet completion struct
	RDMA/bnxt_re: Unconditionly fence non wire memory operations
	RDMA/bnxt_re: Fix incorrect DB offset calculation
	RDMA/bnxt_re: Fix the ib_reg failure cleanup
	xen/pirq: fix error path cleanup when binding MSIs
	drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
	xfrm: Fix ESN sequence number handling for IPsec GSO packets.
	arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
	drm/sun4i: Fix dclk_set_phase
	btrfs: use kvzalloc to allocate btrfs_fs_info
	Btrfs: send, fix issuing write op when processing hole in no data mode
	Btrfs: fix log replay failure after linking special file and fsync
	ceph: fix potential memory leak in init_caches()
	block: display the correct diskname for bio
	nvme-pci: Fix EEH failure on ppc
	nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
	selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
	net: ethtool: don't ignore return from driver get_fecparam method
	iwlwifi: mvm: fix TX of CCMP 256
	iwlwifi: mvm: Fix channel switch for count 0 and 1
	iwlwifi: mvm: fix assert 0x2B00 on older FWs
	iwlwifi: avoid collecting firmware dump if not loaded
	iwlwifi: mvm: fix "failed to remove key" message
	iwlwifi: mvm: Direct multicast frames to the correct station
	iwlwifi: mvm: Correctly set the tid for mcast queue
	rds: Incorrect reference counting in TCP socket creation
	watchdog: f71808e_wdt: Fix magic close handling
	watchdog: sbsa: use 32-bit read for WCV
	batman-adv: Fix multicast packet loss with a single WANT_ALL_IPV4/6 flag
	hv_netvsc: use napi_schedule_irqoff
	hv_netvsc: filter multicast/broadcast
	hv_netvsc: propagate rx filters to VF
	ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
	perf record: Fix crash in pipe mode
	e1000e: Fix check_for_link return value with autoneg off
	e1000e: allocate ring descriptors with dma_zalloc_coherent
	ia64/err-inject: Use get_user_pages_fast()
	RDMA/qedr: Fix kernel panic when running fio over NFSoRDMA
	RDMA/qedr: Fix iWARP write and send with immediate
	IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDs
	IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
	IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()
	fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
	fsl/fman: avoid sleeping in atomic context while adding an address
	qed: Free RoCE ILT Memory on rmmod qedr
	net: qcom/emac: Use proper free methods during TX
	net: smsc911x: Fix unload crash when link is up
	IB/core: Fix possible crash to access NULL netdev
	cxgb4: do not set needs_free_netdev for mgmt dev's
	xen-blkfront: move negotiate_mq to cover all cases of new VBDs
	xen: xenbus: use put_device() instead of kfree()
	hv_netvsc: fix filter flags
	hv_netvsc: fix locking for rx_mode
	hv_netvsc: fix locking during VF setup
	ARM: davinci: fix the GPIO lookup for omapl138-hawk
	arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
	selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus
	lib/test_kmod.c: fix limit check on number of test devices created
	dmaengine: mv_xor_v2: Fix clock resource by adding a register clock
	netfilter: ebtables: fix erroneous reject of last rule
	can: m_can: change comparison to bitshift when dealing with a mask
	can: m_can: select pinctrl state in each suspend/resume function
	bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
	workqueue: use put_device() instead of kfree()
	ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
	sunvnet: does not support GSO for sctp
	KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending
	gpu: ipu-v3: prg: avoid possible array underflow
	drm/imx: move arming of the vblank event to atomic_flush
	drm/nouveau/bl: fix backlight regression
	xfrm: fix rcu_read_unlock usage in xfrm_local_error
	iwlwifi: mvm: set the correct tid when we flush the MCAST sta
	iwlwifi: mvm: Correctly set IGTK for AP
	iwlwifi: mvm: fix error checking for multi/broadcast sta
	net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
	vlan: Fix out of order vlan headers with reorder header off
	batman-adv: fix header size check in batadv_dbg_arp()
	net/sched: fix NULL dereference in the error path of tcf_sample_init()
	batman-adv: Fix skbuff rcsum on packet reroute
	vti4: Don't count header length twice on tunnel setup
	ip_tunnel: Clamp MTU to bounds on new link
	vti4: Don't override MTU passed on link creation via IFLA_MTU
	vti6: Fix dev->max_mtu setting
	iwlwifi: mvm: Increase session protection time after CS
	iwlwifi: mvm: clear tx queue id when unreserving aggregation queue
	iwlwifi: mvm: make sure internal station has a valid id
	iwlwifi: mvm: fix array out of bounds reference
	drm/tegra: Shutdown on driver unbind
	perf/cgroup: Fix child event counting bug
	brcmfmac: Fix check for ISO3166 code
	kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
	RDMA/ucma: Correct option size check using optlen
	RDMA/qedr: fix QP's ack timeout configuration
	RDMA/qedr: Fix rc initialization on CNQ allocation failure
	RDMA/qedr: Fix QP state initialization race
	net/sched: fix idr leak on the error path of tcf_bpf_init()
	net/sched: fix idr leak in the error path of tcf_simp_init()
	net/sched: fix idr leak in the error path of tcf_act_police_init()
	net/sched: fix idr leak in the error path of tcp_pedit_init()
	net/sched: fix idr leak in the error path of __tcf_ipt_init()
	net/sched: fix idr leak in the error path of tcf_skbmod_init()
	net: dsa: Fix functional dsa-loop dependency on FIXED_PHY
	drm/ast: Fixed 1280x800 Display Issue
	mm/mempolicy.c: avoid use uninitialized preferred_node
	mm, thp: do not cause memcg oom for thp
	xfrm: Fix transport mode skb control buffer usage.
	selftests: ftrace: Add probe event argument syntax testcase
	selftests: ftrace: Add a testcase for string type with kprobe_event
	selftests: ftrace: Add a testcase for probepoint
	drm/amdkfd: Fix scratch memory with HWS enabled
	batman-adv: fix multicast-via-unicast transmission with AP isolation
	batman-adv: fix packet loss for broadcasted DHCP packets to a server
	ARM: 8748/1: mm: Define vdso_start, vdso_end as array
	lan78xx: Set ASD in MAC_CR when EEE is enabled.
	net: qmi_wwan: add BroadMobi BM806U 2020:2033
	bonding: fix the err path for dev hwaddr sync in bond_enslave
	net: dsa: mt7530: fix module autoloading for OF platform drivers
	net/mlx5: Make eswitch support to depend on switchdev
	perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
	x86/alternatives: Fixup alternative_call_2
	llc: properly handle dev_queue_xmit() return value
	builddeb: Fix header package regarding dtc source links
	qede: Fix barrier usage after tx doorbell write.
	mm, slab: memcg_link the SLAB's kmem_cache
	mm/page_owner: fix recursion bug after changing skip entries
	mm/vmstat.c: fix vmstat_update() preemption BUG
	mm/kmemleak.c: wait for scan completion before disabling free
	hv_netvsc: enable multicast if necessary
	qede: Do not drop rx-checksum invalidated packets.
	net: Fix untag for vlan packets without ethernet header
	vlan: Fix vlan insertion for packets without ethernet header
	net: mvneta: fix enable of all initialized RXQs
	sh: fix debug trap failure to process signals before return to user
	firmware: dmi_scan: Fix UUID length safety check
	nvme: don't send keep-alives to the discovery controller
	Btrfs: clean up resources during umount after trans is aborted
	Btrfs: fix loss of prealloc extents past i_size after fsync log replay
	x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
	x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init
	fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
	swap: divide-by-zero when zero length swap file on ssd
	z3fold: fix memory leak
	sr: get/drop reference to device in revalidate and check_events
	Force log to disk before reading the AGF during a fstrim
	cpufreq: CPPC: Initialize shared perf capabilities of CPUs
	powerpc/fscr: Enable interrupts earlier before calling get_user()
	perf tools: Fix perf builds with clang support
	perf clang: Add support for recent clang versions
	dp83640: Ensure against premature access to PHY registers after reset
	ibmvnic: Zero used TX descriptor counter on reset
	mm/ksm: fix interaction with THP
	mm: fix races between address_space dereference and free in page_evicatable
	mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one()
	Btrfs: bail out on error during replay_dir_deletes
	Btrfs: fix NULL pointer dereference in log_dir_items
	btrfs: Fix possible softlock on single core machines
	IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
	ocfs2/dlm: don't handle migrate lockres if already in shutdown
	powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
	sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
	x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush()
	KVM: VMX: raise internal error for exception during invalid protected mode state
	lan78xx: Connect phy early
	fscache: Fix hanging wait on page discarded by writeback
	sparc64: Make atomic_xchg() an inline function rather than a macro.
	net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
	net: bgmac: Correctly annotate register space
	powerpc/64s: sreset panic if there is no debugger or crash dump handlers
	btrfs: tests/qgroup: Fix wrong tree backref level
	Btrfs: fix copy_items() return value when logging an inode
	btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
	btrfs: qgroup: Fix root item corruption when multiple same source snapshots are created with quota enabled
	rxrpc: Fix Tx ring annotation after initial Tx failure
	rxrpc: Don't treat call aborts as conn aborts
	xen/acpi: off by one in read_acpi_id()
	drivers: macintosh: rack-meter: really fix bogus memsets
	ACPI: acpi_pad: Fix memory leak in power saving threads
	powerpc/mpic: Check if cpu_possible() in mpic_physmask()
	ieee802154: ca8210: fix uninitialised data read
	ath10k: advertize beacon_int_min_gcd
	iommu/amd: Take into account that alloc_dev_data() may return NULL
	intel_th: Use correct method of finding hub
	m68k: set dma and coherent masks for platform FEC ethernets
	iwlwifi: mvm: check if mac80211_queue is valid in iwl_mvm_disable_txq
	parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
	hwmon: (nct6775) Fix writing pwmX_mode
	powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
	powerpc/perf: Fix kernel address leak via sampling registers
	rsi: fix kernel panic observed on 64bit machine
	tools/thermal: tmon: fix for segfault
	selftests: Print the test we're running to /dev/kmsg
	net/mlx5: Protect from command bit overflow
	watchdog: davinci_wdt: fix error handling in davinci_wdt_probe()
	ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
	nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A
	ath9k: fix crash in spectral scan
	cxgb4: Setup FW queues before registering netdev
	ima: Fix Kconfig to select TPM 2.0 CRB interface
	ima: Fallback to the builtin hash algorithm
	watchdog: aspeed: Allow configuring for alternate boot
	virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
	arm: dts: socfpga: fix GIC PPI warning
	ext4: don't complain about incorrect features when probing
	drm/vmwgfx: Unpin the screen object backup buffer when not used
	iommu/mediatek: Fix protect memory setting
	cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
	IB/mlx5: Set the default active rate and width to QDR and 4X
	zorro: Set up z->dev.dma_mask for the DMA API
	bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
	remoteproc: imx_rproc: Fix an error handling path in 'imx_rproc_probe()'
	dt-bindings: add device tree binding for Allwinner H6 main CCU
	ACPICA: Events: add a return on failure from acpi_hw_register_read
	ACPICA: Fix memory leak on unusual memory leak
	ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
	cxgb4: Fix queue free path of ULD drivers
	i2c: mv64xxx: Apply errata delay only in standard mode
	KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
	perf top: Fix top.call-graph config option reading
	perf stat: Fix core dump when flag T is used
	IB/core: Honor port_num while resolving GID for IB link layer
	drm/amdkfd: add missing include of mm.h
	coresight: Use %px to print pcsr instead of %p
	regulator: gpio: Fix some error handling paths in 'gpio_regulator_probe()'
	spi: bcm-qspi: fIX some error handling paths
	net/smc: pay attention to MAX_ORDER for CQ entries
	MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
	PCI: Restore config space on runtime resume despite being unbound
	watchdog: dw: RMW the control register
	watchdog: aspeed: Fix translation of reset mode to ctrl register
	ipmi_ssif: Fix kernel panic at msg_done_handler
	drm/meson: Fix some error handling paths in 'meson_drv_bind_master()'
	drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()'
	powerpc: Add missing prototype for arch_irq_work_raise()
	powerpc/powernv/npu: Fix deadlock in mmio_invalidate()
	cxl: Check if PSL data-cache is available before issue flush request
	f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
	f2fs: fix to clear CP_TRIMMED_FLAG
	f2fs: fix to check extent cache in f2fs_drop_extent_tree
	perf/core: Fix installing cgroup events on CPU
	max17042: propagate of_node to power supply device
	perf/core: Fix perf_output_read_group()
	drm/panel: simple: Fix the bus format for the Ontat panel
	hwmon: (pmbus/max8688) Accept negative page register values
	hwmon: (pmbus/adm1275) Accept negative page register values
	perf/x86/intel: Properly save/restore the PMU state in the NMI handler
	cdrom: do not call check_disk_change() inside cdrom_open()
	efi/arm*: Only register page tables when they exist
	perf/x86/intel: Fix large period handling on Broadwell CPUs
	perf/x86/intel: Fix event update for auto-reload
	arm64: dts: qcom: Fix SPI5 config on MSM8996
	soc: qcom: wcnss_ctrl: Fix increment in NV upload
	gfs2: Fix fallocate chunk size
	x86/devicetree: Initialize device tree before using it
	x86/devicetree: Fix device IRQ settings in DT
	phy: rockchip-emmc: retry calpad busy trimming
	ALSA: vmaster: Propagate slave error
	phy: qcom-qmp: Fix phy pipe clock gating
	drm/bridge: sii902x: Retry status read after DDI I2C
	tools: hv: fix compiler warnings about major/target_fname
	block: null_blk: fix 'Invalid parameters' when loading module
	dmaengine: pl330: fix a race condition in case of threaded irqs
	dmaengine: rcar-dmac: Check the done lists in rcar_dmac_chan_get_residue()
	enic: enable rq before updating rq descriptors
	watchdog: asm9260_wdt: fix error handling in asm9260_wdt_probe()
	hwrng: stm32 - add reset during probe
	pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs
	pinctrl: artpec6: dt: add missing pin group uart5nocts
	vfio-ccw: fence off transport mode
	dmaengine: qcom: bam_dma: get num-channels and num-ees from dt
	drm: omapdrm: dss: Move initialization code from component bind to probe
	ARM: dts: dra71-evm: Correct evm_sd regulator max voltage
	drm/amdgpu: disable GFX ring and disable PQ wptr in hw_fini
	drm/amdgpu: adjust timeout for ib_ring_tests(v2)
	net: stmmac: ensure that the device has released ownership before reading data
	net: stmmac: ensure that the MSS desc is the last desc to set the own bit
	cpufreq: Reorder cpufreq_online() error code path
	dpaa_eth: fix SG mapping
	PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
	udf: Provide saner default for invalid uid / gid
	ixgbe: prevent ptp_rx_hang from running when in FILTER_ALL mode
	sh_eth: fix TSU init on SH7734/R8A7740
	power: supply: ltc2941-battery-gauge: Fix temperature units
	ARM: dts: bcm283x: Fix probing of bcm2835-i2s
	ARM: dts: bcm283x: Fix pin function of JTAG pins
	PCMCIA / PM: Avoid noirq suspend aborts during suspend-to-idle
	audit: return on memory error to avoid null pointer dereference
	net: stmmac: call correct function in stmmac_mac_config_rx_queues_routing()
	rcu: Call touch_nmi_watchdog() while printing stall warnings
	pinctrl: sh-pfc: r8a7796: Fix MOD_SEL register pin assignment for SSI pins group
	dpaa_eth: fix pause capability advertisement logic
	MIPS: Octeon: Fix logging messages with spurious periods after newlines
	drm/rockchip: Respect page offset for PRIME mmap calls
	x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
	perf test: Fix test case inet_pton to accept inlines.
	perf report: Fix wrong jump arrow
	perf tests: Use arch__compare_symbol_names to compare symbols
	perf report: Fix memory corruption in --branch-history mode --branch-history
	perf tests: Fix dwarf unwind for stripped binaries
	selftests/net: fixes psock_fanout eBPF test case
	netlabel: If PF_INET6, check sk_buff ip header version
	drm: rcar-du: lvds: Fix LVDS startup on R-Car Gen3
	drm: rcar-du: lvds: Fix LVDS startup on R-Car Gen2
	ARM: dts: at91: tse850: use the correct compatible for the eeprom
	regmap: Correct comparison in regmap_cached
	i40e: Add delay after EMP reset for firmware to recover
	ARM: dts: imx7d: cl-som-imx7: fix pinctrl_enet
	ARM: dts: porter: Fix HDMI output routing
	regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
	pinctrl: msm: Use dynamic GPIO numbering
	pinctrl: mcp23s08: spi: Fix regmap debugfs entries
	kdb: make "mdr" command repeat
	drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful
	Linux 4.14.45

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-30 13:17:17 +02:00
Gustavo A. R. Silva
058dfcf9c2 kernel/sys.c: fix potential Spectre v1 issue
commit 23d6aef74da86a33fa6bb75f79565e0a16ee97c2 upstream.

`resource' can be controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

  kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap)
  kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap)

Fix this by sanitizing *resource* before using it to index
current->signal->rlim

Notice that given that speculation windows are large, the policy is to
kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:51:50 +02:00
Greg Kroah-Hartman
4c9e0a9b25 Merge 4.14.43 into android-4.14
Changes in 4.14.43
	usbip: usbip_host: refine probe and disconnect debug msgs to be useful
	usbip: usbip_host: delete device from busid_table after rebind
	usbip: usbip_host: run rebind from exit when module is removed
	usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
	usbip: usbip_host: fix bad unlock balance during stub_probe()
	ALSA: usb: mixer: volume quirk for CM102-A+/102S+
	ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
	ALSA: control: fix a redundant-copy issue
	spi: pxa2xx: Allow 64-bit DMA
	spi: bcm-qspi: Avoid setting MSPI_CDRAM_PCS for spi-nor master
	spi: bcm-qspi: Always read and set BSPI_MAST_N_BOOT_CTRL
	KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
	KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
	powerpc: Don't preempt_disable() in show_cpuinfo()
	vfio: ccw: fix cleanup if cp_prefetch fails
	tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
	tee: shm: fix use-after-free via temporarily dropped reference
	netfilter: nf_tables: free set name in error path
	netfilter: nf_tables: can't fail after linking rule into active rule list
	netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}
	i2c: designware: fix poll-after-enable regression
	powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
	drm: Match sysfs name in link removal to link creation
	lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
	radix tree: fix multi-order iteration race
	mm: don't allow deferred pages with NEED_PER_CPU_KM
	drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
	s390/qdio: fix access to uninitialized qdio_q fields
	s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
	s390/qdio: don't release memory in qdio_setup_irq()
	s390: remove indirect branch from do_softirq_own_stack
	x86/pkeys: Override pkey when moving away from PROT_EXEC
	x86/pkeys: Do not special case protection key 0
	efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
	ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
	x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
	tick/broadcast: Use for_each_cpu() specially on UP kernels
	ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
	ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
	ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
	Btrfs: fix xattr loss after power failure
	Btrfs: send, fix invalid access to commit roots due to concurrent snapshotting
	btrfs: property: Set incompat flag if lzo/zstd compression is set
	btrfs: fix crash when trying to resume balance without the resume flag
	btrfs: Split btrfs_del_delalloc_inode into 2 functions
	btrfs: Fix delalloc inodes invalidation during transaction abort
	btrfs: fix reading stale metadata blocks after degraded raid1 mounts
	x86/nospec: Simplify alternative_msr_write()
	x86/bugs: Concentrate bug detection into a separate function
	x86/bugs: Concentrate bug reporting into a separate function
	x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
	x86/bugs, KVM: Support the combination of guest and host IBRS
	x86/bugs: Expose /sys/../spec_store_bypass
	x86/cpufeatures: Add X86_FEATURE_RDS
	x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
	x86/bugs/intel: Set proper CPU features and setup RDS
	x86/bugs: Whitelist allowed SPEC_CTRL MSR values
	x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
	x86/KVM/VMX: Expose SPEC_CTRL Bit(2) to the guest
	x86/speculation: Create spec-ctrl.h to avoid include hell
	prctl: Add speculation control prctls
	x86/process: Allow runtime control of Speculative Store Bypass
	x86/speculation: Add prctl for Speculative Store Bypass mitigation
	nospec: Allow getting/setting on non-current task
	proc: Provide details on speculation flaw mitigations
	seccomp: Enable speculation flaw mitigations
	x86/bugs: Make boot modes __ro_after_init
	prctl: Add force disable speculation
	seccomp: Use PR_SPEC_FORCE_DISABLE
	seccomp: Add filter flag to opt-out of SSB mitigation
	seccomp: Move speculation migitation control to arch code
	x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
	x86/bugs: Rename _RDS to _SSBD
	proc: Use underscores for SSBD in 'status'
	Documentation/spec_ctrl: Do some minor cleanups
	x86/bugs: Fix __ssb_select_mitigation() return type
	x86/bugs: Make cpu_show_common() static
	x86/bugs: Fix the parameters alignment and missing void
	x86/cpu: Make alternative_msr_write work for 32-bit code
	KVM: SVM: Move spec control call after restore of GS
	x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
	x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
	x86/cpufeatures: Disentangle SSBD enumeration
	x86/cpufeatures: Add FEATURE_ZEN
	x86/speculation: Handle HT correctly on AMD
	x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
	x86/speculation: Add virtualized speculative store bypass disable support
	x86/speculation: Rework speculative_store_bypass_update()
	x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
	x86/bugs: Expose x86_spec_ctrl_base directly
	x86/bugs: Remove x86_spec_ctrl_set()
	x86/bugs: Rework spec_ctrl base and mask logic
	x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
	KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
	x86/bugs: Rename SSBD_NO to SSB_NO
	Linux 4.14.43

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-22 20:17:10 +02:00
Kees Cook
7d1254a148 nospec: Allow getting/setting on non-current task
commit 7bbf1373e228840bb0295a2ca26d548ef37f448e upstream

Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than
current.

This is needed both for /proc/$pid/status queries and for seccomp (since
thread-syncing can trigger seccomp in non-current threads).

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22 18:54:03 +02:00
Thomas Gleixner
33f6a06810 prctl: Add speculation control prctls
commit b617cfc858161140d69cc0b5cc211996b557a1c7 upstream

Add two new prctls to control aspects of speculation related vulnerabilites
and their mitigations to provide finer grained control over performance
impacting mitigations.

PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
the following meaning:

Bit  Define           Description
0    PR_SPEC_PRCTL    Mitigation can be controlled per task by
                      PR_SET_SPECULATION_CTRL
1    PR_SPEC_ENABLE   The speculation feature is enabled, mitigation is
                      disabled
2    PR_SPEC_DISABLE  The speculation feature is disabled, mitigation is
                      enabled

If all bits are 0 the CPU is not affected by the speculation misfeature.

If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
misfeature will fail.

PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.

The common return values are:

EINVAL  prctl is not implemented by the architecture or the unused prctl()
        arguments are not 0
ENODEV  arg2 is selecting a not supported speculation misfeature

PR_SET_SPECULATION_CTRL has these additional return values:

ERANGE  arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
ENXIO   prctl control of the selected speculation misfeature is disabled

The first supported controlable speculation misfeature is
PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
architectures.

Based on an initial patch from Tim Chen and mostly rewritten.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22 18:54:03 +02:00
Colin Cross
8392add7be ANDROID: mm: add a field to store names for private anonymous memory
Userspace processes often have multiple allocators that each do
anonymous mmaps to get memory.  When examining memory usage of
individual processes or systems as a whole, it is useful to be
able to break down the various heaps that were allocated by
each layer and examine their size, RSS, and physical memory
usage.

This patch adds a user pointer to the shared union in
vm_area_struct that points to a null terminated string inside
the user process containing a name for the vma.  vmas that
point to the same address will be merged, but vmas that
point to equivalent strings at different addresses will
not be merged.

Userspace can set the name for a region of memory by calling
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name);
Setting the name to NULL clears it.

The names of named anonymous vmas are shown in /proc/pid/maps
as [anon:<name>] and in /proc/pid/smaps in a new "Name" field
that is only present for named vmas.  If the userspace pointer
is no longer valid all or part of the name will be replaced
with "<fault>".

The idea to store a userspace pointer to reduce the complexity
within mm (at the expense of the complexity of reading
/proc/pid/mem) came from Dave Hansen.  This results in no
runtime overhead in the mm subsystem other than comparing
the anon_name pointers when considering vma merging.  The pointer
is stored in a union with fieds that are only used on file-backed
mappings, so it does not increase memory usage.

Includes fix from Jed Davis <jld@mozilla.com> for typo in
prctl_set_vma_anon_name, which could attempt to set the name
across two vmas at the same time due to a typo, which might
corrupt the vma list.  Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.

Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>

[AmitP: Fix get_user_pages_remote() call to align with upstream commit
        5b56d49fc3 ("mm: add locked parameter to get_user_pages_remote()")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-12-18 21:11:22 +05:30
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Kirill Tkhai
4d28df6152 prctl: Allow local CAP_SYS_ADMIN changing exe_file
During checkpointing and restore of userspace tasks
we bumped into the situation, that it's not possible
to restore the tasks, which user namespace does not
have uid 0 or gid 0 mapped.

People create user namespace mappings like they want,
and there is no a limitation on obligatory uid and gid
"must be mapped". So, if there is no uid 0 or gid 0
in the mapping, it's impossible to restore mm->exe_file
of the processes belonging to this user namespace.

Also, there is no a workaround. It's impossible
to create a temporary uid/gid mapping, because
only one write to /proc/[pid]/uid_map and gid_map
is allowed during a namespace lifetime.
If there is an entry, then no more mapings can't be
written. If there isn't an entry, we can't write
there too, otherwise user task won't be able
to do that in the future.

The patch changes the check, and looks for CAP_SYS_ADMIN
instead of zero uid and gid. This allows to restore
a task independently of its user namespace mappings.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Serge Hallyn <serge@hallyn.com>
CC: "Eric W. Biederman" <ebiederm@xmission.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Michal Hocko <mhocko@suse.com>
CC: Andrei Vagin <avagin@openvz.org>
CC: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
CC: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-07-20 07:46:07 -05:00
Al Viro
58c7ffc074 fix a braino in compat_sys_getrlimit()
Reported-and-tested-by: Meelis Roos <mroos@linux.ee>
Fixes: commit d9e968cb9f "getrlimit()/setrlimit(): move compat to native"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 09:15:00 -07:00
Michal Hocko
1860033237 mm: make PR_SET_THP_DISABLE immediately active
PR_SET_THP_DISABLE has a rather subtle semantic.  It doesn't affect any
existing mapping because it only updated mm->def_flags which is a
template for new mappings.

The mappings created after prctl(PR_SET_THP_DISABLE) have VM_NOHUGEPAGE
flag set.  This can be quite surprising for all those applications which
do not do prctl(); fork() & exec() and want to control their own THP
behavior.

Another usecase when the immediate semantic of the prctl might be useful
is a combination of pre- and post-copy migration of containers with
CRIU.  In this case CRIU populates a part of a memory region with data
that was saved during the pre-copy stage.  Afterwards, the region is
registered with userfaultfd and CRIU expects to get page faults for the
parts of the region that were not yet populated.  However, khugepaged
collapses the pages and the expected page faults do not occur.

In more general case, the prctl(PR_SET_THP_DISABLE) could be used as a
temporary mechanism for enabling/disabling THP process wide.

Implementation wise, a new MMF_DISABLE_THP flag is added.  This flag is
tested when decision whether to use huge pages is taken either during
page fault of at the time of THP collapse.

It should be noted, that the new implementation makes PR_SET_THP_DISABLE
master override to any per-VMA setting, which was not the case
previously.

Fixes: a0715cc226 ("mm, thp: add VM_INIT_DEF_MASK and PRCTL_THP_DISABLE")
Link: http://lkml.kernel.org/r/1496415802-30944-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10 16:32:31 -07:00
Linus Torvalds
c856863988 Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc compat stuff updates from Al Viro:
 "This part is basically untangling various compat stuff. Compat
  syscalls moved to their native counterparts, getting rid of quite a
  bit of double-copying and/or set_fs() uses. A lot of field-by-field
  copyin/copyout killed off.

   - kernel/compat.c is much closer to containing just the
     copyin/copyout of compat structs. Not all compat syscalls are gone
     from it yet, but it's getting there.

   - ipc/compat_mq.c killed off completely.

   - block/compat_ioctl.c cleaned up; floppy compat ioctls moved to
     drivers/block/floppy.c where they belong. Yes, there are several
     drivers that implement some of the same ioctls. Some are m68k and
     one is 32bit-only pmac. drivers/block/floppy.c is the only one in
     that bunch that can be built on biarch"

* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  mqueue: move compat syscalls to native ones
  usbdevfs: get rid of field-by-field copyin
  compat_hdio_ioctl: get rid of set_fs()
  take floppy compat ioctls to sodding floppy.c
  ipmi: get rid of field-by-field __get_user()
  ipmi: get COMPAT_IPMICTL_RECEIVE_MSG in sync with the native one
  rt_sigtimedwait(): move compat to native
  select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()
  put_compat_rusage(): switch to copy_to_user()
  sigpending(): move compat to native
  getrlimit()/setrlimit(): move compat to native
  times(2): move compat to native
  compat_{get,put}_bitmap(): use unsafe_{get,put}_user()
  fb_get_fscreeninfo(): don't bother with do_fb_ioctl()
  do_sigaltstack(): lift copying to/from userland into callers
  take compat_sys_old_getrlimit() to native syscall
  trim __ARCH_WANT_SYS_OLD_GETRLIMIT
2017-07-06 20:57:13 -07:00
Al Viro
d9e968cb9f getrlimit()/setrlimit(): move compat to native
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09 23:51:33 -04:00
Al Viro
ca2406ed58 times(2): move compat to native
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09 23:51:28 -04:00
Al Viro
613763a1f0 take compat_sys_old_getrlimit() to native syscall
... and sanitize the ifdefs in there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-27 15:38:06 -04:00
Al Viro
ce72a16fa7 wait4(2)/waitid(2): separate copying rusage to userland
New helpers: kernel_waitid() and kernel_wait4().  sys_waitid(),
sys_wait4() and their compat variants switched to those.  Copying
struct rusage to userland is left to syscall itself.  For
compat_sys_wait4() that eliminates the use of set_fs() completely.
For compat_sys_waitid() it's still needed (for siginfo handling);
that will change shortly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-21 13:11:00 -04:00
Linus Torvalds
e579dde654 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "This is a set of small fixes that were mostly stumbled over during
  more significant development. This proc fix and the fix to
  posix-timers are the most significant of the lot.

  There is a lot of good development going on but unfortunately it
  didn't quite make the merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc: Fix unbalanced hard link numbers
  signal: Make kill_proc_info static
  rlimit: Properly call security_task_setrlimit
  signal: Remove unused definition of sig_user_definied
  ia64: Remove unused IA64_TASK_SIGHAND_OFFSET and IA64_SIGHAND_SIGLOCK_OFFSET
  ipc: Remove unused declaration of recompute_msgmni
  posix-timers: Correct sanity check in posix_cpu_nsleep
  sysctl: Remove dead register_sysctl_root
2017-05-05 11:08:43 -07:00
Eric W. Biederman
cad4ea546b rlimit: Properly call security_task_setrlimit
Modify do_prlimit to call security_task_setrlimit passing the task
whose rlimit we are changing not the tsk->group_leader.

In general this should not matter as the lsms implementing
security_task_setrlimit apparmor and selinux both examine the
task->cred to see what should be allowed on the destination task.

That task->cred is shared between tasks created with CLONE_THREAD
unless thread keyrings are in play, in which case both apparmor and
selinux create duplicate security contexts.

So the only time when it will matter which thread is passed to
security_task_setrlimit is if one of the threads of a process performs
an operation that changes only it's credentials.  At which point if a
thread has done that we don't want to hide that information from the
lsms.

So fix the call of security_task_setrlimit.  With the removal
of tsk->group_leader this makes the code slightly faster,
more comprehensible and maintainable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-04-21 16:08:19 -05:00
Stephen Smalley
791ec491c3 prlimit,security,selinux: add a security hook for prlimit
When SELinux was first added to the kernel, a process could only get
and set its own resource limits via getrlimit(2) and setrlimit(2), so no
MAC checks were required for those operations, and thus no security hooks
were defined for them. Later, SELinux introduced a hook for setlimit(2)
with a check if the hard limit was being changed in order to be able to
rely on the hard limit value as a safe reset point upon context
transitions.

Later on, when prlimit(2) was added to the kernel with the ability to get
or set resource limits (hard or soft) of another process, LSM/SELinux was
not updated other than to pass the target process to the setrlimit hook.
This resulted in incomplete control over both getting and setting the
resource limits of another process.

Add a new security_task_prlimit() hook to the check_prlimit_permission()
function to provide complete mediation.  The hook is only called when
acting on another task, and only if the existing DAC/capability checks
would allow access.  Pass flags down to the hook to indicate whether the
prlimit(2) call will read, write, or both read and write the resource
limits of the target process.

The existing security_task_setrlimit() hook is left alone; it continues
to serve a purpose in supporting the ability to make decisions based on
the old and/or new resource limit values when setting limits.  This
is consistent with the DAC/capability logic, where
check_prlimit_permission() performs generic DAC/capability checks for
acting on another task, while do_prlimit() performs a capability check
based on a comparison of the old and new resource limits.  Fix the
inline documentation for the hook to match the code.

Implement the new hook for SELinux.  For setting resource limits, we
reuse the existing setrlimit permission.  Note that this does overload
the setrlimit permission to mean the ability to set the resource limit
(soft or hard) of another process or the ability to change one's own
hard limit.  For getting resource limits, a new getrlimit permission
is defined.  This was not originally defined since getrlimit(2) could
only be used to obtain a process' own limits.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-06 10:43:47 +11:00
Ingo Molnar
32ef5517c2 sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h>
Introduce a trivial, mostly empty <linux/sched/cputime.h> header
to prepare for the moving of cputime functionality out of sched.h.

Update all code that relies on these facilities.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:39 +01:00
Ingo Molnar
299300258d sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>
We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:35 +01:00
Ingo Molnar
03441a3482 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h>
We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/stat.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:34 +01:00
Ingo Molnar
f7ccbae45c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/coredump.h>
We are going to split <linux/sched/coredump.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/coredump.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:28 +01:00
Ingo Molnar
6e84f31522 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h>
We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/mm.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

The APIs that are going to be moved first are:

   mm_alloc()
   __mmdrop()
   mmdrop()
   mmdrop_async_fn()
   mmdrop_async()
   mmget_not_zero()
   mmput()
   mmput_async()
   get_task_mm()
   mm_access()
   mm_release()

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:28 +01:00
Ingo Molnar
4eb5aaa3af sched/headers: Prepare for new header dependencies before moving code to <linux/sched/autogroup.h>
We are going to split <linux/sched/autogroup.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/autogroup.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:28 +01:00
Ingo Molnar
4f17722c72 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/loadavg.h>
We are going to split <linux/sched/loadavg.h> out of <linux/sched.h>, which
will have to be picked up from a couple of .c files.

Create a trivial placeholder <linux/sched/topology.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Linus Torvalds
f1ef09fde1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "There is a lot here. A lot of these changes result in subtle user
  visible differences in kernel behavior. I don't expect anything will
  care but I will revert/fix things immediately if any regressions show
  up.

  From Seth Forshee there is a continuation of the work to make the vfs
  ready for unpriviled mounts. We had thought the previous changes
  prevented the creation of files outside of s_user_ns of a filesystem,
  but it turns we missed the O_CREAT path. Ooops.

  Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
  standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
  children that are forked after the prctl are considered and not
  children forked before the prctl. The only known user of this prctl
  systemd forks all children after the prctl. So no userspace
  regressions will occur. Holding earlier forked children to the same
  rules as later forked children creates a semantic that is sane enough
  to allow checkpoing of processes that use this feature.

  There is a long delayed change by Nikolay Borisov to limit inotify
  instances inside a user namespace.

  Michael Kerrisk extends the API for files used to maniuplate
  namespaces with two new trivial ioctls to allow discovery of the
  hierachy and properties of namespaces.

  Konstantin Khlebnikov with the help of Al Viro adds code that when a
  network namespace exits purges it's sysctl entries from the dcache. As
  in some circumstances this could use a lot of memory.

  Vivek Goyal fixed a bug with stacked filesystems where the permissions
  on the wrong inode were being checked.

  I continue previous work on ptracing across exec. Allowing a file to
  be setuid across exec while being ptraced if the tracer has enough
  credentials in the user namespace, and if the process has CAP_SETUID
  in it's own namespace. Proc files for setuid or otherwise undumpable
  executables are now owned by the root in the user namespace of their
  mm. Allowing debugging of setuid applications in containers to work
  better.

  A bug I introduced with permission checking and automount is now
  fixed. The big change is to mark the mounts that the kernel initiates
  as a result of an automount. This allows the permission checks in sget
  to be safely suppressed for this kind of mount. As the permission
  check happened when the original filesystem was mounted.

  Finally a special case in the mount namespace is removed preventing
  unbounded chains in the mount hash table, and making the semantics
  simpler which benefits CRIU.

  The vfs fix along with related work in ima and evm I believe makes us
  ready to finish developing and merge fully unprivileged mounts of the
  fuse filesystem. The cleanups of the mount namespace makes discussing
  how to fix the worst case complexity of umount. The stacked filesystem
  fixes pave the way for adding multiple mappings for the filesystem
  uids so that efficient and safer containers can be implemented"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc/sysctl: Don't grab i_lock under sysctl_lock.
  vfs: Use upper filesystem inode in bprm_fill_uid()
  proc/sysctl: prune stale dentries during unregistering
  mnt: Tuck mounts under others instead of creating shadow/side mounts.
  prctl: propagate has_child_subreaper flag to every descendant
  introduce the walk_process_tree() helper
  nsfs: Add an ioctl() to return owner UID of a userns
  fs: Better permission checking for submounts
  exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
  vfs: open() with O_CREAT should not create inodes with unknown ids
  nsfs: Add an ioctl() to return the namespace type
  proc: Better ownership of files for non-dumpable tasks in user namespaces
  exec: Remove LSM_UNSAFE_PTRACE_CAP
  exec: Test the ptracer's saved cred to see if the tracee can gain caps
  exec: Don't reset euid and egid when the tracee has CAP_SETUID
  inotify: Convert to using per-namespace limits
2017-02-23 20:33:51 -08:00
Pavel Tikhomirov
749860ce24 prctl: propagate has_child_subreaper flag to every descendant
If process forks some children when it has is_child_subreaper
flag enabled they will inherit has_child_subreaper flag - first
group, when is_child_subreaper is disabled forked children will
not inherit it - second group. So child-subreaper does not reparent
all his descendants when their parents die. Having these two
differently behaving groups can lead to confusion. Also it is
a problem for CRIU, as when we restore process tree we need to
somehow determine which descendants belong to which group and
much harder - to put them exactly to these group.

To simplify these we can add a propagation of has_child_subreaper
flag on PR_SET_CHILD_SUBREAPER, walking all descendants of child-
subreaper to setup has_child_subreaper flag.

In common cases when process like systemd first sets itself to
be a child-subreaper and only after that forks its services, we will
have zero-length list of descendants to walk. Testing with binary
subtree of 2^15 processes prctl took < 0.007 sec and has shown close
to linear dependency(~0.2 * n * usec) on lower numbers of processes.

Moreover, I doubt someone intentionaly pre-forks the children whitch
should reparent to init before becoming subreaper, because some our
ancestor migh have had is_child_subreaper flag while forking our
sub-tree and our childs will all inherit has_child_subreaper flag,
and we have no way to influence it. And only way to check if we have
no has_child_subreaper flag is to create some childs, kill them and
see where they will reparent to.

Using walk_process_tree helper to walk subtree, thanks to Oleg! Timing
seems to be the same.

Optimize:

a) When descendant already has has_child_subreaper flag all his subtree
has it too already.

* for a) to be true need to move has_child_subreaper inheritance under
the same tasklist_lock with adding task to its ->real_parent->children
as without it process can inherit zero has_child_subreaper, then we
set 1 to it's parent flag, check that parent has no more children, and
only after child with wrong flag is added to the tree.

* Also make these inheritance more clear by using real_parent instead of
current, as on clone(CLONE_PARENT) if current has is_child_subreaper
and real_parent has no is_child_subreaper or has_child_subreaper, child
will have has_child_subreaper flag set without actually having a
subreaper in it's ancestors.

b) When some descendant is child_reaper, it's subtree is in different
pidns from us(original child-subreaper) and processes from other pidns
will never reparent to us.

So we can skip their(a,b) subtree from walk.

v2: switch to walk_process_tree() general helper, move
has_child_subreaper inheritance
v3: remove csr_descendant leftover, change current to real_parent
in has_child_subreaper inheritance
v4: small commit message fix

Fixes: ebec18a6d3 ("prctl: add PR_{SET,GET}_CHILD_SUBREAPER to allow simple process supervision")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-02-03 16:55:15 +13:00
Frederic Weisbecker
5613fda9a5 sched/cputime: Convert task/group cputime to nsecs
Now that most cputime readers use the transition API which return the
task cputime in old style cputime_t, we can safely store the cputime in
nsecs. This will eventually make cputime statistics less opaque and more
granular. Back and forth convertions between cputime_t and nsecs in order
to deal with cputime_t random granularity won't be needed anymore.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01 09:13:49 +01:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds
e34bac726d Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - various misc bits

 - most of MM (quite a lot of MM material is awaiting the merge of
   linux-next dependencies)

 - kasan

 - printk updates

 - procfs updates

 - MAINTAINERS

 - /lib updates

 - checkpatch updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits)
  init: reduce rootwait polling interval time to 5ms
  binfmt_elf: use vmalloc() for allocation of vma_filesz
  checkpatch: don't emit unified-diff error for rename-only patches
  checkpatch: don't check c99 types like uint8_t under tools
  checkpatch: avoid multiple line dereferences
  checkpatch: don't check .pl files, improve absolute path commit log test
  scripts/checkpatch.pl: fix spelling
  checkpatch: don't try to get maintained status when --no-tree is given
  lib/ida: document locking requirements a bit better
  lib/rbtree.c: fix typo in comment of ____rb_erase_color
  lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
  MAINTAINERS: add drm and drm/i915 irc channels
  MAINTAINERS: add "C:" for URI for chat where developers hang out
  MAINTAINERS: add drm and drm/i915 bug filing info
  MAINTAINERS: add "B:" for URI where to file bugs
  get_maintainer: look for arbitrary letter prefixes in sections
  printk: add Kconfig option to set default console loglevel
  printk/sound: handle more message headers
  printk/btrfs: handle more message headers
  printk/kdb: handle more message headers
  ...
2016-12-12 20:50:02 -08:00
Stanislav Kinsburskiy
3fb4afd9a5 prctl: remove one-shot limitation for changing exe link
This limitation came with the reason to remove "another way for
malicious code to obscure a compromised program and masquerade as a
benign process" by allowing "security-concious program can use this
prctl once during its early initialization to ensure the prctl cannot
later be abused for this purpose":

    http://marc.info/?l=linux-kernel&m=133160684517468&w=2

This explanation doesn't look sufficient.  The only thing "exe" link is
indicating is the file, used to execve, which is basically nothing and
not reliable immediately after process has returned from execve system
call.

Moreover, to use this feture, all the mappings to previous exe file have
to be unmapped and all the new exe file permissions must be satisfied.

Which means, that changing exe link is very similar to calling execve on
the binary.

The need to remove this limitations comes from migration of NFS mount
point, which is not accessible during restore and replaced by other file
system.  Because of this exe link has to be changed twice.

[akpm@linux-foundation.org: fix up comment]
Link: http://lkml.kernel.org/r/20160927153755.9337.69650.stgit@localhost.localdomain
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:06 -08:00
Nicolas Pitre
baa73d9e47 posix-timers: Make them configurable
Some embedded systems have no use for them.  This removes about
25KB from the kernel binary size when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime, setitimer, getitimer, alarm.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep
syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast
majority of use cases with very little code.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-kbuild@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Michal Marek <mmarek@suse.com>
Cc: Edward Cree <ecree@solarflare.com>
Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:26:35 +01:00
Michal Hocko
17b0573d77 prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable
PR_SET_THP_DISABLE requires mmap_sem for write.  If the waiting task
gets killed by the oom killer it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving.  Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-23 17:04:14 -07:00
John Stultz
da8b44d5a9 timer: convert timer_slack_ns from unsigned long to u64
This patchset introduces a /proc/<pid>/timerslack_ns interface which
would allow controlling processes to be able to set the timerslack value
on other processes in order to save power by avoiding wakeups (Something
Android currently does via out-of-tree patches).

The first patch tries to fix the internal timer_slack_ns usage which was
defined as a long, which limits the slack range to ~4 seconds on 32bit
systems.  It converts it to a u64, which provides the same basically
unlimited slack (500 years) on both 32bit and 64bit machines.

The second patch introduces the /proc/<pid>/timerslack_ns interface
which allows the full 64bit slack range for a task to be read or set on
both 32bit and 64bit machines.

With these two patches, on a 32bit machine, after setting the slack on
bash to 10 seconds:

$ time sleep 1

real    0m10.747s
user    0m0.001s
sys     0m0.005s

The first patch is a little ugly, since I had to chase the slack delta
arguments through a number of functions converting them to u64s.  Let me
know if it makes sense to break that up more or not.

Other than that things are fairly straightforward.

This patch (of 2):

The timer_slack_ns value in the task struct is currently a unsigned
long.  This means that on 32bit applications, the maximum slack is just
over 4 seconds.  However, on 64bit machines, its much much larger (~500
years).

This disparity could make application development a little (as well as
the default_slack) to a u64.  This means both 32bit and 64bit systems
have the same effective internal slack range.

Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
the interface as a unsigned long, so we preserve that limitation on
32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
actually larger then what can be stored by an unsigned long.

This patch also modifies hrtimer functions which specified the slack
delta as a unsigned long.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Mateusz Guzik
ddf1d398e5 prctl: take mmap sem for writing to protect against others
An unprivileged user can trigger an oops on a kernel with
CONFIG_CHECKPOINT_RESTORE.

proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env
start/end values. These get sanity checked as follows:
        BUG_ON(arg_start > arg_end);
        BUG_ON(env_start > env_end);

These can be changed by prctl_set_mm. Turns out also takes the semaphore for
reading, effectively rendering it useless. This results in:

  kernel BUG at fs/proc/base.c:240!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: virtio_net
  CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000
  RIP: proc_pid_cmdline_read+0x520/0x530
  RSP: 0018:ffff8800784d3db8  EFLAGS: 00010206
  RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000
  RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246
  RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050
  R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600
  FS:  00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0
  Call Trace:
    __vfs_read+0x37/0x100
    vfs_read+0x82/0x130
    SyS_read+0x58/0xd0
    entry_SYSCALL_64_fastpath+0x12/0x76
  Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
  RIP   proc_pid_cmdline_read+0x520/0x530
  ---[ end trace 97882617ae9c6818 ]---

Turns out there are instances where the code just reads aformentioned
values without locking whatsoever - namely environ_read and get_cmdline.

Interestingly these functions look quite resilient against bogus values,
but I don't believe this should be relied upon.

The first patch gets rid of the oops bug by grabbing mmap_sem for
writing.

The second patch is optional and puts locking around aformentioned
consumers for safety.  Consumers of other fields don't seem to benefit
from similar treatment and are left untouched.

This patch (of 2):

The code was taking the semaphore for reading, which does not protect
against readers nor concurrent modifications.

The problem could cause a sanity checks to fail in procfs's cmdline
reader, resulting in an OOPS.

Note that some functions perform an unlocked read of various mm fields,
but they seem to be fine despite possible modificaton.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anshuman Khandual <anshuman.linux@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Ben Segall
8639b46139 pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
setpriority(PRIO_USER, 0, x) will change the priority of tasks outside of
the current pid namespace.  This is in contrast to both the other modes of
setpriority and the example of kill(-1).  Fix this.  getpriority and
ioprio have the same failure mode, fix them too.

Eric said:

: After some more thinking about it this patch sounds justifiable.
:
: My goal with namespaces is not to build perfect isolation mechanisms
: as that can get into ill defined territory, but to build well defined
: mechanisms.  And to handle the corner cases so you can use only
: a single namespace with well defined results.
:
: In this case you have found the two interfaces I am aware of that
: identify processes by uid instead of by pid.  Which quite frankly is
: weird.  Unfortunately the weird unexpected cases are hard to handle
: in the usual way.
:
: I was hoping for a little more information.  Changes like this one we
: have to be careful of because someone might be depending on the current
: behavior.  I don't think they are and I do think this make sense as part
: of the pid namespace.

Signed-off-by: Ben Segall <bsegall@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ambrose Feinstein <ambrose@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Eric W. Biederman
90f8572b0f vfs: Commit to never having exectuables on proc and sysfs.
Today proc and sysfs do not contain any executable files.  Several
applications today mount proc or sysfs without noexec and nosuid and
then depend on there being no exectuables files on proc or sysfs.
Having any executable files show on proc or sysfs would cause
a user space visible regression, and most likely security problems.

Therefore commit to never allowing executables on proc and sysfs by
adding a new flag to mark them as filesystems without executables and
enforce that flag.

Test the flag where MNT_NOEXEC is tested today, so that the only user
visible effect will be that exectuables will be treated as if the
execute bit is cleared.

The filesystems proc and sysfs do not currently incoporate any
executable files so this does not result in any user visible effects.

This makes it unnecessary to vet changes to proc and sysfs tightly for
adding exectuable files or changes to chattr that would modify
existing files, as no matter what the individual file say they will
not be treated as exectuable files by the vfs.

Not having to vet changes to closely is important as without this we
are only one proc_create call (or another goof up in the
implementation of notify_change) from having problematic executables
on proc.  Those mistakes are all too easy to make and would create
a situation where there are security issues or the assumptions of
some program having to be broken (and cause userspace regressions).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-10 10:39:25 -05:00
Alexey Dobriyan
4a00e9df29 prctl: more prctl(PR_SET_MM_*) checks
Individual prctl(PR_SET_MM_*) calls do some checking to maintain a
consistent view of mm->arg_start et al fields, but not enough.  In
particular PR_SET_MM_ARG_START/PR_SET_MM_ARG_END/ R_SET_MM_ENV_START/
PR_SET_MM_ENV_END only check that the address lies in an existing VMA,
but don't check that the start address is lower than the end address _at
all_.

Consolidate all consistency checks, so there will be no difference in
the future between PR_SET_MM_MAP and individual PR_SET_MM_* calls.

The program below makes both ARGV and ENVP areas be reversed.  It makes
/proc/$PID/cmdline show garbage (it doesn't oops by luck).

#include <sys/mman.h>
#include <sys/prctl.h>
#include <unistd.h>

enum {PAGE_SIZE=4096};

int main(void)
{
	void *p;

	p = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

#define PR_SET_MM               35
#define PR_SET_MM_ARG_START     8
#define PR_SET_MM_ARG_END       9
#define PR_SET_MM_ENV_START     10
#define PR_SET_MM_ENV_END       11
	prctl(PR_SET_MM, PR_SET_MM_ARG_START, (unsigned long)p + PAGE_SIZE - 1, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ARG_END,   (unsigned long)p, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ENV_START, (unsigned long)p + PAGE_SIZE - 1, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ENV_END,   (unsigned long)p, 0, 0);

	pause();
	return 0;
}

[akpm@linux-foundation.org: tidy code, tweak comment]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:37 -07:00