150 Commits

Author SHA1 Message Date
Eric W. Biederman
cd02cea58b vfs: Pass data, ns, and ns->userns to mount_ns
Today what is normally called data (the mount options) is not passed
to fill_super through mount_ns.

Pass the mount options and the namespace separately to mount_ns so
that filesystems such as proc that have mount options, can use
mount_ns.

Pass the user namespace to mount_ns so that the standard permission
check that verifies the mounter has permissions over the namespace can
be performed in mount_ns instead of in each filesystems .mount method.
Thus removing the duplication between mqueuefs and proc in terms of
permission checks.  The extra permission check does not currently
affect the rpc_pipefs filesystem and the nfsd filesystem as those
filesystems do not currently allow unprivileged mounts.  Without
unpvileged mounts it is guaranteed that the caller has already passed
capable(CAP_SYS_ADMIN) which guarantees extra permission check will
pass.

Update rpc_pipefs and the nfsd filesystem to ensure that the network
namespace reference is always taken in fill_super and always put in kill_sb
so that the logic is simpler and so that errors originating inside of
fill_super do not cause a network namespace leak.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Chatur27 <jasonbright2709@gmail.com>
2022-03-04 20:16:56 +01:00
Nathan Chancellor
b51a1faf1a Merge 4.4.188 into android-msm-wahoo-4.4
Changes in 4.4.188: (23 commits)
        ARM: riscpc: fix DMA
        ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
        kernel/module.c: Only return -EEXIST for modules that have finished loading
        MIPS: lantiq: Fix bitfield masking
        dmaengine: rcar-dmac: Reject zero-length slave DMA requests
        fs/adfs: super: fix use-after-free bug
        btrfs: fix minimum number of chunk errors for DUP
        ceph: fix improper use of smp_mb__before_atomic()
        scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
        ACPI: fix false-positive -Wuninitialized warning
        be2net: Signal that the device cannot transmit during reconfiguration
        x86/apic: Silence -Wtype-limits compiler warnings
        x86: math-emu: Hide clang warnings for 16-bit overflow
        mm/cma.c: fail if fixed declaration can't be honored
        coda: add error handling for fget
        coda: fix build using bare-metal toolchain
        uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
        ipc/mqueue.c: only perform resource calculation if user valid
        x86/kvm: Don't call kvm_spurious_fault() from .fixup
        selinux: fix memory leak in policydb_init()
        s390/dasd: fix endless loop after read unit address configuration
        xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
        Linux 4.4.188

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2019-08-06 09:51:28 -07:00
Kees Cook
793cbeb948 ipc/mqueue.c: only perform resource calculation if user valid
[ Upstream commit a318f12ed8843cfac53198390c74a565c632f417 ]

Andreas Christoforou reported:

  UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow:
  9 * 2305843009213693951 cannot be represented in type 'long int'
  ...
  Call Trace:
    mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414
    evict+0x472/0x8c0 fs/inode.c:558
    iput_final fs/inode.c:1547 [inline]
    iput+0x51d/0x8c0 fs/inode.c:1573
    mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320
    mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459
    vfs_mkobj+0x39e/0x580 fs/namei.c:2892
    prepare_open ipc/mqueue.c:731 [inline]
    do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771

Which could be triggered by:

        struct mq_attr attr = {
                .mq_flags = 0,
                .mq_maxmsg = 9,
                .mq_msgsize = 0x1fffffffffffffff,
                .mq_curmsgs = 0,
        };

        if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1)
                perror("mq_open");

mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and
preparing to return -EINVAL.  During the cleanup, it calls
mqueue_evict_inode() which performed resource usage tracking math for
updating "user", before checking if there was a valid "user" at all
(which would indicate that the calculations would be sane).  Instead,
delay this check to after seeing a valid "user".

The overflow was real, but the results went unused, so while the flaw is
harmless, it's noisy for kernel fuzzers, so just fix it by moving the
calculation under the non-NULL "user" where it actually gets used.

Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Andreas Christoforou <andreaschristofo@gmail.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 18:28:28 +02:00
Nathan Chancellor
2458b36258 Merge 4.4.183 into android-msm-wahoo-4.4
Changes in 4.4.183: (85 commits)
        fs/fat/file.c: issue flush after the writeback of FAT
        sysctl: return -EINVAL if val violates minmax
        ipc: prevent lockup on alloc_msg and free_msg
        hugetlbfs: on restore reserve error path retain subpool reservation
        mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
        mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
        kernel/sys.c: prctl: fix false positive in validate_prctl_map()
        mfd: intel-lpss: Set the device in reset state when init
        mfd: twl6040: Fix device init errors for ACCCTL register
        perf/x86/intel: Allow PEBS multi-entry in watermark mode
        drm/bridge: adv7511: Fix low refresh rate selection
        ntp: Allow TAI-UTC offset to be set to zero
        f2fs: fix to avoid panic in do_recover_data()
        f2fs: fix to do sanity check on valid block count of segment
        iommu/vt-d: Set intel_iommu_gfx_mapped correctly
        ALSA: hda - Register irq handler after the chip initialization
        nvmem: core: fix read buffer in place
        fuse: retrieve: cap requested size to negotiated max_write
        nfsd: allow fh_want_write to be called twice
        x86/PCI: Fix PCI IRQ routing table memory leak
        platform/chrome: cros_ec_proto: check for NULL transfer function
        soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher
        clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288
        ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA
        ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA
        ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA
        PCI: rpadlpar: Fix leaked device_node references in add/remove paths
        PCI: rcar: Fix a potential NULL pointer dereference
        video: hgafb: fix potential NULL pointer dereference
        video: imsttfb: fix potential NULL pointer dereferences
        PCI: xilinx: Check for __get_free_pages() failure
        gpio: gpio-omap: add check for off wake capable gpios
        dmaengine: idma64: Use actual device for DMA transfers
        pwm: tiehrpwm: Update shadow register for disabling PWMs
        ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa
        pwm: Fix deadlock warning when removing PWM device
        ARM: exynos: Fix undefined instruction during Exynos5422 resume
        futex: Fix futex lock the wrong page
        Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections"
        ALSA: seq: Cover unsubscribe_port() in list_mutex
        libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk
        mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
        fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
        signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO
        ptrace: restore smp_rmb() in __ptrace_may_access()
        i2c: acorn: fix i2c warning
        bcache: fix stack corruption by PRECEDING_KEY()
        cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
        ASoC: cs42xx8: Add regcache mask dirty
        Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var
        scsi: lpfc: add check for loss of ndlp when sending RRQ
        scsi: bnx2fc: fix incorrect cast to u64 on shift operation
        usbnet: ipheth: fix racing condition
        KVM: x86/pmu: do not mask the value that is written to fixed PMUs
        KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
        drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read
        drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()
        USB: Fix chipmunk-like voice when using Logitech C270 for recording audio.
        USB: usb-storage: Add new ID to ums-realtek
        USB: serial: pl2303: add Allied Telesis VT-Kit3
        USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode
        USB: serial: option: add Telit 0x1260 and 0x1261 compositions
        ax25: fix inconsistent lock state in ax25_destroy_timer
        be2net: Fix number of Rx queues used for flow hashing
        ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
        lapb: fixed leak of control-blocks.
        neigh: fix use-after-free read in pneigh_get_next
        sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
        mISDN: make sure device name is NUL terminated
        x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
        perf/ring_buffer: Fix exposing a temporarily decreased data_head
        perf/ring_buffer: Add ordering to rb->nest increment
        gpio: fix gpio-adp5588 build errors
        net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE()
        i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
        configfs: Fix use-after-free when accessing sd->s_dentry
        ia64: fix build errors by exporting paddr_to_nid()
        KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
        net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
        scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
        scsi: libsas: delete sas port if expander discover failed
        Revert "crypto: crypto4xx - properly set IV after de- and encrypt"
        coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
        Abort file_remove_privs() for non-reg. files
        Linux 4.4.183

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	drivers/android/binder.c
2019-06-22 12:48:42 -07:00
Li Rongqing
d8129a5d7a ipc: prevent lockup on alloc_msg and free_msg
[ Upstream commit d6a2946a88f524a47cc9b79279667137899db807 ]

msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is
enabled on large memory SMP systems, the pages initialization can take a
long time, if msgctl10 requests a huge block memory, and it will block
rcu scheduler, so release cpu actively.

After adding schedule() in free_msg, free_msg can not be called when
holding spinlock, so adding msg to a tmp list, and free it out of
spinlock

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978
  rcu:     (detected by 11, t=35024 jiffies, g=44237529, q=16542267)
  msgctl10        R  running task    21608 32505   2794 0x00000082
  Call Trace:
   preempt_schedule_irq+0x4c/0xb0
   retint_kernel+0x1b/0x2d
  RIP: 0010:__is_insn_slot_addr+0xfb/0x250
  Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48
  RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
  RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57
  RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780
  RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3
  R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73
  R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec
   kernel_text_address+0xc1/0x100
   __kernel_text_address+0xe/0x30
   unwind_get_return_address+0x2f/0x50
   __save_stack_trace+0x92/0x100
   create_object+0x380/0x650
   __kmalloc+0x14c/0x2b0
   load_msg+0x38/0x1a0
   do_msgsnd+0x19e/0xcf0
   do_syscall_64+0x117/0x400
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170
  rcu:     (detected by 14, t=35016 jiffies, g=44237525, q=12423063)
  msgctl10        R  running task    21608 32170  32155 0x00000082
  Call Trace:
   preempt_schedule_irq+0x4c/0xb0
   retint_kernel+0x1b/0x2d
  RIP: 0010:lock_acquire+0x4d/0x340
  Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82
  RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
  RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64
  RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000
  R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
   is_bpf_text_address+0x32/0xe0
   kernel_text_address+0xec/0x100
   __kernel_text_address+0xe/0x30
   unwind_get_return_address+0x2f/0x50
   __save_stack_trace+0x92/0x100
   save_stack+0x32/0xb0
   __kasan_slab_free+0x130/0x180
   kfree+0xfa/0x2d0
   free_msg+0x24/0x50
   do_msgrcv+0x508/0xe60
   do_syscall_64+0x117/0x400
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Davidlohr said:
 "So after releasing the lock, the msg rbtree/list is empty and new
  calls will not see those in the newly populated tmp_msg list, and
  therefore they cannot access the delayed msg freeing pointers, which
  is good. Also the fact that the node_cache is now freed before the
  actual messages seems to be harmless as this is wanted for
  msg_insert() avoiding GFP_ATOMIC allocations, and after releasing the
  info->lock the thing is freed anyway so it should not change things"

Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Thierry Strudel
75c8bc7183 Merged linux-4.4.80 into android-msm-wahoo-4.4
Linux 4.4.80
    ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused
    scsi: snic: Return error code on memory allocation failure
    scsi: fnic: Avoid sending reset to firmware when another reset is in progress
    HID: ignore Petzl USB headlamp
    ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
    sh_eth: enable RX descriptor word 0 shift on SH7734
    nvmem: imx-ocotp: Fix wrong register size
    arm64: mm: fix show_pte KERN_CONT fallout
    vfio-pci: Handle error from pci_iomap
    video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
    perf symbols: Robustify reading of build-id from sysfs
    perf tools: Install tools/lib/traceevent plugins with install-bin
    xfrm: Don't use sk_family for socket policy lookups
    tools lib traceevent: Fix prev/next_prio for deadline tasks
    Btrfs: adjust outstanding_extents counter properly when dio write is split
    usb: gadget: Fix copy/pasted error message
    ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
    ARM: s3c2410_defconfig: Fix invalid values for NF_CT_PROTO_*
    ARM64: zynqmp: Fix i2c node's compatible string
    ARM64: zynqmp: Fix W=1 dtc 1.4 warnings
    dmaengine: ti-dma-crossbar: Add some 'of_node_put()' in error path.
    dmaengine: ioatdma: workaround SKX ioatdma version
    dmaengine: ioatdma: Add Skylake PCI Dev ID
    openrisc: Add _text symbol to fix ksym build error
    irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
    ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL
    spi: dw: Make debugfs name unique between instances
    ASoC: tlv320aic3x: Mark the RESET register as volatile
    irqchip/keystone: Fix "scheduling while atomic" on rt
    vfio-pci: use 32-bit comparisons for register address for gcc-4.5
    drm/msm: Verify that MSM_SUBMIT_BO_FLAGS are set
    drm/msm: Ensure that the hardware write pointer is valid
    net/mlx4: Remove BUG_ON from ICM allocation routine
    ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
    ARM: dts: n900: Mark eMMC slot with no-sdio and no-sd flags
    r8169: add support for RTL8168 series add-on card.
    x86/mce/AMD: Make the init code more robust
    tpm: Replace device number bitmap with IDR
    tpm: fix a kernel memory leak in tpm-sysfs.c
    xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
    xen/blkback: don't free be structure too early
    sched/cputime: Fix prev steal time accouting during CPU hotplug
    net: skb_needs_check() accepts CHECKSUM_NONE for tx
    pstore: Use dynamic spinlock initializer
    pstore: Correctly initialize spinlock and flags
    pstore: Allow prz to control need for locking
    vlan: Propagate MAC address to VLANs
    /proc/iomem: only expose physical resource addresses to privileged users
    Make file credentials available to the seqfile interfaces
    v4l: s5c73m3: fix negation operator
    dentry name snapshots
    ipmi/watchdog: fix watchdog timeout set on reboot
    libnvdimm, btt: fix btt_rw_page not returning errors
    RDMA/uverbs: Fix the check for port number
    PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present
    sched/cgroup: Move sched_online_group() back into css_online() to fix crash
    kaweth: fix oops upon failed memory allocation
    kaweth: fix firmware download
    mpt3sas: Don't overreach ioc->reply_post[] during initialization
    mailbox: handle empty message in tx_tick
    mailbox: skip complete wait event if timer expired
    mailbox: always wait in mbox_send_message for blocking Tx mode
    wil6210: fix deadlock when using fw_no_recovery option
    ath10k: fix null deref on wmi-tlv when trying spectral scan
    isdn/i4l: fix buffer overflow
    isdn: Fix a sleep-in-atomic bug
    net: phy: Do not perform software reset for Generic PHY
    nfc: fdp: fix NULL pointer dereference
    xfs: don't BUG() on mixed direct and mapped I/O
    perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
    perf intel-pt: Use FUP always when scanning for an IP
    perf intel-pt: Fix last_ip usage
    perf intel-pt: Fix ip compression
    drm: rcar-du: Simplify and fix probe error handling
    drm: rcar-du: Perform initialization/cleanup at probe/remove time
    drm/rcar: Nuke preclose hook
    Staging: comedi: comedi_fops: Avoid orphaned proc entry
    Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"
    KVM: PPC: Book3S HV: Save/restore host values of debug registers
    KVM: PPC: Book3S HV: Reload HTM registers explicitly
    KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
    KVM: PPC: Book3S HV: Context-switch EBB registers properly
    drm/nouveau/bar/gf100: fix access to upper half of BAR2
    drm/vmwgfx: Fix gcc-7.1.1 warning
    md/raid5: add thread_group worker async_tx_issue_pending_all
    crypto: authencesn - Fix digest_null crash
    powerpc/pseries: Fix of_node_put() underflow during reconfig remove
    net: reduce skb_warn_bad_offload() noise
    pstore: Make spinlock per zone instead of global
    af_key: Add lock to key dump
Linux 4.4.79
    alarmtimer: don't rate limit one-shot timers
    tracing: Fix kmemleak in instance_rmdir
    spmi: Include OF based modalias in device uevent
    of: device: Export of_device_{get_modalias, uvent_modalias} to modules
    drm/mst: Avoid processing partially received up/down message transactions
    drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req()
    drm/mst: Fix error handling during MST sideband message reception
    RDMA/core: Initialize port_num in qp_attr
    ceph: fix race in concurrent readdir
    staging: rtl8188eu: add TL-WN722N v2 support
    Revert "perf/core: Drop kernel samples even though :u is specified"
    perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target
    target: Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce
    udf: Fix deadlock between writeback and udf_setsize()
    NFS: only invalidate dentrys that are clearly invalid.
    Input: i8042 - fix crash at boot time
    MIPS: Fix a typo: s/preset/present/ in r2-to-r6 emulation error message
    MIPS: Send SIGILL for linked branches in `__compute_return_epc_for_insn'
    MIPS: Rename `sigill_r6' to `sigill_r2r6' in `__compute_return_epc_for_insn'
    MIPS: Send SIGILL for BPOSGE32 in `__compute_return_epc_for_insn'
    MIPS: math-emu: Prevent wrong ISA mode instruction emulation
    MIPS: Fix unaligned PC interpretation in `compute_return_epc'
    MIPS: Actually decode JALX in `__compute_return_epc_for_insn'
    MIPS: Save static registers before sysmips
    MIPS: Fix MIPS I ISA /proc/cpuinfo reporting
    x86/ioapic: Pass the correct data to unmask_ioapic_irq()
    x86/acpi: Prevent out of bound access caused by broken ACPI tables
    MIPS: Negate error syscall return in trace
    MIPS: Fix mips_atomic_set() with EVA
    MIPS: Fix mips_atomic_set() retry condition
    ftrace: Fix uninitialized variable in match_records()
    vfio: New external user group/file match
    vfio: Fix group release deadlock
    f2fs: Don't clear SGID when inheriting ACLs
    ipmi:ssif: Add missing unlock in error branch
    ipmi: use rcu lock around call to intf->handlers->sender()
    drm/radeon: Fix eDP for single-display iMac10,1 (v2)
    drm/radeon/ci: disable mclk switching for high refresh rates (v2)
    drm/amd/amdgpu: Return error if initiating read out of range on vram
    s390/syscalls: Fix out of bounds arguments access
    Raid5 should update rdev->sectors after reshape
    cx88: Fix regression in initial video standard setting
    x86/xen: allow userspace access during hypercalls
    md: don't use flush_signals in userspace processes
    usb: renesas_usbhs: gadget: disable all eps when the driver stops
    usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL
    USB: cdc-acm: add device-id for quirky printer
    usb: storage: return on error to avoid a null pointer dereference
    xhci: Fix NULL pointer dereference when cleaning up streams for removed host
    xhci: fix 20000ms port resume timeout
    ipvs: SNAT packet replies only for NATed connections
    PCI/PM: Restore the status of PCI devices across hibernation
    af_key: Fix sadb_x_ipsecrequest parsing
    powerpc/asm: Mark cr0 as clobbered in mftb()
    powerpc: Fix emulation of mfocrf in emulate_step()
    powerpc: Fix emulation of mcrf in emulate_step()
    powerpc/64: Fix atomic64_inc_not_zero() to return an int
    iscsi-target: Add login_keys_workaround attribute for non RFC initiators
    scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails.
    PM / Domains: Fix unsafe iteration over modified list of domain providers
    PM / Domains: Fix unsafe iteration over modified list of device links
    ASoC: compress: Derive substream from stream based on direction
    wlcore: fix 64K page support
    Bluetooth: use constant time memory comparison for secret values
    perf intel-pt: Clear FUP flag on error
    perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
    perf intel-pt: Fix missing stack clear
    perf intel-pt: Improve sample timestamp
    perf intel-pt: Move decoder error setting into one condition
    NFC: Add sockaddr length checks before accessing sa_family in bind handlers
    nfc: Fix the sockaddr length sanitization in llcp_sock_connect
    nfc: Ensure presence of required attributes in the activate_target handler
    NFC: nfcmrvl: fix firmware-management initialisation
    NFC: nfcmrvl: use nfc-device for firmware download
    NFC: nfcmrvl: do not use device-managed resources
    NFC: nfcmrvl_uart: add missing tty-device sanity check
    NFC: fix broken device allocation
    ath9k: fix tx99 bus error
    ath9k: fix tx99 use after free
    thermal: cpu_cooling: Avoid accessing potentially freed structures
    s5p-jpeg: don't return a random width/height
    ir-core: fix gcc-7 warning on bool arithmetic
    disable new gcc-7.1.1 warnings for now
Linux 4.4.78
    kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
    kvm: vmx: Check value written to IA32_BNDCFGS
    kvm: x86: Guest BNDCFGS requires guest MPX support
    kvm: vmx: Do not disable intercepts for BNDCFGS
    KVM: x86: disable MPX if host did not enable MPX XSAVE features
    tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
    PM / QoS: return -EINVAL for bogus strings
    PM / wakeirq: Convert to SRCU
    sched/topology: Optimize build_group_mask()
    sched/topology: Fix overlapping sched_group_mask
    crypto: caam - fix signals handling
    crypto: sha1-ssse3 - Disable avx2
    crypto: atmel - only treat EBUSY as transient if backlog
    crypto: talitos - Extend max key length for SHA384/512-HMAC and AEAD
    mm: fix overflow check in expand_upwards()
    tpm: Issue a TPM2_Shutdown for TPM2 devices.
    Add "shutdown" to "struct class".
    tpm: Provide strong locking for device removal
    tpm: Get rid of chip->pdev
    selftests/capabilities: Fix the test_execve test
    mnt: Make propagate_umount less slow for overlapping mount propagation trees
    mnt: In propgate_umount handle visiting mounts in any order
    mnt: In umount propagation reparent in a separate pass
    vt: fix unchecked __put_user() in tioclinux ioctls
    exec: Limit arg stack to at most 75% of _STK_LIM
    s390: reduce ELF_ET_DYN_BASE
    powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
    arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
    arm: move ELF_ET_DYN_BASE to 4MB
    binfmt_elf: use ELF_ET_DYN_BASE only for PIE
    checkpatch: silence perl 5.26.0 unescaped left brace warnings
    fs/dcache.c: fix spin lockup issue on nlru->lock
    mm/list_lru.c: fix list_lru_count_node() to be race free
    kernel/extable.c: mark core_kernel_text notrace
    tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth
    parisc/mm: Ensure IRQs are off in switch_mm()
    parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs
    parisc: use compat_sys_keyctl()
    parisc: Report SIGSEGV instead of SIGBUS when running out of stack
    irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity
    cfg80211: Check if PMKID attribute is of expected size
    cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES
    cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE
    brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
    rds: tcp: use sock_create_lite() to create the accept socket
    vrf: fix bug_on triggered by rx when destroying a vrf
    net: ipv6: Compare lwstate in detecting duplicate nexthops
    ipv6: dad: don't remove dynamic addresses if link is down
    net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()
    bpf: prevent leaking pointer via xadd on unpriviledged
    net: prevent sign extension in dev_get_stats()
    tcp: reset sk_rx_dst in tcp_disconnect()
    net: dp83640: Avoid NULL pointer dereference.
    ipv6: avoid unregistering inet6_dev for loopback
    net/phy: micrel: configure intterupts after autoneg workaround
    net: sched: Fix one possible panic when no destroy callback
    net_sched: fix error recovery at qdisc creation
Linux 4.4.77
    saa7134: fix warm Medion 7134 EEPROM read
    x86/mm/pat: Don't report PAT on CPUs that don't support it
    ext4: check return value of kstrtoull correctly in reserved_clusters_store
    staging: comedi: fix clean-up of comedi_class in comedi_init()
    staging: vt6556: vnt_start Fix missing call to vnt_key_init_table.
    tcp: fix tcp_mark_head_lost to check skb len before fragmenting
    md: fix super_offset endianness in super_1_rdev_size_change
    md: fix incorrect use of lexx_to_cpu in does_sb_need_changing
    perf tools: Use readdir() instead of deprecated readdir_r() again
    perf tests: Remove wrong semicolon in while loop in CQM test
    perf trace: Do not process PERF_RECORD_LOST twice
    perf dwarf: Guard !x86_64 definitions under #ifdef else clause
    perf pmu: Fix misleadingly indented assignment (whitespace)
    perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
    perf tools: Remove duplicate const qualifier
    perf script: Use readdir() instead of deprecated readdir_r()
    perf thread_map: Use readdir() instead of deprecated readdir_r()
    perf tools: Use readdir() instead of deprecated readdir_r()
    perf bench numa: Avoid possible truncation when using snprintf()
    perf tests: Avoid possible truncation with dirent->d_name + snprintf
    perf scripting perl: Fix compile error with some perl5 versions
    perf thread_map: Correctly size buffer used with dirent->dt_name
    perf intel-pt: Use __fallthrough
    perf top: Use __fallthrough
    tools strfilter: Use __fallthrough
    tools string: Use __fallthrough in perf_atoll()
    tools include: Add a __fallthrough statement
    mqueue: fix a use-after-free in sys_mq_notify()
    RDMA/uverbs: Check port number supplied by user verbs cmds
    KEYS: Fix an error code in request_master_key()
    ath10k: override CE5 config for QCA9377
    x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings
    x86/tools: Fix gcc-7 warning in relocs.c
    gfs2: Fix glock rhashtable rcu bug
    USB: serial: qcserial: new Sierra Wireless EM7305 device ID
    USB: serial: option: add two Longcheer device ids
    pinctrl: sh-pfc: Update info pointer after SoC-specific init
    pinctrl: mxs: atomically switch mux and drive strength config
    pinctrl: sunxi: Fix SPDIF function name for A83T
    pinctrl: meson: meson8b: fix the NAND DQS pins
    pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data
    sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec
    sysctl: don't print negative flag for proc_douintvec
    mac80211_hwsim: Replace bogus hrtimer clockid
    usb: Fix typo in the definition of Endpoint[out]Request
    usb: usbip: set buffer pointers to NULL after free
    Add USB quirk for HVR-950q to avoid intermittent device resets
    USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
    usb: dwc3: replace %p with %pK
    drm/virtio: don't leak bo on drm_gem_object_init failure
    tracing/kprobes: Allow to create probe with a module name starting with a digit
    mm: fix classzone_idx underflow in shrink_zones()
    bgmac: reset & enable Ethernet core before using it
    driver core: platform: fix race condition with driver_override
    fs: completely ignore unknown open flags
    fs: add a VALID_OPEN_FLAGS
Linux 4.4.76
    KVM: nVMX: Fix exception injection
    KVM: x86: zero base3 of unusable segments
    KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
    KVM: x86: fix emulation of RSM and IRET instructions
    cpufreq: s3c2416: double free on driver init error path
    iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
    iommu: Handle default domain attach failure
    iommu/vt-d: Don't over-free page table directories
    ocfs2: o2hb: revert hb threshold to keep compatible
    x86/mm: Fix flush_tlb_page() on Xen
    x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
    ARM: 8685/1: ensure memblock-limit is pmd-aligned
    ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
    sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
    watchdog: bcm281xx: Fix use of uninitialized spinlock.
    xfrm: Oops on error in pfkey_msg2xfrm_state()
    xfrm: NULL dereference on allocation failure
    xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
    jump label: fix passing kbuild_cflags when checking for asm goto support
    ravb: Fix use-after-free on `ifconfig eth0 down`
    sctp: check af before verify address in sctp_addr_id2transport
    net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
    perf probe: Fix to show correct locations for events on modules
    be2net: fix status check in be_cmd_pmac_add()
    s390/ctl_reg: make __ctl_load a full memory barrier
    swiotlb: ensure that page-sized mappings are page-aligned
    coredump: Ensure proper size of sparse core files
    x86/mpx: Use compatible types in comparison to fix sparse error
    mac80211: initialize SMPS field in HT capabilities
    spi: davinci: use dma_mapping_error()
    scsi: lpfc: avoid double free of resource identifiers
    HID: i2c-hid: Add sleep between POWER ON and RESET
    kernel/panic.c: add missing \n
    ibmveth: Add a proper check for the availability of the checksum features
    vxlan: do not age static remote mac entries
    virtio_net: fix PAGE_SIZE > 64k
    vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
    drm/amdgpu: check ring being ready before using
    net: dsa: Check return value of phy_connect_direct()
    amd-xgbe: Check xgbe_init() return code
    platform/x86: ideapad-laptop: handle ACPI event 1
    scsi: virtio_scsi: Reject commands when virtqueue is broken
    xen-netfront: Fix Rx stall during network stress and OOM
    swiotlb-xen: update dev_addr after swapping pages
    virtio_console: fix a crash in config_work_handler
    Btrfs: fix truncate down when no_holes feature is enabled
    gianfar: Do not reuse pages from emergency reserve
    powerpc/eeh: Enable IO path on permanent error
    net: bgmac: Remove superflous netif_carrier_on()
    net: bgmac: Start transmit queue in bgmac_open
    net: bgmac: Fix SOF bit checking
    bgmac: Fix reversed test of build_skb() return value.
    mtd: bcm47xxpart: don't fail because of bit-flips
    bgmac: fix a missing check for build_skb
    mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only
    MIPS: ralink: fix MT7628 wled_an pinmux gpio
    MIPS: ralink: fix MT7628 pinmux typos
    MIPS: ralink: Fix invalid assignment of SoC type
    MIPS: ralink: fix USB frequency scaling
    MIPS: ralink: MT7688 pinmux fixes
    net: korina: Fix NAPI versus resources freeing
    MIPS: ath79: fix regression in PCI window initialization
    net: mvneta: Fix for_each_present_cpu usage
    ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
    qla2xxx: Fix erroneous invalid handle message
    scsi: lpfc: Set elsiocb contexts to NULL after freeing it
    scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
    KVM: x86: fix fixing of hypercalls
    mm: numa: avoid waiting on freed migrated pages
    block: fix module reference leak on put_disk() call for cgroups throttle
    sysctl: enable strict writes
    usb: gadget: f_fs: Fix possibe deadlock
    drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
    ALSA: hda - set input_path bitmap to zero after moving it to new place
    ALSA: hda - Fix endless loop of codec configure
    MIPS: Fix IRQ tracing & lockdep when rescheduling
    MIPS: pm-cps: Drop manual cache-line alignment of ready_count
    MIPS: Avoid accidental raw backtrace
    mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
    drm/ast: Handle configuration without P2A bridge
    NFSv4: fix a reference leak caused WARNING messages
    netfilter: synproxy: fix conntrackd interaction
    netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
    rtnetlink: add IFLA_GROUP to ifla_policy
    ipv6: Do not leak throw route references
    sfc: provide dummy definitions of vswitch functions
    net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
    decnet: always not take dst->__refcnt when inserting dst into hash table
    net/mlx5: Wait for FW readiness before initializing command interface
    ipv6: fix calling in6_ifa_hold incorrectly for dad work
    igmp: add a missing spin_lock_init()
    igmp: acquire pmc lock for ip_mc_clear_src()
    net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
    Fix an intermittent pr_emerg warning about lo becoming free.
    af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
    net: Zero ifla_vf_info in rtnl_fill_vfinfo()
    decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
    net: don't call strlen on non-terminated string in dev_set_alias()
    ipv6: release dst on error in ip6_dst_lookup_tail
Linux 4.4.75
    nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
    nvme/quirk: Add a delay before checking for adapter readiness
    net: phy: fix marvell phy status reading
    net: phy: Initialize mdio clock at probe function
    usb: gadget: f_fs: avoid out of bounds access on comp_desc
    powerpc/slb: Force a full SLB flush when we insert for a bad EA
    mtd: spi-nor: fix spansion quad enable
    of: Add check to of_scan_flat_dt() before accessing initial_boot_params
    rxrpc: Fix several cases where a padded len isn't checked in ticket decode
    USB: usbip: fix nonconforming hub descriptor
    drm/amdgpu: adjust default display clock
    drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
    drm/radeon: add a quirk for Toshiba Satellite L20-183
    drm/radeon: add a PX quirk for another K53TK variant
    iscsi-target: Reject immediate data underflow larger than SCSI transfer length
    target: Fix kref->refcount underflow in transport_cmd_finish_abort
    time: Fix clock->read(clock) race around clocksource changes
    Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list
    powerpc/kprobes: Pause function_graph tracing during jprobes handling
    signal: Only reschedule timers on signals timers have sent
    HID: Add quirk for Dell PIXART OEM mouse
    CIFS: Improve readdir verbosity
    KVM: PPC: Book3S HV: Preserve userspace HTM state properly
    lib/cmdline.c: fix get_options() overflow while parsing ranges
    autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
    fs/exec.c: account for argv/envp pointers
Linux 4.4.74
    mm: fix new crash in unmapped_area_topdown()
    Allow stack to grow up to address space limit
    mm: larger stack guard gap, between vmas
    alarmtimer: Rate limit periodic intervals
    MIPS: Fix bnezc/jialc return address calculation
    usb: dwc3: exynos fix axius clock error path to do cleanup
    alarmtimer: Prevent overflow of relative timers
    genirq: Release resources in __setup_irq() error path
    swap: cond_resched in swap_cgroup_prepare()
    mm/memory-failure.c: use compound_head() flags for huge pages
    USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
    usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
    drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR()
    usb: r8a66597-hcd: decrease timeout
    usb: r8a66597-hcd: select a different endpoint on timeout
    USB: gadget: dummy_hcd: fix hub-descriptor removable fields
    pvrusb2: reduce stack usage pvr2_eeprom_analyze()
    usb: core: fix potential memory leak in error path during hcd creation
    USB: hub: fix SS max number of ports
    iio: proximity: as3935: recalibrate RCO after resume
    staging: rtl8188eu: prevent an underflow in rtw_check_beacon_data()
    mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode
    x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init()
    serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
    mac80211: fix IBSS presp allocation size
    mac80211: fix CSA in IBSS mode
    mac80211/wpa: use constant time memory comparison for MACs
    mac80211: don't look at the PM bit of BAR frames
    vb2: Fix an off by one error in 'vb2_plane_vaddr'
    cpufreq: conservative: Allow down_threshold to take values from 1 to 10
    can: gs_usb: fix memory leak in gs_cmd_reset()
    configfs: Fix race between create_link and configfs_rmdir
Linux 4.4.73
    sparc64: make string buffers large enough
    s390/kvm: do not rely on the ILC on kvm host protection fauls
    xtensa: don't use linux IRQ #0
    tipc: ignore requests when the connection state is not CONNECTED
    proc: add a schedule point in proc_pid_readdir()
    romfs: use different way to generate fsid for BLOCK or MTD
    sctp: sctp_addr_id2transport should verify the addr before looking up assoc
    r8152: avoid start_xmit to schedule napi when napi is disabled
    r8152: fix rtl8152_post_reset function
    r8152: re-schedule napi for tx
    nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
    ravb: unmap descriptors when freeing rings
    drm/ast: Fixed system hanged if disable P2A
    drm/nouveau: Don't enabling polling twice on runtime resume
    parisc, parport_gsc: Fixes for printk continuation lines
    net: adaptec: starfire: add checks for dma mapping errors
    pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
    gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
    net/mlx4_core: Avoid command timeouts during VF driver device shutdown
    drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
    drm/nouveau: prevent userspace from deleting client object
    ipv6: fix flow labels when the traffic class is non-0
    FS-Cache: Initialise stores_lock in netfs cookie
    fscache: Clear outstanding writes when disabling a cookie
    fscache: Fix dead object requeue
    ethtool: do not vzalloc(0) on registers dump
    log2: make order_base_2() behave correctly on const input value zero
    kasan: respect /proc/sys/kernel/traceoff_on_warning
    jump label: pass kbuild_cflags when checking for asm goto support
    PM / runtime: Avoid false-positive warnings from might_sleep_if()
    ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
    i2c: piix4: Fix request_region size
    sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
    sierra_net: Skip validating irrelevant fields for IDLE LSIs
    net: hns: Fix the device being used for dma mapping during TX
    NET: mkiss: Fix panic
    NET: Fix /proc/net/arp for AX.25
    ipv6: Inhibit IPv4-mapped src address on the wire.
    ipv6: Handle IPv4-mapped src to in6addr_any dst.
    net: xilinx_emaclite: fix receive buffer overflow
    net: xilinx_emaclite: fix freezes due to unordered I/O
    Call echo service immediately after socket reconnect
    staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
    ARM: dts: imx6dl: Fix the VDD_ARM_CAP voltage for 396MHz operation
    partitions/msdos: FreeBSD UFS2 file systems are not recognized
    s390/vmem: fix identity mapping
Linux 4.4.72
    arm64: ensure extension of smp_store_release value
    arm64: armv8_deprecated: ensure extension of addr
    usercopy: Adjust tests to deal with SMAP/PAN
    RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
    arm64: entry: improve data abort handling of tagged pointers
    arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
    Make __xfs_xattr_put_listen preperly report errors.
    NFSv4: Don't perform cached access checks before we've OPENed the file
    NFS: Ensure we revalidate attributes before using execute_ok()
    mm: consider memblock reservations for deferred memory initialization sizing
    net: better skb->sender_cpu and skb->napi_id cohabitation
    serial: sh-sci: Fix panic when serial console and DMA are enabled
    tty: Drop krefs for interrupted tty lock
    drivers: char: mem: Fix wraparound check to allow mappings up to the end
    ASoC: Fix use-after-free at card unregistration
    ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
    ALSA: timer: Fix race between read and ioctl
    drm/nouveau/tmr: fully separate alarm execution/pending lists
    drm/vmwgfx: Make sure backup_handle is always valid
    drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
    drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
    perf/core: Drop kernel samples even though :u is specified
    powerpc/hotplug-mem: Fix missing endian conversion of aa_index
    powerpc/numa: Fix percpu allocations to be NUMA aware
    powerpc/eeh: Avoid use after free in eeh_handle_special_event()
    scsi: qla2xxx: don't disable a not previously enabled PCI device
    KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
    btrfs: fix memory leak in update_space_info failure path
    btrfs: use correct types for page indices in btrfs_page_exists_in_range
    cxl: Fix error path on bad ioctl
    ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
    ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
    ufs: set correct ->s_maxsize
    ufs: restore maintaining ->i_blocks
    fix ufs_isblockset()
    ufs: restore proper tail allocation
    fs: add i_blocksize()
    cpuset: consider dying css as offline
    Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
    drm/msm: Expose our reservation object when exporting a dmabuf.
    target: Re-add check to reject control WRITEs with overflow data
    cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
    stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
    random: properly align get_random_int_hash
    drivers: char: random: add get_random_long()
    iio: proximity: as3935: fix AS3935_INT mask
    iio: light: ltr501 Fix interchanged als/ps register field
    staging/lustre/lov: remove set_fs() call from lov_getstripe()
    usb: chipidea: debug: check before accessing ci_role
    usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
    usb: gadget: f_mass_storage: Serialize wake and sleep execution
    ext4: fix fdatasync(2) after extent manipulation operations
    ext4: keep existing extra fields when inode expands
    ext4: fix SEEK_HOLE
    xen-netfront: cast grant table reference first to type int
    xen-netfront: do not cast grant table reference to signed short
    xen/privcmd: Support correctly 64KB page granularity when mapping memory
    dmaengine: ep93xx: Always start from BASE0
    dmaengine: usb-dmac: Fix DMAOR AE bit definition
    KVM: async_pf: avoid async pf injection when in guest mode
    arm: KVM: Allow unaligned accesses at HYP
    KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation
    kvm: async_pf: fix rcu_irq_enter() with irqs enabled
    nfsd: Fix up the "supattr_exclcreat" attributes
    nfsd4: fix null dereference on replay
    drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
    crypto: gcm - wait for crypto op not signal safe
    KEYS: fix freeing uninitialized memory in key_update()
    KEYS: fix dereferencing NULL payload with nonzero length
    ptrace: Properly initialize ptracer_cred on fork
    serial: ifx6x60: fix use-after-free on module unload
    arch/sparc: support NR_CPUS = 4096
    sparc64: delete old wrap code
    sparc64: new context wrap
    sparc64: add per-cpu mm of secondary contexts
    sparc64: redefine first version
    sparc64: combine activate_mm and switch_mm
    sparc64: reset mm cpumask after wrap
    sparc: Machine description indices can vary
    sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
    net: bridge: start hello timer only if device is up
    net: ethoc: enable NAPI before poll may be scheduled
    net: ping: do not abuse udp_poll()
    ipv6: Fix leak in ipv6_gso_segment().
    vxlan: fix use-after-free on deletion
    tcp: disallow cwnd undo when switching congestion control
    cxgb4: avoid enabling napi twice to the same queue
    ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()
    bnx2x: Fix Multi-Cos
Linux 4.4.71
    xfs: only return -errno or success from attr ->put_listent
    xfs: in _attrlist_by_handle, copy the cursor back to userspace
    xfs: fix unaligned access in xfs_btree_visit_blocks
    xfs: bad assertion for delalloc an extent that start at i_size
    xfs: fix indlen accounting error on partial delalloc conversion
    xfs: wait on new inodes during quotaoff dquot release
    xfs: update ag iterator to support wait on new inodes
    xfs: support ability to wait on new inodes
    xfs: fix up quotacheck buffer list error handling
    xfs: prevent multi-fsb dir readahead from reading random blocks
    xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
    xfs: fix over-copying of getbmap parameters from userspace
    xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
    xfs: Fix missed holes in SEEK_HOLE implementation
    mlock: fix mlock count can not decrease in race condition
    mm/migrate: fix refcount handling when !hugepage_migration_supported()
    drm/gma500/psb: Actually use VBT mode when it is found
    slub/memcg: cure the brainless abuse of sysfs attributes
    ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
    pcmcia: remove left-over %Z format
    drm/radeon: Unbreak HPD handling for r600+
    drm/radeon/ci: disable mclk switching for high refresh rates (v2)
    scsi: mpt3sas: Force request partial completion alignment
    HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
    mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
    i2c: i2c-tiny-usb: fix buffer not being DMA capable
    vlan: Fix tcp checksum offloads in Q-in-Q vlans
    net: phy: marvell: Limit errata to 88m1101
    netem: fix skb_orphan_partial()
    ipv4: add reference counting to metrics
    sctp: fix ICMP processing if skb is non-linear
    tcp: avoid fastopen API to be used on AF_UNSPEC
    virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
    be2net: Fix offload features for Q-in-Q packets
    ipv6: fix out of bound writes in __ip6_append_data()
    bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
    qmi_wwan: add another Lenovo EM74xx device ID
    bridge: netlink: check vlan_default_pvid range
    ipv6: Check ip6_find_1stfragopt() return value properly.
    ipv6: Prevent overrun when parsing v6 header options
    net: Improve handling of failures on link and route dumps
    tcp: eliminate negative reordering in tcp_clean_rtx_queue
    sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
    sctp: fix src address selection if using secondary addresses for ipv6
    tcp: avoid fragmenting peculiar skbs in SACK
    s390/qeth: avoid null pointer dereference on OSN
    s390/qeth: unbreak OSM and OSN support
    s390/qeth: handle sysfs error during initialization
    ipv6/dccp: do not inherit ipv6_mc_list from parent
    dccp/tcp: do not inherit mc_list from parent
    sparc: Fix -Wstringop-overflow warning

Bug: 62730977
Change-Id: Ifca755d82f9e4b11016f6660298c2c1b073bfb3a
Signed-off-by: Thierry Strudel <tstrudel@google.com>
2017-09-20 16:42:37 -07:00
Cong Wang
034e10b4f8 mqueue: fix a use-after-free in sys_mq_notify()
commit f991af3daabaecff34684fd51fac80319d1baad1 upstream.

The retry logic for netlink_attachskb() inside sys_mq_notify()
is nasty and vulnerable:

1) The sock refcnt is already released when retry is needed
2) The fd is controllable by user-space because we already
   release the file refcnt

so we when retry but the fd has been just closed by user-space
during this small window, we end up calling netlink_detachskb()
on the error path which releases the sock again, later when
the user-space closes this socket a use-after-free could be
triggered.

Setting 'sock' to NULL here should be sufficient to fix it.

Reported-by: GeneBlue <geneblue.mail@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15 11:57:47 +02:00
Daniel Rosenberg
7fc825be6b ANDROID: vfs: Add permission2 for filesystems with per mount permissions
This allows filesystems to use their mount private data to
influence the permssions they return in permission2. It has
been separated into a new call to avoid disrupting current
permission users.

Change-Id: I9d416e3b8b6eca84ef3e336bd2af89ddd51df6ca
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-10 10:43:29 -08:00
Marcus Gelderie
de54b9ac25 ipc: modify message queue accounting to not take kernel data structures into account
A while back, the message queue implementation in the kernel was
improved to use btrees to speed up retrieval of messages, in commit
d6629859b3 ("ipc/mqueue: improve performance of send/recv").

That patch introducing the improved kernel handling of message queues
(using btrees) has, as a by-product, changed the meaning of the QSIZE
field in the pseudo-file created for the queue.  Before, this field
reflected the size of the user-data in the queue.  Since, it also takes
kernel data structures into account.  For example, if 13 bytes of user
data are in the queue, on my machine the file reports a size of 61
bytes.

There was some discussion on this topic before (for example
https://lkml.org/lkml/2014/10/1/115).  Commenting on a th lkml, Michael
Kerrisk gave the following background
(https://lkml.org/lkml/2015/6/16/74):

    The pseudofiles in the mqueue filesystem (usually mounted at
    /dev/mqueue) expose fields with metadata describing a message
    queue. One of these fields, QSIZE, as originally implemented,
    showed the total number of bytes of user data in all messages in
    the message queue, and this feature was documented from the
    beginning in the mq_overview(7) page. In 3.5, some other (useful)
    work happened to break the user-space API in a couple of places,
    including the value exposed via QSIZE, which now includes a measure
    of kernel overhead bytes for the queue, a figure that renders QSIZE
    useless for its original purpose, since there's no way to deduce
    the number of overhead bytes consumed by the implementation.
    (The other user-space breakage was subsequently fixed.)

This patch removes the accounting of kernel data structures in the
queue.  Reporting the size of these data-structures in the QSIZE field
was a breaking change (see Michael's comment above).  Without the QSIZE
field reporting the total size of user-data in the queue, there is no
way to deduce this number.

It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
against the worst-case size of the queue (in both the old and the new
implementation).  Therefore, the kernel overhead accounting in QSIZE is
not necessary to help the user understand the limitations RLIMIT imposes
on the processes.

Signed-off-by: Marcus Gelderie <redmnic@gmail.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: John Duffy <jb_duffy@btinternet.com>
Cc: Arto Bendiken <arto@bendiken.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:39 +03:00
Davidlohr Bueso
fa6004ad45 ipc/mqueue: Implement lockless pipelined wakeups
This patch moves the wakeup_process() invocation so it is not done under
the info->lock by making use of a lockless wake_q. With this change, the
waiter is woken up once it is STATE_READY and it does not need to loop
on SMP if it is still in STATE_PENDING. In the timeout case we still need
to grab the info->lock to verify the state.

This change should also avoid the introduction of preempt_disable() in -rt
which avoids a busy-loop which pools for the STATE_PENDING -> STATE_READY
change if the waiter has a higher priority compared to the waker.

Additionally, this patch micro-optimizes wq_sleep by using the cheaper
cousin of set_current_state(TASK_INTERRUPTABLE) as we will block no
matter what, thus get rid of the implied barrier.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: George Spelvin <linux@horizon.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Mason <clm@fb.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1430748166.1940.17.camel@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:23:07 +02:00
David Howells
75c3cfa855 VFS: assorted weird filesystems: d_inode() annotations
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:58 -04:00
Al Viro
9f45f5bf30 new helper: audit_file()
... for situations when we don't have any candidate in pathnames - basically,
in descriptor-based syscalls.

[Folded the build fix for !CONFIG_AUDITSYSCALL configs from Chen Gang]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19 13:01:26 -05:00
Davidlohr Bueso
6d08a2567c ipc: use device_initcall
... since __initcall is now deprecated.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:11 -07:00
Davidlohr Bueso
f3713fd9cf ipc,mqueue: remove limits for the amount of system-wide queues
Commit 93e6f119c0 ("ipc/mqueue: cleanup definition names and
locations") added global hardcoded limits to the amount of message
queues that can be created.  While these limits are per-namespace,
reality is that it ends up breaking userspace applications.
Historically users have, at least in theory, been able to create up to
INT_MAX queues, and limiting it to just 1024 is way too low and dramatic
for some workloads and use cases.  For instance, Madars reports:

 "This update imposes bad limits on our multi-process application.  As
  our app uses approaches that each process opens its own set of queues
  (usually something about 3-5 queues per process).  In some scenarios
  we might run up to 3000 processes or more (which of-course for linux
  is not a problem).  Thus we might need up to 9000 queues or more.  All
  processes run under one user."

Other affected users can be found in launchpad bug #1155695:
  https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695

Instead of increasing this limit, revert it entirely and fallback to the
original way of dealing queue limits -- where once a user's resource
limit is reached, and all memory is used, new queues cannot be created.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reported-by: Madars Vitolins <m@silodev.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>	[3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-25 15:25:45 -08:00
Davidlohr Bueso
3ab08fe204 ipc: remove braces for single statements
Deal with checkpatch messages:
     WARNING: braces {} are not necessary for single statement blocks

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27 21:02:39 -08:00
Manfred Spraul
239521f31d ipc: whitespace cleanup
The ipc code does not adhere the typical linux coding style.
This patch fixes lots of simple whitespace errors.

- mostly autogenerated by
  scripts/checkpatch.pl -f --fix \
	--types=pointer_location,spacing,space_before_tab
- one manual fixup (keep structure members tab-aligned)
- removal of additional space_before_tab that were not found by --fix

Tested with some of my msg and sem test apps.

Andrew: Could you include it in -mm and move it towards Linus' tree?

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Suggested-by: Li Bin <huawei.libin@huawei.com>
Cc: Joe Perches <joe@perches.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27 21:02:39 -08:00
J. Bruce Fields
b21996e36c locks: break delegations on unlink
We need to break delegations on any operation that changes the set of
links pointing to an inode.  Start with unlink.

Such operations also hold the i_mutex on a parent directory.  Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client.  To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:

	acquire locks
	...
	test for delegation; if found:
		take reference on inode
		release locks
		wait for delegation break
		drop reference on inode
		retry

It is possible this could never terminate.  (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.)  But this seems very unlikely.

The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack.  We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.

Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:42 -05:00
Jeff Layton
79f6530cb5 audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record
The old audit PATH records for mq_open looked like this:

  type=PATH msg=audit(1366282323.982:869): item=1 name=(null) inode=6777
  dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
  obj=system_u:object_r:tmpfs_t:s15:c0.c1023
  type=PATH msg=audit(1366282323.982:869): item=0 name="test_mq" inode=26732
  dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
  obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023

...with the audit related changes that went into 3.7, they now look like this:

  type=PATH msg=audit(1366282236.776:3606): item=2 name=(null) inode=66655
  dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
  obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023
  type=PATH msg=audit(1366282236.776:3606): item=1 name=(null) inode=6926
  dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
  obj=system_u:object_r:tmpfs_t:s15:c0.c1023
  type=PATH msg=audit(1366282236.776:3606): item=0 name="test_mq"

Both of these look wrong to me.  As Steve Grubb pointed out:

 "What we need is 1 PATH record that identifies the MQ.  The other PATH
  records probably should not be there."

Fix it to record the mq root as a parent, and flag it such that it
should be hidden from view when the names are logged, since the root of
the mq filesystem isn't terribly interesting.  With this change, we get
a single PATH record that looks more like this:

  type=PATH msg=audit(1368021604.836:484): item=0 name="test_mq" inode=16914
  dev=00:0c mode=0100644 ouid=0 ogid=0 rdev=00:00
  obj=unconfined_u:object_r:user_tmpfs_t:s0

In order to do this, a new audit_inode_parent_hidden() function is
added.  If we do it this way, then we avoid having the existing callers
of audit_inode needing to do any sort of flag conversion if auditing is
inactive.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09 10:33:19 -07:00
Linus Torvalds
2c3de1c2d7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns fixes from Eric W Biederman:
 "The bulk of the changes are fixing the worst consequences of the user
  namespace design oversight in not considering what happens when one
  namespace starts off as a clone of another namespace, as happens with
  the mount namespace.

  The rest of the changes are just plain bug fixes.

  Many thanks to Andy Lutomirski for pointing out many of these issues."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Restrict when proc and sysfs can be mounted
  ipc: Restrict mounting the mqueue filesystem
  vfs: Carefully propogate mounts across user namespaces
  vfs: Add a mount flag to lock read only bind mounts
  userns:  Don't allow creation if the user is chrooted
  yama:  Better permission check for ptraceme
  pid: Handle the exit of a multi-threaded init.
  scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
2013-03-28 13:43:46 -07:00
Eric W. Biederman
a636b702ed ipc: Restrict mounting the mqueue filesystem
Only allow mounting the mqueue filesystem if the caller has CAP_SYS_ADMIN
rights over the ipc namespace.   The principle here is if you create
or have capabilities over it you can mount it, otherwise you get to live
with what other people have mounted.

This information is not particularly sensitive and mqueue essentially
only reports which posix messages queues exist.  Still when creating a
restricted environment for an application to live any extra
information may be of use to someone with sufficient creativity.  The
historical if imperfect way this information has been restricted has
been not to allow mounts and restricting this to ipc namespace
creators maintains the spirit of the historical restriction.

Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-27 07:50:06 -07:00
Vladimir Davydov
38d78e587d mqueue: sys_mq_open: do not call mnt_drop_write() if read-only
mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge.

mnt_writers counters are used to check if remounting FS as read-only is
OK, so after an extra mnt_drop_write() call, it would be impossible to
remount mqueue FS as read-only.  Besides, on umount a warning would be
printed like this one:

  =====================================
  [ BUG: bad unlock balance detected! ]
  3.9.0-rc3 #5 Not tainted
  -------------------------------------
  a.out/12486 is trying to release lock (sb_writers) at:
  mnt_drop_write+0x1f/0x30
  but there are no more locks to release!

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22 16:41:21 -07:00
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Gao feng
bc1b69ed22 userns: Allow the unprivileged users to mount mqueue fs
This patch allow the unprivileged user to mount mqueuefs in
user ns.

If two userns share the same ipcns,the files in mqueue fs
should be seen in both these two userns.

If the userns has its own ipcns,it has its own mqueue fs too.
ipcns has already done this job well.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2013-01-27 19:25:50 -08:00
Jeff Layton
adb5c2473d audit: make audit_inode take struct filename
Keep a pointer to the audit_names "slot" in struct filename.

Have all of the audit_inode callers pass a struct filename ponter to
audit_inode instead of a string pointer. If the aname field is already
populated, then we can skip walking the list altogether and just use it
directly.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:15:09 -04:00
Jeff Layton
91a27b2a75 vfs: define struct filename and have getname() return it
getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:14:55 -04:00
Jeff Layton
bfcec70874 audit: set the name_len in audit_inode for parent lookups
Currently, this gets set mostly by happenstance when we call into
audit_inode_child. While that might be a little more efficient, it seems
wrong. If the syscall ends up failing before audit_inode_child ever gets
called, then you'll have an audit_names record that shows the full path
but has the parent inode info attached.

Fix this by passing in a parent flag when we call audit_inode that gets
set to the value of LOOKUP_PARENT. We can then fix up the pathname for
the audit entry correctly from the get-go.

While we're at it, clean up the no-op macro for audit_inode in the
!CONFIG_AUDITSYSCALL case.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 00:32:01 -04:00
Michel Lespinasse
1638113d9d ipc/mqueue: remove unnecessary rb_init_node() calls
Commit d6629859b3 ("ipc/mqueue: improve performance of send/recv") and
ce2d52cc ("ipc/mqueue: add rbtree node caching support") introduced an
rbtree of message priorities, and usage of rb_init_node() to initialize
the corresponding nodes.  As it turns out, rb_init_node() is unnecessary
here, as the nodes are fully initialized on insertion by rb_link_node()
and the code doesn't access nodes that aren't inserted on the rbtree.

Removing the rb_init_node() calls as I removed that function during
rbtree API cleanups (the only other use of it was in a place that
similarly didn't require it).

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:31 +09:00
Al Viro
2903ff019b switch simple cases of fget_light to fdget
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 22:20:08 -04:00
Al Viro
515e0d6634 switch mqueue syscalls to fget_light()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 21:10:09 -04:00
Al Viro
312b90fbed mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit
the way it abuses ->d_fsdata still needs to be killed, but that's
a separate story.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-18 16:51:26 -04:00
Al Viro
765927b2d5 switch dentry_open() to struct path, make it grab references itself
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23 00:01:29 +04:00
Al Viro
312b63fba9 don't pass nameidata * to vfs_create()
all we want is a boolean flag, same as the method gets now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:50 +04:00
Al Viro
ebfc3b49a7 don't pass nameidata to ->create()
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:34:47 +04:00
Doug Ledford
ce2d52cc13 ipc/mqueue: add rbtree node caching support
When I wrote the first patch that added the rbtree support for message
queue insertion, it sped up the case where the queue was very full
drastically from the original code.  It, however, slowed down the case
where the queue was empty (not drastically though).

This patch caches the last freed rbtree node struct so we can quickly
reuse it when we get a new message.  This is the common path for any queue
that very frequently goes from 0 to 1 then back to 0 messages in queue.

Andrew Morton didn't like that we were doing a GFP_ATOMIC allocation in
msg_insert, so this patch attempts to speculatively allocate a new node
struct outside of the spin lock when we know we need it, but will still
fall back to a GFP_ATOMIC allocation if it has to.

Once I added the caching, the necessary various ret = ; spin_unlock
gyrations in mq_timedsend were getting pretty ugly, so this also slightly
refactors that function to streamline the flow of the code and the
function exit.

Finally, while working on getting performance back I made sure that all of
the node structs were always fully initialized when they were first used,
rendering the use of kzalloc unnecessary and a waste of CPU cycles.

The net result of all of this is:

1) We will avoid a GFP_ATOMIC allocation when possible, but fall back
   on it when necessary.

2) We will speculatively allocate a node struct using GFP_KERNEL if our
   cache is empty (and save the struct to our cache if it's still empty
   after we have obtained the spin lock).

3) The performance of the common queue empty case has significantly
   improved and is now much more in line with the older performance for
   this case.

The performance changes are:

            Old mqueue      new mqueue      new mqueue + caching
queue empty
send/recv   305/288ns       349/318ns       310/322ns

I don't think we'll ever be able to get the recv performance back, but
that's because the old recv performance was a direct result and
consequence of the old methods abysmal send performance.  The recv path
simply must do more so that the send path does not incur such a penalty
under higher queue depths.

As it turns out, the new caching code also sped up the various queue full
cases relative to my last patch.  That could be because of the difference
between the syscall path in 3.3.4-rc5 and 3.3.4-rc6, or because of the
change in code flow in the mq_timedsend routine.  Regardless, I'll take
it.  It wasn't huge, and I *would* say it was within the margin for error,
but after many repeated runs what I'm seeing is that the old numbers trend
slightly higher (about 10 to 20ns depending on which test is the one
running).

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
Doug Ledford
113289cc08 ipc/mqueue: strengthen checks on mqueue creation
We already check the mq attr struct if it's passed in, but now that the
admin can set system wide defaults separate from maximums, it's actually
possible to set the defaults to something that would overflow.  So, if
there is no attr struct passed in to the open call, check the default
values.

While we are at it, simplify mq_attr_ok() by making it return 0 or an
error condition, so that way if we add more tests to it later, we have the
option of what error should be returned instead of the calling location
having to pick a possibly inaccurate error code.

[akpm@linux-foundation.org: s/ENOMEM/EOVERFLOW/]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Manfred Spraul <manfred@colorfullife.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
Doug Ledford
2c12ea498f ipc/mqueue: correct mq_attr_ok test
While working on the other parts of the mqueue stuff, I noticed that the
calculation for overflow in mq_attr_ok didn't actually match reality (this
is especially true since my last patch which changed how we account memory
slightly).

In particular, we used to test for overflow using:
  msgs * msgsize + msgs * sizeof(struct msg_msg *)

That was never really correct because each message we allocate via
load_msg() is actually a struct msg_msg followed by the data for the
message (and if struct msg_msg + data exceeds PAGE_SIZE we end up
allocating struct msg_msgseg structs too, but accounting for them would
get really tedious, so let's ignore those...they're only a pointer in size
anyway).  This patch updates the calculation to be more accurate in
regards to maximum possible memory consumption by the mqueue.

[akpm@linux-foundation.org: add a local to simplify overflow-checking expression]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Manfred Spraul <manfred@colorfullife.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
Doug Ledford
d6629859b3 ipc/mqueue: improve performance of send/recv
The existing implementation of the POSIX message queue send and recv
functions is, well, abysmal.  Even worse than abysmal.  I submitted a
patch to increase the maximum POSIX message queue limit to 65536 due to
customer needs, however, upon looking over the send/recv implementation, I
realized that my customer needs help with that too even if they don't know
it.  The basic problem is that, given the fairly typical use case scenario
for a large queue of queueing lots of messages all at the same priority (I
verified with my customer that this is indeed what their app does), the
msg_insert routine is basically a frikkin' bubble sort.  I mean, whoa,
that's *so* middle school.

OK, OK, to not slam the original author too much, I'm sure they didn't
envision a queue depth of 50,000+ messages.  No one would think that
moving elements in an array, one at a time, and dereferencing each pointer
in that array to check priority of the message being pointed too, again
one at a time, for 50,000+ times would be good.  So let's assume that, as
is typical, the users have found a way to break our code simply by using
it in a way we didn't envision.  Fair enough.

"So, just how broken is it?", you ask.  I wondered the same thing, so I
wrote an app to let me know.  It's my next patch.  It gave me some
interesting results.  Here's what it tested:

Interference with other apps - In continuous mode, the app just sits there
and hits a message queue forever, while you go do something productive on
another terminal using other CPUs.  You then measure how long it takes you
to do that something productive.  Then you restart the app in fake
continuous mode, and it sits in a tight loop on a CPU while you repeat
your tests.  The whole point of this is to keep one CPU tied up (so it
can't be used in your other work) but in one case tied up hitting the
mqueue code so we can see the effect of walking that 65,528 element array
one pointer at a time on the global CPU cache.  If it's bad, then it will
slow down your app on the other CPUs just by polluting cache mercilessly.
In the fake case, it will be in a tight loop, but not polluting cache.
Testing the mqueue subsystem directly - Here we just run a number of tests
to see how the mqueue subsystem performs under different conditions.  A
couple conditions are known to be worst case for the old system, and some
routines, so this tests all of them.

So, on to the results already:

Subsystem/Test                  Old                         New

Time to compile linux
kernel (make -j12 on a
6 core CPU)
  Running mqueue test     user 49m10.744s             user 45m26.294s
			   sys  5m51.924s              sys  4m59.894s
			 total 55m02.668s            total 50m26.188s

  Running fake test       user 45m32.686s             user 45m18.552s
                           sys  5m12.465s              sys  4m56.468s
                         total 50m45.151s            total 50m15.020s

  % slowdown from mqueue
    cache thrashing            ~8%                         ~.5%

Avg time to send/recv (in nanoseconds per message)
  when queue empty            305/288                    349/318
  when queue full (65528 messages)
    constant priority      526589/823                    362/314
    increasing priority    403105/916                    495/445
    decreasing priority     73420/594                    482/409
    random priority        280147/920                    546/436

Time to fill/drain queue (65528 messages, in seconds)
  constant priority         17.37/.12                    .13/.12
  increasing priority        4.14/.14                    .21/.18
  decreasing priority       12.93/.13                    .21/.18
  random priority            8.88/.16                    .22/.17

So, I think the results speak for themselves.  It's possible this
implementation could be improved by cacheing at least one priority level
in the node tree (that would bring the queue empty performance more in
line with the old implementation), but this works and is *so* much better
than what we had, especially for the common case of a single priority in
use, that further refinements can be in follow on patches.

[akpm@linux-foundation.org: fix typo in comment, remove stray semicolon]
[levinsasha928@gmail.com: use correct gfp flags in msg_insert]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Manfred Spraul <manfred@colorfullife.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
KOSAKI Motohiro
cef0184c11 mqueue: separate mqueue default value from maximum value
Commit b231cca438 ("message queues: increase range limits") changed
mqueue default value when attr parameter is specified NULL from hard
coded value to fs.mqueue.{msg,msgsize}_max sysctl value.

This made large side effect.  When user need to use two mqueue
applications 1) using !NULL attr parameter and it require big message
size and 2) using NULL attr parameter and only need small size message,
app (1) require to raise fs.mqueue.msgsize_max and app (2) consume large
memory size even though it doesn't need.

Doug Ledford propsed to switch back it to static hard coded value.
However it also has a compatibility problem.  Some applications might
started depend on the default value is tunable.

The solution is to separate default value from maximum value.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
KOSAKI Motohiro
fd1f87d24d mqueue: don't use kmalloc with KMALLOC_MAX_SIZE
KMALLOC_MAX_SIZE is not a good threshold.  It is extremely high and
problematic.  Unfortunately, some silly drivers depend on this and we
can't change it.  But any new code needn't use such extreme ugly high
order allocations.  It brings us awful fragmentation issues and system
slowdown.

Signed-off-by: KOSAKI Motohiro <mkosaki@jp.fujitsu.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:31 -07:00
Doug Ledford
5b5c4d1a14 ipc/mqueue: update maximums for the mqueue subsystem
Commit b231cca438 ("message queues: increase range limits") changed the
maximum size of a message in a message queue from INT_MAX to 8192*128.
Unfortunately, we had customers that relied on a size much larger than
8192*128 on their production systems.  After reviewing POSIX, we found
that it is silent on the maximum message size.  We did find a couple other
areas in which it was not silent.  Fix up the mqueue maximums so that the
customer's system can continue to work, and document both the POSIX and
real world requirements in ipc_namespace.h so that we don't have this
issue crop back up.

Also, commit 9cf18e1dd7 ("ipc: HARD_MSGMAX should be higher not lower
on 64bit") fiddled with HARD_MSGMAX without realizing that the number was
intentionally in place to limit the msg queue depth to one that was small
enough to kmalloc an array of pointers (hence why we divided 128k by
sizeof(long)).  If we wish to meet POSIX requirements, we have no choice
but to change our allocation to a vmalloc instead (at least for the large
queue size case).  With that, it's possible to increase our allowed
maximum to the POSIX requirements (or more if we choose).

[sfr@canb.auug.org.au: using vmalloc requires including vmalloc.h]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Doug Ledford
02967ea08e ipc/mqueue: enforce hard limits
In two places we don't enforce the hard limits for CAP_SYS_RESOURCE apps.
In preparation for making more reasonable hard limits, start enforcing
them even on CAP_SYS_RESOURCE.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Doug Ledford
858ee3784e ipc/mqueue: switch back to using non-max values on create
Commit b231cca438 ("message queues: increase range limits") changed
how we create a queue that does not include an attr struct passed to
open so that it creates the queue with whatever the maximum values are.
However, if the admin has set the maximums to allow flexibility in
creating a queue (aka, both a large size and large queue are allowed,
but combined they create a queue too large for the RLIMIT_MSGQUEUE of
the user), then attempts to create a queue without an attr struct will
fail.  Switch back to using acceptable defaults regardless of what the
maximums are.

Note: so far, we only know of a few applications that rely on this
behavior (specifically, set the maximums in /proc, then run the
application which calls mq_open() without passing in an attr struct, and
the application expects the newly created message queue to have the
maximum sizes that were set in /proc used on the mq_open() call, and all
of those applications that we know of are actually part of regression
test suites that were coded to do something like this:

for size in 4096 65536 $((1024 * 1024)) $((16 * 1024 * 1024)); do
	echo $size > /proc/sys/fs/mqueue/msgsize_max
	mq_open || echo "Error opening mq with size $size"
done

These test suites that depend on any behavior like this are broken.  The
concept that programs should rely upon the system wide maximum in order
to get their desired results instead of simply using a attr struct to
specify what they want is fundamentally unfriendly programming practice
for any multi-tasking OS.

Fixing this will break those few apps that we know of (and those app
authors recognize the brokenness of their code and the need to fix it).
However, the following patch "mqueue: separate mqueue default value"
allows a workaround in the form of new knobs for the default msg queue
creation parameters for any software out there that we don't already
know about that might rely on this behavior at the moment.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Linus Torvalds
90324cc1b1 Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux
Pull writeback tree from Wu Fengguang:
 "Mainly from Jan Kara to avoid iput() in the flusher threads."

* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: Avoid iput() from flusher thread
  vfs: Rename end_writeback() to clear_inode()
  vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
  writeback: Refactor writeback_single_inode()
  writeback: Remove wb->list_lock from writeback_single_inode()
  writeback: Separate inode requeueing after writeback
  writeback: Move I_DIRTY_PAGES handling
  writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
  writeback: Move clearing of I_SYNC into inode_sync_complete()
  writeback: initialize global_dirty_limit
  fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
  mm: page-writeback.c: local functions should not be exposed globally
2012-05-28 09:54:45 -07:00
Jan Kara
dbd5768f87 vfs: Rename end_writeback() to clear_inode()
After we moved inode_sync_wait() from end_writeback() it doesn't make sense
to call the function end_writeback() anymore. Rename it to clear_inode()
which well says what the function really does - set I_CLEAR flag.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-05-06 13:43:41 +08:00
Eric W. Biederman
76b6db0102 userns: Replace user_ns_map_uid and user_ns_map_gid with from_kuid and from_kgid
These function are no longer needed replace them with their more useful equivalents.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-05-03 03:28:39 -07:00
Eric W. Biederman
6f9ac6d93a mqueue: Explicitly capture the user namespace to send the notification to.
Stop relying on user->user_ns which is going away and instead capture
the user_namespace of the process we are supposed to notify.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-07 16:55:53 -07:00
Al Viro
48fde701af switch open-coded instances of d_make_root() to new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:35 -04:00
Davidlohr Bueso
2a4e64b8f6 ipc/mqueue: simplify reading msgqueue limit
Because the current task is being used to get the limit, we can simply
use rlimit() instead of task_rlimit().

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-23 08:38:47 -08:00
Serge E. Hallyn
6b550f9495 user namespace: make signal.c respect user namespaces
ipc/mqueue.c: for __SI_MESQ, convert the uid being sent to recipient's
user namespace. (new, thanks Oleg)

__send_signal: convert current's uid to the recipient's user namespace
for any siginfo which is not SI_FROMKERNEL (patch from Oleg, thanks
again :)

do_notify_parent and do_notify_parent_cldstop: map task's uid to parent's
user namespace

ptrace_signal maps parent's uid into current's user namespace before
including in signal to current.  IIUC Oleg has argued that this shouldn't
matter as the debugger will play with it, but it seems like not converting
the value currently being set is misleading.

Changelog:
Sep 20: Inspired by Oleg's suggestion, define map_cred_ns() helper to
	simplify callers and help make clear what we are translating
        (which uid into which namespace).  Passing the target task would
	make callers even easier to read, but we pass in user_ns because
	current_user_ns() != task_cred_xxx(current, user_ns).
Sep 20: As recommended by Oleg, also put task_pid_vnr() under rcu_read_lock
	in ptrace_signal().
Sep 23: In send_signal(), detect when (user) signal is coming from an
	ancestor or unrelated user namespace.  Pass that on to __send_signal,
	which sets si_uid to 0 or overflowuid if needed.
Oct 12: Base on Oleg's fixup_uid() patch.  On top of that, handle all
	SI_FROMKERNEL cases at callers, because we can't assume sender is
	current in those cases.
Nov 10: (mhelsley) rename fixup_uid to more meaningful usern_fixup_signal_uid
Nov 10: (akpm) make the !CONFIG_USER_NS case clearer

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
From: Serge Hallyn <serge.hallyn@canonical.com>
Subject: __send_signal: pass q->info, not info, to userns_fixup_signal_uid (v2)

Eric Biederman pointed out that passing info is a bug and could lead to a
NULL pointer deref to boot.

A collection of signal, securebits, filecaps, cap_bounds, and a few other
ltp tests passed with this kernel.

Changelog:
    Nov 18: previous patch missed a leading '&'

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
From: Dan Carpenter <dan.carpenter@oracle.com>
Subject: ipc/mqueue: lock() => unlock() typo

There was a double lock typo introduced in b085f4bd6b21 "user namespace:
make signal.c respect user namespaces"

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:54 -08:00