412 Commits

Author SHA1 Message Date
Nathan Chancellor
52bdbb481c Merge 4.4.181 into android-msm-wahoo-4.4
Changes in 4.4.181: (244 commits)
        x86/speculation/mds: Revert CPU buffer clear on double fault exit
        x86/speculation/mds: Improve CPU buffer clear documentation
        ARM: exynos: Fix a leaked reference by adding missing of_node_put
        crypto: vmx - fix copy-paste error in CTR mode
        crypto: crct10dif-generic - fix use via crypto_shash_digest()
        crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest()
        ALSA: usb-audio: Fix a memory leak bug
        ALSA: hda/hdmi - Consider eld_valid when reporting jack event
        ALSA: hda/realtek - EAPD turn on later
        ASoC: max98090: Fix restore of DAPM Muxes
        ASoC: RT5677-SPI: Disable 16Bit SPI Transfers
        mm/mincore.c: make mincore() more conservative
        ocfs2: fix ocfs2 read inode data panic in ocfs2_iget
        mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L
        tty/vt: fix write/write race in ioctl(KDSKBSENT) handler
        ext4: actually request zeroing of inode table after grow
        ext4: fix ext4_show_options for file systems w/o journal
        Btrfs: do not start a transaction at iterate_extent_inodes()
        bcache: fix a race between cache register and cacheset unregister
        bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim()
        ipmi:ssif: compare block number correctly for multi-part return messages
        crypto: gcm - Fix error return code in crypto_gcm_create_common()
        crypto: gcm - fix incompatibility between "gcm" and "gcm_base"
        crypto: chacha20poly1305 - set cra_name correctly
        crypto: salsa20 - don't access already-freed walk.iv
        crypto: arm/aes-neonbs - don't access already-freed walk.iv
        writeback: synchronize sync(2) against cgroup writeback membership switches
        fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
        ext4: zero out the unused memory region in the extent tree block
        ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
        KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes
        net: avoid weird emergency message
        net/mlx4_core: Change the error print to info print
        ppp: deflate: Fix possible crash in deflate_init
        tipc: switch order of device registration to fix a crash
        tipc: fix modprobe tipc failed after switch order of device registration
        stm class: Fix channel free in stm output free path
        md: add mddev->pers to avoid potential NULL pointer dereference
        intel_th: msu: Fix single mode with IOMMU
        of: fix clang -Wunsequenced for be32_to_cpu()
        cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level()
        media: ov6650: Fix sensor possibly not detected on probe
        NFS4: Fix v4.0 client state corruption when mount
        clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
        fuse: fix writepages on 32bit
        fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
        iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
        ceph: flush dirty inodes before proceeding with remount
        tracing: Fix partial reading of trace event's id file
        memory: tegra: Fix integer overflow on tick value calculation
        perf intel-pt: Fix instructions sampling rate
        perf intel-pt: Fix improved sample timestamp
        perf intel-pt: Fix sample timestamp wrt non-taken branches
        fbdev: sm712fb: fix brightness control on reboot, don't set SR30
        fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
        fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
        fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
        fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAM
        fbdev: sm712fb: fix support for 1024x768-16 mode
        fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
        fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
        PCI: Mark Atheros AR9462 to avoid bus reset
        dm delay: fix a crash when invalid device is specified
        xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
        xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
        vti4: ipip tunnel deregistration fixes.
        xfrm4: Fix uninitialized memory read in _decode_session4
        KVM: arm/arm64: Ensure vcpu target is unset on reset failure
        power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
        ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
        perf bench numa: Add define for RUSAGE_THREAD if not present
        Revert "Don't jump to compute_result state from check_result state"
        md/raid: raid5 preserve the writeback action after the parity check
        btrfs: Honour FITRIM range constraints during free space trim
        fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
        ext4: do not delete unlinked inode from orphan list on failed truncate
        KVM: x86: fix return value for reserved EFER
        bio: fix improper use of smp_mb__before_atomic()
        Revert "scsi: sd: Keep disk read-only when re-reading partition"
        crypto: vmx - CTR: always increment IV as quadword
        gfs2: Fix sign extension bug in gfs2_update_stats
        Btrfs: fix race between ranged fsync and writeback of adjacent ranges
        btrfs: sysfs: don't leak memory when failing add fsid
        fbdev: fix divide error in fb_var_to_videomode
        hugetlb: use same fault hash key for shared and private mappings
        fbdev: fix WARNING in __alloc_pages_nodemask bug
        media: cpia2: Fix use-after-free in cpia2_exit
        media: vivid: use vfree() instead of kfree() for dev->bitmap_cap
        ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit
        at76c50x-usb: Don't register led_trigger if usb_register_driver failed
        perf tools: No need to include bitops.h in util.h
        tools include: Adopt linux/bits.h
        gfs2: Fix lru_count going negative
        cxgb4: Fix error path in cxgb4_init_module
        mmc: core: Verify SD bus width
        powerpc/boot: Fix missing check of lseek() return value
        ASoC: imx: fix fiq dependencies
        spi: pxa2xx: fix SCR (divisor) calculation
        brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler()
        rtc: 88pm860x: prevent use-after-free on device remove
        w1: fix the resume command API
        dmaengine: pl330: _stop: clear interrupt status
        mac80211/cfg80211: update bss channel on channel switch
        ASoC: fsl_sai: Update is_slave_mode with correct value
        mwifiex: prevent an array overflow
        net: cw1200: fix a NULL pointer dereference
        bcache: return error immediately in bch_journal_replay()
        bcache: fix failure in journal relplay
        bcache: add failure check to run_cache_set() for journal replay
        bcache: avoid clang -Wunintialized warning
        x86/build: Move _etext to actual end of .text
        smpboot: Place the __percpu annotation correctly
        x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault()
        mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
        HID: logitech-hidpp: use RAP instead of FAP to get the protocol version
        pinctrl: pistachio: fix leaked of_node references
        dmaengine: at_xdmac: remove BUG_ON macro in tasklet
        media: coda: clear error return value before picture run
        media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper
        media: au0828: stop video streaming only when last user stops
        media: ov2659: make S_FMT succeed even if requested format doesn't match
        audit: fix a memory leak bug
        media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable()
        media: pvrusb2: Prevent a buffer overflow
        powerpc/numa: improve control of topology updates
        sched/core: Check quota and period overflow at usec to nsec conversion
        sched/core: Handle overflow in cpu_shares_write_u64
        USB: core: Don't unbind interfaces following device reset failure
        x86/irq/64: Limit IST stack overflow check to #DB stack
        i40e: don't allow changes to HW VLAN stripping on active port VLANs
        RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
        hwmon: (vt1211) Use request_muxed_region for Super-IO accesses
        hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses
        hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses
        hwmon: (pc87427) Use request_muxed_region for Super-IO accesses
        hwmon: (f71805f) Use request_muxed_region for Super-IO accesses
        scsi: libsas: Do discovery on empty PHY to update PHY info
        mmc_spi: add a status check for spi_sync_locked
        mmc: sdhci-of-esdhc: add erratum eSDHC5 support
        mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support
        PM / core: Propagate dev->power.wakeup_path when no callbacks
        extcon: arizona: Disable mic detect if running when driver is removed
        s390: cio: fix cio_irb declaration
        cpufreq: ppc_cbe: fix possible object reference leak
        cpufreq/pasemi: fix possible object reference leak
        cpufreq: pmac32: fix possible object reference leak
        x86/build: Keep local relocations with ld.lld
        iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
        iio: hmc5843: fix potential NULL pointer dereferences
        iio: common: ssp_sensors: Initialize calculated_time in ssp_common_process_data
        rtlwifi: fix a potential NULL pointer dereference
        brcmfmac: fix missing checks for kmemdup
        b43: shut up clang -Wuninitialized variable warning
        brcmfmac: convert dev_init_lock mutex to completion
        brcmfmac: fix race during disconnect when USB completion is in progress
        scsi: ufs: Fix regulator load and icc-level configuration
        scsi: ufs: Avoid configuring regulator with undefined voltage range
        arm64: cpu_ops: fix a leaked reference by adding missing of_node_put
        x86/ia32: Fix ia32_restore_sigcontext() AC leak
        chardev: add additional check for minor range overlap
        HID: core: move Usage Page concatenation to Main item
        ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put
        ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put
        cxgb3/l2t: Fix undefined behaviour
        spi: tegra114: reset controller on probe
        media: wl128x: prevent two potential buffer overflows
        virtio_console: initialize vtermno value for ports
        tty: ipwireless: fix missing checks for ioremap
        rcutorture: Fix cleanup path for invalid torture_type strings
        usb: core: Add PM runtime calls to usb_hcd_platform_shutdown
        scsi: qla4xxx: avoid freeing unallocated dma memory
        media: m88ds3103: serialize reset messages in m88ds3103_set_frontend
        media: go7007: avoid clang frame overflow warning with KASAN
        media: saa7146: avoid high stack usage with clang
        scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
        spi : spi-topcliff-pch: Fix to handle empty DMA buffers
        spi: rspi: Fix sequencer reset during initialization
        spi: Fix zero length xfer bug
        ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM
        ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
        llc: fix skb leak in llc_build_and_send_ui_pkt()
        net-gro: fix use-after-free read in napi_gro_frags()
        net: stmmac: fix reset gpio free missing
        usbnet: fix kernel crash after disconnect
        tipc: Avoid copying bytes beyond the supplied data
        bnxt_en: Fix aggregation buffer leak under OOM condition.
        net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
        crypto: vmx - ghash: do nosimd fallback manually
        xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
        Revert "tipc: fix modprobe tipc failed after switch order of device registration"
        tipc: fix modprobe tipc failed after switch order of device registration -v2
        sparc64: Fix regression in non-hypervisor TLB flush xcall
        include/linux/bitops.h: sanitize rotate primitives
        xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
        usb: xhci: avoid null pointer deref when bos field is NULL
        USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
        USB: sisusbvga: fix oops in error path of sisusb_probe
        USB: Add LPM quirk for Surface Dock GigE adapter
        USB: rio500: refuse more than one device at a time
        USB: rio500: fix memory leak in close after disconnect
        media: usb: siano: Fix general protection fault in smsusb
        media: usb: siano: Fix false-positive "uninitialized variable" warning
        media: smsusb: better handle optional alignment
        scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
        scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
        Btrfs: fix race updating log root item during fsync
        ALSA: hda/realtek - Set default power save node to 0
        drm/nouveau/i2c: Disable i2c bus access after ->fini()
        tty: serial: msm_serial: Fix XON/XOFF
        tty: max310x: Fix external crystal register setup
        memcg: make it work on sparse non-0-node systems
        kernel/signal.c: trace_signal_deliver when signal_group_exit
        CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
        binder: Replace "%p" with "%pK" for stable
        binder: replace "%p" with "%pK"
        net: create skb_gso_validate_mac_len()
        bnx2x: disable GSO where gso_size is too big for hardware
        brcmfmac: Add length checks on firmware events
        brcmfmac: screening firmware event packet
        brcmfmac: revise handling events in receive path
        brcmfmac: fix incorrect event channel deduction
        brcmfmac: add length checks in scheduled scan result handler
        brcmfmac: add subtype check for event handling in data path
        userfaultfd: don't pin the user memory in userfaultfd_file_create()
        Revert "x86/build: Move _etext to actual end of .text"
        net: cdc_ncm: GetNtbFormat endian fix
        usb: gadget: fix request length error for isoc transfer
        media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
        ethtool: fix potential userspace buffer overflow
        neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
        net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
        net: rds: fix memory leak in rds_ib_flush_mr_pool
        pktgen: do not sleep with the thread lock held.
        rcu: locking and unlocking need to always be at least barriers
        parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
        fuse: fallocate: fix return with locked inode
        MIPS: pistachio: Build uImage.gz by default
        genwqe: Prevent an integer overflow in the ioctl
        drm/gma500/cdv: Check vbt config bits when detecting lvds panels
        fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
        fuse: Add FOPEN_STREAM to use stream_open()
        ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled
        ethtool: check the return value of get_regs_len
        Linux 4.4.181

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	drivers/android/binder.c
2019-06-11 08:37:51 -07:00
Paul E. McKenney
1909121a61 rcutorture: Fix cleanup path for invalid torture_type strings
[ Upstream commit b813afae7ab6a5e91b4e16cc567331d9c2ae1f04 ]

If the specified rcutorture.torture_type is not in the rcu_torture_init()
function's torture_ops[] array, rcutorture prints some console messages
and then invokes rcu_torture_cleanup() to set state so that a future
torture test can run.  However, rcu_torture_cleanup() also attempts to
end the test that didn't actually start, and in doing so relies on the
value of cur_ops, a value that is not particularly relevant in this case.
This can result in confusing output or even follow-on failures due to
attempts to use facilities that have not been properly initialized.

This commit therefore sets the value of cur_ops to NULL in this case
and inserts a check near the beginning of rcu_torture_cleanup(),
thus avoiding relying on an irrelevant cur_ops value.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-11 12:24:04 +02:00
Nathan Chancellor
12a1173759 Merge 4.4.177 into android-msm-wahoo-4.4
Changes in 4.4.177: (231 commits)
        ceph: avoid repeatedly adding inode to mdsc->snap_flush_list
        numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
        KEYS: allow reaching the keys quotas exactly
        mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells
        mfd: twl-core: Fix section annotations on {,un}protect_pm_master
        mfd: db8500-prcmu: Fix some section annotations
        mfd: ab8500-core: Return zero in get_register_interruptible()
        mfd: qcom_rpm: write fw_version to CTRL_REG
        mfd: wm5110: Add missing ASRC rate register
        mfd: mc13xxx: Fix a missing check of a register-read failure
        net: hns: Fix use after free identified by SLUB debug
        MIPS: ath79: Enable OF serial ports in the default config
        scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param
        scsi: isci: initialize shost fully before calling scsi_add_host()
        MIPS: jazz: fix 64bit build
        isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
        atm: he: fix sign-extension overflow on large shift
        leds: lp5523: fix a missing check of return value of lp55xx_read
        isdn: avm: Fix string plus integer warning from Clang
        RDMA/srp: Rework SCSI device reset handling
        KEYS: user: Align the payload buffer
        KEYS: always initialize keyring_index_key::desc_len
        batman-adv: fix uninit-value in batadv_interface_tx()
        net/packet: fix 4gb buffer limit due to overflow check
        team: avoid complex list operations in team_nl_cmd_options_set()
        sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
        net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
        ARCv2: Enable unaligned access in early ASM code
        Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"
        libceph: handle an empty authorize reply
        scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached
        drm/msm: Unblock writer if reader closes file
        ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field
        ALSA: compress: prevent potential divide by zero bugs
        thermal: int340x_thermal: Fix a NULL vs IS_ERR() check
        usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
        usb: gadget: Potential NULL dereference on allocation error
        ASoC: dapm: change snprintf to scnprintf for possible overflow
        ASoC: imx-audmux: change snprintf to scnprintf for possible overflow
        ARC: fix __ffs return value to avoid build warnings
        mac80211: fix miscounting of ttl-dropped frames
        serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
        scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
        net: altera_tse: fix connect_local_phy error path
        ibmveth: Do not process frames after calling napi_reschedule
        mac80211: don't initiate TDLS connection if station is not associated to AP
        cfg80211: extend range deviation for DMG
        KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
        arm/arm64: KVM: Feed initialized memory to MMIO accesses
        KVM: arm/arm64: Fix MMIO emulation data handling
        powerpc: Always initialize input array when calling epapr_hypercall()
        mmc: spi: Fix card detection during probe
        mm: enforce min addr even if capable() in expand_downwards()
        x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
        USB: serial: option: add Telit ME910 ECM composition
        USB: serial: cp210x: add ID for Ingenico 3070
        USB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485
        cpufreq: Use struct kobj_attribute instead of struct global_attr
        sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
        ncpfs: fix build warning of strncpy
        isdn: isdn_tty: fix build warning of strncpy
        staging: lustre: fix buffer overflow of string buffer
        net-sysfs: Fix mem leak in netdev_register_kobject
        sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
        team: Free BPF filter when unregistering netdev
        bnxt_en: Drop oversize TX packets to prevent errors.
        net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails
        xen-netback: fix occasional leak of grant ref mappings under memory pressure
        net: Add __icmp_send helper.
        net: avoid use IPCB in cipso_v4_error
        net: phy: Micrel KSZ8061: link failure after cable connect
        x86/CPU/AMD: Set the CPB bit unconditionally on F17h
        applicom: Fix potential Spectre v1 vulnerabilities
        MIPS: irq: Allocate accurate order pages for irq stack
        hugetlbfs: fix races and page leaks during migration
        netlabel: fix out-of-bounds memory accesses
        net: dsa: mv88e6xxx: Fix u64 statistics
        ip6mr: Do not call __IP6_INC_STATS() from preemptible context
        media: uvcvideo: Fix 'type' check leading to overflow
        vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel
        perf tools: Handle TOPOLOGY headers with no CPU
        IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
        ipvs: Fix signed integer overflow when setsockopt timeout
        iommu/amd: Fix IOMMU page flush when detach device from a domain
        xtensa: SMP: fix ccount_timer_shutdown
        xtensa: SMP: fix secondary CPU initialization
        xtensa: smp_lx200_defconfig: fix vectors clash
        xtensa: SMP: mark each possible CPU as present
        xtensa: SMP: limit number of possible CPUs by NR_CPUS
        net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case
        net: hns: Fix wrong read accesses via Clause 45 MDIO protocol
        net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()
        gpio: vf610: Mask all GPIO interrupts
        nfs: Fix NULL pointer dereference of dev_name
        scsi: libfc: free skb when receiving invalid flogi resp
        platform/x86: Fix unmet dependency warning for SAMSUNG_Q10
        cifs: fix computation for MAX_SMB2_HDR_SIZE
        x86/kexec: Don't setup EFI info if EFI runtime is not enabled
        x86_64: increase stack size for KASAN_EXTRA
        mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
        mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
        fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
        autofs: drop dentry reference only when it is never used
        autofs: fix error return in autofs_fill_super()
        ARM: pxa: ssp: unneeded to free devm_ allocated data
        irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable
        dmaengine: at_xdmac: Fix wrongfull report of a channel as in use
        dmaengine: dmatest: Abort test in case of mapping error
        s390/qeth: fix use-after-free in error path
        perf symbols: Filter out hidden symbols from labels
        MIPS: Remove function size check in get_frame_info()
        Input: wacom_serial4 - add support for Wacom ArtPad II tablet
        Input: elan_i2c - add id for touchpad found in Lenovo s21e-20
        iscsi_ibft: Fix missing break in switch statement
        futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
        ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU
        Revert "x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls"
        ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 on Exynos5420
        udplite: call proper backlog handlers
        netfilter: x_tables: enforce nul-terminated table name from getsockopt GET_ENTRIES
        netfilter: nfnetlink_log: just returns error for unknown command
        netfilter: nfnetlink_acct: validate NFACCT_FILTER parameters
        netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP options
        KEYS: restrict /proc/keys by credentials at open time
        l2tp: fix infoleak in l2tp_ip6_recvmsg()
        net: hsr: fix memory leak in hsr_dev_finalize()
        net: sit: fix UBSAN Undefined behaviour in check_6rd
        net/x25: fix use-after-free in x25_device_event()
        net/x25: reset state in x25_connect()
        pptp: dst_release sk_dst_cache in pptp_sock_destruct
        ravb: Decrease TxFIFO depth of Q3 and Q2 to one
        route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race
        tcp: handle inet_csk_reqsk_queue_add() failures
        net/mlx4_core: Fix reset flow when in command polling mode
        net/mlx4_core: Fix qp mtt size calculation
        net/x25: fix a race in x25_bind()
        mdio_bus: Fix use-after-free on device_register fails
        net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
        missing barriers in some of unix_sock ->addr and ->path accesses
        ipvlan: disallow userns cap_net_admin to change global mode/flags
        vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()
        vxlan: Fix GRO cells race condition between receive and link delete
        net/hsr: fix possible crash in add_timer()
        gro_cells: make sure device is up in gro_cells_receive()
        tcp/dccp: remove reqsk_put() from inet_child_forget()
        ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56
        fs/9p: use fscache mutex rather than spinlock
        It's wrong to add len to sector_nr in raid10 reshape twice
        media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
        9p: use inode->i_lock to protect i_size_write() under 32-bit
        9p/net: fix memory leak in p9_client_create
        ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
        stm class: Fix an endless loop in channel allocation
        crypto: caam - fixed handling of sg list
        crypto: ahash - fix another early termination in hash walk
        gpu: ipu-v3: Fix i.MX51 CSI control registers offset
        gpu: ipu-v3: Fix CSI offsets for imx53
        s390/dasd: fix using offset into zero size array error
        ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized
        Input: matrix_keypad - use flush_delayed_work()
        i2c: cadence: Fix the hold bit setting
        Input: st-keyscan - fix potential zalloc NULL dereference
        ARM: 8824/1: fix a migrating irq bug when hotplug cpu
        assoc_array: Fix shortcut creation
        scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
        net: systemport: Fix reception of BPDUs
        pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
        net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
        ASoC: topology: free created components in tplg load error
        arm64: Relax GIC version check during early boot
        tmpfs: fix link accounting when a tmpfile is linked in
        ARC: uacces: remove lp_start, lp_end from clobber list
        phonet: fix building with clang
        mac80211_hwsim: propagate genlmsg_reply return code
        net: set static variable an initial value in atl2_probe()
        tmpfs: fix uninitialized return value in shmem_link
        stm class: Prevent division by zero
        crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
        CIFS: Fix read after write for files with read caching
        tracing: Do not free iter->trace in fail path of tracing_open_pipe()
        ACPI / device_sysfs: Avoid OF modalias creation for removed device
        regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
        regulator: s2mpa01: Fix step values for some LDOs
        clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
        clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
        s390/virtio: handle find on invalid queue gracefully
        scsi: virtio_scsi: don't send sc payload with tmfs
        scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
        m68k: Add -ffreestanding to CFLAGS
        btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
        Btrfs: fix corruption reading shared and compressed extents after hole punching
        crypto: pcbc - remove bogus memcpy()s with src == dest
        cpufreq: tegra124: add missing of_node_put()
        cpufreq: pxa2xx: remove incorrect __init annotation
        ext4: fix crash during online resizing
        ext2: Fix underflow in ext2_max_size()
        clk: ingenic: Fix round_rate misbehaving with non-integer dividers
        dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit
        mm/vmalloc: fix size check for remap_vmalloc_range_partial()
        kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
        intel_th: Don't reference unassigned outputs
        parport_pc: fix find_superio io compare code, should use equal test.
        i2c: tegra: fix maximum transfer size
        perf bench: Copy kernel files needed to build mem{cpy,set} x86_64 benchmarks
        serial: 8250_pci: Fix number of ports for ACCES serial cards
        serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
        jbd2: clear dirty flag when revoking a buffer from an older transaction
        jbd2: fix compile warning when using JBUFFER_TRACE
        powerpc/32: Clear on-stack exception marker upon exception return
        powerpc/wii: properly disable use of BATs when requested.
        powerpc/powernv: Make opal log only readable by root
        powerpc/83xx: Also save/restore SPRG4-7 during suspend
        ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
        dm: fix to_sector() for 32bit
        NFS41: pop some layoutget errors to application
        perf intel-pt: Fix CYC timestamp calculation after OVF
        perf auxtrace: Define auxtrace record alignment
        perf intel-pt: Fix overlap calculation for padding
        md: Fix failed allocation of md_register_thread
        NFS: Fix an I/O request leakage in nfs_do_recoalesce
        NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
        nfsd: fix memory corruption caused by readdir
        nfsd: fix wrong check in write_v4_end_grace()
        PM / wakeup: Rework wakeup source timer cancellation
        rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
        media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
        drm/radeon/evergreen_cs: fix missing break in switch statement
        KVM: nVMX: Sign extend displacements of VMX instr's mem operands
        KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
        KVM: X86: Fix residual mmio emulation request to userspace
        Linux 4.4.177

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	arch/arm/kernel/irq.c
	sound/core/compress_offload.c
2019-03-23 08:25:40 -07:00
Zhang, Jun
25c4c45193 rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903 upstream.

The rcu_gp_kthread_wake() function is invoked when it might be necessary
to wake the RCU grace-period kthread.  Because self-wakeups are normally
a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
this kthread, it naturally refuses to do the wakeup.

Unfortunately, natural though it might be, this heuristic fails when
rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
that interrupted the grace-period kthread just after the final check of
the wait-event condition but just before the schedule() call.  In this
case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
is within the RCU grace-period kthread's context.  Failing to provide
this wakeup can result in grace periods failing to start, which in turn
results in out-of-memory conditions.

This race window is quite narrow, but it actually did happen during real
testing.  It would of course need to be fixed even if it was strictly
theoretical in nature.

This patch does not Cc stable because it does not apply cleanly to
earlier kernel versions.

Fixes: 48a7639ce8 ("rcu: Make callers awaken grace-period kthread")
Reported-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
Co-developed-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off: "He, Bo" <bo.he@intel.com>
Signed-off: "xiao, jin" <jin.xiao@intel.com>
Signed-off: Bai, Jie A <jie.a.bai@intel.com>
Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
[ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
  !in_serving_softirq() to avoid redundant wakeups and to also handle the
  interrupt-handler scenario as well as the softirq-handler scenario that
  actually occurred in testing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23 08:44:39 +01:00
Nathan Chancellor
23a2dede1f Merge 4.4.174 into android-msm-wahoo-4.4
Changes in 4.4.174: (35 commits)
        inet: frags: change inet_frags_init_net() return value
        inet: frags: add a pointer to struct netns_frags
        inet: frags: refactor ipfrag_init()
        inet: frags: refactor ipv6_frag_init()
        inet: frags: refactor lowpan_net_frag_init()
        rhashtable: add rhashtable_lookup_get_insert_key()
        rhashtable: Add rhashtable_lookup()
        rhashtable: add schedule points
        inet: frags: use rhashtables for reassembly units
        net: ieee802154: 6lowpan: fix frag reassembly
        ipfrag: really prevent allocation on netns exit
        inet: frags: remove some helpers
        inet: frags: get rif of inet_frag_evicting()
        inet: frags: remove inet_frag_maybe_warn_overflow()
        inet: frags: break the 2GB limit for frags storage
        inet: frags: do not clone skb in ip_expire()
        ipv6: frags: rewrite ip6_expire_frag_queue()
        rhashtable: reorganize struct rhashtable layout
        inet: frags: reorganize struct netns_frags
        inet: frags: get rid of ipfrag_skb_cb/FRAG_CB
        inet: frags: fix ip6frag_low_thresh boundary
        ip: discard IPv4 datagrams with overlapping segments.
        net: modify skb_rbtree_purge to return the truesize of all purged skbs.
        ipv6: defrag: drop non-last frags smaller than min mtu
        net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
        ip: use rb trees for IP frag queue.
        ip: add helpers to process in-order fragments faster.
        ip: process in-order fragments efficiently
        ip: frags: fix crash in ip_do_fragment()
        ipv4: frags: precedence bug in ip_expire()
        inet: frags: better deal with smp races
        net: fix pskb_trim_rcsum_slow() with odd trim offset
        net: ipv4: do not handle duplicate fragments as overlapping
        rcu: Force boolean subscript for expedited stall warnings
        Linux 4.4.174

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
2019-02-08 08:59:53 -07:00
Paul E. McKenney
60c7f8fca1 rcu: Force boolean subscript for expedited stall warnings
commit ec3833ed02ae6ef2a933ece9de7cbab0c64c699e upstream.

The cpu_online() function can return values other than 0 and 1, which
can result in subscript overflow when applied to a two-element array.
This commit allows for this behavior by using "!!" on the return value
from cpu_online() when used as a subscript.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Rantala, Tommi" <tommi.t.rantala@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-08 11:25:33 +01:00
Nathan Chancellor
e4a2ad5046 Merge 4.4.105 into android-msm-wahoo-4.4-oreo-mr1
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Conflicts:
	drivers/gpu/drm/msm/msm_gem_submit.c
	drivers/media/v4l2-core/v4l2-compat-ioctl32.c
	drivers/mmc/core/bus.c
	drivers/net/wireless/iwlwifi/iwl-nvm-parse.c
	drivers/scsi/ufs/ufshcd.h
	kernel/power/process.c
	net/wireless/nl80211.c
	sound/usb/card.c
2017-12-09 13:44:10 -07:00
Paul E. McKenney
5fd4551659 rcu: Allow for page faults in NMI handlers
commit 28585a832602747cbfa88ad8934013177a3aae38 upstream.

A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU.  This works, at
least unless the exception happens in an NMI handler.  In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state.  This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.

This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state.  This in turn should
make the kernel safe for on-demand RCU voyeurism.

Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-18 09:20:41 +02:00
Thierry Strudel
1bfb0526f6 Merge branch 'android-msm-8998-4.4-common' into android-msm-wahoo-4.4
Conflicts:
	Makefile
	arch/arm64/configs/wahoo_defconfig
	arch/arm64/include/asm/cpufeature.h
	arch/arm64/kernel/sleep.S
	arch/arm64/kernel/vmlinux.lds.S
	arch/arm64/mm/fault.c
	drivers/android/binder.c
	drivers/firmware/efi/arm-init.c
	drivers/firmware/efi/efi.c
	drivers/input/keyboard/gpio_keys.c
	drivers/input/misc/Makefile
	drivers/input/misc/vl53L0/Makefile
	drivers/input/misc/vl53L0/inc/vl53l010_api.h
	drivers/input/misc/vl53L0/inc/vl53l010_device.h
	drivers/input/misc/vl53L0/inc/vl53l010_strings.h
	drivers/input/misc/vl53L0/inc/vl53l010_tuning.h
	drivers/input/misc/vl53L0/inc/vl53l0_api.h
	drivers/input/misc/vl53L0/inc/vl53l0_api_calibration.h
	drivers/input/misc/vl53L0/inc/vl53l0_api_core.h
	drivers/input/misc/vl53L0/inc/vl53l0_api_histogram.h
	drivers/input/misc/vl53L0/inc/vl53l0_api_ranging.h
	drivers/input/misc/vl53L0/inc/vl53l0_api_strings.h
	drivers/input/misc/vl53L0/inc/vl53l0_def.h
	drivers/input/misc/vl53L0/inc/vl53l0_device.h
	drivers/input/misc/vl53L0/inc/vl53l0_interrupt_threshold_settings.h
	drivers/input/misc/vl53L0/inc/vl53l0_platform.h
	drivers/input/misc/vl53L0/inc/vl53l0_platform_log.h
	drivers/input/misc/vl53L0/inc/vl53l0_tuning.h
	drivers/input/misc/vl53L0/inc/vl53l0_types.h
	drivers/input/misc/vl53L0/src/vl53l010_api.c
	drivers/input/misc/vl53L0/src/vl53l010_tuning.c
	drivers/input/misc/vl53L0/src/vl53l0_api.c
	drivers/input/misc/vl53L0/src/vl53l0_api_calibration.c
	drivers/input/misc/vl53L0/src/vl53l0_api_core.c
	drivers/input/misc/vl53L0/src/vl53l0_api_histogram.c
	drivers/input/misc/vl53L0/src/vl53l0_api_ranging.c
	drivers/input/misc/vl53L0/src/vl53l0_api_strings.c
	drivers/input/misc/vl53L0/src/vl53l0_i2c_platform.c
	drivers/input/misc/vl53L0/src/vl53l0_platform.c
	drivers/input/misc/vl53L0/src/vl53l0_port_i2c.c
	drivers/input/misc/vl53L0/stmvl53l0-cci.h
	drivers/input/misc/vl53L0/stmvl53l0-i2c.h
	drivers/input/misc/vl53L0/stmvl53l0.h
	drivers/input/misc/vl53L0/stmvl53l0_module-cci.c
	drivers/input/misc/vl53L0/stmvl53l0_module-i2c.c
	drivers/input/misc/vl53L0/stmvl53l0_module.c
	drivers/input/touchscreen/Makefile
	drivers/leds/leds-qpnp.c
	drivers/media/platform/msm/camera_v2/isp/msm_isp_stats_util.c
	drivers/media/platform/msm/camera_v2/msm.c
	drivers/pinctrl/qcom/pinctrl-msm.c
	drivers/platform/msm/ipa/ipa_v3/ipa_client.c
	drivers/platform/msm/mhi/mhi_ssr.c
	drivers/power/supply/qcom/qpnp-smb2.c
	drivers/power/supply/qcom/smb-lib.c
	drivers/power/supply/qcom/smb-lib.h
	drivers/soc/qcom/icnss.c
	drivers/soc/qcom/qdsp6v2/audio_notifier.c
	drivers/soc/qcom/service-notifier.c
	drivers/video/fbdev/msm/mdss_panel.h
	fs/exec.c
	fs/ext4/inode.c
	fs/ext4/readpage.c
	fs/namei.c
	fs/sdcardfs/derived_perm.c
	fs/sdcardfs/file.c
	fs/sdcardfs/inode.c
	fs/sdcardfs/lookup.c
	fs/sdcardfs/main.c
	fs/sdcardfs/multiuser.h
	fs/sdcardfs/packagelist.c
	fs/sdcardfs/sdcardfs.h
	fs/sdcardfs/super.c
	fs/utimes.c
	include/linux/string.h
	lib/kstrtox.c
	lib/string.c
	net/ipv4/tcp_ipv4.c
	net/unix/af_unix.c
	sound/soc/codecs/wcd934x/wcd934x-mbhc.h
	sound/soc/msm/msm8998.c

Change-Id: I918ebad22a5f81d48be07bd2bc2ac435ed9acb0a
Signed-off-by: Thierry Strudel <tstrudel@google.com>
2017-04-07 12:27:45 -07:00
Prasad Sodagudi
4f659aa55e rcu: Induce msm watchdog bite for rcu stalls
Every RCU stall need to be debugged, So collect the ram
dumps on every RCU stall to debug further by inducing
non secure watchdog bite whenever rcu stall detected.

Change-Id: I6c1cfddc92f06b48c3f22fe9970b70f2ec670bf6
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2017-03-09 10:07:35 -08:00
Thierry Strudel
05df0a9e71 Merge branch 'android-msm-8998-4.4-common' into android-msm-muskie-4.4
Merging release LA.UM.5.7.R1.07.01.01.253.064 Pre-CS4 0.0.091.1

Bug: 34911851
Change-Id: Iaaf2a1402940c98a3b36457b5fb99059f4a718f8
Signed-off-by: Thierry Strudel <tstrudel@google.com>
2017-02-09 18:08:53 -08:00
Alex Shi
2f0de5192a Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2016-12-12 22:17:37 +08:00
Ding Tianhong
dfb704f96c rcu: Fix soft lockup for rcu_nocb_kthread
commit bedc1969150d480c462cdac320fa944b694a7162 upstream.

Carrying out the following steps results in a softlockup in the
RCU callback-offload (rcuo) kthreads:

1. Connect to ixgbevf, and set the speed to 10Gb/s.
2. Use ifconfig to bring the nic up and down repeatedly.

[  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
[  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  368.106005] task: ffff88057dd8a220 ti: ffff88057dd9c000 task.ti: ffff88057dd9c000
[  368.106005] RIP: 0010:[<ffffffff81579e04>]  [<ffffffff81579e04>] fib_table_lookup+0x14/0x390
[  368.106005] RSP: 0018:ffff88061fc83ce8  EFLAGS: 00000286
[  368.106005] RAX: 0000000000000001 RBX: 00000000020155c0 RCX: 0000000000000001
[  368.106005] RDX: ffff88061fc83d50 RSI: ffff88061fc83d70 RDI: ffff880036d11a00
[  368.106005] RBP: ffff88061fc83d08 R08: 0000000000000001 R09: 0000000000000000
[  368.106005] R10: ffff880036d11a00 R11: ffffffff819e0900 R12: ffff88061fc83c58
[  368.106005] R13: ffffffff816154dd R14: ffff88061fc83d08 R15: 00000000020155c0
[  368.106005] FS:  0000000000000000(0000) GS:ffff88061fc80000(0000) knlGS:0000000000000000
[  368.106005] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  368.106005] CR2: 00007f8c2aee9c40 CR3: 000000057b222000 CR4: 00000000000407e0
[  368.106005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  368.106005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  368.106005] Stack:
[  368.106005]  00000000010000c0 ffff88057b766000 ffff8802e380b000 ffff88057af03e00
[  368.106005]  ffff88061fc83dc0 ffffffff815349a6 ffff88061fc83d40 ffffffff814ee146
[  368.106005]  ffff8802e380af00 00000000e380af00 ffffffff819e0900 020155c0010000c0
[  368.106005] Call Trace:
[  368.106005]  <IRQ>
[  368.106005]
[  368.106005]  [<ffffffff815349a6>] ip_route_input_noref+0x516/0xbd0
[  368.106005]  [<ffffffff814ee146>] ? skb_release_data+0xd6/0x110
[  368.106005]  [<ffffffff814ee20a>] ? kfree_skb+0x3a/0xa0
[  368.106005]  [<ffffffff8153698f>] ip_rcv_finish+0x29f/0x350
[  368.106005]  [<ffffffff81537034>] ip_rcv+0x234/0x380
[  368.106005]  [<ffffffff814fd656>] __netif_receive_skb_core+0x676/0x870
[  368.106005]  [<ffffffff814fd868>] __netif_receive_skb+0x18/0x60
[  368.106005]  [<ffffffff814fe4de>] process_backlog+0xae/0x180
[  368.106005]  [<ffffffff814fdcb2>] net_rx_action+0x152/0x240
[  368.106005]  [<ffffffff81077b3f>] __do_softirq+0xef/0x280
[  368.106005]  [<ffffffff8161619c>] call_softirq+0x1c/0x30
[  368.106005]  <EOI>
[  368.106005]
[  368.106005]  [<ffffffff81015d95>] do_softirq+0x65/0xa0
[  368.106005]  [<ffffffff81077174>] local_bh_enable+0x94/0xa0
[  368.106005]  [<ffffffff81114922>] rcu_nocb_kthread+0x232/0x370
[  368.106005]  [<ffffffff81098250>] ? wake_up_bit+0x30/0x30
[  368.106005]  [<ffffffff811146f0>] ? rcu_start_gp+0x40/0x40
[  368.106005]  [<ffffffff8109728f>] kthread+0xcf/0xe0
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
[  368.106005]  [<ffffffff816147d8>] ret_from_fork+0x58/0x90
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140

==================================cut here==============================

It turns out that the rcuos callback-offload kthread is busy processing
a very large quantity of RCU callbacks, and it is not reliquishing the
CPU while doing so.  This commit therefore adds an cond_resched_rcu_qs()
within the loop to allow other tasks to run.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Dhaval Giani <dhaval.giani@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08 07:15:24 +01:00
Dmitry Vyukov
53b8336401 UPSTREAM: kernel: add kcov code coverage
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing).  Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system.  A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/).  However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible.  It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g.  scheduler, locking).

Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes.  Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch).  I've
dropped the second mode for simplicity.

This patch adds the necessary support on kernel side.  The complimentary
compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:

  https://github.com/google/syzkaller/wiki/Found-Bugs

We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation".  For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.

Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
typical coverage can be just a dozen of basic blocks (e.g.  an invalid
input).  In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M).  Cost of
kcov depends only on number of executed basic blocks/edges.  On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.

kcov exposes kernel PCs and control flow to user-space which is
insecure.  But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.

[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 5c9a8750a6409c63a0f01d51a9024861022f6593)
Signed-off-by: Alexander Potapenko <glider@google.com>

Change-Id: Ia7d116ded35eb31a514406f864d7d591606c65f3
2016-11-02 10:21:27 -07:00
Guenter Roeck
8b94247342 ANDROID: rcu_sync: Export rcu_sync_lockdep_assert
x86_64:allmodconfig fails to build with the following error.

ERROR: "rcu_sync_lockdep_assert" [kernel/locking/locktorture.ko] undefined!

Introduced by commit 3228c5eb7a ("RFC: FROMLIST: locking/percpu-rwsem:
Optimize readers and reduce global impact"). The applied upstream version
exports the missing symbol, so let's do the same.

Change-Id: If4e516715c3415fe8c82090f287174857561550d
Fixes: 3228c5eb7a ("RFC: FROMLIST: locking/percpu-rwsem: Optimize ...")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
2016-09-14 14:26:20 +05:30
Peter Zijlstra
a81c69e149 RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write()
The current percpu-rwsem read side is entirely free of serializing insns
at the cost of having a synchronize_sched() in the write path.

The latency of the synchronize_sched() is too high for cgroups. The
commit 1ed1328792 talks about the write path being a fairly cold path
but this is not the case for Android which moves task to the foreground
cgroup and back around binder IPC calls from foreground processes to
background processes, so it is significantly hotter than human initiated
operations.

Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
problem, hopefully it should not be that slow after another commit
80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global
impact").

We could just add rcu_sync_enter() into cgroup_init() but we do not want
another synchronize_sched() at boot time, so this patch adds the new helper
which doesn't block but currently can only be called before the first use.

Cc: Tejun Heo <tj@kernel.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[jstultz: backported to 4.4]
Change-Id: I34aa9c394d3052779b56976693e96d861bd255f2
Mailing-list-URL: https://lkml.org/lkml/2016/8/11/557
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-09-14 14:26:20 +05:30
Peter Zijlstra
00eaad05be RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write()
The current percpu-rwsem read side is entirely free of serializing insns
at the cost of having a synchronize_sched() in the write path.

The latency of the synchronize_sched() is too high for cgroups. The
commit 1ed1328792 talks about the write path being a fairly cold path
but this is not the case for Android which moves task to the foreground
cgroup and back around binder IPC calls from foreground processes to
background processes, so it is significantly hotter than human initiated
operations.

Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
problem, hopefully it should not be that slow after another commit
80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global
impact").

We could just add rcu_sync_enter() into cgroup_init() but we do not want
another synchronize_sched() at boot time, so this patch adds the new helper
which doesn't block but currently can only be called before the first use.

Cc: Tejun Heo <tj@kernel.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[jstultz: backported to 4.4]
Change-Id: I34aa9c394d3052779b56976693e96d861bd255f2
Mailing-list-URL: https://lkml.org/lkml/2016/8/11/557
Signed-off-by: John Stultz <john.stultz@linaro.org>
Git-commit: 0c3240a1ef
Git-repo: https://android.googlesource.com/kernel/common/+/android-4.4
Signed-off-by: Omprakash Dhyade <odhyade@codeaurora.org>
2016-08-29 14:18:07 -07:00
Paul E. McKenney
39cd2dd39a Merge branches 'doc.2015.10.06a', 'percpu-rwsem.2015.10.06a' and 'torture.2015.10.06a' into HEAD
doc.2015.10.06a:  Documentation updates.
percpu-rwsem.2015.10.06a:  Optimization of per-CPU reader-writer semaphores.
torture.2015.10.06a:  Torture-test updates.
2015-10-07 16:06:25 -07:00
Paul E. McKenney
d2856b046d Merge branches 'fixes.2015.10.06a' and 'exp.2015.10.07a' into HEAD
exp.2015.10.07a:  Reduce OS jitter of RCU-sched expedited grace periods.
fixes.2015.10.06a:  Miscellaneous fixes.
2015-10-07 16:05:21 -07:00
Paul E. McKenney
338b0f760e rcu: Better hotplug handling for synchronize_sched_expedited()
Earlier versions of synchronize_sched_expedited() can prematurely end
grace periods due to the fact that a CPU marked as cpu_is_offline()
can still be using RCU read-side critical sections during the time that
CPU makes its last pass through the scheduler and into the idle loop
and during the time that a given CPU is in the process of coming online.
This commit therefore eliminates this window by adding additional
interaction with the CPU-hotplug operations.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
b08517c76d rcu: Enable stall warnings for synchronize_rcu_expedited()
This commit redirects synchronize_rcu_expedited()'s wait to
synchronize_sched_expedited_wait(), thus enabling RCU CPU
stall warnings.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
c58656382e rcu: Add tasks to expedited stall-warning messages
This commit adds task-print ability to the expedited RCU CPU stall
warning messages in preparation for adding stall warnings to
synchornize_rcu_expedited().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
74611ecb0f rcu: Add online/offline info to expedited stall warning message
This commit makes the RCU CPU stall warning message print online/offline
indications immediately after the CPU number.  A "O" indicates global
offline, a "." global online, and a "o" indicates RCU believes that the
CPU is offline for the current grace period and "." otherwise, and an
"N" indicates that RCU believes that the CPU will be offline for the
next grace period, and "." otherwise, all right after the CPU number.
So for CPU 10, you would normally see "10-...:" indicating that everything
believes that the CPU is online.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
dcdb8807ba rcu: Consolidate expedited CPU selection
Now that sync_sched_exp_select_cpus() and sync_rcu_exp_select_cpus()
are identical aside from the the argument to smp_call_function_single(),
this commit consolidates them with a functional argument.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
66fe6cbee4 rcu: Prepare for consolidating expedited CPU selection
This commit brings sync_sched_exp_select_cpus() into alignment with
sync_rcu_exp_select_cpus(), as a first step towards consolidating them
into one function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:50 -07:00
Paul E. McKenney
807226e2fb rcu: Stop excluding CPU hotplug in synchronize_sched_expedited()
Now that synchronize_sched_expedited() uses IPIs, a hook in
rcu_sched_qs(), and the ->expmask field in the rcu_node combining
tree, it is no longer necessary to exclude CPU hotplug.  Any
races with CPU hotplug will be detected when attempting to send
the IPI.  This commit therefore removes the code excluding
CPU hotplug operations.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:02:49 -07:00
Paul E. McKenney
83c2c735e7 rcu: Stop silencing lockdep false positive for expedited grace periods
This reverts commit af859beaab (rcu: Silence lockdep false positive
for expedited grace periods).  Because synchronize_rcu_expedited()
no longer invokes synchronize_sched_expedited(), ->exp_funnel_mutex
acquisition is no longer nested, so the false positive no longer happens.
This commit therefore removes the extra lockdep data structures, as they
are no longer needed.
2015-10-07 16:02:49 -07:00
Paul E. McKenney
6587a23b6b rcu: Switch synchronize_sched_expedited() to IPI
This commit switches synchronize_sched_expedited() from stop_one_cpu_nowait()
to smp_call_function_single(), thus moving from an IPI and a pair of
context switches to an IPI and a single pass through the scheduler.
Of course, if the scheduler actually does decide to switch to a different
task, there will still be a pair of context switches, but there would
likely have been a pair of context switches anyway, just a bit later.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-07 16:01:12 -07:00
Paul E. McKenney
4f441a258f rcutorture: Fix unused-function warning for torturing_tasks()
The torturing_tasks() function is used only in kernels built with
CONFIG_PROVE_RCU=y, so the second definition can result in unused-function
compiler warnings.  This commit adds __maybe_unused to suppress these
warnings.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:28:09 -07:00
Paul E. McKenney
889d487a26 rcutorture: Fix module unwind when bad torture_type specified
The rcutorture module has a list of torture types, and specifying a
type not on this list is supposed to cleanly fail the module load.
Unfortunately, the "fail" happens without the "cleanly".  This commit
therefore adds the needed clean-up after an incorrect torture_type.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:28:01 -07:00
Oleg Nesterov
4bace7344d rcu_sync: Cleanup the CONFIG_PROVE_RCU checks
1. Rename __rcu_sync_is_idle() to rcu_sync_lockdep_assert() and
   change it to use rcu_lockdep_assert().

2. Change rcu_sync_is_idle() to return rsp->gp_state == GP_IDLE
   unconditonally, this way we can remove the same check from
   rcu_sync_lockdep_assert() and clearly isolate the debugging
   code.

Note: rcu_sync_enter()->wait_event(gp_state == GP_PASSED) needs
another CONFIG_PROVE_RCU check, the same as is done in ->sync(); but
this needs some simple preparations in the core RCU code to avoid the
code duplication.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:45 -07:00
Oleg Nesterov
07899a6e5f rcu_sync: Introduce rcu_sync_dtor()
This commit allows rcu_sync structures to be safely deallocated,
The trick is to add a new ->wait field to the gp_ops array.
This field is a pointer to the rcu_barrier() function corresponding
to the flavor of RCU in question.  This allows a new rcu_sync_dtor()
to wait for any outstanding callbacks before freeing the rcu_sync
structure.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:21 -07:00
Oleg Nesterov
3a518b76af rcu_sync: Add CONFIG_PROVE_RCU checks
This commit validates that the caller of rcu_sync_is_idle() holds the
corresponding type of RCU read-side lock, but only in kernels built
with CONFIG_PROVE_RCU=y.  This validation is carried out via a new
rcu_sync_ops->held() method that is checked within rcu_sync_is_idle().

Note that although this does add code to the fast path, it only does so
in kernels built with CONFIG_PROVE_RCU=y.

Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:16 -07:00
Oleg Nesterov
82e8c565be rcu_sync: Simplify rcu_sync using new rcu_sync_ops structure
This commit adds the new struct rcu_sync_ops which holds sync/call
methods, and turns the function pointers in rcu_sync_struct into an array
of struct rcu_sync_ops.  This simplifies the "init" helpers by collapsing
a switch statement and explicit multiple definitions into a simple
assignment and a helper macro, respectively.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:10 -07:00
Oleg Nesterov
cc44ca848f rcu: Create rcu_sync infrastructure
The rcu_sync infrastructure can be thought of as infrastructure to be
used to implement reader-writer primitives having extremely lightweight
readers during times when there are no writers.  The first use is in
the percpu_rwsem used by the VFS subsystem.

This infrastructure is functionally equivalent to

        struct rcu_sync_struct {
                atomic_t counter;
        };

	/* Check possibility of fast-path read-side operations. */
        static inline bool rcu_sync_is_idle(struct rcu_sync_struct *rss)
        {
                return atomic_read(&rss->counter) == 0;
        }

	/* Tell readers to use slowpaths. */
        static inline void rcu_sync_enter(struct rcu_sync_struct *rss)
        {
                atomic_inc(&rss->counter);
                synchronize_sched();
        }

	/* Allow readers to once again use fastpaths. */
        static inline void rcu_sync_exit(struct rcu_sync_struct *rss)
        {
                synchronize_sched();
                atomic_dec(&rss->counter);
        }

The main difference is that it records the state and only calls
synchronize_sched() if required.  At least some of the calls to
synchronize_sched() will be optimized away when rcu_sync_enter() and
rcu_sync_exit() are invoked repeatedly in quick succession.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:04 -07:00
Paul E. McKenney
3836f5337f torture: Consolidate cond_resched_rcu_qs() into stutter_wait()
This commit moves cond_resched_rcu_qs() into stutter_wait(), saving
a line and also avoiding RCU CPU stall warnings from all torture
loops containing a stutter_wait().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:25:01 -07:00
Paul E. McKenney
c34d2f4184 rcu: Correct comment for values of ->gp_state field
This commit corrects the comment for the values of the ->gp_state field,
which previously incorrectly said that these were for the ->gp_flags
field.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:16:11 -07:00
Petr Mladek
77f81fe08e rcu: Finish folding ->fqs_state into ->gp_state
Commit commit 4cdfc175c2 ("rcu: Move quiescent-state forcing
into kthread") started the process of folding the old ->fqs_state into
->gp_state, but did not complete it.  This situation does not cause
any malfunction, but can result in extremely confusing trace output.
This commit completes this task of eliminating ->fqs_state in favor
of ->gp_state.

The old ->fqs_state was also used to decide when to collect dyntick-idle
snapshots.  For this purpose, we add a boolean variable into the kthread,
which is set on the first call to rcu_gp_fqs() for a given grace period
and clear otherwise.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:15:59 -07:00
Paul E. McKenney
49f5903b47 rcu: Move preemption disabling out of __srcu_read_lock()
Currently, __srcu_read_lock() cannot be invoked from restricted
environments because it contains calls to preempt_disable() and
preempt_enable(), both of which can invoke lockdep, which is a bad
idea in some restricted execution modes.  This commit therefore moves
the preempt_disable() and preempt_enable() from __srcu_read_lock()
to srcu_read_lock().  It also inserts the preempt_disable() and
preempt_enable() around the call to __srcu_read_lock() in do_exit().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:15:43 -07:00
Paul E. McKenney
7f21aeef72 rcu: Add online/offline info to stall warning message
This commit makes the RCU CPU stall warning message print online/offline
indications immediately after a hyphen following the CPU number.  A "O"
indicates that the global CPU-hotplug system believes that the CPU is
online, a "o" that RCU perceived the CPU to be online at the beginning
of the current expedited grace period, and an "N" that RCU currently
believes that it will perceive the CPU as being online at the beginning
of the next expedited grace period, with "." otherwise for all three
indications.  So for CPU 10, you would normally see "10-OoN:" indicating
that everything believes that the CPU is online.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-06 11:10:18 -07:00
Paul E. McKenney
ee968ac61d rcu: Eliminate panic when silly boot-time fanout specified
This commit loosens rcutree.rcu_fanout_leaf range checks
and replaces a panic() with a fallback to compile-time values.
This fallback is accompanied by a WARN_ON(), and both occur when the
rcutree.rcu_fanout_leaf value is too small to accommodate the number of
CPUs.  For example, given the current four-level limit for the rcu_node
tree, a system with more than 16 CPUs built with CONFIG_FANOUT=2 must
have rcutree.rcu_fanout_leaf larger than 2.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:09:41 -07:00
Boqun Feng
bb73c52bad rcu: Don't disable preemption for Tiny and Tree RCU readers
Because preempt_disable() maps to barrier() for non-debug builds,
it forces the compiler to spill and reload registers.  Because Tree
RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these
barrier() instances generate needless extra code for each instance of
rcu_read_lock() and rcu_read_unlock().  This extra code slows down Tree
RCU and bloats Tiny RCU.

This commit therefore removes the preempt_disable() and preempt_enable()
from the non-preemptible implementations of __rcu_read_lock() and
__rcu_read_unlock(), respectively.  However, for debug purposes,
preempt_disable() and preempt_enable() are still invoked if
CONFIG_PREEMPT_COUNT=y, because this allows detection of sleeping inside
atomic sections in non-preemptible kernels.

However, Tiny and Tree RCU operates by coalescing all RCU read-side
critical sections on a given CPU that lie between successive quiescent
states.  It is therefore necessary to compensate for removing barriers
from __rcu_read_lock() and __rcu_read_unlock() by adding them to a
couple of the RCU functions invoked during quiescent states, namely to
rcu_all_qs() and rcu_note_context_switch().  However, note that the latter
is more paranoia than necessity, at least until link-time optimizations
become more aggressive.

This is based on an earlier patch by Paul E. McKenney, fixing
a bug encountered in kernels built with CONFIG_PREEMPT=n and
CONFIG_PREEMPT_COUNT=y.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-06 11:08:23 -07:00
Boqun Feng
db3e8db45e rcu: Use call_rcu_func_t to replace explicit type equivalents
We have had the call_rcu_func_t typedef for a quite awhile, but we still
use explicit function pointer types in some places.  These types can
confuse cscope and can be hard to read.  This patch therefore replaces
these types with the call_rcu_func_t typedef.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:08:19 -07:00
Boqun Feng
b6a4ae766e rcu: Use rcu_callback_t in call_rcu*() and friends
As we now have rcu_callback_t typedefs as the type of rcu callbacks, we
should use it in call_rcu*() and friends as the type of parameters. This
could save us a few lines of code and make it clear which function
requires an rcu callbacks rather than other callbacks as its argument.

Besides, this can also help cscope to generate a better database for
code reading.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:08:05 -07:00
Paul E. McKenney
5b74c45890 rcu: Make ->cpu_no_qs be a union for aggregate OR
This commit converts the rcu_data structure's ->cpu_no_qs field
to a union.  The bytewise side of this union allows individual access
to indications as to whether this CPU needs to find a quiescent state
for a normal (.norm) and/or expedited (.exp) grace period.  The setwise
side of the union allows testing whether or not a quiescent state is
needed at all, for either type of grace period.

For now, only .norm is used.  A later commit will introduce the expedited
usage.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:21 -07:00
Paul E. McKenney
0d43eb34f9 rcu: Invert passed_quiesce and rename to cpu_no_qs
This commit inverts the sense of the rcu_data structure's ->passed_quiesce
field and renames it to ->cpu_no_qs.  This will allow a later commit to
use an "aggregate OR" operation to test expedited as well as normal grace
periods without added overhead.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:21 -07:00
Paul E. McKenney
97c668b8e9 rcu: Rename qs_pending to core_needs_qs
An upcoming commit needs to invert the sense of the ->passed_quiesce
rcu_data structure field, so this commit is taking this opportunity
to clarify things a bit by renaming ->qs_pending to ->core_needs_qs.

So if !rdp->core_needs_qs, then this CPU need not concern itself with
quiescent states, in particular, it need not acquire its leaf rcu_node
structure's ->lock to check.  Otherwise, it needs to report the next
quiescent state.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:20 -07:00
Paul E. McKenney
bce5fa12aa rcu: Move synchronize_sched_expedited() to combining tree
Currently, synchronize_sched_expedited() uses a single global counter
to track the number of remaining context switches that the current
expedited grace period must wait on.  This is problematic on large
systems, where the resulting memory contention can be pathological.
This commit therefore makes synchronize_sched_expedited() instead use
the combining tree in the same manner as synchronize_rcu_expedited(),
keeping memory contention down to a dull roar.

This commit creates a temporary function sync_sched_exp_select_cpus()
that is very similar to sync_rcu_exp_select_cpus().  A later commit
will consolidate these two functions, which becomes possible when
synchronize_sched_expedited() switches from stop_one_cpu_nowait() to
smp_call_function_single().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:20 -07:00
Paul E. McKenney
8203d6d0ee rcu: Use single-stage IPI algorithm for RCU expedited grace period
The current preemptible-RCU expedited grace-period algorithm invokes
synchronize_sched_expedited() to enqueue all tasks currently running
in a preemptible-RCU read-side critical section, then waits for all the
->blkd_tasks lists to drain.  This works, but results in both an IPI and
a double context switch even on CPUs that do not happen to be running
in a preemptible RCU read-side critical section.

This commit implements a new algorithm that causes less OS jitter.
This new algorithm IPIs all online CPUs that are not idle (from an
RCU perspective), but refrains from self-IPIs.  If a CPU receiving
this IPI is not in a preemptible RCU read-side critical section (or
is just now exiting one), it pushes quiescence up the rcu_node tree,
otherwise, it sets a flag that will be handled by the upcoming outermost
rcu_read_unlock(), which will then push quiescence up the tree.

The expedited grace period must of course wait on any pre-existing blocked
readers, and newly blocked readers must be queued carefully based on
the state of both the normal and the expedited grace periods.  This
new queueing approach also avoids the need to update boost state,
courtesy of the fact that blocked tasks are no longer ever migrated to
the root rcu_node structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:19 -07:00
Paul E. McKenney
b9585e940a rcu: Consolidate tree setup for synchronize_rcu_expedited()
This commit replaces sync_rcu_preempt_exp_init1(() and
sync_rcu_preempt_exp_init2() with sync_exp_reset_tree_hotplug()
and sync_exp_reset_tree(), which will also be used by
synchronize_sched_expedited(), and sync_rcu_exp_select_nodes(), which
contains code specific to synchronize_rcu_expedited().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:18 -07:00