Commit Graph

122 Commits

Author SHA1 Message Date
Lee Jones
8728c33713 Revert "Revert "mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse""
This reverts commit 4f35cec76058557d9eaec0d501d03c7657eb56b4 and does so
in an abi-safe way.

This is done by adding the new fields only to the end of the structure
and this structure is only passed around to other functions as a
pointer, the internal structure layout is only touched by the core
kernel, so adding it to the end is safe.

ABI differences manually updated:

Leaf changes summary: 1 artifact changed
Changed leaf types summary: 1 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

'struct anon_vma at rmap.h:29:1' changed:
  type size changed from 704 to 832 (in bits)
  2 data member insertions:
    'unsigned long int num_children', at offset 704 (in bits) at rmap.h:70:1
    'unsigned long int num_active_vmas', at offset 768 (in bits) at rmap.h:72:1
  761 impacted interfaces

Bug: 260678056
Bug: 253167854
Change-Id: Ib1d45625cbc2e0b21330ca3dc2aa7aff34666d31
Signed-off-by: Lee Jones <joneslee@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Signed-off-by: Wilson Sung <wilsonsung@google.com>
(cherry picked from commit d3e1a50cba092fa9c56fc642ee74f360c4b40a17)
2023-05-18 11:19:41 +00:00
Lucas Wei
b2bed6615a Merge android-4.19-stable (4.19.202) into android-msm-pixel-4.19-lts
Merge 4.19.202 into android-4.19-stable
Linux 4.19.202
    spi: mediatek: Fix fifo transfer
  * padata: add separate cpuhp node for CPUHP_PADATA_DEAD
      include/linux/padata.h
  * padata: validate cpumask without removed CPU during offline
      include/linux/cpuhotplug.h
    Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"
    firmware: arm_scmi: Ensure drivers provide a probe function
    drm/i915: Ensure intel_engine_init_execlist() builds with Clang
  * Revert "Bluetooth: Shutdown controller after workqueues are flushed or cancelled"
      net/bluetooth/hci_core.c
  * bdi: add a ->dev_name field to struct backing_dev_info
      include/linux/backing-dev-defs.h
      mm/backing-dev.c
  * bdi: use bdi_dev_name() to get device name
      block/blk-cgroup.c
      include/trace/events/wbt.h
  * bdi: move bdi_dev_name out of line
      include/linux/backing-dev.h
      mm/backing-dev.c
  * net: Fix zero-copy head len calculation.
      net/core/skbuff.c
    qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()
  * r8152: Fix potential PM refcount imbalance
      drivers/net/usb/r8152.c
    ASoC: tlv320aic31xx: fix reversed bclk/wclk master bits
  * regulator: rt5033: Fix n_voltages settings for BUCK and LDO
      include/linux/mfd/rt5033-private.h
    btrfs: mark compressed range uptodate only if all bio succeed
    Merge 4.19.201 into android-4.19-stable
Linux 4.19.201
    i40e: Add additional info to PHY type error
    Revert "perf map: Fix dso->nsinfo refcounting"
    powerpc/pseries: Fix regression while building external modules
    can: hi311x: fix a signedness bug in hi3110_cmd()
    sis900: Fix missing pci_disable_device() in probe and remove
    tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
  * sctp: fix return value check in __sctp_rcv_asconf_lookup
      net/sctp/input.c
    net/mlx5: Fix flow table chaining
  * net: llc: fix skb_over_panic
      include/net/llc_pdu.h
    mlx4: Fix missing error code in mlx4_load_one()
  * tipc: fix sleeping in tipc accept routine
      net/tipc/socket.c
    i40e: Fix log TC creation failure when max num of queues is exceeded
    i40e: Fix logic of disabling queues
    netfilter: nft_nat: allow to specify layer 4 protocol NAT only
  * netfilter: conntrack: adjust stop timestamp to real expiry value
      net/netfilter/nf_conntrack_core.c
  * cfg80211: Fix possible memory leak in function cfg80211_bss_update
      net/wireless/scan.c
    nfc: nfcsim: fix use after free during module unload
    NIU: fix incorrect error return, missed in previous revert
    can: esd_usb2: fix memory leak
    can: ems_usb: fix memory leak
    can: usb_8dev: fix memory leak
    can: mcba_usb_start(): add missing urb->transfer_dma initialization
    can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
    ocfs2: issue zeroout to EOF blocks
    ocfs2: fix zero out valid data
    x86/kvm: fix vcpu-id indexed array sizes
    btrfs: fix rw device counting in __btrfs_free_extra_devids
    x86/asm: Ensure asm/proto.h can be included stand-alone
  * gro: ensure frag0 meets IP header alignment
      include/linux/skbuff.h
      net/core/dev.c
  * virtio_net: Do not pull payload in skb->head
      include/linux/virtio_net.h
    Merge 4.19.200 into android-4.19-stable
Linux 4.19.200
    ARM: dts: versatile: Fix up interrupt controller node names
    cifs: fix the out of range assignment to bit fields in parse_server_interfaces
    firmware: arm_scmi: Fix range check for the maximum number of pending messages
    firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
    hfs: add lock nesting notation to hfs_find_init
    hfs: fix high memory mapping in hfs_bnode_read
    hfs: add missing clean-up in hfs_fill_super
  * sctp: move 198 addresses from unusable to private scope
      include/net/sctp/constants.h
      net/sctp/protocol.c
  * net: annotate data race around sk_ll_usec
      include/net/busy_poll.h
      net/core/sock.c
    net/802/garp: fix memleak in garp_request_join()
    net/802/mrp: fix memleak in mrp_request_join()
  * workqueue: fix UAF in pwq_unbound_release_workfn()
      kernel/workqueue.c
  * af_unix: fix garbage collect vs MSG_PEEK
      net/unix/af_unix.c
  * net: split out functions related to registering inflight socket files
      include/net/af_unix.h
      net/Makefile
      net/unix/Kconfig
      net/unix/Makefile
      net/unix/af_unix.c
      net/unix/garbage.c
      net/unix/scm.c
      net/unix/scm.h
    KVM: x86: determine if an exception has an error code only when injecting it.
    iio: dac: ds4422/ds4424 drop of_node check
    selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c
  * ANDROID: staging: ion: move buffer kmap from begin/end_cpu_access()
      drivers/staging/android/ion/ion.c
    Merge 4.19.199 into android-4.19-stable
Linux 4.19.199
  * xhci: add xhci_get_virt_ep() helper
      drivers/usb/host/xhci-ring.c
      drivers/usb/host/xhci.h
    spi: spi-fsl-dspi: Fix a resource leak in an error handling path
  * PCI: Mark AMD Navi14 GPU ATS as broken
      drivers/pci/quirks.c
    btrfs: compression: don't try to compress if we don't have enough pages
    iio: accel: bma180: Fix BMA25x bandwidth register values
    iio: accel: bma180: Use explicit member assignment
    net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
    net: dsa: mv88e6xxx: use correct .stats_set_histogram() on Topaz
    KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
    KVM: do not allow mapping valid but non-reference-counted pages
    KVM: do not assume PTE is writable after follow_pfn
  * drm: Return -ENOTTY for non-drm ioctls
      drivers/gpu/drm/drm_ioctl.c
      include/drm/drm_ioctl.h
    nds32: fix up stack guard gap
    selftest: use mmap instead of posix_memalign to allocate memory
    ixgbe: Fix packet corruption due to missing DMA sync
    media: ngene: Fix out-of-bounds bug in ngene_command_config_free_buf()
  * tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
      kernel/trace/ring_buffer.c
    usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
    USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
    USB: serial: cp210x: fix comments for GE CS1000
    USB: serial: option: add support for u-blox LARA-R6 family
    usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
    usb: max-3421: Prevent corruption of freed memory
    USB: usb-storage: Add LaCie Rugged USB3-FW to IGNORE_UAS
  * usb: hub: Fix link power management max exit latency (MEL) calculations
      drivers/usb/core/hub.c
  * usb: hub: Disable USB 3 device initiated lpm if exit latency is too high
      drivers/usb/core/hub.c
    KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow
  * xhci: Fix lost USB 2 remote wake
      drivers/usb/host/xhci-hub.c
    ALSA: sb: Fix potential ABBA deadlock in CSP driver
  * ALSA: usb-audio: Add registration quirk for JBL Quantum headsets
      sound/usb/quirks.c
    s390/ftrace: fix ftrace_update_ftrace_func implementation
    Revert "MIPS: add PMD table accounting into MIPS'pmd_alloc_one"
  * proc: Avoid mixing integer types in mem_rw()
      fs/proc/base.c
    drm/panel: raspberrypi-touchscreen: Prevent double-free
  * net: sched: cls_api: Fix the the wrong parameter
      net/sched/cls_api.c
  * sctp: update active_key for asoc when old key is being replaced
      net/sctp/auth.c
  * Revert "USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem"
      drivers/usb/core/quirks.c
    nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING
    net/sched: act_skbmod: Skip non-Ethernet packets
  * net/tcp_fastopen: fix data races around tfo_active_disable_stamp
      net/ipv4/tcp_fastopen.c
    spi: cadence: Correct initialisation of runtime PM again
    scsi: target: Fix protect handling in WRITE SAME(32)
    scsi: iscsi: Fix iface sysfs attr detection
    netrom: Decrease sock refcount when sock timers expire
    KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leak
    net: decnet: Fix sleeping inside in af_decnet
    net: fix uninit-value in caif_seqpkt_sendmsg
    bpftool: Check malloc return value in mount_bpffs_for_pin
    s390/bpf: Perform r1 range checking before accessing jit->seen_reg[r1]
    liquidio: Fix unintentional sign extension issue on left shift of u16
    spi: mediatek: fix fifo rx mode
    perf probe-file: Delete namelist in del_events() on the error path
    perf test bpf: Free obj_buf
    perf lzma: Close lzma stream on exit
    perf dso: Fix memory leak in dso__new_map()
    perf probe: Fix dso->nsinfo refcounting
    perf map: Fix dso->nsinfo refcounting
    nvme-pci: do not call nvme_dev_remove_admin from nvme_remove
  * ipv6: fix 'disable_policy' for fwd packets
      net/ipv6/ip6_output.c
    igb: Fix position of assignment to *ring
    igb: Check if num of q_vectors is smaller than max before array access
    iavf: Fix an error handling path in 'iavf_probe()'
    e1000e: Fix an error handling path in 'e1000_probe()'
    fm10k: Fix an error handling path in 'fm10k_probe()'
    igb: Fix an error handling path in 'igb_probe()'
    ixgbe: Fix an error handling path in 'ixgbe_probe()'
    igb: Fix use-after-free error during reset
  * net: ip_tunnel: fix mtu calculation for ETHER tunnel devices
      net/ipv4/ip_tunnel.c
  * udp: annotate data races around unix_sk(sk)->gso_size
      net/ipv4/udp.c
      net/ipv6/udp.c
    bpftool: Properly close va_list 'ap' by va_end() on error
  * ipv6: tcp: drop silly ICMPv6 packet too big messages
      net/ipv4/tcp_output.c
      net/ipv6/tcp_ipv6.c
  * tcp: annotate data races around tp->mtu_info
      net/ipv4/tcp_ipv4.c
      net/ipv6/tcp_ipv6.c
  * dma-buf/sync_file: Don't leak fences on merge failure
      drivers/dma-buf/sync_file.c
  * net: validate lwtstate->data before returning from skb_tunnel_info()
      include/net/dst_metadata.h
  * net: send SYNACK packet with accepted fwmark
      net/ipv6/tcp_ipv6.c
    net: ti: fix UAF in tlan_remove_one
    net: qcom/emac: fix UAF in emac_remove
    net: moxa: fix UAF in moxart_mac_probe
    net: bcmgenet: Ensure all TX/RX queues DMAs are disabled
  * net: bridge: sync fdb to new unicast-filtering ports
      net/bridge/br_if.c
  * netfilter: ctnetlink: suspicious RCU usage in ctnetlink_dump_helpinfo
      net/netfilter/nf_conntrack_netlink.c
  * net: ipv6: fix return value of ip6_skb_dst_mtu
      include/net/ip6_route.h
      net/ipv6/xfrm6_output.c
    net: dsa: mv88e6xxx: enable .rmu_disable() on Topaz
    dm writecache: fix writing beyond end of underlying device when shrinking
    dm writecache: return the exact table values that were set
  * mm: slab: fix kmem_cache_create failed when sysfs node not destroyed
      mm/slab_common.c
  * sched/fair: Fix CFS bandwidth hrtimer expiry type
      kernel/sched/fair.c
    scsi: libfc: Fix array index out of bound exception
    scsi: libsas: Add LUN number check in .slave_alloc callback
    scsi: aic7xxx: Fix unintentional sign extension issue on left shift of u8
    rtc: max77686: Do not enforce (incorrect) interrupt trigger type
  * kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set
      scripts/mkcompile_h
  * thermal/core: Correct function name thermal_zone_device_unregister()
      drivers/thermal/thermal_core.c
    arm64: dts: ls208xa: remove bus-num from dspi node
    soc/tegra: fuse: Fix Tegra234-only builds
    ARM: dts: stm32: move stmmac axi config in ethernet node on stm32mp15
    ARM: dts: stm32: fix i2c node name on stm32f746 to prevent warnings
    ARM: dts: rockchip: fix supply properties in io-domains nodes
    arm64: dts: juno: Update SCPI nodes as per the YAML schema
    ARM: dts: stm32: fix timer nodes on STM32 MCU to prevent warnings
    ARM: dts: stm32: fix RCC node name on stm32f429 MCU
    ARM: dts: stm32: fix gpio-keys node on STM32 MCU boards
    rtc: mxc_v2: add missing MODULE_DEVICE_TABLE
    ARM: imx: pm-imx5: Fix references to imx5_cpu_suspend_info
    ARM: dts: imx6: phyFLEX: Fix UART hardware flow control
    ARM: dts: Hurricane 2: Fix NAND nodes names
    ARM: dts: BCM63xx: Fix NAND nodes names
    ARM: NSP: dts: fix NAND nodes names
    ARM: Cygnus: dts: fix NAND nodes names
    ARM: brcmstb: dts: fix NAND nodes names
    reset: ti-syscon: fix to_ti_syscon_reset_data macro
    arm64: dts: rockchip: Fix power-controller node names for rk3328
    ARM: dts: rockchip: Fix power-controller node names for rk3288
    ARM: dts: rockchip: Fix IOMMU nodes properties on rk322x
    ARM: dts: rockchip: Fix the timer clocks order
    arm64: dts: rockchip: fix pinctrl sleep nodename for rk3399.dtsi
    ARM: dts: rockchip: fix pinctrl sleep nodename for rk3036-kylin and rk3288
    ARM: dts: gemini: add device_type on pci
    ARM: dts: gemini: rename mdio to the right name
  * ANDROID: generate_initcall_order.pl: Use two dash long options for llvm-nm
      scripts/generate_initcall_order.pl
  * Revert "media: subdev: disallow ioctl for saa6588/davinci"
      include/media/v4l2-subdev.h
  * ANDROID: GKI: fix up crc change in ip.h
      include/net/ip.h
    Merge 4.19.198 into android-4.19-stable
Linux 4.19.198
  * seq_file: disallow extremely large seq buffer allocations
      fs/seq_file.c
    scsi: scsi_dh_alua: Fix signedness bug in alua_rtpg()
  * net: bridge: multicast: fix PIM hello router port marking race
      net/bridge/br_multicast.c
    MIPS: vdso: Invalid GIC access through VDSO
    mips: disable branch profiling in boot/decompress.o
    mips: always link byteswap helpers into decompressor
    scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe()
    ARM: dts: imx6q-dhcom: Add gpios pinctrl for i2c bus recovery
    ARM: dts: imx6q-dhcom: Fix ethernet plugin detection problems
    ARM: dts: imx6q-dhcom: Fix ethernet reset time properties
    ARM: dts: am437x: align ti,pindir-d0-out-d1-in property with dt-shema
    ARM: dts: am335x: align ti,pindir-d0-out-d1-in property with dt-shema
    memory: fsl_ifc: fix leak of private memory on probe failure
    memory: fsl_ifc: fix leak of IO mapping on probe failure
  * reset: bail if try_module_get() fails
      drivers/reset/core.c
    ARM: dts: BCM5301X: Fixup SPI binding
    ARM: dts: r8a7779, marzen: Fix DU clock names
    arm64: dts: renesas: v3msk: Fix memory size
  * rtc: fix snprintf() checking in is_rtc_hctosys()
      drivers/rtc/rtc-proc.c
    memory: atmel-ebi: add missing of_node_put for loop iteration
    ARM: dts: exynos: fix PWM LED max brightness on Odroid XU4
    ARM: dts: exynos: fix PWM LED max brightness on Odroid HC1
    ARM: dts: exynos: fix PWM LED max brightness on Odroid XU/XU3
    reset: a10sr: add missing of_match_table reference
    hexagon: use common DISCARDS macro
    NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times
    ALSA: isa: Fix error return code in snd_cmi8330_probe()
    virtio_net: move tx vq operation under tx queue lock
    x86/fpu: Limit xstate copy size in xstateregs_set()
    PCI: iproc: Support multi-MSI only on uniprocessor kernel
    PCI: iproc: Fix multi-MSI base vector number allocation
    ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode
    nfs: fix acl memory leak of posix_acl_create()
    watchdog: aspeed: fix hardware timeout calculation
    um: fix error return code in winch_tramp()
    um: fix error return code in slip_open()
    NFSv4: Initialise connection to the server in nfs4_alloc_client()
  * power: supply: rt5033_battery: Fix device tree enumeration
      drivers/power/supply/Kconfig
    PCI/sysfs: Fix dsm_label_utf16s_to_utf8s() buffer overrun
  * f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfs
      fs/f2fs/super.c
    virtio_console: Assure used length from device is limited
    virtio_net: Fix error handling in virtnet_restore()
    virtio-blk: Fix memory leak among suspend/resume procedure
    ACPI: video: Add quirk for the Dell Vostro 3350
    ACPI: AMBA: Fix resource name in /proc/iomem
    pwm: tegra: Don't modify HW state in .remove callback
    power: supply: ab8500: add missing MODULE_DEVICE_TABLE
    power: supply: charger-manager: add missing MODULE_DEVICE_TABLE
  * NFS: nfs_find_open_context() may only select open files
      include/linux/nfs_fs.h
    ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirty
    orangefs: fix orangefs df output.
    PCI: tegra: Add missing MODULE_DEVICE_TABLE
    x86/fpu: Return proper error codes from user access functions
    watchdog: iTCO_wdt: Account for rebooting on second timeout
    watchdog: Fix possible use-after-free by calling del_timer_sync()
    watchdog: sc520_wdt: Fix possible use-after-free in wdt_turnoff()
    watchdog: Fix possible use-after-free in wdt_startup()
    ARM: 9087/1: kprobes: test-thumb: fix for LLVM_IAS=1
    power: reset: gpio-poweroff: add missing MODULE_DEVICE_TABLE
    power: supply: max17042: Do not enforce (incorrect) interrupt trigger type
    power: supply: ab8500: Avoid NULL pointers
    pwm: spear: Don't modify HW state in .remove callback
  * lib/decompress_unlz4.c: correctly handle zero-padding around initrds.
      lib/decompress_unlz4.c
  * i2c: core: Disable client irq on reboot/shutdown
      drivers/i2c/i2c-core-base.c
    intel_th: Wait until port is in reset before programming it
    staging: rtl8723bs: fix macro value for 2.4Ghz only device
    ALSA: hda: Add IRQ check for platform_get_irq()
    backlight: lm3630a: Fix return code of .update_status() callback
    powerpc/boot: Fixup device-tree on little endian
    usb: gadget: hid: fix error return code in hid_bind()
  * usb: gadget: f_hid: fix endianness issue with descriptors
      drivers/usb/gadget/function/f_hid.c
  * ALSA: bebob: add support for ToneWeal FW66
      sound/firewire/Kconfig
    Input: hideep - fix the uninitialized use in hideep_nvm_unlock()
  * ASoC: soc-core: Fix the error return code in snd_soc_of_parse_audio_routing()
      sound/soc/soc-core.c
    gpio: pca953x: Add support for the On Semi pca9655
    selftests/powerpc: Fix "no_handler" EBB selftest
    ALSA: ppc: fix error return code in snd_pmac_probe()
    gpio: zynq: Check return value of pm_runtime_get_sync
    powerpc/ps3: Add dma_mask to ps3_dma_region
    ALSA: sb: Fix potential double-free of CSP mixer elements
    selftests: timers: rtcpie: skip test if default RTC device does not exist
    s390/sclp_vt220: fix console name to match device
    mfd: da9052/stmpe: Add and modify MODULE_DEVICE_TABLE
    scsi: qedi: Fix null ref during abort handling
    scsi: iscsi: Fix shost->max_id use
  * scsi: iscsi: Fix conn use after free during resets
      include/scsi/libiscsi.h
  * scsi: iscsi: Add iscsi_cls_conn refcount helpers
      include/scsi/scsi_transport_iscsi.h
    fs/jfs: Fix missing error code in lmLogInit()
    scsi: scsi_dh_alua: Check for negative result value
    tty: serial: 8250: serial_cs: Fix a memory leak in error handling path
    ALSA: ac97: fix PM reference leak in ac97_bus_remove()
  * scsi: core: Cap scsi_host cmd_per_lun at can_queue
      drivers/scsi/hosts.c
    scsi: lpfc: Fix crash when lpfc_sli4_hba_setup() fails to initialize the SGLs
    scsi: lpfc: Fix "Unexpected timeout" error in direct attach topology
    w1: ds2438: fixing bug that would always get page0
  * Revert "ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro"
      sound/firewire/Kconfig
    misc/libmasm/module: Fix two use after free in ibmasm_init_one
    tty: serial: fsl_lpuart: fix the potential risk of division or modulo by zero
    PCI: aardvark: Fix kernel panic during PIO transfer
    PCI: aardvark: Don't rely on jiffies while holding spinlock
    tracing: Do not reference char * as a string in histograms
  * scsi: core: Fix bad pointer dereference when ehandler kthread is invalid
      drivers/scsi/hosts.c
    KVM: X86: Disable hardware breakpoints unconditionally before kvm_x86->run()
    KVM: x86: Use guest MAXPHYADDR from CPUID.0x8000_0008 iff TDP is enabled
  * smackfs: restrict bytes count in smk_set_cipso()
      security/smack/smackfs.c
    jfs: fix GPF in diFree
    pinctrl: mcp23s08: Fix missing unlock on error in mcp23s08_irq()
    media: uvcvideo: Fix pixel format change for Elgato Cam Link 4K
    media: gspca/sunplus: fix zero-length control requests
    media: gspca/sq905: fix control-request direction
    media: zr364xx: fix memory leak in zr364xx_start_readpipe
    media: dtv5100: fix control-request directions
  * media: subdev: disallow ioctl for saa6588/davinci
      include/media/v4l2-subdev.h
    PCI: aardvark: Fix checking for PIO Non-posted Request
  * PCI: Leave Apple Thunderbolt controllers on for s2idle or standby
      drivers/pci/quirks.c
    dm btree remove: assign new_root only when removal succeeds
  * coresight: tmc-etf: Fix global-out-of-bounds in tmc_update_etf_buffer()
      drivers/hwtracing/coresight/coresight-tmc-etf.c
    ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe
  * tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
      kernel/trace/trace.c
  * tracing: Simplify & fix saved_tgids logic
      kernel/trace/trace.c
  * seq_buf: Fix overflow in seq_buf_putmem_hex()
      lib/seq_buf.c
  * power: supply: ab8500: Fix an old bug
      include/linux/mfd/abx500/ux500_chargalg.h
    ipmi/watchdog: Stop watchdog timer when the current action is 'none'
    qemu_fw_cfg: Make fw_cfg_rev_attr a proper kobj_attribute
    ASoC: tegra: Set driver_name=tegra for all machine drivers
  * clocksource/arm_arch_timer: Improve Allwinner A64 timer workaround
      drivers/clocksource/arm_arch_timer.c
  * cpu/hotplug: Cure the cpusets trainwreck
      kernel/cpu.c
    ata: ahci_sunxi: Disable DIPM
    mmc: core: Allow UHS-I voltage switch for SDSC cards if supported
    mmc: core: clear flags before allowing to retune
    mmc: sdhci: Fix warning message when accessing RPMB in HS400 mode
    drm/msm/mdp4: Fix modifier support enabling
    pinctrl/amd: Add device HID for new AMD GPIO controller
    drm/amd/display: fix incorrrect valid irq check
    drm/radeon: Add the missed drm_gem_object_put() in radeon_user_framebuffer_create()
  * usb: gadget: f_fs: Fix setting of device and driver data cross-references
      drivers/usb/gadget/function/f_fs.c
    powerpc/barrier: Avoid collision with clang's __lwsync macro
  * fuse: reject internal errno
      fs/fuse/dev.c
    serial: mvebu-uart: fix calculation of clock divisor
    serial: mvebu-uart: clarify the baud rate derivation
  * bdi: Do not use freezable workqueue
      mm/backing-dev.c
  * fscrypt: don't ignore minor_hash when hash is 0
      fs/crypto/fname.c
    MIPS: set mips32r5 for virt extensions
  * sctp: add size validation when walking chunks
      net/sctp/input.c
  * sctp: validate from_addr_param return
      include/net/sctp/structs.h
      net/sctp/bind_addr.c
      net/sctp/input.c
      net/sctp/ipv6.c
      net/sctp/protocol.c
      net/sctp/sm_make_chunk.c
    Bluetooth: btusb: fix bt fiwmare downloading failure issue for qca btsoc.
  * Bluetooth: Shutdown controller after workqueues are flushed or cancelled
      net/bluetooth/hci_core.c
  * Bluetooth: Fix the HCI to MGMT status conversion table
      net/bluetooth/mgmt.c
    RDMA/cma: Fix rdma_resolve_route() memory leak
  * net: ip: avoid OOM kills with large UDP sends over loopback
      net/ipv4/ip_output.c
      net/ipv6/ip6_output.c
    media, bpf: Do not copy more entries than user space requested
  * wireless: wext-spy: Fix out-of-bounds warning
      net/wireless/wext-spy.c
    sfc: error code if SRIOV cannot be disabled
    sfc: avoid double pci_remove of VFs
    iwlwifi: pcie: free IML DMA memory allocation
    iwlwifi: mvm: don't change band on bound PHY contexts
    RDMA/rxe: Don't overwrite errno from ib_umem_get()
    vsock: notify server to shutdown when client has pending signal
    atm: nicstar: register the interrupt handler in the right place
    atm: nicstar: use 'dma_free_coherent' instead of 'kfree'
    MIPS: add PMD table accounting into MIPS'pmd_alloc_one
    rtl8xxxu: Fix device info for RTL8192EU devices
  * net: fix mistake path for netdev_features_strings
      include/linux/netdev_features.h
      include/uapi/linux/ethtool.h
    cw1200: add missing MODULE_DEVICE_TABLE
    wl1251: Fix possible buffer overflow in wl1251_cmd_scan
    wlcore/wl12xx: Fix wl12xx get_mac error if device is in ELP
  * xfrm: Fix error reporting in xfrm_state_construct.
      net/xfrm/xfrm_user.c
  * selinux: use __GFP_NOWARN with GFP_NOWAIT in the AVC
      security/selinux/avc.c
    fjes: check return value after calling platform_get_resource()
    net: micrel: check return value after calling platform_get_resource()
    net: mvpp2: check return value after calling platform_get_resource()
    net: bcmgenet: check return value after calling platform_get_resource()
    virtio_net: Remove BUG() to avoid machine dead
    ice: set the value of global config lock timeout longer
    pinctrl: mcp23s08: fix race condition in irq handler
    dm space maps: don't reset space map allocation cursor when committing
    RDMA/cxgb4: Fix missing error code in create_qp()
  * ipv6: use prandom_u32() for ID generation
      net/ipv6/output_core.c
    clk: tegra: Ensure that PLLU configuration is applied properly
    clk: renesas: r8a77995: Add ZA2 clock
    e100: handle eeprom as little endian
    udf: Fix NULL pointer dereference in udf_symlink function
    drm/virtio: Fix double free on probe failure
    reiserfs: add check for invalid 1st journal block
  * net: Treat __napi_schedule_irqoff() as __napi_schedule() on PREEMPT_RT
      net/core/dev.c
    atm: nicstar: Fix possible use-after-free in nicstar_cleanup()
    mISDN: fix possible use-after-free in HFC_cleanup()
    atm: iphase: fix possible use-after-free in ia_module_exit()
    hugetlb: clear huge pte during flush function on mips platform
    drm/amd/display: fix use_max_lb flag for 420 pixel formats
    net: pch_gbe: Use proper accessors to BE data in pch_ptp_match()
    drm/amd/amdgpu/sriov disable all ip hw status by default
  * drm/zte: Don't select DRM_KMS_FB_HELPER
      drivers/gpu/drm/zte/Kconfig
  * drm/mxsfb: Don't select DRM_KMS_FB_HELPER
      drivers/gpu/drm/mxsfb/Kconfig
    mmc: vub3000: fix control-request direction
    mmc: block: Disable CMDQ on the ioctl path
    perf llvm: Return -ENOMEM when asprintf() fails
    selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
    mm/huge_memory.c: don't discard hugepage if other processes are mapping it
    vfio/pci: Handle concurrent vma faults
    arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART
    serial: mvebu-uart: correctly calculate minimal possible baudrate
    powerpc: Offline CPU in stop_this_cpu()
    leds: ktd2692: Fix an error handling path
    leds: as3645a: Fix error return code in as3645a_parse_node()
  * configfs: fix memleak in configfs_release_bin_file
      fs/configfs/file.c
    ASoC: atmel-i2s: Fix usage of capture and playback at the same time
    extcon: max8997: Add missing modalias string
    extcon: sm5502: Drop invalid register write in sm5502_reg_data
    phy: ti: dm816x: Fix the error handling path in 'dm816x_usb_phy_probe()
    scsi: mpt3sas: Fix error return value in _scsih_expander_add()
    mtd: rawnand: marvell: add missing clk_disable_unprepare() on error in marvell_nfc_resume()
  * of: Fix truncation of memory sizes on 32-bit platforms
      drivers/of/fdt.c
      drivers/of/of_reserved_mem.c
    ASoC: cs42l42: Correct definition of CS42L42_ADC_PDN_MASK
    iio: prox: isl29501: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    serial: 8250: Actually allow UPF_MAGIC_MULTIPLIER baud rates
    staging: mt7621-dts: fix pci address for PCI memory range
    staging: gdm724x: check for overflow in gdm_lte_netif_rx()
    staging: gdm724x: check for buffer overflow in gdm_lte_multi_sdu_pkt()
    iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
    iio: adc: mxs-lradc: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: adc: hx711: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    eeprom: idt_89hpesx: Restore printing the unsupported fwnode name
    eeprom: idt_89hpesx: Put fwnode in matching case during ->probe()
    s390: appldata depends on PROC_SYSCTL
    visorbus: fix error return code in visorchipset_init()
    fsi/sbefifo: Fix reset timeout
    fsi/sbefifo: Clean up correct FIFO when receiving reset request from SBE
    fsi: scom: Reset the FSI2PIB engine for any error
    fsi: core: Fix return of error values on failures
    scsi: FlashPoint: Rename si_flags field
    tty: nozomi: Fix the error handling path of 'nozomi_card_init()'
    char: pcmcia: error out if 'num_bytes_read' is greater than 4 in set_protocol()
    Input: hil_kbd - fix error return code in hil_dev_connect()
    ASoC: rsnd: tidyup loop on rsnd_adg_clk_query()
    ASoC: hisilicon: fix missing clk_disable_unprepare() on error in hi6210_i2s_startup()
    iio: potentiostat: lmp91000: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
    iio: light: tcs3472: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: light: tcs3414: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: light: isl29125: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: prox: as3935: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: prox: pulsed-light: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: prox: srf08: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: humidity: am2315: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: gyro: bmg160: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: adc: vf610: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: adc: ti-ads1015: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: stk8ba50: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: stk8312: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: kxcjk-1013: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: hid: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: bma220: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: accel: bma180: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
    iio: adis_buffer: do not return ints in irq handlers
    mwifiex: re-fix for unaligned accesses
    tty: nozomi: Fix a resource leak in an error handling function
    RDMA/mlx5: Don't access NULL-cleared mpi pointer
    net: sched: fix warning in tcindex_alloc_perfect_hash
  * net: lwtunnel: handle MTU calculation in forwading
      include/net/ip.h
      include/net/ip6_route.h
      net/ipv4/route.c
  * writeback: fix obtain a reference to a freeing memcg css
      fs/fs-writeback.c
  * Bluetooth: Fix handling of HCI_LE_Advertising_Set_Terminated event
      net/bluetooth/hci_event.c
  * Bluetooth: mgmt: Fix slab-out-of-bounds in tlv_data_is_valid
      net/bluetooth/mgmt.c
  * ipv6: fix out-of-bound access in ip6_parse_tlv()
      net/ipv6/exthdrs.c
    ibmvnic: free tx_pool if tso_pool alloc fails
    Revert "ibmvnic: remove duplicate napi_schedule call in open function"
    i40e: Fix autoneg disabling for non-10GBaseT links
    i40e: Fix error handling in i40e_vsi_open
  * bpf: Do not change gso_size during bpf_skb_change_proto()
      net/core/filter.c
  * ipv6: exthdrs: do not blindly use init_net
      net/ipv6/exthdrs.c
    net: bcmgenet: Fix attaching to PYH failed on RPi 4B
    mac80211: remove iwlwifi specific workaround NDPs of null_response
    ieee802154: hwsim: avoid possible crash in hwsim_del_edge_nl()
    ieee802154: hwsim: Fix memory leak in hwsim_add_one
  * net/ipv4: swap flow ports when validating source
      net/ipv4/fib_frontend.c
    vxlan: add missing rcu_read_lock() in neigh_reduce()
    pkt_sched: sch_qfq: fix qfq_change_class() error path
    net: ethernet: ezchip: fix error handling
    net: ethernet: ezchip: fix UAF in nps_enet_remove
    net: ethernet: aeroflex: fix UAF in greth_of_remove
    samples/bpf: Fix the error return code of xdp_redirect's main()
    RDMA/rxe: Fix qp reference counting for atomic ops
    netfilter: nft_tproxy: restrict support to TCP and UDP transport protocols
    netfilter: nft_osf: check for TCP packet before further processing
    netfilter: nft_exthdr: check for IPv6 packet before further processing
    RDMA/mlx5: Don't add slave port to unaffiliated list
  * netlabel: Fix memory leak in netlbl_mgmt_add_common
      net/netlabel/netlabel_mgmt.c
    ath10k: Fix an error code in ath10k_add_interface()
    brcmsmac: mac80211_if: Fix a resource leak in an error handling path
    brcmfmac: correctly report average RSSI in station info
    brcmfmac: fix setting of station info chains bitmask
    ssb: Fix error return code in ssb_bus_scan()
    wcn36xx: Move hal_buf allocation to devm_kmalloc in probe
    ieee802154: hwsim: Fix possible memory leak in hwsim_subscribe_all_others
  * wireless: carl9170: fix LEDS build errors & warnings
      drivers/net/wireless/ath/carl9170/Kconfig
    tools/bpftool: Fix error return code in do_batch()
    drm: qxl: ensure surf.data is ininitialized
    RDMA/rxe: Fix failure during driver load
    ehea: fix error return code in ehea_restart_qps()
    drm/rockchip: cdn-dp-core: add missing clk_disable_unprepare() on error in cdn_dp_grf_write()
    net: pch_gbe: Propagate error from devm_gpio_request_one()
    net: mvpp2: Put fwnode in error case during ->probe()
    ocfs2: fix snprintf() checking
    blk-wbt: make sure throttle is enabled properly
  * blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled()
      block/blk-wbt.h
    ACPI: sysfs: Fix a buffer overrun problem with description_show()
    crypto: nx - Fix RCU warning in nx842_OF_upd_status
    spi: spi-sun6i: Fix chipselect/clock bug
    btrfs: clear log tree recovering status if starting transaction fails
    hwmon: (max31790) Fix fan speed reporting for fan7..12
    hwmon: (max31722) Remove non-standard ACPI device IDs
    media: s5p-g2d: Fix a memory leak on ctx->fh.m2m_ctx
    mmc: usdhi6rol0: fix error return code in usdhi6_probe()
    media: siano: Fix out-of-bounds warnings in smscore_load_firmware_family2()
    media: gspca/gl860: fix zero-length control requests
    media: tc358743: Fix error return code in tc358743_probe_of()
    media: exynos4-is: Fix a use after free in isp_video_release
    pata_ep93xx: fix deferred probing
    media: rc: i2c: Fix an error message
    crypto: ccp - Fix a resource leak in an error handling path
    evm: fix writing <securityfs>/evm overflow
    pata_octeon_cf: avoid WARN_ON() in ata_host_activate()
    media: I2C: change 'RST' to "RSET" to fix multiple build errors
    pata_rb532_cf: fix deferred probing
    sata_highbank: fix deferred probing
    crypto: ux500 - Fix error return code in hash_hw_final()
    crypto: ixp4xx - dma_unmap the correct address
    media: s5p_cec: decrement usage count if disabled
    ia64: mca_drv: fix incorrect array size calculation
    HID: wacom: Correct base usage for capacitive ExpressKey status bits
    ACPI: tables: Add custom DSDT file as makefile prerequisite
  * clocksource: Retry clock read if long delays detected
      kernel/time/clocksource.c
    platform/x86: toshiba_acpi: Fix missing error code in toshiba_acpi_setup_keyboard()
    ACPI: bus: Call kobject_put() in acpi_init() error path
    ACPICA: Fix memory leak caused by _CID repair function
    fs: dlm: fix memory leak when fenced
  * random32: Fix implicit truncation warning in prandom_seed_state()
      include/linux/prandom.h
    fs: dlm: cancel work sync othercon
  * block_dump: remove block_dump feature in mark_inode_dirty()
      fs/fs-writeback.c
    ACPI: EC: Make more Asus laptops use ECDT _GPE
  * lib: vsprintf: Fix handling of number field widths in vsscanf
      lib/kstrtox.c
      lib/kstrtox.h
      lib/vsprintf.c
    hv_utils: Fix passing zero to 'PTR_ERR' warning
    ACPI: processor idle: Fix up C-state latency if not ordered
    EDAC/ti: Add missing MODULE_DEVICE_TABLE
  * HID: do not use down_interruptible() when unbinding devices
      drivers/hid/hid-core.c
    regulator: da9052: Ensure enough delay time for .set_voltage_time_sel
  * btrfs: disable build on platforms having page size 256K
      fs/btrfs/Kconfig
    btrfs: abort transaction if we fail to update the delayed inode
    btrfs: fix error handling in __btrfs_update_delayed_inode
    media: imx-csi: Skip first few frames from a BT.656 source
    media: siano: fix device register error path
    media: dvb_net: avoid speculation from net slot
  * crypto: shash - avoid comparing pointers to exported functions under CFI
      crypto/shash.c
      include/crypto/internal/hash.h
    mmc: via-sdmmc: add a check against NULL pointer dereference
    media: dvd_usb: memory leak in cinergyt2_fe_attach
    media: st-hva: Fix potential NULL pointer dereferences
    media: bt8xx: Fix a missing check bug in bt878_probe
  * media: v4l2-core: Avoid the dangling pointer in v4l2_fh_release
      drivers/media/v4l2-core/v4l2-fh.c
    media: em28xx: Fix possible memory leak of em28xx struct
  * sched/fair: Fix ascii art by relpacing tabs
      kernel/sched/fair.c
    crypto: qat - remove unused macro in FW loader
    crypto: qat - check return code of qat_hal_rd_rel_reg()
    media: pvrusb2: fix warning in pvr2_i2c_core_done
    media: cobalt: fix race condition in setting HPD
    media: cpia2: fix memory leak in cpia2_usb_probe
    crypto: nx - add missing MODULE_DEVICE_TABLE
    regulator: uniphier: Add missing MODULE_DEVICE_TABLE
    spi: omap-100k: Fix the length judgment problem
    spi: spi-topcliff-pch: Fix potential double free in pch_spi_process_messages()
    spi: spi-loopback-test: Fix 'tx_buf' might be 'rx_buf'
  * spi: Make of_register_spi_device also set the fwnode
      drivers/spi/spi.c
  * fuse: check connected before queueing on fpq->io
      fs/fuse/dev.c
    evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loaded
    evm: Execute evm_inode_init_security() only when an HMAC key is loaded
    powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
  * seq_buf: Make trace_seq_putmem_hex() support data longer than 8
      lib/seq_buf.c
  * tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
      include/linux/tracepoint.h
      kernel/trace/bpf_trace.c
      kernel/tracepoint.c
    tracing/histograms: Fix parsing of "sym-offset" modifier
    rsi: fix AP mode with WPA failure due to encrypted EAPOL
    rsi: Assign beacon rate settings to the correct rate_info descriptor field
    ssb: sdio: Don't overwrite const buffer if block_write fails
    ath9k: Fix kernel NULL pointer dereference during ath_reset_internal()
    serial_cs: remove wrong GLOBETROTTER.cis entry
    serial_cs: Add Option International GSM-Ready 56K/ISDN modem
    serial: sh-sci: Stop dmaengine transfer in sci_stop_tx()
    iio: ltr501: ltr501_read_ps(): add missing endianness conversion
    iio: ltr501: ltr559: fix initialization of LTR501_ALS_CONTR
    iio: ltr501: mark register holding upper 8 bits of ALS_DATA{0,1} and PS_DATA as volatile, too
    iio: light: tcs3472: do not free unallocated IRQ
    rtc: stm32: Fix unbalanced clk_disable_unprepare() on probe error path
    s390/cio: dont call css_wait_for_slow_path() inside a lock
    SUNRPC: Should wake up the privileged task firstly.
    SUNRPC: Fix the batch tasks count wraparound.
    can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in TX path
    can: gw: synchronize rcu operations before removing gw job entry
    can: bcm: delay release of struct bcm_op after synchronize_rcu()
  * ext4: use ext4_grp_locked_error in mb_find_extent
      fs/ext4/mballoc.c
  * ext4: fix avefreec in find_group_orlov
      fs/ext4/ialloc.c
  * ext4: remove check for zero nr_to_scan in ext4_es_scan()
      fs/ext4/extents_status.c
  * ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
      fs/ext4/extents_status.c
  * ext4: return error code when ext4_fill_flex_info() fails
      fs/ext4/super.c
  * ext4: fix kernel infoleak via ext4_extent_header
      fs/ext4/extents.c
  * ext4: cleanup in-core orphan list if ext4_truncate() failed to get a transaction handle
      fs/ext4/super.c
    btrfs: clear defrag status of a root if starting transaction fails
    btrfs: send: fix invalid path for unlink operations after parent orphanization
    ARM: dts: at91: sama5d4: fix pinctrl muxing
    arm_pmu: Fix write counter incorrect in ARMv7 big-endian mode
    Input: joydev - prevent use of not validated data in JSIOCSBTNMAP ioctl
  * iov_iter_fault_in_readable() should do nothing in xarray case
      lib/iov_iter.c
    ntfs: fix validity check for file name attribute
  * xhci: solve a double free problem while doing s4
      drivers/usb/host/xhci-mem.c
  * usb: typec: Add the missed altmode_id_remove() in typec_register_altmode()
      drivers/usb/typec/class.c
  * usb: dwc3: Fix debugfs creation flow
      drivers/usb/dwc3/core.c
    USB: cdc-acm: blacklist Heimann USB Appset device
    usb: gadget: eem: fix echo command packet response issue
    net: can: ems_usb: fix use-after-free in ems_usb_disconnect()
    Input: usbtouchscreen - fix control-request directions
    media: dvb-usb: fix wrong definition
  * ALSA: usb-audio: Fix OOB access at proc output
      sound/usb/mixer.c
  * ALSA: usb-audio: fix rate on Ozone Z90 USB headset
      sound/usb/format.c
  * scsi: core: Retry I/O for Notify (Enable Spinup) Required error
      drivers/scsi/scsi_lib.c
  * Revert "clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940"
      include/linux/cpuhotplug.h
    Merge 4.19.197 into android-4.19-stable
Linux 4.19.197
  * clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940
      include/linux/cpuhotplug.h
    clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue
    clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support
    ARM: OMAP: replace setup_irq() by request_irq()
    KVM: SVM: Call SEV Guest Decommission if ASID binding fails
    xen/events: reset active flag for lateeoi events later
  * kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
      kernel/kthread.c
  * kthread_worker: split code for canceling the delayed work timer
      kernel/kthread.c
    ARM: dts: imx6qdl-sabresd: Remove incorrect power supply assignment
    KVM: SVM: Periodically schedule when unregistering regions on destroy
  * ext4: eliminate bogus error in ext4_data_block_valid_rcu()
      fs/ext4/block_validity.c
    drm/nouveau: fix dma_address check for CPU/GPU sync
    scsi: sr: Return appropriate error code when disk is ejected
  * mm, futex: fix shared futex pgoff on shmem huge page
      include/linux/hugetlb.h
      include/linux/pagemap.h
      kernel/futex.c
  * mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
      mm/page_vma_mapped.c
  * mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): get vma_address_end() earlier
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): use goto instead of while (1)
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): add a level of indentation
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): crossing page table boundary
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): settle PageHuge on entry
      mm/page_vma_mapped.c
  * mm: page_vma_mapped_walk(): use page for pvmw->page
      mm/page_vma_mapped.c
    mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
  * mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
      include/linux/mm.h
      mm/memory.c
      mm/truncate.c
  * mm/thp: fix page_address_in_vma() on file THP tails
      mm/rmap.c
  * mm/thp: fix vma_address() if virtual address below file offset
      mm/internal.h
      mm/page_vma_mapped.c
      mm/rmap.c
  * mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
      include/linux/rmap.h
      mm/page_vma_mapped.c
      mm/rmap.c
  * mm/thp: make is_huge_zero_pmd() safe and quicker
      include/linux/huge_mm.h
  * mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
      mm/pgtable-generic.c
  * mm/rmap: use page_not_mapped in try_to_unmap()
      mm/rmap.c
  * mm/rmap: remove unneeded semicolon in page_not_mapped()
      mm/rmap.c
  * mm: add VM_WARN_ON_ONCE_PAGE() macro
      include/linux/mmdebug.h

Bug: 196282886
Change-Id: I0af3abfa9aaa6da3e884f1a692da381e8e140bee
Signed-off-by: Lucas Wei <lucaswei@google.com>
2021-08-18 20:48:52 +08:00
Hugh Dickins
205899d6be mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
[ Upstream commit 732ed55823fc3ad998d43b86bf771887bcc5ec67 ]

Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
(!unmap_success): with dump_page() showing mapcount:1, but then its raw
struct page output showing _mapcount ffffffff i.e.  mapcount 0.

And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
all indicative of some mapcount difficulty in development here perhaps.
But the !CONFIG_DEBUG_VM path handles the failures correctly and
silently.

I believe the problem is that once a racing unmap has cleared pte or
pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
from try_to_unmap() before the racing task has reached decrementing
mapcount.

Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
TTU_SYNC to the options, and passing that from unmap_page().

When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
for both: the slight overhead added should rarely matter, except perhaps
if splitting sparsely-populated multiply-mapped shmem.  Once confident
that bugs are fixed, TTU_SYNC here can be removed, and the race
tolerated.

Link: https://lkml.kernel.org/r/c1e95853-8bcd-d8fd-55fa-e7f2488e78f@google.com
Fixes: fec89c109f ("thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Note on stable backport: upstream TTU_SYNC 0x10 takes the value which
5.11 commit 013339df116c ("mm/rmap: always do TTU_IGNORE_ACCESS") freed.
It is very tempting to backport that commit (as 5.10 already did) and
make no change here; but on reflection, good as that commit is, I'm
reluctant to include any possible side-effect of it in this series.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-11 12:49:28 +02:00
Lucas Wei
5ff9988b08 Merge android-4.19-stable (4.19.183) into android-msm-pixel-4.19-lts
Merge 4.19.183 into android-4.19-stable
    ANDROID: refresh ABI XML to new version
    ANDROID: refresh ABI XML
Linux 4.19.183
    cifs: Fix preauth hash corruption
    x86/apic/of: Fix CPU devicetree-node lookups
  * genirq: Disable interrupts for force threaded handlers
      kernel/irq/manage.c
  * ext4: fix potential error in ext4_do_update_inode
      fs/ext4/inode.c
  * ext4: do not try to set xattr into ea_inode if value is empty
      fs/ext4/xattr.c
  * ext4: find old entry again if failed to rename whiteout
      fs/ext4/namei.c
    x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()
    x86: Move TS_COMPAT back to asm/thread_info.h
  * kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
      fs/select.c
      include/linux/thread_info.h
      kernel/futex.c
      kernel/time/alarmtimer.c
      kernel/time/hrtimer.c
      kernel/time/posix-cpu-timers.c
    x86/ioapic: Ignore IRQ2 again
    perf/x86/intel: Fix a crash caused by zero PEBS status
    PCI: rpadlpar: Fix potential drc_name corruption in store functions
    iio: hid-sensor-temperature: Fix issues of timestamp channel
    iio: hid-sensor-prox: Fix scale not correct issue
    iio: hid-sensor-humidity: Fix alignment issue of timestamp channel
    iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler
    iio: adis16400: Fix an error code in adis16400_initial_setup()
    iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel
  * iio:adc:stm32-adc: Add HAS_IOMEM dependency
      drivers/iio/adc/Kconfig
  * usb: gadget: configfs: Fix KASAN use-after-free
      drivers/usb/gadget/configfs.c
  * USB: replace hardcode maximum usb string length by definition
      drivers/usb/gadget/composite.c
      drivers/usb/gadget/configfs.c
      drivers/usb/gadget/usbstring.c
      include/uapi/linux/usb/ch9.h
    usbip: Fix incorrect double assignment to udc->ud.tcp_rx
  * usb-storage: Add quirk to defeat Kindle's automatic unload
      drivers/usb/storage/transport.c
      drivers/usb/storage/unusual_devs.h
      include/linux/usb_usual.h
    powerpc: Force inlining of cpu_has_feature() to avoid build failure
    nvme-rdma: fix possible hang when failing to set io queues
    scsi: lpfc: Fix some error codes in debugfs
  * net/qrtr: fix __netdev_alloc_skb call
      net/qrtr/qrtr.c
    sunrpc: fix refcount leak for rpc auth modules
    svcrdma: disable timeouts on rdma backchannel
    NFSD: Repair misuse of sv_lock in 5.10.16-rt30.
    nvmet: don't check iosqes,iocqes for discovery controllers
    ASoC: fsl_ssi: Fix TDM slot setup for I2S mode
    btrfs: fix slab cache flags for free space tree bitmap
    btrfs: fix race when cloning extent buffer during rewind of an old root
    tools build: Check if gettid() is available before providing helper
    tools build feature: Check if eventfd() is available
    tools build feature: Check if get_current_dir_name() is available
    perf tools: Use %define api.pure full instead of %pure-parser
    lkdtm: don't move ctors to .rodata
  * vmlinux.lds.h: Create section for protection against instrumentation
      include/asm-generic/sections.h
      include/asm-generic/vmlinux.lds.h
      include/linux/compiler.h
      include/linux/compiler_types.h
      scripts/mod/modpost.c
  * Revert "PM: runtime: Update device status before letting suppliers suspend"
      drivers/base/power/runtime.c
    ALSA: hda: generic: Fix the micmute led init state
    ASoC: ak5558: Add MODULE_DEVICE_TABLE
    ASoC: ak4458: Add MODULE_DEVICE_TABLE
    ANDROID: clang: update to 12.0.4
    Merge 4.19.182 into android-4.19-stable
Linux 4.19.182
    net: dsa: b53: Support setting learning on port
    net: dsa: tag_mtk: fix 802.1ad VLAN egress
  * bpf: Add sanity check for upper ptr_limit
      kernel/bpf/verifier.c
  * bpf: Simplify alu_limit masking for pointer arithmetic
      kernel/bpf/verifier.c
  * bpf: Fix off-by-one for area size in creating mask to left
      kernel/bpf/verifier.c
  * bpf: Prohibit alu ops for pointer types not defining ptr_limit
      kernel/bpf/verifier.c
  * KVM: arm64: nvhe: Save the SPE context early
      arch/arm64/include/asm/kvm_hyp.h
  * ext4: check journal inode extents more carefully
      fs/ext4/block_validity.c
      fs/ext4/ext4.h
      fs/ext4/extents.c
      fs/ext4/indirect.c
      fs/ext4/inode.c
      fs/ext4/mballoc.c
  * Revert "net: Introduce parse_protocol header_ops callback"
      include/linux/netdevice.h
  * Revert "net: check if protocol extracted by virtio_net_hdr_set_proto is correct"
      include/linux/virtio_net.h
    Merge 4.19.181 into android-4.19-stable
Linux 4.19.181
    xen/events: avoid handling the same event on two cpus at the same time
    xen/events: don't unmask an event channel when an eoi is pending
    xen/events: reset affinity of 2-level event when tearing it down
    KVM: arm64: Fix exclusive limit for IPA size
    hwmon: (lm90) Fix max6658 sporadic wrong temperature reading
    x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
    binfmt_misc: fix possible deadlock in bm_register_write
    powerpc/64s: Fix instruction encoding for lis in ppc_function_entry()
  * include/linux/sched/mm.h: use rcu_dereference in in_vfork()
      include/linux/sched/mm.h
  * stop_machine: mark helpers __always_inline
      include/linux/stop_machine.h
  * hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
      kernel/time/hrtimer.c
  * configfs: fix a use-after-free in __configfs_open_file
      fs/configfs/file.c
    block: rsxx: fix error return code of rsxx_pci_probe()
    NFSv4.2: fix return value of _nfs4_get_security_label()
    sh_eth: fix TRSCER mask for R7S72100
    staging: comedi: pcl818: Fix endian problem for AI command data
    staging: comedi: pcl711: Fix endian problem for AI command data
    staging: comedi: me4000: Fix endian problem for AI command data
    staging: comedi: dmm32at: Fix endian problem for AI command data
    staging: comedi: das800: Fix endian problem for AI command data
    staging: comedi: das6402: Fix endian problem for AI command data
    staging: comedi: adv_pci1710: Fix endian problem for AI command data
    staging: comedi: addi_apci_1500: Fix endian problem for command sample
    staging: comedi: addi_apci_1032: Fix endian problem for COS sample
    staging: rtl8192e: Fix possible buffer overflow in _rtl92e_wx_set_scan
    staging: rtl8712: Fix possible buffer overflow in r8712_sitesurvey_cmd
    staging: ks7010: prevent buffer overflow in ks_wlan_set_scan()
    staging: rtl8188eu: fix potential memory corruption in rtw_check_beacon_data()
    staging: rtl8712: unterminated string leads to read overflow
    staging: rtl8188eu: prevent ->ssid overflow in rtw_wx_set_scan()
    staging: rtl8192u: fix ->ssid overflow in r8192_wx_set_scan()
    usbip: fix vudc usbip_sockfd_store races leading to gpf
    usbip: fix vhci_hcd attach_store() races leading to gpf
    usbip: fix stub_dev usbip_sockfd_store() races leading to gpf
    usbip: fix vudc to check for stream socket
    usbip: fix vhci_hcd to check for stream socket
    usbip: fix stub_dev to check for stream socket
    USB: serial: cp210x: add some more GE USB IDs
    USB: serial: cp210x: add ID for Acuity Brands nLight Air Adapter
    USB: serial: ch341: add new Product ID
    USB: serial: io_edgeport: fix memory leak in edge_startup
  * usb: xhci: Fix ASMedia ASM1042A and ASM3242 DMA addressing
      drivers/usb/host/xhci-pci.c
  * xhci: Improve detection of device initiated wake signal.
      drivers/usb/host/xhci.c
    usb: renesas_usbhs: Clear PIPECFG for re-enabling pipe with other EPNUM
    USB: usblp: fix a hang in poll() if disconnected
  * usb: dwc3: qcom: Honor wakeup enabled/disabled state
      drivers/usb/dwc3/dwc3-qcom.c
    usb: gadget: f_uac1: stop playback on function disable
    usb: gadget: f_uac2: always increase endpoint max_packet_size by one audio slot
  * USB: gadget: u_ether: Fix a configfs return code
      drivers/usb/gadget/function/u_ether_configfs.h
    Goodix Fingerprint device is not a modem
    mmc: cqhci: Fix random crash when remove mmc module/card
    mmc: core: Fix partition switch time for eMMC
    s390/dasd: fix hanging IO request during DASD driver unbind
    s390/dasd: fix hanging DASD driver unbind
  * Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")
      security/commoncap.c
  * ALSA: usb-audio: Apply the control quirk to Plantronics headsets
      sound/usb/quirks.c
  * ALSA: usb-audio: Fix "cannot get freq eq" errors on Dell AE515 sound bar
      sound/usb/quirks.c
    ALSA: hda: Avoid spurious unsol event handling during S3/S4
    ALSA: hda: Drop the BATCH workaround for AMD controllers
    ALSA: hda/hdmi: Cancel pending works before suspend
  * ALSA: usb: Add Plantronics C320-M USB ctrl msg delay quirk
      sound/usb/quirks.c
    scsi: target: core: Prevent underflow for service actions
  * scsi: target: core: Add cmd length set before cmd complete
      include/target/target_core_backend.h
    scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling
    s390/smp: __smp_rescan_cpus() - move cpumask away from stack
    i40e: Fix memory leak in i40e_probe
  * PCI: Fix pci_register_io_range() memory leak
      drivers/pci/pci.c
      lib/logic_pio.c
    PCI: mediatek: Add missing of_node_put() to fix reference leak
    PCI: xgene-msi: Fix race in installing chained irq handler
    sparc64: Use arch_validate_flags() to validate ADI flag
    sparc32: Limit memblock allocation to low memory
    powerpc/perf: Record counter overflow always if SAMPLE_IP is unset
    powerpc: improve handling of unrecoverable system reset
    powerpc/pci: Add ppc_md.discover_phbs()
    mmc: mediatek: fix race condition between msdc_request_timeout and irq
    mmc: mxs-mmc: Fix a resource leak in an error handling path in 'mxs_mmc_probe()'
    udf: fix silent AED tagLocation corruption
    i2c: rcar: optimize cacheline to minimize HW race condition
  * net: phy: fix save wrong speed and duplex problem if autoneg is on
      drivers/net/phy/phy.c
    media: v4l: vsp1: Fix bru null pointer access
    media: v4l: vsp1: Fix uif null pointer access
    media: usbtv: Fix deadlock on suspend
    sh_eth: fix TRSCER mask for R7S9210
    s390/cio: return -EFAULT if copy_to_user() fails
    drm: meson_drv add shutdown function
  * drm/compat: Clear bounce structures
      drivers/gpu/drm/drm_ioc32.c
    s390/cio: return -EFAULT if copy_to_user() fails again
    perf traceevent: Ensure read cmdlines are null terminated.
    selftests: forwarding: Fix race condition in mirror installation
    net: stmmac: fix watchdog timeout during suspend/resume stress test
    net: stmmac: stop each tx channel independently
  * net: qrtr: fix error return code of qrtr_sendmsg()
      net/qrtr/qrtr.c
    net: davicom: Fix regulator not turned off on driver removal
    net: davicom: Fix regulator not turned off on failed probe
    net: lapbether: Remove netif_start_queue / netif_stop_queue
  * cipso,calipso: resolve a number of problems with the DOI refcounts
      net/ipv4/cipso_ipv4.c
      net/ipv6/calipso.c
      net/netlabel/netlabel_cipso_v4.c
    net: usb: qmi_wwan: allow qmimux add/del with master up
  * net: sched: avoid duplicates in classes dump
      net/sched/sch_api.c
    net: stmmac: fix incorrect DMA channel intr enable setting of EQoS v4.10
    net/mlx4_en: update moderation when config reset
    net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
  * net: check if protocol extracted by virtio_net_hdr_set_proto is correct
      include/linux/virtio_net.h
    sh_eth: fix TRSCER mask for SH771x
  * Revert "mm, slub: consider rest of partial list if acquire_slab() fails"
      mm/slub.c
    scripts/recordmcount.{c,pl}: support -ffunction-sections .text.* section names
    cifs: return proper error code in statfs(2)
  * tcp: add sanity tests to TCP_QUEUE_SEQ
      net/ipv4/tcp.c
  * tcp: annotate tp->write_seq lockless reads
      include/net/tcp.h
      net/ipv4/tcp.c
      net/ipv4/tcp_diag.c
      net/ipv4/tcp_ipv4.c
      net/ipv4/tcp_minisocks.c
      net/ipv4/tcp_output.c
      net/ipv6/tcp_ipv6.c
  * tcp: annotate tp->copied_seq lockless reads
      net/ipv4/tcp.c
      net/ipv4/tcp_diag.c
      net/ipv4/tcp_input.c
      net/ipv4/tcp_ipv4.c
      net/ipv4/tcp_minisocks.c
      net/ipv4/tcp_output.c
      net/ipv6/tcp_ipv6.c
    mt76: dma: do not report truncated frames to mac80211
  * netfilter: x_tables: gpf inside xt_find_revision()
      net/netfilter/x_tables.c
    can: flexcan: enable RX FIFO after FRZ/HALT valid
    can: flexcan: assert FRZ bit in flexcan_chip_freeze()
  * can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership
      include/linux/can/skb.h
  * net: Introduce parse_protocol header_ops callback
      include/linux/netdevice.h
  * net: Fix gro aggregation for udp encaps with zero csum
      net/ipv4/udp_offload.c
    ath9k: fix transmitting to stations in dynamic SMPS mode
    ethernet: alx: fix order of calls on resume
  * uapi: nfnetlink_cthelper.h: fix userspace compilation error
      include/uapi/linux/netfilter/nfnetlink_cthelper.h
  * FROMGIT: configfs: fix a use-after-free in __configfs_open_file
      fs/configfs/file.c
    ANDROID: GKI: Enable CONFIG_BT for x86
  * Revert "Revert "zram: close udev startup race condition as default groups""
      drivers/block/zram/zram_drv.c
  * Revert "block: genhd: add 'groups' argument to device_add_disk"
      block/genhd.c
      drivers/scsi/sd.c
      include/linux/genhd.h
    Revert "nvme: register ns_id attributes as default sysfs groups"
    Revert "aoe: register default groups with device_add_disk()"
  * Revert "zram: register default groups with device_add_disk()"
      drivers/block/zram/zram_drv.c
    Revert "virtio-blk: modernize sysfs attribute creation"
    Merge 4.19.180 into android-4.19-stable
Linux 4.19.180
    mmc: sdhci-of-dwcmshc: set SDHCI_QUIRK2_PRESET_VALUE_BROKEN
    drm/msm/a5xx: Remove overwriting A5XX_PC_DBG_ECO_CNTL register
  * misc: eeprom_93xx46: Add quirk to support Microchip 93LC46B eeprom
      include/linux/eeprom_93xx46.h
  * PCI: Add function 1 DMA alias quirk for Marvell 9215 SATA controller
      drivers/pci/quirks.c
    ASoC: Intel: bytcr_rt5640: Add quirk for ARCHOS Cesium 140
    media: cx23885: add more quirks for reset DMA on some AMD IOMMU
  * HID: mf: add support for 0079:1846 Mayflash/Dragonrise USB Gamecube Adapter
      drivers/hid/hid-ids.h
      drivers/hid/hid-quirks.c
    platform/x86: acer-wmi: Add ACER_CAP_KBD_DOCK quirk for the Aspire Switch 10E SW3-016
    platform/x86: acer-wmi: Add support for SW_TABLET_MODE on Switch devices
    platform/x86: acer-wmi: Add ACER_CAP_SET_FUNCTION_MODE capability flag
    platform/x86: acer-wmi: Add new force_caps module parameter
    platform/x86: acer-wmi: Cleanup accelerometer device handling
    platform/x86: acer-wmi: Cleanup ACER_CAP_FOO defines
    mwifiex: pcie: skip cancel_work_sync() on reset failure path
    iommu/amd: Fix sleeping in atomic in increase_address_space()
  * dm table: fix zoned iterate_devices based device capability checks
      drivers/md/dm-table.c
  * dm table: fix DAX iterate_devices based device capability checks
      drivers/md/dm-table.c
  * dm table: fix iterate_devices based device capability checks
      drivers/md/dm-table.c
  * net: dsa: add GRO support via gro_cells
      net/dsa/Kconfig
    r8169: fix resuming from suspend on RTL8105e if machine runs on battery
  * dm verity: fix FEC for RS roots unaligned to block size
      drivers/md/dm-verity-fec.c
    rsxx: Return -EFAULT if copy_to_user() fails
  * RDMA/rxe: Fix missing kconfig dependency on CRYPTO
      drivers/infiniband/sw/rxe/Kconfig
    ALSA: ctxfi: cthw20k2: fix mask on conf to allow 4 bits
    virtio-blk: modernize sysfs attribute creation
  * zram: register default groups with device_add_disk()
      drivers/block/zram/zram_drv.c
    aoe: register default groups with device_add_disk()
    nvme: register ns_id attributes as default sysfs groups
  * block: genhd: add 'groups' argument to device_add_disk
      block/genhd.c
      drivers/scsi/sd.c
      include/linux/genhd.h
  * Revert "zram: close udev startup race condition as default groups"
      drivers/block/zram/zram_drv.c
    usbip: tools: fix build error for multiple definition
    drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie
  * dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size
      drivers/md/dm-bufio.c
  * PM: runtime: Update device status before letting suppliers suspend
      drivers/base/power/runtime.c
    btrfs: unlock extents in btrfs_zero_range in case of quota reservation errors
    btrfs: free correct amount of space in btrfs_delayed_inode_reserve_metadata
    btrfs: validate qgroup inherit for SNAP_CREATE_V2 ioctl
    btrfs: fix raid6 qstripe kmap
    btrfs: raid56: simplify tracking of Q stripe presence
  * ANDROID: GKI: hack up fs/sysfs/file.c to prevent GENKSYMS change
      fs/sysfs/file.c
  * Revert "arm64: Avoid redundant type conversions in xchg() and cmpxchg()"
      arch/arm64/include/asm/atomic_ll_sc.h
      arch/arm64/include/asm/atomic_lse.h
      arch/arm64/include/asm/cmpxchg.h
    Merge 4.19.179 into android-4.19-stable
Linux 4.19.179
    ALSA: hda/realtek: Apply dual codec quirks for MSI Godlike X570 board
    ALSA: hda/realtek: Add quirk for Clevo NH55RZQ
  * media: v4l: ioctl: Fix memory leak in video_usercopy
      drivers/media/v4l2-core/v4l2-ioctl.c
  * swap: fix swapfile read/write offset
      mm/page_io.c
      mm/swapfile.c
  * zsmalloc: account the number of compacted pages correctly
      drivers/block/zram/zram_drv.c
      include/linux/zsmalloc.h
      mm/zsmalloc.c
    xen-netback: respect gnttab_map_refs()'s return value
    Xen/gnttab: handle p2m update errors on a per-slot basis
    scsi: iscsi: Verify lengths on passthrough PDUs
    scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE
  * sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
      fs/sysfs/file.c
      include/linux/sysfs.h
    scsi: iscsi: Restrict sessions and handles to admin capabilities
    ASoC: Intel: bytcr_rt5640: Add quirk for the Acer One S1002 tablet
    ASoC: Intel: bytcr_rt5640: Add quirk for the Voyo Winpad A15 tablet
    ASoC: Intel: bytcr_rt5640: Add quirk for the Estar Beauty HD MID 7316R tablet
    parisc: Bump 64-bit IRQ stack size to 64 KB
    btrfs: fix error handling in commit_fs_roots
  * f2fs: fix to set/clear I_LINKABLE under i_lock
      fs/f2fs/namei.c
  * f2fs: handle unallocated section and zone on pinned/atgc
      fs/f2fs/segment.h
    media: uvcvideo: Allow entities with no pads
    drm/amd/display: Guard against NULL pointer deref when get_i2c_info fails
  * PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse
      drivers/pci/pci.c
    crypto: tcrypt - avoid signed overflow in byte count
    staging: most: sound: add sanity check for function argument
  * Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_data
      net/bluetooth/amp.c
    x86/build: Treat R_386_PLT32 relocation as R_386_PC32
    ath10k: fix wmi mgmt tx queue full due to race condition
    pktgen: fix misuse of BUG_ON() in pktgen_thread_worker()
    Bluetooth: hci_h5: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for btrtl
    wlcore: Fix command execute failure 19 for wl12xx
    vt/consolemap: do font sum unsigned
    x86/reboot: Add Zotac ZBOX CI327 nano PCI reboot quirk
    staging: fwserial: Fix error handling in fwserial_create
    rsi: Move card interrupt handling to RX thread
    rsi: Fix TX EAPOL packet handling against iwlwifi AP
    dt-bindings: net: btusb: DT fix s/interrupt-name/interrupt-names/
  * net: bridge: use switchdev for port flags set through sysfs too
      net/bridge/br_sysfs_if.c
    mm/hugetlb.c: fix unnecessary address expansion of pmd sharing
  * net: fix up truesize of cloned skb in skb_prepare_for_shift()
      net/core/skbuff.c
  * smackfs: restrict bytes count in smackfs write functions
      security/smack/smackfs.c
    xfs: Fix assert failure in xfs_setattr_size()
    media: mceusb: sanity check for prescaler value
    udlfb: Fix memory leak in dlfb_usb_probe
    JFS: more checks for invalid superblock
    MIPS: VDSO: Use CLANG_FLAGS instead of filtering out '--target='
  * arm64: Use correct ll/sc atomic constraints
      arch/arm64/include/asm/atomic_ll_sc.h
  * arm64: cmpxchg: Use "K" instead of "L" for ll/sc immediate constraint
      arch/arm64/include/asm/atomic_ll_sc.h
  * arm64: Avoid redundant type conversions in xchg() and cmpxchg()
      arch/arm64/include/asm/atomic_ll_sc.h
      arch/arm64/include/asm/atomic_lse.h
      arch/arm64/include/asm/cmpxchg.h
  * arm64 module: set plt* section addresses to 0x0
      arch/arm64/kernel/module.lds
    virtio/s390: implement virtio-ccw revision 2 correctly
    drm/virtio: use kvmalloc for large allocations
    hugetlb: fix update_and_free_page contig page struct assumption
    net: usb: qmi_wwan: support ZTE P685M modem
    ANDROID: clang: update to 12.0.3
  * Revert "block: split .sysfs_lock into two locks"
      block/blk-core.c
      block/blk-mq-sysfs.c
      block/blk-sysfs.c
      block/blk.h
      block/elevator.c
      include/linux/blkdev.h
  * Revert "block: fix race between switching elevator and removing queues"
      block/blk-sysfs.c
  * Revert "block: don't release queue's sysfs lock during switching elevator"
      block/blk-sysfs.c
      block/elevator.c
  * Revert "dm: fix deadlock when swapping to encrypted device"
      drivers/md/dm-core.h
      drivers/md/dm.c
      include/linux/device-mapper.h
    Merge 4.19.178 into android-4.19-stable
Linux 4.19.178
    ARM: dts: aspeed: Add LCLK to lpc-snoop
    net: qrtr: Fix memory leak in qrtr_tun_open
    dm era: Update in-core bitset after committing the metadata
  * net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
      include/linux/icmpv6.h
      include/linux/ipv6.h
      include/net/icmp.h
      net/ipv4/icmp.c
      net/ipv6/icmp.c
      net/ipv6/ip6_icmp.c
  * ipv6: silence compilation warning for non-IPV6 builds
      include/linux/icmpv6.h
  * ipv6: icmp6: avoid indirect call for icmpv6_send()
      include/linux/icmpv6.h
      net/ipv6/icmp.c
      net/ipv6/ip6_icmp.c
  * xfrm: interface: use icmp_ndo_send helper
      net/xfrm/xfrm_interface.c
    sunvnet: use icmp_ndo_send helper
    gtp: use icmp_ndo_send helper
  * icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n
      include/linux/icmpv6.h
  * icmp: introduce helper for nat'd source address in network device context
      include/linux/icmpv6.h
      include/net/icmp.h
      net/ipv4/icmp.c
      net/ipv6/ip6_icmp.c
    dm era: only resize metadata in preresume
    dm era: Reinitialize bitset cache before digesting a new writeset
    dm era: Use correct value size in equality function of writeset tree
    dm era: Fix bitset memory leaks
    dm era: Verify the data block size hasn't changed
    dm era: Recover committed writeset after crash
  * dm: fix deadlock when swapping to encrypted device
      drivers/md/dm-core.h
      drivers/md/dm.c
      include/linux/device-mapper.h
    gfs2: Don't skip dlm unlock if glock has an lvb
    sparc32: fix a user-triggerable oops in clear_user()
  * f2fs: fix out-of-repair __setattr_copy()
      fs/f2fs/file.c
    cpufreq: intel_pstate: Get per-CPU max freq via MSR_HWP_CAPABILITIES if available
  * printk: fix deadlock when kernel panic
      kernel/printk/printk_safe.c
    gpio: pcf857x: Fix missing first interrupt
    mmc: sdhci-esdhc-imx: fix kernel panic when remove module
  * module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
      kernel/module.c
  * arm64: Extend workaround for erratum 1024718 to all versions of Cortex-A55
      arch/arm64/Kconfig
      arch/arm64/kernel/cpufeature.c
    libnvdimm/dimm: Avoid race between probe and available_slots_show()
  * hugetlb: fix copy_huge_page_from_user contig page struct assumption
      mm/memory.c
    x86: fix seq_file iteration for pat/memtype.c
    seq_file: document how per-entry resources are managed.
    fs/affs: release old buffer head on error path
    mtd: spi-nor: hisi-sfc: Put child node np on error path
    watchdog: mei_wdt: request stop on unregister
  * arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
      arch/arm64/kernel/probes/uprobes.c
    floppy: reintroduce O_NDELAY fix
    x86/reboot: Force all cpus to exit VMX root if VMX is supported
    media: ipu3-cio2: Fix mbus_code processing in cio2_subdev_set_fmt()
    staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
    staging: gdm724x: Fix DMA from stack
    staging/mt7621-dma: mtk-hsdma.c->hsdma-mt7621.c
    dts64: mt7622: fix slow sd card access
  * pstore: Fix typo in compression option name
      fs/pstore/platform.c
    drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
    misc: rtsx: init of rts522a add OCP power off when no card is present
  * seccomp: Add missing return in non-void function
      kernel/seccomp.c
    crypto: sun4i-ss - handle BigEndian for cipher
    crypto: sun4i-ss - checking sg length is not sufficient
  * crypto: arm64/sha - add missing module aliases
      arch/arm64/crypto/sha1-ce-glue.c
      arch/arm64/crypto/sha2-ce-glue.c
    btrfs: fix extent buffer leak on failure to copy root
    btrfs: fix reloc root leak with 0 ref reloc roots on recovery
    btrfs: abort the transaction if we fail to inc ref in btrfs_copy_root
    KEYS: trusted: Fix migratable=1 failing
    tpm_tis: Clean up locality release
    tpm_tis: Fix check_locality for correct locality acquisition
    ALSA: hda/realtek: modify EAPD in the ALC886
    USB: serial: mos7720: fix error code in mos7720_write()
    USB: serial: mos7840: fix error code in mos7840_write()
    USB: serial: ftdi_sio: fix FTX sub-integer prescaler
  * usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
      drivers/usb/dwc3/gadget.c
  * usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
      drivers/usb/dwc3/gadget.c
    usb: musb: Fix runtime PM race in musb_queue_resume_work
    USB: serial: option: update interface mapping for ZTE P685M
    Input: i8042 - add ASUS Zenbook Flip to noselftest list
    Input: joydev - prevent potential read overflow in ioctl
  * Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
      drivers/input/joystick/xpad.c
    Input: raydium_ts_i2c - do not send zero length
    HID: wacom: Ignore attempts to overwrite the touch_max value from HID
    ACPI: configfs: add missing check after configfs_register_default_group()
    ACPI: property: Fix fwnode string properties matching
  * blk-settings: align max_sectors on "logical_block_size" boundary
      block/blk-settings.c
  * scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
      drivers/scsi/bnx2fc/Kconfig
  * mm/rmap: fix potential pte_unmap on an not mapped pte
      include/linux/rmap.h
    i2c: brcmstb: Fix brcmstd_send_i2c_cmd condition
  * arm64: Add missing ISB after invalidating TLB in __primary_switch
      arch/arm64/kernel/head.S
    r8169: fix jumbo packet handling on RTL8168e
    mm/hugetlb: fix potential double free in hugetlb_register_node() error path
  * mm/memory.c: fix potential pte_unmap_unlock pte error
      mm/memory.c
    ocfs2: fix a use after free on error
    vxlan: move debug check after netdev unregister
    net/mlx4_core: Add missed mlx4_free_cmd_mailbox()
    i40e: Fix add TC filter for IPv6
    i40e: Fix VFs not created
    i40e: Fix overwriting flow control settings during driver loading
    i40e: Add zero-initialization of AQ command structures
    i40e: Fix flow for IPv6 next header (extension header)
    regmap: sdw: use _no_pm functions in regmap_read/write
  * ext4: fix potential htree index checksum corruption
      fs/ext4/namei.c
    drm/msm/dsi: Correct io_start for MSM8994 (20nm PHY)
  * PCI: Align checking of syscall user config accessors
      drivers/pci/syscall.c
    VMCI: Use set_page_dirty_lock() when unregistering guest memory
    pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare()
    misc: eeprom_93xx46: Add module alias to avoid breaking support for non device tree users
    misc: eeprom_93xx46: Fix module alias to enable module autoprobe
    sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
    Input: elo - fix an error code in elo_connect()
    perf test: Fix unaligned access in sample parsing test
    perf intel-pt: Fix missing CYC processing in PSB
    Input: sur40 - fix an error code in sur40_probe()
    spi: pxa2xx: Fix the controller numbering for Wildcat Point
    clk: qcom: gcc-msm8998: Fix Alpha PLL type for all GPLLs
    powerpc/8xx: Fix software emulation interrupt
    powerpc/pseries/dlpar: handle ibm, configure-connector delay status
    mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq()
    spi: stm32: properly handle 0 byte transfer
    RDMA/rxe: Correct skb on loopback path
    RDMA/rxe: Fix coding error in rxe_recv.c
    perf tools: Fix DSO filtering when not finding a map for a sampled address
  * tracepoint: Do not fail unregistering a probe due to memory failure
      kernel/tracepoint.c
  * amba: Fix resource leak for drivers without .remove
      drivers/amba/bus.c
    ARM: 9046/1: decompressor: Do not clear SCTLR.nTLSMD for ARMv7+ cores
    mmc: renesas_sdhi_internal_dmac: Fix DMA buffer alignment from 8 to 128-bytes
    mmc: usdhi6rol0: Fix a resource leak in the error handling path of the probe
    powerpc/47x: Disable 256k page size
    KVM: PPC: Make the VMX instruction emulation routines static
    IB/umad: Return EPOLLERR in case of when device disassociated
    IB/umad: Return EIO in case of when device disassociated
    auxdisplay: ht16k33: Fix refresh rate handling
    isofs: release buffer head before return
    regulator: s5m8767: Drop regulators OF node reference
    spi: atmel: Put allocated master before return
  * certs: Fix blacklist flag type confusion
      include/linux/key.h
      security/keys/key.c
    regulator: axp20x: Fix reference cout leak
    clk: sunxi-ng: h6: Fix clock divider range on some clocks
    RDMA/mlx5: Use the correct obj_id upon DEVX TIR creation
    clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined
  * rtc: s5m: select REGMAP_I2C
      drivers/rtc/Kconfig
    power: reset: at91-sama5d2_shdwc: fix wkupdbc mask
  * of/fdt: Make sure no-map does not remove already reserved regions
      drivers/of/fdt.c
  * fdt: Properly handle "no-map" field in the memory region
      drivers/of/fdt.c
    mfd: bd9571mwv: Use devm_mfd_add_devices()
    dmaengine: hsu: disable spurious interrupt
    dmaengine: owl-dma: Fix a resource leak in the remove function
    dmaengine: fsldma: Fix a resource leak in an error handling path of the probe function
    dmaengine: fsldma: Fix a resource leak in the remove function
  * HID: core: detect and skip invalid inputs to snto32()
      drivers/hid/hid-core.c
    clk: sunxi-ng: h6: Fix CEC clock
    spi: cadence-quadspi: Abort read if dummy cycles required are too many
  * quota: Fix memory leak when handling corrupted quota file
      fs/quota/quota_v2.c
    clk: meson: clk-pll: fix initializing the old rate (fallback) for a PLL
  * capabilities: Don't allow writing ambiguous v3 file capabilities
      security/commoncap.c
    jffs2: fix use after free in jffs2_sum_write_data()
    fs/jfs: fix potential integer overflow on shift of a int
  * ima: Free IMA measurement buffer after kexec syscall
      include/linux/kexec.h
    ima: Free IMA measurement buffer on error
  * crypto: ecdh_helper - Ensure 'len >= secret.len' in decode_key()
      crypto/ecdh_helper.c
    hwrng: timeriomem - Fix cooldown period calculation
    btrfs: clarify error returns values in __load_free_space_cache
    Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
    drm/amdgpu: Prevent shift wrapping in amdgpu_read_mask()
  * f2fs: fix to avoid inconsistent quota data
      fs/f2fs/file.c
      fs/f2fs/inline.c
    ASoC: cpcap: fix microphone timeslot mask
    ata: ahci_brcm: Add back regulators management
    crypto: talitos - Work around SEC6 ERRATA (AES-CTR mode data size error)
    media: uvcvideo: Accept invalid bFormatIndex and bFrameIndex values
    media: pxa_camera: declare variable when DEBUG is defined
    media: cx25821: Fix a bug when reallocating some dma memory
    media: qm1d1c0042: fix error return code in qm1d1c0042_init()
    media: lmedm04: Fix misuse of comma
    drm/amd/display: Fix 10/12 bpc setup in DCE output bit depth reduction.
    crypto: bcm - Rename struct device_private to bcm_device_private
    ASoC: cs42l56: fix up error handling in probe
    media: tm6000: Fix memleak in tm6000_start_stream
    media: media/pci: Fix memleak in empress_init
    media: em28xx: Fix use-after-free in em28xx_alloc_urbs
    media: vsp1: Fix an error handling path in the probe function
    media: camss: missing error code in msm_video_register()
    media: i2c: ov5670: Fix PIXEL_RATE minimum value
    MIPS: lantiq: Explicitly compare LTQ_EBU_PCC_ISTAT against 0
    MIPS: c-r4k: Fix section mismatch for loongson2_sc_init
    drm/amdgpu: Fix macro name _AMDGPU_TRACE_H_ in preprocessor if condition
    crypto: sun4i-ss - fix kmap usage
    gma500: clean up error handling in init
    drm/gma500: Fix error return code in psb_driver_load()
  * fbdev: aty: SPARC64 requires FB_ATY_CT
      drivers/video/fbdev/Kconfig
    net: mvneta: Remove per-cpu queue mapping for Armada 3700
    net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP
    net: amd-xgbe: Reset link when the link never comes back
    net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning
    net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
    ibmvnic: skip send_request_unmap for timeout reset
    ibmvnic: add memory barrier to protect long term buffer
    b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
    cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds
  * tcp: fix SO_RCVLOWAT related hangs under mem pressure
      include/net/tcp.h
  * bpf: Fix bpf_fib_lookup helper MTU check for SKB ctx
      net/core/filter.c
    mac80211: fix potential overflow when multiplying to u32 integers
    xen/netback: fix spurious event detection for common event case
    bnxt_en: reverse order of TX disable and carrier off
    ibmvnic: Set to CLOSED state even on error
    ath9k: fix data bus crash when setting nf_override via debugfs
  * bpf_lru_list: Read double-checked variable once without lock
      kernel/bpf/bpf_lru_list.c
    soc: aspeed: snoop: Add clock control logic
    ARM: s3c: fix fiq for clang IAS
    arm64: dts: msm8916: Fix reserved and rfsa nodes unit address
    ARM: dts: armada388-helios4: assign pinctrl to each fan
    ARM: dts: armada388-helios4: assign pinctrl to LEDs
    staging: rtl8723bs: wifi_regd.c: Fix incorrect number of regulatory rules
    usb: dwc2: Make "trimming xfer length" a debug message
    usb: dwc2: Abort transaction after errors with unknown reason
    usb: dwc2: Do not update data length if it is 0 on inbound transfers
    ARM: dts: Configure missing thermal interrupt for 4430
    memory: ti-aemif: Drop child node when jumping out loop
  * Bluetooth: Put HCI device if inquiry procedure interrupts
      net/bluetooth/hci_core.c
  * Bluetooth: drop HCI device reference before return
      net/bluetooth/a2mp.c
    usb: gadget: u_audio: Free requests only after callback
  * ACPICA: Fix exception code class checks
      include/acpi/acexcep.h
    cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()
    cpufreq: brcmstb-avs-cpufreq: Free resources in error path
    arm64: dts: allwinner: A64: Limit MMC2 bus frequency to 150 MHz
    arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card
    arm64: dts: allwinner: A64: properly connect USB PHY to port 0
  * bpf: Avoid warning when re-casting __bpf_call_base into __bpf_call_base_args
      include/linux/filter.h
    arm64: dts: exynos: correct PMIC interrupt trigger level on Espresso
    arm64: dts: exynos: correct PMIC interrupt trigger level on TM2
    ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid XU3 family
    ARM: dts: exynos: correct PMIC interrupt trigger level on Arndale Octa
    ARM: dts: exynos: correct PMIC interrupt trigger level on Spring
    ARM: dts: exynos: correct PMIC interrupt trigger level on Rinato
    ARM: dts: exynos: correct PMIC interrupt trigger level on Monk
    ARM: dts: exynos: correct PMIC interrupt trigger level on Artik 5
  * Bluetooth: Fix initializing response id after clearing struct
      net/bluetooth/a2mp.c
    Bluetooth: btqcomsmd: Fix a resource leak in error handling paths in the probe function
    ath10k: Fix error handling in case of CE pipe init failure
  * random: fix the RNDRESEEDCRNG ioctl
      drivers/char/random.c
    MIPS: vmlinux.lds.S: add missing PAGE_ALIGNED_DATA() section
  * ALSA: usb-audio: Fix PCM buffer allocation in non-vmalloc mode
      sound/usb/pcm.c
    bfq: Avoid false bfq queue merging
    PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064
    kdb: Make memory allocations more robust
  * vmlinux.lds.h: add DWARF v5 sections
      include/asm-generic/vmlinux.lds.h
  * locking/static_key: Fix false positive warnings on concurrent dec/inc
      kernel/jump_label.c
  * jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations
      kernel/jump_label.c
    scripts/recordmcount.pl: support big endian for ARCH sh
    cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
    NET: usb: qmi_wwan: Adding support for Cinterion MV31
  * block: don't release queue's sysfs lock during switching elevator
      block/blk-sysfs.c
      block/elevator.c
  * block: fix race between switching elevator and removing queues
      block/blk-sysfs.c
  * block: split .sysfs_lock into two locks
      block/blk-core.c
      block/blk-mq-sysfs.c
      block/blk-sysfs.c
      block/blk.h
      block/elevator.c
      include/linux/blkdev.h
  * block: add helper for checking if queue is registered
      block/blk-sysfs.c
      block/elevator.c
      include/linux/blkdev.h
  * scripts: set proper OpenSSL include dir also for sign-file
      scripts/Makefile
  * scripts: use pkg-config to locate libcrypto
      scripts/Makefile
    arm64: tegra: Add power-domain for Tegra210 HDA
    ntfs: check for valid standard information attribute
  * usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable
      drivers/usb/core/quirks.c
  * USB: quirks: sort quirk entries
      drivers/usb/core/quirks.c
  * HID: make arrays usage and value to be the same
      drivers/hid/hid-core.c
    ANDROID: syscalls/x86: use a weak function for IA32 compat syscalls
    ANDROID: Adding kprobes build configs for Cuttlefish
  * UPSTREAM: locking/static_key: Fix false positive warnings on concurrent dec/inc
      kernel/jump_label.c
  * UPSTREAM: jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations
      kernel/jump_label.c
    ANDROID: Add symbol of _proc_mkdir

Bug: 184596728
Change-Id: I0dddfc3da829cf53c4716a6df1072ed3396b6f4a
Signed-off-by: Lucas Wei <lucaswei@google.com>
2021-04-26 16:14:00 +08:00
Miaohe Lin
97065f403c mm/rmap: fix potential pte_unmap on an not mapped pte
[ Upstream commit 5d5d19eda6b0ee790af89c45e3f678345be6f50f ]

For PMD-mapped page (usually THP), pvmw->pte is NULL.  For PTE-mapped THP,
pvmw->pte is mapped.  But for HugeTLB pages, pvmw->pte is not mapped and
set to the relevant page table entry.  So in page_vma_mapped_walk_done(),
we may do pte_unmap() for HugeTLB pte which is not mapped.  Fix this by
checking pvmw->page against PageHuge before trying to do pte_unmap().

Link: https://lkml.kernel.org/r/20210127093349.39081-1-linmiaohe@huawei.com
Fixes: ace71a19ce ("mm: introduce page_vma_mapped_walk()")
Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Brian Geffon <bgeffon@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04 09:39:50 +01:00
Minchan Kim
235453b766 mm: per-process reclaim
These day, there are many platforms available in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier.

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and background so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Bug: 131016077
Bug: 153444106
Test: boot
(cherry picked from 18c2af05a553f17d354b88b3a45dadc114c8c72c)
Change-Id: I99b51544f79202c097214d3856678cac4449a743
Signed-off-by: Tim Murray <timmurray@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Martin Liu <liumartin@google.com>
2020-05-30 02:20:48 +08:00
Martin Liu
ced40b3dda Revert "mm: Per process reclaim"
This reverts commit fd3a0c0194.

Reason for revert: depreciated per process reclaim
Bug: 153444106
Test: boot
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I826bff885a16232d3d20597a89c52ebf06e984f2
2020-05-30 02:20:48 +08:00
Martin Liu
4adaf1bb6b Revert "mm: Enhance per process reclaim to consider shared pages"
This reverts commit 5a83f94ad7.

Reason for revert: depreciated per process reclaim
Bug: 153444106
Test: boot
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: If635f7787f8b05a5f6a1af01c5afc71fd2642a88
2020-05-30 02:20:46 +08:00
Martin Liu
fdce5b1ab3 Revert "mm: introduce __page_add_new_anon_rmap()"
This reverts commit a4e72cb774.

Reason for revert: remove SPF non upstream code
Bug: 140544941
Test: boot
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I201da5cc6d18a3b831e9e1ad04de32b134a07461
2020-05-30 02:20:33 +08:00
Martin Liu
ef36c60fe7 mm: Revert previous mm revert list
This commit reverts e799c1b10c54...cfb042c6c5d1

Reason for revert: unblock GKI
Bug: 140544941
Test: boot
Change-Id: I4ebe6c01918788cdc2468ceabf101ef7c3e3c452
Signed-off-by: Martin Liu <liumartin@google.com>
2020-05-30 02:20:27 +08:00
Martin Liu
6d8bd4f119 Revert "mm: Per process reclaim"
This reverts commit fd3a0c0194.

Reason for revert: revet customized code
Bug: 153444106
Test: boot

Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I380bdbb1ac4ed8220e23a9de39f760d7b6411fe5
2020-05-30 02:20:23 +08:00
Minchan Kim
5bfd8a9a89 Revert "mm: introduce __page_add_new_anon_rmap()"
This reverts commit a4e72cb774.

Reason for revert: revet customized code
Bug: 140544941
Test: boot
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I5910f56e1f86132275de219a85043bd566f715db
2020-05-30 02:20:15 +08:00
Minchan Kim
65ddcb45af Revert "mm: Enhance per process reclaim to consider shared pages"
This reverts commit 5a83f94ad7.

Reason for revert: revet customized code
Bug: 140544941
Test: boot
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I20c6ccae1abab92a1b99fcb5ef9e376696abf1d0
2020-05-30 02:20:07 +08:00
Minchan Kim
5a83f94ad7 mm: Enhance per process reclaim to consider shared pages
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Change-Id: I7e5f34f2e947f5db6d405867fe2ad34863ca40f7
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:27
[vinmenon@codeaurora.org: merge conflict fixes + fix for ksm]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2019-05-06 11:44:49 -07:00
Minchan Kim
fd3a0c0194 mm: Per process reclaim
These day, there are many platforms available in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier.

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and background so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Change-Id: Iabdb7bc2ef3dc4d94e3ea005fbe18f4cd06739ab
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:24
[vinmenon@codeaurora.org: merge conflict fixes, minor tweak of the commit msg]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2019-05-06 11:38:17 -07:00
Laurent Dufour
a4e72cb774 mm: introduce __page_add_new_anon_rmap()
When dealing with speculative page fault handler, we may race with VMA
being split or merged. In this case the vma->vm_start and vm->vm_end
fields may not match the address the page fault is occurring.

This can only happens when the VMA is split but in that case, the
anon_vma pointer of the new VMA will be the same as the original one,
because in __split_vma the new->anon_vma is set to src->anon_vma when
*new = *vma.

So even if the VMA boundaries are not correct, the anon_vma pointer is
still valid.

If the VMA has been merged, then the VMA in which it has been merged
must have the same anon_vma pointer otherwise the merge can't be done.

So in all the case we know that the anon_vma is valid, since we have
checked before starting the speculative page fault that the anon_vma
pointer is valid for this VMA and since there is an anon_vma this
means that at one time a page has been backed and that before the VMA
is cleaned, the page table lock would have to be grab to clean the
PTE, and the anon_vma field is checked once the PTE is locked.

This patch introduce a new __page_add_new_anon_rmap() service which
doesn't check for the VMA boundaries, and create a new inline one
which do the check.

When called from a page fault handler, if this is not a speculative one,
there is a guarantee that vm_start and vm_end match the faulting address,
so this check is useless. In the context of the speculative page fault
handler, this check may be wrong but anon_vma is still valid as explained
above.

Change-Id: I72c47830181579f8c9618df879077d321653b5f1
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Patch-mainline: linux-mm @ Tue, 17 Apr 2018 16:33:22
[vinmenon@codeaurora.org: trivial merge conflict fixes +
checkpatch fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2019-03-29 03:09:44 -07:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Davidlohr Bueso
f808c13fd3 lib/interval_tree: fast overlap detection
Allow interval trees to quickly check for overlaps to avoid unnecesary
tree lookups in interval_tree_iter_first().

As of this patch, all interval tree flavors will require using a
'rb_root_cached' such that we can have the leftmost node easily
available.  While most users will make use of this feature, those with
special functions (in addition to the generic insert, delete, search
calls) will avoid using the cached option as they can do funky things
with insertions -- for example, vma_interval_tree_insert_after().

[jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()]
  Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:49 -07:00
Naoya Horiguchi
b5ff8161e3 mm: thp: introduce separate TTU flag for thp freezing
TTU_MIGRATION is used to convert pte into migration entry until thp
split completes.  This behavior conflicts with thp migration added later
patches, so let's introduce a new TTU flag specifically for freezing.

try_to_unmap() is used both for thp split (via freeze_page()) and page
migration (via __unmap_and_move()).  In freeze_page(), ttu_flag given
for head page is like below (assuming anonymous thp):

    (TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | TTU_RMAP_LOCKED | \
     TTU_MIGRATION | TTU_SPLIT_HUGE_PMD)

and ttu_flag given for tail pages is:

    (TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | TTU_RMAP_LOCKED | \
     TTU_MIGRATION)

__unmap_and_move() calls try_to_unmap() with ttu_flag:

    (TTU_MIGRATION | TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS)

Now I'm trying to insert a branch for thp migration at the top of
try_to_unmap_one() like below

static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
                       unsigned long address, void *arg)
  {
          ...
          /* PMD-mapped THP migration entry */
          if (!pvmw.pte && (flags & TTU_MIGRATION)) {
              if (!PageAnon(page))
                  continue;

              set_pmd_migration_entry(&pvmw, page);
              continue;
          }
	  ...
  }

so try_to_unmap() for tail pages called by thp split can go into thp
migration code path (which converts *pmd* into migration entry), while
the expectation is to freeze thp (which converts *pte* into migration
entry.)

I detected this failure as a "bad page state" error in a testcase where
split_huge_page() is called from queue_pages_pte_range().

Link: http://lkml.kernel.org/r/20170717193955.20207-4-zi.yan@sent.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Nellans <dnellans@nvidia.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:45 -07:00
Minchan Kim
83612a948d mm: remove SWAP_[SUCCESS|AGAIN|FAIL]
There is no user for it.  Remove it.

[minchan@kernel.org: use false instead of SWAP_FAIL]
  Link: http://lkml.kernel.org/r/20170316053313.GA19241@bbox
Link: http://lkml.kernel.org/r/1489555493-14659-11-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
e4b8222271 mm: make rmap_one boolean function
rmap_one's return value controls whether rmap_work should contine to
scan other ptes or not so it's target for changing to boolean.  Return
true if the scan should be continued.  Otherwise, return false to stop
the scanning.

This patch makes rmap_one's return value to boolean.

Link: http://lkml.kernel.org/r/1489555493-14659-10-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
1df631ae19 mm: make rmap_walk() return void
There is no user of the return value from rmap_walk() and friends so
this patch makes them void-returning functions.

Link: http://lkml.kernel.org/r/1489555493-14659-9-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
666e5a406c mm: make ttu's return boolean
try_to_unmap() returns SWAP_SUCCESS or SWAP_FAIL so it's suitable for
boolean return.  This patch changes it.

Link: http://lkml.kernel.org/r/1489555493-14659-8-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
ad6b67041a mm: remove SWAP_MLOCK in ttu
ttu doesn't need to return SWAP_MLOCK.  Instead, just return SWAP_FAIL
because it means the page is not-swappable so it should move to another
LRU list(active or unevictable).  putback friends will move it to right
list depending on the page's LRU flag.

Link: http://lkml.kernel.org/r/1489555493-14659-6-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
192d723256 mm: make try_to_munlock() return void
try_to_munlock returns SWAP_MLOCK if the one of VMAs mapped the page has
VM_LOCKED flag.  In that time, VM set PG_mlocked to the page if the page
is not pte-mapped THP which cannot be mlocked, either.

With that, __munlock_isolated_page can use PageMlocked to check whether
try_to_munlock is successful or not without relying on try_to_munlock's
retval.  It helps to make try_to_unmap/try_to_unmap_one simple with
upcoming patches.

[minchan@kernel.org: remove PG_Mlocked VM_BUG_ON check]
  Link: http://lkml.kernel.org/r/20170411025615.GA6545@bbox
Link: http://lkml.kernel.org/r/1489555493-14659-5-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Minchan Kim
18863d3a3f mm: remove SWAP_DIRTY in ttu
If we found lazyfree page is dirty, try_to_unmap_one can just
SetPageSwapBakced in there like PG_mlocked page and just return with
SWAP_FAIL which is very natural because the page is not swappable right
now so that vmscan can activate it.  There is no point to introduce new
return value SWAP_DIRTY in try_to_unmap at the moment.

Link: http://lkml.kernel.org/r/1489555493-14659-3-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:10 -07:00
Shaohua Li
802a3a92ad mm: reclaim MADV_FREE pages
When memory pressure is high, we free MADV_FREE pages.  If the pages are
not dirty in pte, the pages could be freed immediately.  Otherwise we
can't reclaim them.  We put the pages back to anonumous LRU list (by
setting SwapBacked flag) and the pages will be reclaimed in normal
swapout way.

We use normal page reclaim policy.  Since MADV_FREE pages are put into
inactive file list, such pages and inactive file pages are reclaimed
according to their age.  This is expected, because we don't want to
reclaim too many MADV_FREE pages before used once pages.

Based on Minchan's original patch

[minchan@kernel.org: clean up lazyfree page handling]
  Link: http://lkml.kernel.org/r/20170303025237.GB3503@bbox
Link: http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Shaohua Li
a128ca71fb mm: delete unnecessary TTU_* flags
Patch series "mm: fix some MADV_FREE issues", v5.

We are trying to use MADV_FREE in jemalloc.  Several issues are found.
Without solving the issues, jemalloc can't use the MADV_FREE feature.

 - Doesn't support system without swap enabled. Because if swap is off,
   we can't or can't efficiently age anonymous pages. And since
   MADV_FREE pages are mixed with other anonymous pages, we can't
   reclaim MADV_FREE pages. In current implementation, MADV_FREE will
   fallback to MADV_DONTNEED without swap enabled. But in our
   environment, a lot of machines don't enable swap. This will prevent
   our setup using MADV_FREE.

 - Increases memory pressure. page reclaim bias file pages reclaim
   against anonymous pages. This doesn't make sense for MADV_FREE pages,
   because those pages could be freed easily and refilled with very
   slight penality. Even page reclaim doesn't bias file pages, there is
   still an issue, because MADV_FREE pages and other anonymous pages are
   mixed together. To reclaim a MADV_FREE page, we probably must scan a
   lot of other anonymous pages, which is inefficient. In our test, we
   usually see oom with MADV_FREE enabled and nothing without it.

 - Accounting. There are two accounting problems. We don't have a global
   accounting. If the system is abnormal, we don't know if it's a
   problem from MADV_FREE side. The other problem is RSS accounting.
   MADV_FREE pages are accounted as normal anon pages and reclaimed
   lazily, so application's RSS becomes bigger. This confuses our
   workloads. We have monitoring daemon running and if it finds
   applications' RSS becomes abnormal, the daemon will kill the
   applications even kernel can reclaim the memory easily.

To address the first the two issues, we can either put MADV_FREE pages
into a separate LRU list (Minchan's previous patches and V1 patches), or
put them into LRU_INACTIVE_FILE list (suggested by Johannes).  The
patchset use the second idea.  The reason is LRU_INACTIVE_FILE list is
tiny nowadays and should be full of used once file pages.  So we can
still efficiently reclaim MADV_FREE pages there without interference
with other anon and active file pages.  Putting the pages into inactive
file list also has an advantage which allows page reclaim to prioritize
MADV_FREE pages and used once file pages.  MADV_FREE pages are put into
the lru list and clear SwapBacked flag, so PageAnon(page) &&
!PageSwapBacked(page) will indicate a MADV_FREE pages.  These pages will
directly freed without pageout if they are clean, otherwise normal swap
will reclaim them.

For the third issue, the previous post adds global accounting and a
separate RSS count for MADV_FREE pages.  The problem is we never get
accurate accounting for MADV_FREE pages.  The pages are mapped to
userspace, can be dirtied without notice from kernel side.  To get
accurate accounting, we could write protect the page, but then there is
extra page fault overhead, which people don't want to pay.  Jemalloc
guys have concerns about the inaccurate accounting, so this post drops
the accounting patches temporarily.  The info exported to
/proc/pid/smaps for MADV_FREE pages are kept, which is the only place we
can get accurate accounting right now.

This patch (of 6):

Johannes pointed out TTU_LZFREE is unnecessary.  It's true because we
always have the flag set if we want to do an unmap.  For cases we don't
do an unmap, the TTU_LZFREE part of code should never run.

Also the TTU_UNMAP is unnecessary.  If no other flags set (for example,
TTU_MIGRATION), an unmap is implied.

The patch includes Johannes's cleanup and dead TTU_ACTION macro removal
code

Link: http://lkml.kernel.org/r/4be3ea1bc56b26fd98a54d0a6f70bec63f6d8980.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Kirill A. Shutemov
d53a8b49a6 mm: drop page_check_address{,_transhuge}
All users are gone. Let's drop them.

Link: http://lkml.kernel.org/r/20170129173858.45174-12-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Kirill A. Shutemov
ace71a19ce mm: introduce page_vma_mapped_walk()
Introduce a new interface to check if a page is mapped into a vma.  It
aims to address shortcomings of page_check_address{,_transhuge}.

Existing interface is not able to handle PTE-mapped THPs: it only finds
the first PTE.  The rest lefted unnoticed.

page_vma_mapped_walk() iterates over all possible mapping of the page in
the vma.

Link: http://lkml.kernel.org/r/20170129173858.45174-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Vlastimil Babka
d5a187daf5 mm, rmap: handle anon_vma_prepare() common case inline
anon_vma_prepare() is mostly a large "if (unlikely(...))" block, as the
expected common case is that an anon_vma already exists.  We could turn
the condition around and return 0, but it also makes sense to do it
inline and avoid a call for the common case.

Bloat-o-meter naturally shows that inlining the check has some code size
costs:

add/remove: 1/1 grow/shrink: 4/0 up/down: 475/-373 (102)
function                                     old     new   delta
__anon_vma_prepare                             -     359    +359
handle_mm_fault                             2744    2796     +52
hugetlb_cow                                 1146    1170     +24
hugetlb_fault                               2123    2145     +22
wp_page_copy                                1469    1487     +18
anon_vma_prepare                             373       -    -373

Checking the asm however confirms that the hot paths now avoid a call,
which is moved away.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20161116074005.22768-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:08 -08:00
Kirill A. Shutemov
dd78fedde4 rmap: support file thp
Naive approach: on mapping/unmapping the page as compound we update
->_mapcount on each 4k page.  That's not efficient, but it's not obvious
how we can optimize this.  We can look into optimization later.

PG_double_map optimization doesn't work for file pages since lifecycle
of file pages is different comparing to anon pages: file page can be
mapped again at any time.

Link: http://lkml.kernel.org/r/1466021202-61880-11-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Hugh Dickins
5a49973d71 mm: thp: refix false positive BUG in page_move_anon_rmap()
The VM_BUG_ON_PAGE in page_move_anon_rmap() is more trouble than it's
worth: the syzkaller fuzzer hit it again.  It's still wrong for some THP
cases, because linear_page_index() was never intended to apply to
addresses before the start of a vma.

That's easily fixed with a signed long cast inside linear_page_index();
and Dmitry has tested such a patch, to verify the false positive.  But
why extend linear_page_index() just for this case? when the avoidance in
page_move_anon_rmap() has already grown ugly, and there's no reason for
the check at all (nothing else there is using address or index).

Remove address arg from page_move_anon_rmap(), remove VM_BUG_ON_PAGE,
remove CONFIG_DEBUG_VM PageTransHuge adjustment.

And one more thing: should the compound_head(page) be done inside or
outside page_move_anon_rmap()? It's usually pushed down to the lowest
level nowadays (and mm/memory.c shows no other explicit use of it), so I
think it's better done in page_move_anon_rmap() than by caller.

Fixes: 0798d3c022 ("mm: thp: avoid false positive VM_BUG_ON_PAGE in page_move_anon_rmap()")
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1607120444540.12528@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>	[4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-15 14:54:27 +09:00
Kirill A. Shutemov
e388466de4 mm: make remove_migration_ptes() beyond mm/migration.c
Make remove_migration_ptes() available to be used in split_huge_page().

New parameter 'locked' added: as with try_to_umap() we need a way to
indicate that caller holds rmap lock.

We also shouldn't try to mlock() pte-mapped huge pages: pte-mapeed THP
pages are never mlocked.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Kirill A. Shutemov
2a52bcbcc6 rmap: extend try_to_unmap() to be usable by split_huge_page()
Add support for two ttu_flags:

  - TTU_SPLIT_HUGE_PMD would split PMD if it's there, before trying to
    unmap page;

  - TTU_RMAP_LOCKED indicates that caller holds relevant rmap lock;

Also, change rwc->done to !page_mapcount() instead of !page_mapped().
try_to_unmap() works on pte level, so we are really interested in the
mappedness of this small page rather than of the compound page it's a
part of.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Kirill A. Shutemov
b97731992d rmap: introduce rmap_walk_locked()
This patchset rewrites freeze_page() and unfreeze_page() using
try_to_unmap() and remove_migration_ptes().  Result is much simpler, but
somewhat slower.

Migration 8GiB worth of PMD-mapped THP:

  Baseline	20.21 +/- 0.393
  Patched	20.73 +/- 0.082
  Slowdown	1.03x

It's 3% slower, comparing to 14% in v1.  I don't it should be a stopper.

Splitting of PTE-mapped pages slowed more.  But this is not a common
case.

Migration 8GiB worth of PMD-mapped THP:

  Baseline	20.39 +/- 0.225
  Patched	22.43 +/- 0.496
  Slowdown	1.10x

rmap_walk_locked() is the same as rmap_walk(), but the caller takes care
of the relevant rmap lock.

This is preparation for switching THP splitting from custom rmap walk in
freeze_page()/unfreeze_page() to the generic one.

There is no support for KSM pages for now: not clear which lock is
implied.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Konstantin Khlebnikov
12352d3cae mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock.  We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here.  There are
only few users of these legacy helpers.  Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Minchan Kim
854e9ed09d mm: support madvise(MADV_FREE)
Linux doesn't have an ability to free pages lazy while other OS already
have been supported that named by madvise(MADV_FREE).

The gain is clear that kernel can discard freed pages rather than
swapping out or OOM if memory pressure happens.

Without memory pressure, freed pages would be reused by userspace
without another additional overhead(ex, page fault + allocation +
zeroing).

Jason Evans said:

: Facebook has been using MAP_UNINITIALIZED
: (https://lkml.org/lkml/2012/1/18/308) in some of its applications for
: several years, but there are operational costs to maintaining this
: out-of-tree in our kernel and in jemalloc, and we are anxious to retire it
: in favor of MADV_FREE.  When we first enabled MAP_UNINITIALIZED it
: increased throughput for much of our workload by ~5%, and although the
: benefit has decreased using newer hardware and kernels, there is still
: enough benefit that we cannot reasonably retire it without a replacement.
:
: Aside from Facebook operations, there are numerous broadly used
: applications that would benefit from MADV_FREE.  The ones that immediately
: come to mind are redis, varnish, and MariaDB.  I don't have much insight
: into Android internals and development process, but I would hope to see
: MADV_FREE support eventually end up there as well to benefit applications
: linked with the integrated jemalloc.
:
: jemalloc will use MADV_FREE once it becomes available in the Linux kernel.
: In fact, jemalloc already uses MADV_FREE or equivalent everywhere it's
: available: *BSD, OS X, Windows, and Solaris -- every platform except Linux
: (and AIX, but I'm not sure it even compiles on AIX).  The lack of
: MADV_FREE on Linux forced me down a long series of increasingly
: sophisticated heuristics for madvise() volume reduction, and even so this
: remains a common performance issue for people using jemalloc on Linux.
: Please integrate MADV_FREE; many people will benefit substantially.

How it works:

When madvise syscall is called, VM clears dirty bit of ptes of the
range.  If memory pressure happens, VM checks dirty bit of page table
and if it found still "clean", it means it's a "lazyfree pages" so VM
could discard the page instead of swapping out.  Once there was store
operation for the page before VM peek a page to reclaim, dirty bit is
set so VM can swap out the page instead of discarding.

One thing we should notice is that basically, MADV_FREE relies on dirty
bit in page table entry to decide whether VM allows to discard the page
or not.  IOW, if page table entry includes marked dirty bit, VM
shouldn't discard the page.

However, as a example, if swap-in by read fault happens, page table
entry doesn't have dirty bit so MADV_FREE could discard the page
wrongly.

For avoiding the problem, MADV_FREE did more checks with PageDirty and
PageSwapCache.  It worked out because swapped-in page lives on swap
cache and since it is evicted from the swap cache, the page has PG_dirty
flag.  So both page flags check effectively prevent wrong discarding by
MADV_FREE.

However, a problem in above logic is that swapped-in page has PG_dirty
still after they are removed from swap cache so VM cannot consider the
page as freeable any more even if madvise_free is called in future.

Look at below example for detail.

    ptr = malloc();
    memset(ptr);
    ..
    ..
    .. heavy memory pressure so all of pages are swapped out
    ..
    ..
    var = *ptr; -> a page swapped-in and could be removed from
                   swapcache. Then, page table doesn't mark
                   dirty bit and page descriptor includes PG_dirty
    ..
    ..
    madvise_free(ptr); -> It doesn't clear PG_dirty of the page.
    ..
    ..
    ..
    .. heavy memory pressure again.
    .. In this time, VM cannot discard the page because the page
    .. has *PG_dirty*

To solve the problem, this patch clears PG_dirty if only the page is
owned exclusively by current process when madvise is called because
PG_dirty represents ptes's dirtiness in several processes so we could
clear it only if we own it exclusively.

Firstly, heavy users would be general allocators(ex, jemalloc, tcmalloc
and hope glibc supports it) and jemalloc/tcmalloc already have supported
the feature for other OS(ex, FreeBSD)

  barrios@blaptop:~/benchmark/ebizzy$ lscpu
  Architecture:          x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
  CPU(s):                12
  On-line CPU(s) list:   0-11
  Thread(s) per core:    1
  Core(s) per socket:    1
  Socket(s):             12
  NUMA node(s):          1
  Vendor ID:             GenuineIntel
  CPU family:            6
  Model:                 2
  Stepping:              3
  CPU MHz:               3200.185
  BogoMIPS:              6400.53
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
  L1d cache:             32K
  L1i cache:             32K
  L2 cache:              4096K
  NUMA node0 CPU(s):     0-11
  ebizzy benchmark(./ebizzy -S 10 -n 512)

  Higher avg is better.

   vanilla-jemalloc             MADV_free-jemalloc

  1 thread
  records: 10                   records: 10
  avg:   2961.90                avg:  12069.70
  std:     71.96(2.43%)         std:    186.68(1.55%)
  max:   3070.00                max:  12385.00
  min:   2796.00                min:  11746.00

  2 thread
  records: 10                   records: 10
  avg:   5020.00                avg:  17827.00
  std:    264.87(5.28%)         std:    358.52(2.01%)
  max:   5244.00                max:  18760.00
  min:   4251.00                min:  17382.00

  4 thread
  records: 10                   records: 10
  avg:   8988.80                avg:  27930.80
  std:   1175.33(13.08%)        std:   3317.33(11.88%)
  max:   9508.00                max:  30879.00
  min:   5477.00                min:  21024.00

  8 thread
  records: 10                   records: 10
  avg:  13036.50                avg:  33739.40
  std:    170.67(1.31%)         std:   5146.22(15.25%)
  max:  13371.00                max:  40572.00
  min:  12785.00                min:  24088.00

  16 thread
  records: 10                   records: 10
  avg:  11092.40                avg:  31424.20
  std:    710.60(6.41%)         std:   3763.89(11.98%)
  max:  12446.00                max:  36635.00
  min:   9949.00                min:  25669.00

  32 thread
  records: 10                   records: 10
  avg:  11067.00                avg:  34495.80
  std:    971.06(8.77%)         std:   2721.36(7.89%)
  max:  12010.00                max:  38598.00
  min:   9002.00                min:  30636.00

In summary, MADV_FREE is about much faster than MADV_DONTNEED.

This patch (of 12):

Add core MADV_FREE implementation.

[akpm@linux-foundation.org: small cleanups]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jason Evans <je@fb.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Shaohua Li" <shli@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Vladimir Davydov
8749cfea11 mm: add page_check_address_transhuge() helper
page_referenced_one() and page_idle_clear_pte_refs_one() duplicate the
code for looking up pte of a (possibly transhuge) page.  Move this code
to a new helper function, page_check_address_transhuge(), and make the
above mentioned functions use it.

This is just a cleanup, no functional changes are intended.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Kirill A. Shutemov
53f9263bab mm: rework mapcount accounting to enable 4k mapping of THPs
We're going to allow mapping of individual 4k pages of THP compound.  It
means we need to track mapcount on per small page basis.

Straight-forward approach is to use ->_mapcount in all subpages to track
how many time this subpage is mapped with PMDs or PTEs combined.  But
this is rather expensive: mapping or unmapping of a THP page with PMD
would require HPAGE_PMD_NR atomic operations instead of single we have
now.

The idea is to store separately how many times the page was mapped as
whole -- compound_mapcount.  This frees up ->_mapcount in subpages to
track PTE mapcount.

We use the same approach as with compound page destructor and compound
order to store compound_mapcount: use space in first tail page,
->mapping this time.

Any time we map/unmap whole compound page (THP or hugetlb) -- we
increment/decrement compound_mapcount.  When we map part of compound
page with PTE we operate on ->_mapcount of the subpage.

page_mapcount() counts both: PTE and PMD mappings of the page.

Basically, we have mapcount for a subpage spread over two counters.  It
makes tricky to detect when last mapcount for a page goes away.

We introduced PageDoubleMap() for this.  When we split THP PMD for the
first time and there's other PMD mapping left we offset up ->_mapcount
in all subpages by one and set PG_double_map on the compound page.
These additional references go away with last compound_mapcount.

This approach provides a way to detect when last mapcount goes away on
per small page basis without introducing new overhead for most common
cases.

[akpm@linux-foundation.org: fix typo in comment]
[mhocko@suse.com: ignore partial THP when moving task]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Kirill A. Shutemov
d281ee6145 rmap: add argument to charge compound page
We're going to allow mapping of individual 4k pages of THP compound
page.  It means we cannot rely on PageTransHuge() check to decide if
map/unmap small page or THP.

The patch adds new argument to rmap functions to indicate whether we
want to operate on whole compound page or only the small page.

[n-horiguchi@ah.jp.nec.com: fix mapcount mismatch in hugepage migration]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Mel Gorman
72b252aed5 mm: send one IPI per CPU to TLB flush all entries after unmapping pages
An IPI is sent to flush remote TLBs when a page is unmapped that was
potentially accesssed by other CPUs.  There are many circumstances where
this happens but the obvious one is kswapd reclaiming pages belonging to a
running process as kswapd and the task are likely running on separate
CPUs.

On small machines, this is not a significant problem but as machine gets
larger with more cores and more memory, the cost of these IPIs can be
high.  This patch uses a simple structure that tracks CPUs that
potentially have TLB entries for pages being unmapped.  When the unmapping
is complete, the full TLB is flushed on the assumption that a refill cost
is lower than flushing individual entries.

Architectures wishing to do this must give the following guarantee.

        If a clean page is unmapped and not immediately flushed, the
        architecture must guarantee that a write to that linear address
        from a CPU with a cached TLB entry will trap a page fault.

This is essentially what the kernel already depends on but the window is
much larger with this patch applied and is worth highlighting.  The
architecture should consider whether the cost of the full TLB flush is
higher than sending an IPI to flush each individual entry.  An additional
architecture helper called flush_tlb_local is required.  It's a trivial
wrapper with some accounting in the x86 case.

The impact of this patch depends on the workload as measuring any benefit
requires both mapped pages co-located on the LRU and memory pressure.  The
case with the biggest impact is multiple processes reading mapped pages
taken from the vm-scalability test suite.  The test case uses NR_CPU
readers of mapped files that consume 10*RAM.

Linear mapped reader on a 4-node machine with 64G RAM and 48 CPUs

                                           4.2.0-rc1          4.2.0-rc1
                                             vanilla       flushfull-v7
Ops lru-file-mmap-read-elapsed      159.62 (  0.00%)   120.68 ( 24.40%)
Ops lru-file-mmap-read-time_range    30.59 (  0.00%)     2.80 ( 90.85%)
Ops lru-file-mmap-read-time_stddv     6.70 (  0.00%)     0.64 ( 90.38%)

           4.2.0-rc1    4.2.0-rc1
             vanilla flushfull-v7
User          581.00       611.43
System       5804.93      4111.76
Elapsed       161.03       122.12

This is showing that the readers completed 24.40% faster with 29% less
system CPU time.  From vmstats, it is known that the vanilla kernel was
interrupted roughly 900K times per second during the steady phase of the
test and the patched kernel was interrupts 180K times per second.

The impact is lower on a single socket machine.

                                           4.2.0-rc1          4.2.0-rc1
                                             vanilla       flushfull-v7
Ops lru-file-mmap-read-elapsed       25.33 (  0.00%)    20.38 ( 19.54%)
Ops lru-file-mmap-read-time_range     0.91 (  0.00%)     1.44 (-58.24%)
Ops lru-file-mmap-read-time_stddv     0.28 (  0.00%)     0.47 (-65.34%)

           4.2.0-rc1    4.2.0-rc1
             vanilla flushfull-v7
User           58.09        57.64
System        111.82        76.56
Elapsed        27.29        22.55

It's still a noticeable improvement with vmstat showing interrupts went
from roughly 500K per second to 45K per second.

The patch will have no impact on workloads with no memory pressure or have
relatively few mapped pages.  It will have an unpredictable impact on the
workload running on the CPU being flushed as it'll depend on how many TLB
entries need to be refilled and how long that takes.  Worst case, the TLB
will be completely cleared of active entries when the target PFNs were not
resident at all.

[sasha.levin@oracle.com: trace tlb flush after disabling preemption in try_to_unmap_flush]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Kirill A. Shutemov
e39155ea11 mm: uninline and cleanup page-mapping related helpers
Most-used page->mapping helper -- page_mapping() -- has already uninlined.
 Let's uninline also page_rmapping() and page_anon_vma().  It saves us
depending on configuration around 400 bytes in text:

   text	   data	    bss	    dec	    hex	filename
 660318	  99254	 410000	1169572	 11d8a4	mm/built-in.o-before
 659854	  99254	 410000	1169108	 11d6d4	mm/built-in.o

I also tried to make code a bit more clean.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:19 -07:00
Matthew Wilcox
e748dcd095 vfs: remove get_xip_mem
All callers of get_xip_mem() are now gone.  Remove checks for it,
initialisers of it, documentation of it and the only implementation of it.
 Also remove mm/filemap_xip.c as it is now empty.  Also remove
documentation of the long-gone get_xip_page().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-16 17:56:03 -08:00
Kirill A. Shutemov
27ba0644ea rmap: drop support of non-linear mappings
We don't create non-linear mappings anymore.  Let's drop code which
handles them in rmap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
Konstantin Khlebnikov
7a3ef208e6 mm: prevent endless growth of anon_vma hierarchy
Constantly forking task causes unlimited grow of anon_vma chain.  Each
next child allocates new level of anon_vmas and links vma to all
previous levels because pages might be inherited from any level.

This patch adds heuristic which decides to reuse existing anon_vma
instead of forking new one.  It adds counter anon_vma->degree which
counts linked vmas and directly descending anon_vmas and reuses anon_vma
if counter is lower than two.  As a result each anon_vma has either vma
or at least two descending anon_vmas.  In such trees half of nodes are
leafs with alive vmas, thus count of anon_vmas is no more than two times
bigger than count of vmas.

This heuristic reuses anon_vmas as few as possible because each reuse
adds false aliasing among vmas and rmap walker ought to scan more ptes
when it searches where page is might be mapped.

Link: http://lkml.kernel.org/r/20120816024610.GA5350@evergreen.ssec.wisc.edu
Fixes: 5beb493052 ("mm: change anon_vma linking to fix multi-process server scalability issue")
[akpm@linux-foundation.org: fix typo, per Rik]
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Tested-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>	[2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-08 15:10:51 -08:00
Sasha Levin
81d1b09c6b mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA
Trivially convert a few VM_BUG_ON calls to VM_BUG_ON_VMA to extract
more information when they trigger.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Konstantin Khlebnikov
daa5ba768b mm/rmap.c: cleanup ttu_flags
Transform action part of ttu_flags into individiual bits.  These flags
aren't part of any uses-space visible api or even trace events.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:12 -07:00
Kirill A. Shutemov
ac7695012a mm/rmap.c: make page_referenced_one() and try_to_unmap_one() static
KSM was converted to use rmap_walk() and now nobody uses these functions
outside mm/rmap.c.

Let's covert them back to static.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:05 -07:00
Hugh Dickins
7e09e738af mm: fix swapops.h:131 bug if remap_file_pages raced migration
Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting
little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity:
indicating that remove_migration_ptes() failed to find one of the
migration entries that was temporarily inserted.

The problem comes from remap_file_pages()'s switch from vma_interval_tree
(good for inserting the migration entry) to i_mmap_nonlinear list (no good
for locating it again); but can only be a problem if the remap_file_pages()
range does not cover the whole of the vma (zap_pte() clears the range).

remove_migration_ptes() needs a file_nonlinear method to go down the
i_mmap_nonlinear list, applying linear location to look for migration
entries in those vmas too, just in case there was this race.

The file_nonlinear method does need rmap_walk_control.arg to do this;
but it never needed vma passed in - vma comes from its own iteration.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20 22:09:09 -07:00