* refs/heads/tmp-1cfd841:
Revert "BACKPORT: perf_event: Add support for LSM and SELinux checks"
Linux 4.14.163
perf/x86/intel/bts: Fix the use of page_private()
xen/blkback: Avoid unmapping unmapped grant pages
s390/smp: fix physical to logical CPU map for SMT
net: add annotations on hh->hh_len lockless accesses
arm64: dts: meson: odroid-c2: Disable usb_otg bus to avoid power failed warning
ath9k_htc: Discard undersized packets
ath9k_htc: Modify byte order for an error message
rxrpc: Fix possible NULL pointer access in ICMP handling
selftests: rtnetlink: add addresses with fixed life time
powerpc/pseries/hvconsole: Fix stack overread via udbg
drm/mst: Fix MST sideband up-reply failure handling
scsi: qedf: Do not retry ELS request if qedf_alloc_cmd fails
fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP
tty: serial: msm_serial: Fix lockup for sysrq and oops
dt-bindings: clock: renesas: rcar-usb2-clock-sel: Fix typo in example
media: usb: fix memory leak in af9005_identify_state
regulator: ab8500: Remove AB8505 USB regulator
media: flexcop-usb: ensure -EIO is returned on error condition
Bluetooth: Fix memory leak in hci_connect_le_scan
Bluetooth: delete a stray unlock
Bluetooth: btusb: fix PM leak in error case of setup
platform/x86: pmc_atom: Add Siemens CONNECT X300 to critclk_systems DMI table
xfs: don't check for AG deadlock for realtime files in bunmapi
scsi: qla2xxx: Drop superfluous INIT_WORK of del_work
nfsd4: fix up replay_matches_cache()
PM / devfreq: Check NULL governor in available_governors_show
arm64: Revert support for execute-only user mappings
ftrace: Avoid potential division by zero in function profiler
exit: panic before exit_mm() on global init exit
ALSA: firewire-motu: Correct a typo in the clock proc string
ALSA: cs4236: fix error return comparison of an unsigned integer
tracing: Have the histogram compare functions convert to u64 first
tracing: Fix lock inversion in trace_event_enable_tgid_record()
gpiolib: fix up emulated open drain outputs
ata: ahci_brcm: Fix AHCI resources management
ata: ahci_brcm: Allow optional reset controller to be used
ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys()
compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE
compat_ioctl: block: handle Persistent Reservations
dmaengine: Fix access to uninitialized dma_slave_caps
locks: print unsigned ino in /proc/locks
pstore/ram: Write new dumps to start of recycled zones
memcg: account security cred as well to kmemcg
mm/zsmalloc.c: fix the migrated zspage statistics.
media: cec: avoid decrementing transmit_queue_sz if it is 0
media: cec: CEC 2.0-only bcast messages were ignored
media: pulse8-cec: fix lost cec_transmit_attempt_done() call
MIPS: Avoid VDSO ABI breakage due to global register variable
drm/sun4i: hdmi: Remove duplicate cleanup calls
ALSA: ice1724: Fix sleep-in-atomic in Infrasonic Quartet support code
drm: limit to INT_MAX in create_blob ioctl
taskstats: fix data-race
xfs: fix mount failure crash on invalid iclog memory access
PM / hibernate: memory_bm_find_bit(): Tighten node optimisation
xen/balloon: fix ballooned page accounting without hotplug enabled
xen-blkback: prevent premature module unload
IB/mlx4: Follow mirror sequence of device add during device removal
s390/cpum_sf: Avoid SBD overflow condition in irq handler
s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits
md: raid1: check rdev before reference in raid1_sync_request func
net: make socket read/write_iter() honor IOCB_NOWAIT
usb: gadget: fix wrong endpoint desc
drm/nouveau: Move the declaration of struct nouveau_conn_atom up a bit
scsi: libsas: stop discovering if oob mode is disconnected
scsi: iscsi: qla4xxx: fix double free in probe
scsi: qla2xxx: Don't call qlt_async_event twice
scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func
rxe: correctly calculate iCRC for unaligned payloads
RDMA/cma: add missed unregister_pernet_subsys in init failure
PM / devfreq: Don't fail devfreq_dev_release if not in list
iio: adc: max9611: Fix too short conversion time delay
nvme_fc: add module to ops template to allow module references
UPSTREAM: selinux: sidtab reverse lookup hash table
UPSTREAM: selinux: avoid atomic_t usage in sidtab
UPSTREAM: selinux: check sidtab limit before adding a new entry
UPSTREAM: selinux: fix context string corruption in convert_context()
BACKPORT: selinux: overhaul sidtab to fix bug and improve performance
UPSTREAM: selinux: refactor mls_context_to_sid() and make it stricter
UPSTREAM: selinux: Cleanup printk logging in services
UPSTREAM: scsi: ilog2: create truly constant version for sparse
BACKPORT: selinux: use separate table for initial SID lookup
UPSTREAM: selinux: make "selinux_policycap_names[]" const char *
UPSTREAM: selinux: refactor sidtab conversion
BACKPORT: selinux: wrap AVC state
UPSTREAM: selinux: wrap selinuxfs state
UPSTREAM: selinux: rename the {is,set}_enforcing() functions
BACKPORT: selinux: wrap global selinux state
UPSTREAM: selinux: Use kmem_cache for hashtab_node
BACKPORT: perf_event: Add support for LSM and SELinux checks
UPSTREAM: binder: Add binder_proc logging to binderfs
UPSTREAM: binder: Make transaction_log available in binderfs
UPSTREAM: binder: Add stats, state and transactions files
UPSTREAM: binder: add a mount option to show global stats
UPSTREAM: binder: Validate the default binderfs device names.
UPSTREAM: binder: Add default binder devices through binderfs when configured
UPSTREAM: binder: fix CONFIG_ANDROID_BINDER_DEVICES
UPSTREAM: android: binder: use kstrdup instead of open-coding it
UPSTREAM: binderfs: remove separate device_initcall()
BACKPORT: binderfs: respect limit on binder control creation
UPSTREAM: binderfs: switch from d_add() to d_instantiate()
UPSTREAM: binderfs: drop lock in binderfs_binder_ctl_create
UPSTREAM: binderfs: kill_litter_super() before cleanup
UPSTREAM: binderfs: rework binderfs_binder_device_create()
UPSTREAM: binderfs: rework binderfs_fill_super()
UPSTREAM: binderfs: prevent renaming the control dentry
UPSTREAM: binderfs: remove outdated comment
UPSTREAM: binderfs: fix error return code in binderfs_fill_super()
UPSTREAM: binderfs: handle !CONFIG_IPC_NS builds
BACKPORT: binderfs: reserve devices for initial mount
UPSTREAM: binderfs: rename header to binderfs.h
BACKPORT: binderfs: implement "max" mount option
UPSTREAM: binderfs: make each binderfs mount a new instance
UPSTREAM: binderfs: remove wrong kern_mount() call
BACKPORT: binder: implement binderfs
UPSTREAM: binder: remove BINDER_DEBUG_ENTRY()
UPSTREAM: seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
UPSTREAM: exit: panic before exit_mm() on global init exit
Conflicts:
drivers/gpu/drm/drm_property.c
security/selinux/avc.c
security/selinux/hooks.c
security/selinux/include/security.h
security/selinux/ss/services.c
Changed below files to fix build errors:
gen_headers_arm64.bp
gen_headers_arm.bp
Change-Id: Ie7e5cd66a03cfaa765a491598302b8f073ac159c
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
* refs/heads/tmp-f960b38:
Linux 4.14.159
of: unittest: fix memory leak in attach_node_and_children
raid5: need to set STRIPE_HANDLE for batch head
gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist
kernel/module.c: wakeup processes in module_wq on module unload
gfs2: fix glock reference problem in gfs2_trans_remove_revoke
net/mlx5e: Fix SFF 8472 eeprom length
sunrpc: fix crash when cache_head become valid before update
workqueue: Fix missing kfree(rescuer) in destroy_workqueue()
blk-mq: make sure that line break can be printed
mfd: rk808: Fix RK818 ID template
ext4: fix a bug in ext4_wait_for_tail_page_commit
mm/shmem.c: cast the type of unmap_start to u64
firmware: qcom: scm: Ensure 'a0' status code is treated as signed
ext4: work around deleting a file with i_nlink == 0 safely
powerpc: Fix vDSO clock_getres()
powerpc: Avoid clang warnings around setjmp and longjmp
ath10k: fix fw crash by moving chip reset after napi disabled
media: vimc: fix component match compare
mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead
power: supply: cpcap-battery: Fix signed counter sample register
x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
e100: Fix passing zero to 'PTR_ERR' warning in e100_load_ucode_wait
drbd: Change drbd_request_detach_interruptible's return type to int
scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE
scsi: lpfc: Cap NPIV vports to 256
omap: pdata-quirks: remove openpandora quirks for mmc3 and wl1251
phy: renesas: rcar-gen3-usb2: Fix sysfs interface of "role"
iio: adis16480: Add debugfs_reg_access entry
xhci: make sure interrupts are restored to correct state
xhci: Fix memory leak in xhci_add_in_port()
scsi: qla2xxx: Fix message indicating vectors used by driver
scsi: qla2xxx: Always check the qla2x00_wait_for_hba_online() return value
scsi: qla2xxx: Fix qla24xx_process_bidir_cmd()
scsi: qla2xxx: Fix session lookup in qlt_abort_work()
scsi: qla2xxx: Fix DMA unmap leak
scsi: zfcp: trace channel log even for FCP command responses
block: fix single range discard merge
reiserfs: fix extended attributes on the root directory
ext4: Fix credit estimate for final inode freeing
quota: fix livelock in dquot_writeback_dquots
ext2: check err when partial != NULL
quota: Check that quota is not dirty before release
video/hdmi: Fix AVI bar unpack
powerpc/xive: Skip ioremap() of ESB pages for LSI interrupts
powerpc: Allow flush_icache_range to work across ranges >4GB
powerpc/xive: Prevent page fault issues in the machine crash handler
powerpc: Allow 64bit VDSO __kernel_sync_dicache to work across ranges >4GB
ppdev: fix PPGETTIME/PPSETTIME ioctls
ARM: dts: omap3-tao3530: Fix incorrect MMC card detection GPIO polarity
mmc: host: omap_hsmmc: add code for special init of wl1251 to get rid of pandora_wl1251_init_card
pinctrl: samsung: Fix device node refcount leaks in S3C64xx wakeup controller init
pinctrl: samsung: Fix device node refcount leaks in init code
pinctrl: samsung: Fix device node refcount leaks in S3C24xx wakeup controller init
pinctrl: samsung: Add of_node_put() before return in error path
ACPI: PM: Avoid attaching ACPI PM domain to certain devices
ACPI: bus: Fix NULL pointer check in acpi_bus_get_private_data()
ACPI: OSL: only free map once in osl.c
cpufreq: powernv: fix stack bloat and hard limit on number of CPUs
PM / devfreq: Lock devfreq in trans_stat_show
intel_th: pci: Add Tiger Lake CPU support
intel_th: pci: Add Ice Lake CPU support
intel_th: Fix a double put_device() in error path
cpuidle: Do not unset the driver if it is there already
media: cec.h: CEC_OP_REC_FLAG_ values were swapped
media: radio: wl1273: fix interrupt masking on release
media: bdisp: fix memleak on release
s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
ar5523: check NULL before memcpy() in ar5523_cmd()
cgroup: pids: use atomic64_t for pids->limit
blk-mq: avoid sysfs buffer overflow with too many CPU cores
ASoC: Jack: Fix NULL pointer dereference in snd_soc_jack_report
workqueue: Fix pwq ref leak in rescuer_thread()
workqueue: Fix spurious sanity check failures in destroy_workqueue()
dm zoned: reduce overhead of backing device checks
hwrng: omap - Fix RNG wait loop timeout
watchdog: aspeed: Fix clock behaviour for ast2600
md/raid0: Fix an error message in raid0_make_request()
ALSA: hda - Fix pending unsol events at shutdown
ovl: relax WARN_ON() on rename to self
lib: raid6: fix awk build warnings
rtlwifi: rtl8192de: Fix missing enable interrupt flag
rtlwifi: rtl8192de: Fix missing callback that tests for hw release of buffer
rtlwifi: rtl8192de: Fix missing code to retrieve RX buffer address
btrfs: record all roots for rename exchange on a subvol
Btrfs: send, skip backreference walking for extents with many references
btrfs: Remove btrfs_bio::flags member
Btrfs: fix negative subv_writers counter and data space leak after buffered write
btrfs: use refcount_inc_not_zero in kill_all_nodes
btrfs: check page->mapping when loading free space cache
usb: dwc3: ep0: Clear started flag on completion
virtio-balloon: fix managed page counts when migrating pages between zones
mtd: spear_smi: Fix Write Burst mode
tpm: add check after commands attribs tab allocation
usb: mon: Fix a deadlock in usbmon between mmap and read
usb: core: urb: fix URB structure initialization function
USB: adutux: fix interface sanity check
USB: serial: io_edgeport: fix epic endpoint lookup
USB: idmouse: fix interface sanity checks
USB: atm: ueagle-atm: add missing endpoint check
iio: humidity: hdc100x: fix IIO_HUMIDITYRELATIVE channel reporting
ARM: dts: pandora-common: define wl1251 as child node of mmc3
xhci: handle some XHCI_TRUST_TX_LENGTH quirks cases as default behaviour.
xhci: Increase STS_HALT timeout in xhci_suspend()
usb: xhci: only set D3hot for pci device
staging: gigaset: add endpoint-type sanity check
staging: gigaset: fix illegal free on probe errors
staging: gigaset: fix general protection fault on probe
staging: rtl8712: fix interface sanity check
staging: rtl8188eu: fix interface sanity check
usb: Allow USB device to be warm reset in suspended state
USB: documentation: flags on usb-storage versus UAS
USB: uas: heed CAPACITY_HEURISTICS
USB: uas: honor flag to avoid CAPACITY16
media: venus: remove invalid compat_ioctl32 handler
scsi: qla2xxx: Fix driver unload hang
usb: gadget: pch_udc: fix use after free
usb: gadget: configfs: Fix missing spin_lock_init()
appletalk: Set error code if register_snap_client failed
appletalk: Fix potential NULL pointer dereference in unregister_snap_client
KVM: x86: fix out-of-bounds write in KVM_GET_EMULATED_CPUID (CVE-2019-19332)
ASoC: rsnd: fixup MIX kctrl registration
binder: Handle start==NULL in binder_update_page_range()
thermal: Fix deadlock in thermal thermal_zone_device_check
iomap: Fix pipe page leakage during splicing
RDMA/qib: Validate ->show()/store() callbacks before calling them
spi: atmel: Fix CS high support
crypto: user - fix memory leak in crypto_report
crypto: ecdh - fix big endian bug in ECC library
crypto: ccp - fix uninitialized list head
crypto: af_alg - cast ki_complete ternary op to int
crypto: crypto4xx - fix double-free in crypto4xx_destroy_sdr
KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
KVM: x86: do not modify masked bits of shared MSRs
KVM: arm/arm64: vgic: Don't rely on the wrong pending table
drm/i810: Prevent underflow in ioctl
jbd2: Fix possible overflow in jbd2_log_space_left()
kernfs: fix ino wrap-around detection
can: slcan: Fix use-after-free Read in slcan_open
tty: vt: keyboard: reject invalid keycodes
CIFS: Fix SMB2 oplock break processing
CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks
x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
Input: Fix memory leak in psxpad_spi_probe
coresight: etm4x: Fix input validation for sysfs.
Input: goodix - add upside-down quirk for Teclast X89 tablet
Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers
Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash
Input: synaptics - switch another X1 Carbon 6 to RMI/SMbus
ALSA: hda - Add mute led support for HP ProBook 645 G4
ALSA: pcm: oss: Avoid potential buffer overflows
ALSA: hda/realtek - Dell headphone has noise on unmute for ALC236
fuse: verify attributes
fuse: verify nlink
sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
tcp: exit if nothing to retransmit on RTO timeout
net: aquantia: fix RSS table and key sizes
media: vimc: fix start stream when link is disabled
ARM: dts: sunxi: Fix PMU compatible strings
usb: mtu3: fix dbginfo in qmu_tx_zlp_error_handler
mlx4: Use snprintf instead of complicated strcpy
IB/hfi1: Close VNIC sdma_progress sleep window
IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling state
mlxsw: spectrum_router: Relax GRE decap matching check
firmware: qcom: scm: fix compilation error when disabled
media: stkwebcam: Bugfix for wrong return values
tty: Don't block on IO when ldisc change is pending
nfsd: Return EPERM, not EACCES, in some SETATTR cases
MIPS: OCTEON: cvmx_pko_mem_debug8: use oldest forward compatible definition
clk: renesas: r8a77995: Correct parent clock of DU
powerpc/math-emu: Update macros from GCC
pstore/ram: Avoid NULL deref in ftrace merging failure path
net/mlx4_core: Fix return codes of unsupported operations
dlm: fix invalid cluster name warning
ARM: dts: realview: Fix some more duplicate regulator nodes
clk: sunxi-ng: h3/h5: Fix CSI_MCLK parent
ARM: dts: pxa: clean up USB controller nodes
mtd: fix mtd_oobavail() incoherent returned value
kbuild: fix single target build for external module
modpost: skip ELF local symbols during section mismatch check
tcp: fix SNMP TCP timeout under-estimation
tcp: fix SNMP under-estimation on failed retransmission
tcp: fix off-by-one bug on aborting window-probing socket
ARM: dts: realview-pbx: Fix duplicate regulator nodes
ARM: dts: mmp2: fix the gpio interrupt cell number
net/x25: fix null_x25_address handling
net/x25: fix called/calling length calculation in x25_parse_address_block
arm64: dts: meson-gxl-khadas-vim: fix GPIO lines names
arm64: dts: meson-gxbb-odroidc2: fix GPIO lines names
arm64: dts: meson-gxbb-nanopi-k2: fix GPIO lines names
arm64: dts: meson-gxl-libretech-cc: fix GPIO lines names
ARM: OMAP1/2: fix SoC name printing
ASoC: au8540: use 64-bit arithmetic instead of 32-bit
nfsd: fix a warning in __cld_pipe_upcall()
ARM: debug: enable UART1 for socfpga Cyclone5
dlm: NULL check before kmem_cache_destroy is not needed
ARM: dts: sun8i: v3s: Change pinctrl nodes to avoid warning
ARM: dts: sun5i: a10s: Fix HDMI output DTC warning
ASoC: rsnd: tidyup registering method for rsnd_kctrl_new()
lockd: fix decoding of TEST results
i2c: imx: don't print error message on probe defer
serial: imx: fix error handling in console_setup
altera-stapl: check for a null key before strcasecmp'ing it
dma-mapping: fix return type of dma_set_max_seg_size()
sparc: Correct ctx->saw_frame_pointer logic.
f2fs: fix to allow node segment for GC by ioctl path
ARM: dts: rockchip: Assign the proper GPIO clocks for rv1108
ARM: dts: rockchip: Fix the PMU interrupt number for rv1108
f2fs: change segment to section in f2fs_ioc_gc_range
f2fs: fix count of seg_freed to make sec_freed correct
ACPI: fix acpi_find_child_device() invocation in acpi_preset_companion()
usb: dwc3: don't log probe deferrals; but do log other error codes
usb: dwc3: debugfs: Properly print/set link state for HS
dmaengine: dw-dmac: implement dma protection control setting
dmaengine: coh901318: Remove unused variable
dmaengine: coh901318: Fix a double-lock bug
media: cec: report Vendor ID after initialization
media: pulse8-cec: return 0 when invalidating the logical address
ARM: dts: exynos: Use Samsung SoC specific compatible for DWC2 module
rtc: dt-binding: abx80x: fix resistance scale
rtc: max8997: Fix the returned value in case of error in 'max8997_rtc_read_alarm()'
math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning
net/smc: use after free fix in smc_wr_tx_put_slot()
MIPS: OCTEON: octeon-platform: fix typing
iomap: sub-block dio needs to zeroout beyond EOF
net-next/hinic:fix a bug in set mac address
regulator: Fix return value of _set_load() stub
clk: rockchip: fix ID of 8ch clock of I2S1 for rk3328
clk: rockchip: fix I2S1 clock gate register for rk3328
mm/vmstat.c: fix NUMA statistics updates
Staging: iio: adt7316: Fix i2c data reading, set the data field
pinctrl: qcom: ssbi-gpio: fix gpio-hog related boot issues
crypto: bcm - fix normal/non key hash algorithm failure
crypto: ecc - check for invalid values in the key verification test
scsi: zfcp: drop default switch case which might paper over missing case
net: dsa: mv88e6xxx: Work around mv886e6161 SERDES missing MII_PHYSID2
MIPS: SiByte: Enable ZONE_DMA32 for LittleSur
dlm: fix missing idr_destroy for recover_idr
ARM: dts: rockchip: Fix rk3288-rock2 vcc_flash name
clk: rockchip: fix rk3188 sclk_mac_lbtest parameter ordering
clk: rockchip: fix rk3188 sclk_smc gate data
i40e: don't restart nway if autoneg not supported
rtc: s3c-rtc: Avoid using broken ALMYEAR register
net: ethernet: ti: cpts: correct debug for expired txq skb
extcon: max8997: Fix lack of path setting in USB device mode
dlm: fix possible call to kfree() for non-initialized pointer
clk: sunxi-ng: a64: Fix gate bit of DSI DPHY
net/mlx5: Release resource on error flow
ARM: 8813/1: Make aligned 2-byte getuser()/putuser() atomic on ARMv6+
iwlwifi: mvm: Send non offchannel traffic via AP sta
iwlwifi: mvm: synchronize TID queue removal
cxgb4vf: fix memleak in mac_hlist initialization
serial: core: Allow processing sysrq at port unlock time
i2c: core: fix use after free in of_i2c_notify
net: ep93xx_eth: fix mismatch of request_mem_region in remove
rsxx: add missed destroy_workqueue calls in remove
ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
sched/core: Avoid spurious lock dependencies
Input: cyttsp4_core - fix use after free bug
xfrm: release device reference for invalid state
NFC: nxp-nci: Fix NULL pointer dereference after I2C communication error
audit_get_nd(): don't unlock parent too early
exportfs_decode_fh(): negative pinned may become positive without the parent locked
iwlwifi: pcie: don't consider IV len in A-MSDU
RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN
autofs: fix a leak in autofs_expire_indirect()
serial: ifx6x60: add missed pm_runtime_disable
serial: serial_core: Perform NULL checks for break_ctl ops
serial: pl011: Fix DMA ->flush_buffer()
tty: serial: msm_serial: Fix flow control
tty: serial: fsl_lpuart: use the sg count from dma_map_sg
usb: gadget: u_serial: add missing port entry locking
arm64: tegra: Fix 'active-low' warning for Jetson TX1 regulator
rsi: release skb if rsi_prepare_beacon fails
ANDROID: staging: android: ion: Fix build when CONFIG_ION_SYSTEM_HEAP=n
ANDROID: staging: android: ion: Expose total heap and pool sizes via sysfs
UPSTREAM: include/linux/slab.h: fix sparse warning in kmalloc_type()
UPSTREAM: mm, slab: shorten kmalloc cache names for large sizes
UPSTREAM: mm, proc: add KReclaimable to /proc/meminfo
BACKPORT: mm: rename and change semantics of nr_indirectly_reclaimable_bytes
UPSTREAM: dcache: allocate external names from reclaimable kmalloc caches
BACKPORT: mm, slab/slub: introduce kmalloc-reclaimable caches
UPSTREAM: mm, slab: combine kmalloc_caches and kmalloc_dma_caches
ANDROID: kbuild: disable SCS by default in allmodconfig
ANDROID: arm64: cuttlefish_defconfig: enable LTO, CFI, and SCS
BACKPORT: FROMLIST: arm64: implement Shadow Call Stack
FROMLIST: arm64: disable SCS for hypervisor code
BACKPORT: FROMLIST: arm64: vdso: disable Shadow Call Stack
FROMLIST: arm64: preserve x18 when CPU is suspended
FROMLIST: arm64: reserve x18 from general allocation with SCS
FROMLIST: arm64: disable function graph tracing with SCS
FROMLIST: scs: add support for stack usage debugging
FROMLIST: scs: add accounting
FROMLIST: add support for Clang's Shadow Call Stack (SCS)
FROMLIST: arm64: kernel: avoid x18 in __cpu_soft_restart
FROMLIST: arm64: kvm: stop treating register x18 as caller save
FROMLIST: arm64/lib: copy_page: avoid x18 register in assembler code
FROMLIST: arm64: mm: avoid x18 in idmap_kpti_install_ng_mappings
ANDROID: use non-canonical CFI jump tables
ANDROID: arm64: add __nocfi to __apply_alternatives
ANDROID: arm64: add __pa_function
ANDROID: arm64: allow ThinLTO to be selected
ANDROID: soc/tegra: disable ARCH_TEGRA_210_SOC with LTO
FROMLIST: arm64: fix alternatives with LLVM's integrated assembler
ANDROID: irqchip/gic-v3: rename gic_of_init to work around a ThinLTO+CFI bug
ANDROID: kbuild: limit LTO inlining
ANDROID: kbuild: merge module sections with LTO
ANDROID: init: ensure initcall ordering with LTO
Revert "ANDROID: HACK: init: ensure initcall ordering with LTO"
ANDROID: add support for ThinLTO
ANDROID: Switch to LLD
ANDROID: clang: update to 10.0.1
ANDROID: arm64: add atomic_ll_sc.o to obj-y if using lld
ANDROID: enable ARM64_ERRATUM_843419 by default with LTO_CLANG
ANDROID: kbuild: allow lld to be used with CONFIG_LTO_CLANG
ANDROID: Makefile: set -Qunused-arguments sooner
BACKPORT: FROMLIST: Makefile: lld: tell clang to use lld
BACKPORT: FROMLIST: Makefile: lld: set -O2 linker flag when linking with LLD
ANDROID: scripts/Kbuild: add ld-name support for ld.lld
UPSTREAM: bpf: permit multiple bpf attachments for a single perf event
UPSTREAM: bpf: use the same condition in perf event set/free bpf handler
UPSTREAM: bpf: multi program support for cgroup+bpf
BACKPORT: serdev: make synchronous write return bytes written
UPSTREAM: gnss: serial: fix synchronous write timeout
UPSTREAM: gnss: fix potential error pointer dereference
BACKPORT: gnss: add receiver type support
UPSTREAM: dt-bindings: add generic gnss binding
UPSTREAM: gnss: add generic serial driver
ANDROID: cuttlefish_defconfig: Enable CONFIG_SERIAL_DEV_BUS
ANDROID: cuttlefish_defconfig: Enable CONFIG_GNSS
BACKPORT: gnss: add GNSS receiver subsystem
UPSTREAM: arm64: Validate tagged addresses in access_ok() called from kernel threads
BACKPORT: ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer
fs/lock: skip lock owner pid translation in case we are in init_pid_ns
f2fs: stop GC when the victim becomes fully valid
f2fs: expose main_blkaddr in sysfs
f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project()
f2fs: Fix deadlock in f2fs_gc() context during atomic files handling
f2fs: show f2fs instance in printk_ratelimited
f2fs: fix potential overflow
f2fs: fix to update dir's i_pino during cross_rename
f2fs: support aligned pinned file
f2fs: avoid kernel panic on corruption test
f2fs: fix wrong description in document
f2fs: cache global IPU bio
f2fs: fix to avoid memory leakage in f2fs_listxattr
f2fs: check total_segments from devices in raw_super
f2fs: update multi-dev metadata in resize_fs
f2fs: mark recovery flag correctly in read_raw_super_block()
f2fs: fix to update time in lazytime mode
vfs: don't allow writes to swap files
mm: set S_SWAPFILE on blockdev swap devices
Conflicts:
drivers/Makefile
drivers/staging/android/ion/ion.c
drivers/staging/android/ion/ion.h
drivers/staging/android/ion/ion_page_pool.c
drivers/usb/dwc3/core.c
drivers/usb/dwc3/debugfs.c
drivers/usb/dwc3/ep0.c
fs/f2fs/data.c
include/linux/mmzone.h
mm/vmstat.c
Discarded below patches, as usb patches not applicable and block patch
causing stability issues:
usb: dwc3: ep0: Clear started flag on completion
usb: dwc3: don't log probe deferrals; but do log other error codes
block: fix single range discard merge
Fixed build errors in below files:
drivers/gpu/msm/kgsl_pool.c
drivers/staging/android/ion/ion_page_pool.c
kernel/taskstats.c
Fixed bootup issue in:
arch/arm64/mm/proc.s
Change-Id: I0a16824c251c14c63af78f9cfd9ede5e82c427fc
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
[ Upstream commit 0b8d616fb5a8ffa307b1d3af37f55c15dae14f28 ]
When assiging and testing taskstats in taskstats_exit() there's a race
when setting up and reading sig->stats when a thread-group with more
than one thread exits:
write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
do_exit+0x2c2/0x18e0 kernel/exit.c:864
do_group_exit+0xb4/0x1c0 kernel/exit.c:983
get_signal+0x2a2/0x1320 kernel/signal.c:2734
do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
entry_SYSCALL_64_after_hwframe+0x44/0xa9
read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
do_exit+0x2c2/0x18e0 kernel/exit.c:864
do_group_exit+0xb4/0x1c0 kernel/exit.c:983
__do_sys_exit_group kernel/exit.c:994 [inline]
__se_sys_exit_group kernel/exit.c:992 [inline]
__x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes: 34ec12349c ("taskstats: cleanup ->signal->stats allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Will Deacon <will@kernel.org>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20191009114809.8643-1-christian.brauner@ubuntu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bring up changes for SDM660 on 4.14 kernel.
Change-Id: I338a4a19eee9fb8757c5ef49c307ec794610fc11
Signed-off-by: Naitik Bharadiya <bharad@codeaurora.org>
Adds a new taskstats command to share task memory statistics
to userspace. The command works in two modes. It can share
information per pid, and also send the statistics for all
tasks within a given oom_score_adj range.
Change-Id: Iae742ea4f96754022bc634d285c5ce140b32749d
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Add a new command to get system wide information from
userspace. This patch adds system command to export
memory statistics.
Change-Id: I20c1c24f49024f49b21e3d8e88799559fb5b2058
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
cgroupstats_cmd_get_policy is [CGROUPSTATS_CMD_ATTR_MAX+1],
taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1],
but their family.maxattr is TASKSTATS_CMD_ATTR_MAX.
CGROUPSTATS_CMD_ATTR_MAX is less than TASKSTATS_CMD_ATTR_MAX,
so we could end up accessing out-of-bound.
Change cgroupstats_cmd_get_policy to TASKSTATS_CMD_ATTR_MAX+1,
this is safe because the rest are initialized to 0's.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now genl_register_family() is the only thing (other than the
users themselves, perhaps, but I didn't find any doing that)
writing to the family struct.
In all families that I found, genl_register_family() is only
called from __init functions (some indirectly, in which case
I've add __init annotations to clarifly things), so all can
actually be marked __ro_after_init.
This protects the data structure from accidental corruption.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of providing macros/inline functions to initialize
the families, make all users initialize them statically and
get rid of the macros.
This reduces the kernel code size by about 1.6k on x86-64
(with allyesconfig).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Static family IDs have never really been used, the only
use case was the workaround I introduced for those users
that assumed their family ID was also their multicast
group ID.
Additionally, because static family IDs would never be
reserved by the generic netlink code, using a relatively
low ID would only work for built-in families that can be
registered immediately after generic netlink is started,
which is basically only the control family (apart from
the workaround code, which I also had to add code for so
it would reserve those IDs)
Thus, anything other than GENL_ID_GENERATE is flawed and
luckily not used except in the cases I mentioned. Move
those workarounds into a few lines of code, and then get
rid of GENL_ID_GENERATE entirely, making it more robust.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Goal of this patch is to use the new libnl API to align netlink attribute
when needed.
The layout of the netlink message will be a bit different after the patch,
because the padattr (TASKSTATS_TYPE_STATS) will be inside the nested
attribute instead of before it.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.
This makes the very common pattern of
if (genlmsg_end(...) < 0) { ... }
be a whole bunch of dead code. Many places also simply do
return nlmsg_end(...);
and the caller is expected to deal with it.
This also commonly (at least for me) causes errors, because it is very
common to write
if (my_function(...))
/* error condition */
and if my_function() does "return nlmsg_end()" this is of course wrong.
Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.
Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did
- return nlmsg_end(...);
+ nlmsg_end(...);
+ return 0;
I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.
One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert all uses of __get_cpu_var for address calculation to use
this_cpu_ptr instead.
[Uses of __get_cpu_var with cpumask_var_t are no longer
handled by this patch]
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
As suggested by David Miller, make genl_register_family_with_ops()
a macro and pass only the array, evaluating ARRAY_SIZE() in the
macro, this is a little safer.
The openvswitch has some indirection, assing ops/n_ops directly in
that code. This might ultimately just assign the pointers in the
family initializations, saving the struct genl_family_and_ops and
code (once mcast groups are handled differently.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that genl_ops are no longer modified in place when
registering, they can be made const. This patch was done
mostly with spatch:
@@
identifier ops;
@@
+const
struct genl_ops ops[] = {
...
};
(except the struct thing in net/openvswitch/datapath.c)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This simplifies the code since there's no longer a
need to have error handling in the registration.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For registering in add_del_listener(), when kmalloc_node() fails, need
return -ENOMEM instead of success code, and cmd_attr_register_cpumask()
wants to know about it.
After modification, give a simple common test "build -> boot up ->
kernel/controllers/cgroup/getdelays by LTP tools".
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If prepare_reply() succeeds we have allocated memory for 'rep_skb'. If
nla_reserve() then subsequently fails and returns NULL we fail to release
the memory we allocated, thus causing a leak.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs update from Al Viro:
- big one - consolidation of descriptor-related logics; almost all of
that is moved to fs/file.c
(BTW, I'm seriously tempted to rename the result to fd.c. As it is,
we have a situation when file_table.c is about handling of struct
file and file.c is about handling of descriptor tables; the reasons
are historical - file_table.c used to be about a static array of
struct file we used to have way back).
A lot of stray ends got cleaned up and converted to saner primitives,
disgusting mess in android/binder.c is still disgusting, but at least
doesn't poke so much in descriptor table guts anymore. A bunch of
relatively minor races got fixed in process, plus an ext4 struct file
leak.
- related thing - fget_light() partially unuglified; see fdget() in
there (and yes, it generates the code as good as we used to have).
- also related - bits of Cyrill's procfs stuff that got entangled into
that work; _not_ all of it, just the initial move to fs/proc/fd.c and
switch of fdinfo to seq_file.
- Alex's fs/coredump.c spiltoff - the same story, had been easier to
take that commit than mess with conflicts. The rest is a separate
pile, this was just a mechanical code movement.
- a few misc patches all over the place. Not all for this cycle,
there'll be more (and quite a few currently sit in akpm's tree)."
Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
MAX_LFS_FILESIZE should be a loff_t
compat: fs: Generic compat_sys_sendfile implementation
fs: push rcu_barrier() from deactivate_locked_super() to filesystems
btrfs: reada_extent doesn't need kref for refcount
coredump: move core dump functionality into its own file
coredump: prevent double-free on an error path in core dumper
usb/gadget: fix misannotations
fcntl: fix misannotations
ceph: don't abuse d_delete() on failure exits
hypfs: ->d_parent is never NULL or negative
vfs: delete surplus inode NULL check
switch simple cases of fget_light to fdget
new helpers: fdget()/fdput()
switch o2hb_region_dev_write() to fget_light()
proc_map_files_readdir(): don't bother with grabbing files
make get_file() return its argument
vhost_set_vring(): turn pollstart/pollstop into bool
switch prctl_set_mm_exe_file() to fget_light()
switch xfs_find_handle() to fget_light()
switch xfs_swapext() to fget_light()
...
Pull networking changes from David Miller:
1) GRE now works over ipv6, from Dmitry Kozlov.
2) Make SCTP more network namespace aware, from Eric Biederman.
3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.
4) Make openvswitch network namespace aware, from Pravin B Shelar.
5) IPV6 NAT implementation, from Patrick McHardy.
6) Server side support for TCP Fast Open, from Jerry Chu and others.
7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
Borkmann.
8) Increate the loopback default MTU to 64K, from Eric Dumazet.
9) Use a per-task rather than per-socket page fragment allocator for
outgoing networking traffic. This benefits processes that have very
many mostly idle sockets, which is quite common.
From Eric Dumazet.
10) Use up to 32K for page fragment allocations, with fallbacks to
smaller sizes when higher order page allocations fail. Benefits are
a) less segments for driver to process b) less calls to page
allocator c) less waste of space.
From Eric Dumazet.
11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.
12) VXLAN device driver, one way to handle VLAN issues such as the
limitation of 4096 VLAN IDs yet still have some level of isolation.
From Stephen Hemminger.
13) As usual there is a large boatload of driver changes, with the scale
perhaps tilted towards the wireless side this time around.
Fix up various fairly trivial conflicts, mostly caused by the user
namespace changes.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
hyperv: Add buffer for extended info after the RNDIS response message.
hyperv: Report actual status in receive completion packet
hyperv: Remove extra allocated space for recv_pkt_list elements
hyperv: Fix page buffer handling in rndis_filter_send_request()
hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
hyperv: Fix the max_xfer_size in RNDIS initialization
vxlan: put UDP socket in correct namespace
vxlan: Depend on CONFIG_INET
sfc: Fix the reported priorities of different filter types
sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
sfc: Fix loopback self-test with separate_tx_channels=1
sfc: Fix MCDI structure field lookup
sfc: Add parentheses around use of bitfield macro arguments
sfc: Fix null function pointer in efx_sriov_channel_type
vxlan: virtual extensible lan
igmp: export symbol ip_mc_leave_group
netlink: add attributes to fdb interface
tg3: unconditionally select HWMON support when tg3 is enabled.
Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
gre: fix sparse warning
...
- Explicitly limit exit task stat broadcast to the initial user and
pid namespaces, as it is already limited to the initial network
namespace.
- For broadcast task stats explicitly generate all of the idenitiers
in terms of the initial user namespace and the initial pid
namespace.
- For request stats report them in terms of the current user namespace
and the current pid namespace. Netlink messages are delivered
syncrhonously to the kernel allowing us to get the user namespace
and the pid namespace from the current task.
- Pass the namespaces for representing pids and uids and gids
into bacct_add_task.
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.
I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.
I have successfully built an allyesconfig kernel with this change.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ok, this isn't optimal, since it means that 'iotop' needs admin
capabilities, and we may have to work on this some more. But at the
same time it is very much not acceptable to let anybody just read
anybody elses IO statistics quite at this level.
Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative
to checking the capabilities by hand.
Reported-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When send_cpu_listeners() finds the orphaned listener it marks it as
!valid and drops listeners->sem. Before it takes this sem for writing,
s->pid can be reused and add_del_listener() can wrongly try to re-use
this entry.
Change add_del_listener() to check ->valid = T.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. Commit 26c4caea9d "don't allow duplicate entries in listener mode"
changed add_del_listener(REGISTER) so that "next_cpu:" can reuse the
listener allocated for the previous cpu, this doesn't look exactly
right even if minor.
Change the code to kfree() in the already-registered case, this case
is unlikely anyway so the extra kmalloc_node() shouldn't hurt but
looke more correct and clean.
2. use the plain list_for_each_entry() instead of _safe() to scan
listeners->list.
3. Remove the unneeded INIT_LIST_HEAD(&s->list), we are going to
list_add(&s->list).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently a single process may register exit handlers unlimited times.
It may lead to a bloated listeners chain and very slow process
terminations.
Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of
kernel memory is stolen for the handlers chain and "time id" shows 2-7
seconds instead of normal 0.003. It makes it possible to exhaust all
kernel memory and to eat much of CPU time by triggerring numerous exits
on a single CPU.
The patch limits the number of times a single process may register
itself on a single CPU to one.
One little issue is kept unfixed - as taskstats_exit() is called before
exit_files() in do_exit(), the orphaned listener entry (if it was not
explicitly deregistered) is kept until the next someone's exit() and
implicit deregistration in send_cpu_listeners(). So, if a process
registered itself as a listener exits and the next spawned process gets
the same pid, it would inherit taskstats attributes.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk()s without a priority level default to KERN_WARNING. To reduce
noise at KERN_WARNING, this patch set the priority level appriopriately
for unleveled printks()s. This should be useful to folks that look at
dmesg warnings closely.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 4be2c95d ("taskstats: pad taskstats netlink response for aligment
issues on ia64") added a null field to align the taskstats structure but
the discussion centered around ia64. The issue exists on other platforms
with inefficient unaligned access and adding them piecemeal would be an
unmaintainable mess.
This patch uses Dave Miller's suggestion of using a combination of
CONFIG_64BIT && !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to determine
whether alignment is needed.
Note that this will cause breakage on those platforms with applications
like iotop which had hard-coded offsets into the packet to access the
taskstats structure.
The message seen on systems without the alignment fixes looks like: kernel
unaligned access to 0xe000023879dca9bc, ip=0xa000000100133d10
The addresses may vary but resolve to locations inside __delayacct_add_tsk.
iotop makes what I'd call unreasonable assumptions about the contents of a
netlink genetlink packet containing generic attributes. They're typed and
have headers that specify value lengths, so the client can (should)
identify and skip the ones the client doesn't understand.
The kernel, as of version 2.6.36, presented a packet like so:
+--------------------------------+
| genlmsghdr - 4 bytes |
+--------------------------------+
| NLA header - 4 bytes | /* Aggregate header */
+-+------------------------------+
| | NLA header - 4 bytes | /* PID header */
| +------------------------------+
| | pid/tgid - 4 bytes |
| +------------------------------+
| | NLA header - 4 bytes | /* stats header */
| + -----------------------------+ <- oops. aligned on 4 byte boundary
| | struct taskstats - 328 bytes |
+-+------------------------------+
The iotop code expects that the kernel will behave as it did then,
assuming that the packet format is set in stone. The format is set in
stone, but the packet offsets are not. There's nothing in the packet
format that guarantees that the packet will be sent in exactly the same
way. The attribute contents are set (or versioned) and the aggregate
contents are set but they can be anywhere in the packet.
The issue here isn't that an unaligned structure gets passed to userspace,
it's that the NLA infrastructure has something of a weakness: The 4 byte
attribute header may force the payload to be unaligned. The taskstats
structure is created at an unaligned location and then 64-bit values are
operated on inside the kernel, so the unaligned access warnings gets
spewed everywhere.
It's possible to use the unaligned access API to operate on the structure
in the kernel but it seems like a wasted effort to work around userspace
code that isn't following the packet format. Any new additions would also
need the be worked around. It's a maintenance nightmare.
The conclusion of the earlier discussion seemed to be "ok fine, if we have
to break it, don't break it on arches that don't have the problem." Dave
pointed out that the unaligned access problem doesn't only exist on ia64,
but also on other 64-bit arches that don't have efficient unaligned access
and it should be fixed there as well. The committed version of the patch
and this addition keep with the conclusion of that discussion not to break
it unnecessarily, which the pid padding and the packet padding fixes did
do. x86_64 and powerpc don't suffer this problem so they shouldn't suffer
the solution. Other 64-bit architectures do and will, though.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reported-by: David S. Miller <davem@davemloft.net>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Dan Carpenter <error27@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Florian Mickler <florian@mickler.org>
Cc: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
gameport: use this_cpu_read instead of lookup
x86: udelay: Use this_cpu_read to avoid address calculation
x86: Use this_cpu_inc_return for nmi counter
x86: Replace uses of current_cpu_data with this_cpu ops
x86: Use this_cpu_ops to optimize code
vmstat: User per cpu atomics to avoid interrupt disable / enable
irq_work: Use per cpu atomics instead of regular atomics
cpuops: Use cmpxchg for xchg to avoid lock semantics
x86: this_cpu_cmpxchg and this_cpu_xchg operations
percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
percpu,x86: relocate this_cpu_add_return() and friends
connector: Use this_cpu operations
xen: Use this_cpu_inc_return
taskstats: Use this_cpu_ops
random: Use this_cpu_inc_return
fs: Use this_cpu_inc_return in buffer.c
highmem: Use this_cpu_xx_return() operations
vmstat: Use this_cpu_inc_return for vm statistics
x86: Support for this_cpu_add, sub, dec, inc_return
percpu: Generic support for this_cpu_add, sub, dec, inc_return
...
Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
The taskstats structure is internally aligned on 8 byte boundaries but the
layout of the aggregrate reply, with two NLA headers and the pid (each 4
bytes), actually force the entire structure to be unaligned. This causes
the kernel to issue unaligned access warnings on some architectures like
ia64. Unfortunately, some software out there doesn't properly unroll the
NLA packet and assumes that the start of the taskstats structure will
always be 20 bytes from the start of the netlink payload. Aligning the
start of the taskstats structure breaks this software, which we don't
want. So, for now the alignment only happens on architectures that
require it and those users will have to update to fixed versions of those
packages. Space is reserved in the packet only when needed. This ifdef
should be removed in several years e.g. 2012 once we can be confident
that fixed versions are installed on most systems. We add the padding
before the aggregate since the aggregate is already a defined type.
Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems")
previously addressed the alignment issues by padding out the pid field.
This was supposed to be a compatible change but the circumstances
described above mean that it wasn't. This patch backs out that change,
since it was a hack, and introduces a new NULL attribute type to provide
the padding. Padding the response with 4 bytes avoids allocating an
aligned taskstats structure and copying it back. Since the structure
weighs in at 328 bytes, it's too big to do it on the stack.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reported-by: Brian Rogers <brian@xyzw.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Guillaume Chazarain <guichaz@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use this_cpu_inc_return in one place and avoid ugly __raw_get_cpu in
another.
V3->V4:
- Fix off by one.
V4-V4f:
- Use &listener_array
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
prepare_reply() sets up an skb for the response. The payload contains:
+--------------------------------+
| genlmsghdr - 4 bytes |
+--------------------------------+
| NLA header - 4 bytes | /* Aggregate header */
+-+------------------------------+
| | NLA header - 4 bytes | /* PID header */
| +------------------------------+
| | pid/tgid - 4 bytes |
| +------------------------------+
| | NLA header - 4 bytes | /* stats header */
| + -----------------------------+ <- oops. aligned on 4 byte boundary
| | struct taskstats - 328 bytes |
+-+------------------------------+
The start of the taskstats struct must be 8 byte aligned on IA64 (and
other systems with 8 byte alignment rules for 64-bit types) or runtime
alignment warnings will be issued.
This patch pads the pid/tgid field out to sizeof(long), which forces the
alignment of taskstats. The getdelays userspace code is ok with this
since it assumes 32-bit pid/tgid and then honors that header's length
field.
An array is used to avoid exposing kernel memory contents to userspace in
the response.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Make remaining netlink policies as const.
Fixup coding style where needed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes generic netlink network namespace aware. No
generic netlink families except for the controller family
are made namespace aware, they need to be checked one by
one and then set the family->netnsok member to true.
A new function genlmsg_multicast_netns() is introduced to
allow sending a multicast message in a given namespace,
for example when it applies to an object that lives in
that namespace, a new function genlmsg_multicast_allns()
to send a message to all network namespaces (for objects
that do not have an associated netns).
The function genlmsg_multicast() is changed to multicast
the message in just init_net, which is currently correct
for all generic netlink families since they only work in
init_net right now. Some will later want to work in all
net namespaces because they do not care about the netns
at all -- those will have to be converted to use one of
the new functions genlmsg_multicast_allns() or
genlmsg_multicast_netns() whenever they are made netns
aware in some way.
After this patch families can easily decide whether or
not they should be available in all net namespaces. Many
genl families us it for objects not related to networking
and should therefore be available in all namespaces, but
that will have to be done on a per family basis.
Note that this doesn't touch on the checkpoint/restart
problem where network namespaces could be used, genl
families and multicast groups are numbered globally and
I see no easy way of changing that, especially since it
must be possible to multicast to all network namespaces
for those families that do not care about netns.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: Reduce stack usage, use new cpumask API.
Mainly changing cpumask_t to 'struct cpumask' and similar simple API
conversion. Two conversions worth mentioning:
1) we use cpumask_any_but to avoid a temporary in kernel/softlockup.c,
2) Use cpumask_var_t in taskstats_user_cmd().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>