49eea524bebea0d2b7dfa1c709a6694de808eb8a
79 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
16b6ed19fc |
Merge android-4.9.87 (a290494) into msm-4.9
* refs/heads/tmp-a290494: Linux 4.9.87 btrfs: preserve i_mode if __btrfs_set_acl() fails bpf, ppc64: fix out of bounds access in tail call bpf: add schedule points in percpu arrays management bpf, arm64: fix out of bounds access in tail call bpf, x64: implement retpoline for tail call bpf: fix mlock precharge on arraymaps bpf: fix wrong exposure of map_flags into fdinfo for lpm mpls, nospec: Sanitize array index in mpls_label_ok() net: mpls: Pull common label check into helper sctp: verify size of a new chunk in _sctp_make_chunk() s390/qeth: fix IPA command submission race s390/qeth: fix IP address lookup for L3 devices s390/qeth: fix double-free on IP add/remove race s390/qeth: fix IP removal on offline cards s390/qeth: fix overestimated count of buffer elements s390/qeth: fix SETIP command handling s390/qeth: fix underestimated count of buffer elements sctp: fix dst refcnt leak in sctp_v6_get_dst() tcp_bbr: better deal with suboptimal GSO rxrpc: Fix send in rxrpc_send_data_packet() tcp: Honor the eor bit in tcp_mtu_probe net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT mlxsw: spectrum_switchdev: Check success of FDB add operation sctp: fix dst refcnt leak in sctp_v4_get_dst udplite: fix partial checksum initialization ppp: prevent unregistered channels from connecting to PPP units netlink: ensure to loop over all netns in genlmsg_multicast_allns() net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68 net: fix race on decreasing number of TX queues ipv6 sit: work around bogus gcc-8 -Wrestrict warning hdlc_ppp: carrier detect ok, don't turn off negotiation fib_semantics: Don't match route with mismatching tclassid bridge: check brport attr show in brport_show x86/apic/vector: Handle legacy irq data correctly netlink: put module reference if dump start fails md: only allow remove_and_add_spares when no sync_thread running. x86/speculation: Use Indirect Branch Prediction Barrier in context switch x86/mm: Give each mm TLB flush generation a unique ID ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux dm io: fix duplicate bio completion due to missing ref count PCI/ASPM: Deal with missing root ports in link state handling KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM/x86: Remove indirect MSR op calls from SPEC_CTRL KVM: mmu: Fix overlap between public and private memslots ARM: kvm: fix building with gcc-8 ARM: mvebu: Fix broken PL310_ERRATA_753970 selects nospec: Allow index argument to have const-qualified type media: m88ds3103: don't call a non-initalized function x86/platform/intel-mid: Handle Intel Edison reboot correctly x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend dax: fix vma_is_fsdax() helper cpufreq: s3c24xx: Fix broken s3c_cpufreq_init() parisc: Fix ordering of cache and TLB flushes timers: Forward timer base before migrating timers ALSA: hda - Fix pincfg at resume on Lenovo T470 dock ALSA: hda: Add a power_save blacklist ALSA: usb-audio: Add a quirck for B&W PX headphones tpm-dev-common: Reject too short writes tpm_tis_spi: Use DMA-safe memory for SPI transfers tpm: constify transmit data pointers tpm_tis: fix potential buffer overruns caused by bit glitches on the bus tpm_i2c_nuvoton: fix potential buffer overruns caused by bit glitches on the bus tpm_i2c_infineon: fix potential buffer overruns caused by bit glitches on the bus tpm: st33zp24: fix potential buffer overruns caused by bit glitches on the bus FROMLIST: ARM: amba: Don't read past the end of sysfs "driver_override" buffer UPSTREAM: ANDROID: binder: remove WARN() for redundant txn error Conflicts: kernel/time/timer.c Change-Id: I302546c52a480e9a4c661accf021766c499739b9 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org> |
||
|
|
13e75c74cd |
timers: Forward timer base before migrating timers
commit c52232a49e203a65a6e1a670cd5262f59e9364a0 upstream.
On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a
live CPU. This happens from the control thread which initiated the unplug.
If the CPU on which the control thread runs came out from a longer idle
period then the base clock of that CPU might be stale because the control
thread runs prior to any event which forwards the clock.
In such a case the timers from the unplugged CPU are queued on the live CPU
based on the stale clock which can cause large delays due to increased
granularity of the outer timer wheels which are far away from base:;clock.
But there is a worse problem than that. The following sequence of events
illustrates it:
- CPU0 timer1 is queued expires = 59969 and base->clk = 59131.
The timer is queued at wheel level 2, with resulting expiry time = 60032
(due to level granularity).
- CPU1 enters idle @60007, with next timer expiry @60020.
- CPU0 is hotplugged at @60009
- CPU1 exits idle and runs the control thread which migrates the
timers from CPU0
timer1 is now queued in level 0 for immediate handling in the next
softirq because the requested expiry time 59969 is before CPU1 base->clk
60007
- CPU1 runs code which forwards the base clock which succeeds because the
next expiring timer. which was collected at idle entry time is still set
to 60020.
So it forwards beyond 60007 and therefore misses to expire the migrated
timer1. That timer gets expired when the wheel wraps around again, which
takes between 63 and 630ms depending on the HZ setting.
Address both problems by invoking forward_timer_base() for the control CPUs
timer base. All other places, which might run into a similar problem
(mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid
that.
[ tglx: Massaged comment and changelog ]
Fixes:
|
||
|
|
ea6a32473b |
Revert "time: Run deferrable timers on other CPUs when tick_do_timer_cpu is busy"
Only Watchdog was using deferrable timers, but it is not using deferrable
any more. So reverting the change and let tick_do_timer_cpu run the
deferrable timers.
This reverts 'commit
|
||
|
|
b7e257f0a6 |
Merge "Merge android-4.9-o.80 (a9fd318) into msm-4.9"
|
||
|
|
4fe122d73b |
time: Run deferrable timers on other CPUs when tick_do_timer_cpu is busy
The deferrable timers processing is done by the tick_do_timer_cpu. If softirqs are deferred to the ksoftirqd task on this CPU, it may take longer time to process the deferrable timers under heavy load scenarios. Allow deferrable timers to be processed on other CPUs when ksoftirqd is active on the tick_do_timer_cpu. Change-Id: Ic7b39415b5efd3239a28e41460708a3bcfdb47b2 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> |
||
|
|
ec23108aed |
Merge android-4.9-o.78 (bf3b339) into msm-4.9
* refs/heads/tmp-bf3b339: Linux 4.9.78 MIPS: AR7: ensure the port type's FCR value is used x86/retpoline: Optimize inline assembler for vmexit_fill_RSB x86/pti: Document fix wrong index kprobes/x86: Disable optimizing on the function jumps to indirect thunk kprobes/x86: Blacklist indirect thunk functions for kprobes retpoline: Introduce start/end markers of indirect thunk x86/mce: Make machine check speculation protected usbip: fix warning in vhci_hcd_probe/lockdep_init_map x86/cpu, x86/pti: Do not enable PTI on AMD processors arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6 dm btree: fix serious bug in btree_split_beneath() workqueue: avoid hard lockups in show_workqueue_state() libata: apply MAX_SEC_1024 to all LITEON EP1 series devices proc: fix coredump vs read /proc/*/stat race scripts/gdb/linux/tasks.py: fix get_thread_info can: peak: fix potential bug in packet fragmentation ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7 ARM: sunxi_defconfig: Enable CMA phy: work around 'phys' references to usb-nop-xceiv devices tracing: Fix converting enum's from the map in trace_event_eval_update() Input: twl4030-vibra - fix sibling-node lookup Input: twl6040-vibra - fix child-node lookup Input: 88pm860x-ts - fix child-node lookup Input: ALPS - fix multi-touch decoding on SS4 plus touchpads perf tools: Fix build with ARCH=x86_64 x86/apic/vector: Fix off by one in error path pipe: avoid round_pipe_size() nr_pages overflow on 32-bit x86/tsc: Fix erroneous TSC rate on Skylake Xeon x86/mm/pkeys: Fix fill_sig_info_pkey module: Add retpoline tag to VERMAGIC x86/cpufeature: Move processor tracing out of scattered features objtool: Improve error message for bad file argument x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros x86/retpoline: Fill RSB on context switch for affected CPUs sched/deadline: Zero out positive runtime after throttling constrained tasks scsi: hpsa: fix volume offline state iser-target: Fix possible use-after-free in connection establishment error af_key: fix buffer overread in parse_exthdrs() af_key: fix buffer overread in verify_address_len() timers: Unconditionally check deferrable base ALSA: hda - Apply the existing quirk to iMac 14,1 ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant ALSA: pcm: Remove yet superfluous WARN_ON() ALSA: seq: Make ioctls race-free futex: Prevent overflow by strengthen input validation scsi: sg: disable SET_FORCE_LOW_DMA libnvdimm, btt: Fix an incompatibility in the log layout FROMLIST: arm64: kpti: Fix the interaction between ASID switching and software PAN FROMLIST: arm64: Move post_ttbr_update_workaround to C code Conflicts: arch/arm64/include/asm/efi.h arch/arm64/include/asm/mmu_context.h arch/arm64/mm/context.c drivers/scsi/sg.c kernel/workqueue.c Change-Id: Icbdef53178fe3e325386cbae73edea918d23f519 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org> |
||
|
|
676109b28c |
timers: Unconditionally check deferrable base
commit ed4bbf7910b28ce3c691aef28d245585eaabda06 upstream.
When the timer base is checked for expired timers then the deferrable base
must be checked as well. This was missed when making the deferrable base
independent of base::nohz_active.
Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: rt@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
42d425962e |
Merge android-4.9-o.74 (127372f) into msm-4.9
* refs/heads/tmp-127372f: Linux 4.9.74 mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP tty: fix tty_ldisc_receive_buf() documentation n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD) x86/smpboot: Remove stale TLB flush invocations nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() timers: Reinitialize per cpu bases on hotplug timers: Invoke timer_start_debug() where it makes sense timers: Use deferrable base independent of base::nohz_active usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 USB: Fix off by one in type-specific length check of BOS SSP capability usb: add RESET_RESUME for ELSA MicroLink 56K usb: Add device quirk for Logitech HD Pro Webcam C925e USB: serial: option: adding support for YUGA CLM920-NC5 USB: serial: option: add support for Telit ME910 PID 0x1101 USB: serial: qcserial: add Sierra Wireless EM7565 USB: serial: ftdi_sio: add id for Airbus DS P8GR usbip: vhci: stop printing kernel pointer addresses in messages usbip: stub: stop printing kernel pointer addresses in messages usbip: prevent leaking socket pointer address in messages usbip: fix usbip bind writing random string after command in match_busid s390/qeth: update takeover IPs after configuration change s390/qeth: lock IP table while applying takeover changes s390/qeth: don't apply takeover changes to RXIP s390/qeth: apply takeover changes when mode is toggled net/mlx5: Fix error flow in CREATE_QP command net/mlx5e: Prevent possible races in VXLAN control flow net/mlx5e: Add refcount to VXLAN structure net/mlx5e: Fix possible deadlock of VXLAN lock net/mlx5e: Fix features check of IPv6 traffic net/mlx5: Fix rate limit packet pacing naming and struct tcp: invalidate rate samples during SACK reneging sock: free skb in skb_complete_tx_timestamp on error net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround net: Fix double free and memory corruption in get_net_ns_by_id() net: fec: Allow reception of frames bigger than 1522 bytes net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks ipv4: Fix use-after-free when flushing FIB tables adding missing rcu_read_unlock in ipxip6_rcv sctp: Replace use of sockets_allocated with specified macro. net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case net: ipv4: fix for a race condition in raw_sendmsg tg3: Fix rx hang on MTU change with 5717/5719 tcp md5sig: Use skb's saddr when replying to an incoming segment tcp_bbr: record "full bw reached" decision in new full_bw_reached bit RDS: Check cmsg_len before dereferencing CMSG_DATA ptr_ring: add barriers net: reevalulate autoflowlabel setting after sysctl setting net: qmi_wwan: add Sierra EM7565 1199:9091 netlink: Add netns check on taps net: igmp: Use correct source address on IGMPv3 reports net: fec: unmap the xmit buffer that are not transferred by DMA ipv6: mcast: better catch silly mtu values ipv4: igmp: guard against silly MTU values kbuild: add '-fno-stack-check' to kernel build options x86/mm/64: Fix reboot interaction with CR4.PCIDE x86/mm: Enable CR4.PCIDE on supported systems x86/mm: Add the 'nopcid' boot option to turn off PCID x86/mm: Disable PCID on 32-bit kernels x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range() x86/mm: Make flush_tlb_mm_range() more predictable x86/mm: Remove flush_tlb() and flush_tlb_current_task() x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() ALSA: hda - fix headset mic detection issue on a Dell machine ALSA: hda: Drop useless WARN_ON() ASoC: tlv320aic31xx: Fix GPIO1 register definition ASoC: twl4030: fix child-node lookup ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure ASoC: da7218: fix fix child-node lookup ASoC: wm_adsp: Fix validation of firmware and coeff lengths iw_cxgb4: Only validate the MSN for successful completions ring-buffer: Mask out the info bits when returning buffer page length tracing: Fix crash when it fails to alloc ring buffer tracing: Fix possible double free on failure of allocating trace buffer tracing: Remove extra zeroing out of the ring buffer page sync objtool's copy of x86-opcode-map.txt Conflicts: include/linux/cpuhotplug.h kernel/time/timer.c Change-Id: I0198e2b75715d13acd86237321966774cd6d9f1d Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org> |
||
|
|
249d4a9b32 |
timers: Reinitialize per cpu bases on hotplug
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.
The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.
Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.
Set base->must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.
Fixes:
|
||
|
|
574e543ff9 |
timers: Invoke timer_start_debug() where it makes sense
commit fd45bb77ad682be728d1002431d77b8c73342836 upstream.
The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.
Call the debug function after setting the new base and flags.
Fixes:
|
||
|
|
d840687aa8 |
timers: Use deferrable base independent of base::nohz_active
commit ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd upstream.
During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.
Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.
To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.
Fixes:
|
||
|
|
219fe504c6 |
kernel: time: Fix low resolution timer not fire in 32bit case
Low resolution timer is not fired to run at given expired time which is equal to this cpu's base clk time. This is caused by 32bit intergar overflow. When pos_up is 0, and pos_down is -1, with unsigned add, it will not seen pos_up + clk is bigger than pos_down + clk. So add an cast to u64 to have the expected result. Change-Id: I45777a1fd282d8f70ba94528b04fce2f0436d7e4 Signed-off-by: Maria Yu <aiquny@codeaurora.org> |
||
|
|
cd02e634b8 |
Merge remote-tracking branch '4.9/tmp-379e3b2' into 4.9
* 4.9/tmp-379e3b2:
ANDROID: binder: fix transaction leak.
ANDROID: binder: Add tracing for binder priority inheritance.
Linux 4.9.53
swiotlb-xen: implement xen_swiotlb_dma_mmap callback
video: fbdev: aty: do not leak uninitialized padding in clk to userspace
KVM: VMX: use cmpxchg64
cxl: Fix driver use count
KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt
KVM: VMX: do not change SN bit in vmx_update_pi_irte()
timer/sysclt: Restrict timer migration sysctl values to 0 and 1
gfs2: Fix debugfs glocks dump
x86/fpu: Don't let userspace set bogus xcomp_bv
x86/mm: Fix fault error path using unsafe vma pointer
btrfs: prevent to set invalid default subvolid
btrfs: propagate error to btrfs_cmp_data_prepare caller
btrfs: fix NULL pointer dereference from free_reloc_roots()
PCI: Fix race condition with driver_override
etnaviv: fix gem object list corruption
xfs: validate bdev support for DAX inode flag
kvm: nVMX: Don't allow L2 to access the hardware CR8
KVM: VMX: Do not BUG() on out-of-bounds guest IRQ
kvm/x86: Handle async PF in RCU read-side critical sections
KVM: VMX: simplify and fix vmx_vcpu_pi_load
KVM: VMX: avoid double list add with VT-d posted interrupts
KVM: VMX: extract __pi_post_block
arm64: fault: Route pte translation faults via do_translation_fault
arm64: Make sure SPsel is always set
seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
selftests/seccomp: Support glibc 2.26 siginfo_t.h
iw_cxgb4: put ep reference in pass_accept_req()
iw_cxgb4: remove the stid on listen create failure
bsg-lib: don't free job in bsg_prepare_job
nl80211: check for the required netlink attributes presence
vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets
SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
SMB: Validate negotiate (to protect against downgrade) even if signing off
SMB3: Warn user if trying to sign connection that authenticated as guest
Fix SMB3.1.1 guest authentication to Samba
PM: core: Fix device_pm_check_callbacks()
s390/mm: fix write access check in gup_huge_pmd()
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/tm: Flush TM only if CPU has TM feature
powerpc/pseries: Fix parent_dn reference leak in add_dt_node()
KEYS: prevent KEYCTL_READ on negative key
KEYS: prevent creating a different user's keyrings
KEYS: fix writing past end of user-supplied buffer in keyring_read()
security/keys: rewrite all of big_key crypto
security/keys: properly zero out sensitive key material in big_key
crypto: talitos - fix hashing
crypto: talitos - fix sha224
crypto: talitos - Don't provide setkey for non hmac hashing algs.
crypto: drbg - fix freeing of resources
drm/radeon: disable hard reset in hibernate for APUs
scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list
md/raid5: fix a race condition in stripe batch
tracing: Erase irqsoff trace with empty write
tracing: Fix trace_pipe behavior for instance traces
KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
genirq: Make sparse_irq_lock protect what it should protect
mac80211: flush hw_roc_start work before cancelling the ROC
mac80211_hwsim: Use proper TX power
mac80211: fix VLAN handling with TXQs
fs/proc: Report eip/esp in /prod/PID/stat for coredumping
cifs: release auth_key.response for reconnect.
cifs: release cifs root_cred after exit_cifs
ANDROID: add script to fetch android kernel config fragments
FROMLIST: binder: fix use-after-free in binder_transaction()
UPSTREAM: ipv6: fib: Unlink replaced routes from their nodes
Linux 4.9.52
bcache: fix bch_hprint crash and improve output
bcache: fix for gc and write-back race
bcache: Correct return value for sysfs attach errors
bcache: correct cache_dirty_target in __update_writeback_rate()
bcache: do not subtract sectors_to_gc for bypassed IO
bcache: Fix leak of bdev reference
bcache: initialize dirty stripes in flash_dev_run()
PM / devfreq: Fix memory leak when fail to register device
media: uvcvideo: Prevent heap overflow when accessing mapped controls
media: v4l2-compat-ioctl32: Fix timespec conversion
s390/mm: fix race on mm->context.flush_mm
s390/mm: fix local TLB flushing vs. detach of an mm address space
net/netfilter/nf_conntrack_core: Fix net_conntrack_lock()
PCI: pciehp: Report power fault only once until we clear it
PCI: shpchp: Enable bridge bus mastering if MSI is enabled
ARC: Re-enable MMU upon Machine Check exception
tracing: Apply trace_clock changes to instance max buffer
tracing: Add barrier to trace_printk() buffer nesting modification
ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
ftrace: Fix selftest goto location on error
scsi: qla2xxx: Fix an integer overflow in sysfs code
scsi: qla2xxx: Correction to vha->vref_count timeout
scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE
scsi: sg: factor out sg_fill_request_table()
scsi: sg: off by one in sg_ioctl()
scsi: sg: use standard lists for sg_requests
scsi: sg: remove 'save_scat_len'
scsi: storvsc: fix memory leak on ring buffer busy
scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead
scsi: megaraid_sas: Check valid aen class range to avoid kernel panic
scsi: megaraid_sas: set minimum value of resetwaittime to be 1 secs
scsi: zfcp: trace high part of "new" 64 bit SCSI LUN
scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response
scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records
scsi: zfcp: fix missing trace records for early returns in TMF eh handlers
scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA
scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records
scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path
scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled
skd: Submit requests to firmware before triggering the doorbell
skd: Avoid that module unloading triggers a use-after-free
md/bitmap: disable bitmap_resize for file-backed bitmaps.
block: Relax a check in blk_start_queue()
powerpc: Fix DAR reporting when alignment handler faults
ext4: fix quota inconsistency during orphan cleanup for read-only mounts
ext4: fix incorrect quotaoff if the quota feature is enabled
crypto: AF_ALG - remove SGL terminator indicator when chaining
crypto: ccp - Fix XTS-AES-128 support on v5 CCPs
MIPS: math-emu: <MADDF|MSUBF>.D: Fix accuracy (64-bit case)
MIPS: math-emu: <MADDF|MSUBF>.S: Fix accuracy (32-bit case)
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Clean up "maddf_flags" enumeration
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of zero inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with opposite signs
MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs zero
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
Input: i8042 - add Gigabyte P57 to the keyboard reset table
pinctrl/amd: save pin registers over suspend/resume
tty: fix __tty_insert_flip_char regression
tty: improve tty_insert_flip_char() slow path
tty: improve tty_insert_flip_char() fast path
IB/addr: Fix setting source address in addr6_resolve()
drm/sun4i: Implement drm_driver lastclose to restore fbdev console
IB/{qib, hfi1}: Avoid flow control testing for RDMA write operation
orangefs: Don't clear SGID when inheriting ACLs
mm: prevent double decrease of nr_reserved_highatomic
NFSv4: Fix callback server shutdown
SUNRPC: Refactor svc_set_num_threads()
UPSTREAM: drm/atomic: Handle -EDEADLK with out-fences correctly
UPSTREAM: sched/fair: Fix FTQ noise bench regression
UPSTREAM: fib_rules: fix error return code
UPSTREAM: ipv4: add missing initialization for flowi4_uid
ANDROID: Squashfs: optimize reading uncompressed data
ANDROID: Squashfs: implement .readpages()
ANDROID: Squashfs: replace buffer_head with BIO
ANDROID: Squashfs: refactor page_actor
ANDROID: Squashfs: remove the FILE_CACHE option
FROMLIST: android: binder: Don't get mm from task
FROMLIST: android: binder: Remove unused vma argument
FROMLIST: android: binder: Drop lru lock in isolate callback
ANDROID: Use sk_uid to replace uid get from socket file
ANDROID: nf: xt_qtaguid: fix handling for cases where tunnels are used.
Revert "ANDROID: Use sk_uid to replace uid get from socket file"
ANDROID: USB gadget: mtp: Fix hang in ioctl(MTP_RECEIVE_FILE) for WritePartialObject
Conflicts:
drivers/android/binder_alloc.c
drivers/media/v4l2-core/v4l2-compat-ioctl32.c
drivers/scsi/sg.c
drivers/usb/gadget/function/f_mtp.c
net/netfilter/xt_qtaguid.c
net/wireless/nl80211.c
Change-Id: I6af673bd4b920bb229fe238a3e96b2330fa18263
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
|
||
|
|
4c00015385 |
timer/sysclt: Restrict timer migration sysctl values to 0 and 1
commit b94bf594cf8ed67cdd0439e70fa939783471597a upstream. timer_migration sysctl acts as a boolean switch, so the allowed values should be restricted to 0 and 1. Add the necessary extra fields to the sysctl table entry to enforce that. [ tglx: Rewrote changelog ] Signed-off-by: Myungho Jung <mhjungk@gmail.com> Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kazuhiro Hayashi <kazuhiro3.hayashi@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
7d337cc7f9 |
Merge remote-tracking branch '4.9/tmp-85e1c01' into 4.9
* 4.9/tmp-85e1c01:
Linux 4.9.48
epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove()
kvm: arm/arm64: Force reading uncached stage2 PGD
drm/ttm: Fix accounting error when fail to get pages for pool
xfrm: policy: check policy direction value
lib/mpi: kunmap after finishing accessing buffer
wl1251: add a missing spin_lock_init()
CIFS: remove endian related sparse warning
CIFS: Fix maximum SMB2 header size
alpha: uapi: Add support for __SANE_USERSPACE_TYPES__
cpuset: Fix incorrect memory_pressure control file mapping
cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs
ceph: fix readpage from fscache
mm, madvise: ensure poisoned pages are removed from per-cpu lists
mm, uprobes: fix multiple free of ->uprobes_state.xol_area
crypto: algif_skcipher - only call put_page on referenced and used pages
i2c: ismt: Return EMSGSIZE for block reads with bogus length
i2c: ismt: Don't duplicate the receive length for block reads
irqchip: mips-gic: SYNC after enabling GIC region
ANDROID: fiq_debugger: Fix minor bug in code
ANDROID: configs: remove requirement for CONFIG_SYNC
FROMLIST: binder: fix an ret value override
FROMLIST: binder: fix memory corruption in binder_transaction binder
Linux 4.9.47
lz4: fix bogus gcc warning
scsi: sg: reset 'res_in_use' after unlinking reserved array
scsi: sg: protect accesses to 'reserved' page array
locking/spinlock/debug: Remove spinlock lockup detection code
arm64: fpsimd: Prevent registers leaking across exec
x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl
arm64: mm: abort uaccess retries upon fatal signal
kvm: arm/arm64: Fix race in resetting stage2 PGD
gcov: support GCC 7.1
staging: wilc1000: simplify vif[i]->ndev accesses
scsi: isci: avoid array subscript warning
p54: memset(0) whole array
FROMLIST: android: binder: Add page usage in binder stats
FROMLIST: android: binder: Add shrinker tracepoints
FROMLIST: android: binder: Add global lru shrinker to binder
FROMLIST: android: binder: Move buffer out of area shared with user space
FROMLIST: android: binder: Add allocator selftest
FROMLIST: android: binder: Refactor prev and next buffer into a helper function
android: android-base.config: enable IP6_NF_MATCH_RPFILTER
Linux 4.9.46
powerpc/mm: Ensure cpumask update is ordered
ACPI: EC: Fix regression related to wrong ECDT initialization order
ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal
ACPI: ioapic: Clear on-stack resource before using it
ntb: transport shouldn't disable link due to bogus values in SPADs
ntb: ntb_test: ensure the link is up before trying to configure the mws
ntb: no sleep in ntb_async_tx_submit
NTB: ntb_test: fix bug printing ntb_perf results
ntb_transport: fix bug calculating num_qps_mw
ntb_transport: fix qp count bug
Clarify (and fix) MAX_LFS_FILESIZE macros
staging: rtl8188eu: add RNX-N150NUB support
iio: hid-sensor-trigger: Fix the race with user space powering up sensors
iio: imu: adis16480: Fix acceleration scale factor for adis16480
ANDROID: binder: fix proc->tsk check.
binder: Use wake up hint for synchronous transactions.
binder: use group leader instead of open thread
Revert "android: binder: Sanity check at binder ioctl"
Bluetooth: bnep: fix possible might sleep error in bnep_session
Bluetooth: cmtp: fix possible might sleep error in cmtp_session
Bluetooth: hidp: fix possible might sleep error in hidp_session_thread
netfilter: nat: fix src map lookup
Revert "leds: handle suspend/resume in heartbeat trigger"
net: sunrpc: svcsock: fix NULL-pointer exception
x86/mm: Fix use-after-free of ldt_struct
timers: Fix excessive granularity of new timers after a nohz idle
perf/x86/intel/rapl: Make package handling more robust
perf probe: Fix --funcs to show correct symbols for offline module
perf/core: Fix group {cpu,task} validation
ftrace: Check for null ret_stack on profile function graph entry function
nfsd: Limit end of page list when decoding NFSv4 WRITE
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
cifs: Fix df output for users with quota limits
kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
tracing: Fix freeing of filter in create_filter() when set_str is false
tracing: Fix kmemleak in tracing_map_array_free()
tracing: Call clear_boot_tracer() at lateinit_sync
drm: rcar-du: Fix H/V sync signal polarity configuration
drm: rcar-du: Fix display timing controller parameter
drm: rcar-du: Fix crash in encoder failure error path
drm/atomic: If the atomic check fails, return its value first
drm: Release driver tracking before making the object available again
mm/memblock.c: reversed logic in memblock_discard()
fork: fix incorrect fput of ->exe_file causing use-after-free
mm/madvise.c: fix freeing of locked page with MADV_FREE
i2c: designware: Fix system suspend
mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource
ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
ALSA: core: Fix unexpected error at replacing user TLV
ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
KVM: x86: block guest protection keys unless the host has them enabled
KVM: s390: sthyi: fix specification exception detection
KVM: s390: sthyi: fix sthyi inline assembly
Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
Input: trackpoint - add new trackpoint firmware ID
bpf/verifier: fix min/max handling in BPF_SUB
bpf: fix mixed signed/unsigned derived min/max value bounds
bpf, verifier: fix alu ops against map_value{, _adj} register types
bpf: adjust verifier heuristics
bpf, verifier: add additional patterns to evaluate_reg_imm_alu
net_sched: fix order of queue length updates in qdisc_replace()
net: sched: fix NULL pointer dereference when action calls some targets
irda: do not leak initialized list.dev to userspace
net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
ipv6: repair fib6 tree in failure case
ipv6: reset fn->rr_ptr when replacing route
tipc: fix use-after-free
sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
nfp: fix infinite loop on umapping cleanup
ipv4: better IP_MAX_MTU enforcement
ptr_ring: use kmalloc_array()
openvswitch: fix skb_panic due to the incorrect actions attrlen
bpf: fix bpf_trace_printk on 32 bit archs
net_sched: remove warning from qdisc_hash_add
net_sched/sfq: update hierarchical backlog when drop packet
ipv4: fix NULL dereference in free_fib_info_rcu()
dccp: defer ccid_hc_tx_delete() at dismantle time
dccp: purge write queue in dccp_destroy_sock()
af_key: do not use GFP_KERNEL in atomic contexts
sparc64: remove unnecessary log message
ANDROID: NFC: st21nfca: Fix memory OOB and leak issues in connectivity events handler
Linux 4.9.45
usb: qmi_wwan: add D-Link DWM-222 device ID
usb: optimize acpi companion search for usb port devices
pids: make task_tgid_nr_ns() safe
Sanitize 'move_pages()' permission checks
genirq/ipi: Fixup checks against nr_cpu_ids
genirq: Restore trigger settings in irq_modify_status()
irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
x86/asm/64: Clear AC on NMI entries
xen-blkfront: use a right index when checking requests
powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL
xen: fix bio vec merging
mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
mm/mempolicy: fix use after free when calling get_mempolicy
mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
mm: discard memblock data later
ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
ALSA: seq: 2nd attempt at fixing race creating a queue
Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
Input: elan_i2c - add ELAN0608 to the ACPI table
crypto: x86/sha1 - Fix reads beyond the number of blocks passed
crypto: ixp4xx - Fix error handling path in 'aead_perform()'
parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
audit: Fix use after free in audit_remove_watch_rule()
netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_unregister
ANDROID: check dir value of xfrm_userpolicy_id
ANDROID: NFC: Fix possible memory corruption when handling SHDLC I-Frame commands
ANDROID: nfc: fdp: Fix possible buffer overflow in WCS4000 NFC driver
ANDROID: NFC: st21nfca: Fix out of bounds kernel access when handling ATR_REQ
ANDROID: usb: gadget: assign no-op request complete callbacks
ANDROID: usb: gadget: configfs: fix null ptr in android_disconnect
ANDROID: uid_sys_stats: Fix implicit declaration of get_cmdline()
uid_sys_stats: log task io with a debug flag
Linux 4.9.44
MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
pinctrl: meson-gxbb: Add missing GPIODV_18 pin entry
pinctrl: samsung: Remove bogus irq_[un]mask from resource management
pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD20
pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD11
pinctrl: intel: merrifield: Correct UART pin lists
pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
pnfs/blocklayout: require 64-bit sector_t
iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits
usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume
usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter
usb: core: unlink urbs from the tail of the endpoint's urb_list
USB: Check for dropped connection before switching to full speed
usb: renesas_usbhs: Fix UGCTRL2 value for R-Car Gen3
usb: gadget: udc: renesas_usb3: Fix usb_gadget_giveback_request() calling
uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069
staging: comedi: comedi_fops: do not call blocking ops when !TASK_RUNNING
iio: light: tsl2563: use correct event code
iio: accel: bmc150: Always restore device to normal mode after suspend-resume
staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
USB: hcd: Mark secondary HCD as dead if the primary one died
usb: musb: fix tx fifo flush handling again
USB: serial: pl2303: add new ATEN device id
USB: serial: cp210x: add support for Qivicon USB ZigBee dongle
USB: serial: option: add D-Link DWM-222 device ID
drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut
drm/etnaviv: Fix off-by-one error in reloc checking
nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
mmc: mmc: correct the logic for setting HS400ES signal voltage
nand: fix wrong default oob layout for small pages using soft ecc
fuse: initialize the flock flag in fuse_file on allocation
target: Fix node_acl demo-mode + uncached dynamic shutdown regression
iscsi-target: Fix iscsi_np reset hung task during parallel delete
iscsi-target: fix memory leak in iscsit_setup_text_cmd()
mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES
xtensa: don't limit csum_partial export by CONFIG_NET
xtensa: mm/cache: add missing EXPORT_SYMBOLs
xtensa: fix cache aliasing handling code for WT cache
futex: Remove unnecessary warning from get_futex_key
mm: fix list corruptions on shmem shrinklist
mm: ratelimit PFNs busy info message
ANDROID: Use sk_uid to replace uid get from socket file
Linux 4.9.43
Revert "ARM: dts: sun8i: Support DTB build for NanoPi M1"
KVM: arm/arm64: Handle hva aging while destroying the vm
sparc64: Prevent perf from running during super critical sections
udp: consistently apply ufo or fragmentation
revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"
revert "net: account for current skb length when deciding about UFO"
packet: fix tp_reserve race in packet_set_ring
igmp: Fix regression caused by igmp sysctl namespace code.
net: avoid skb_warn_bad_offload false positives on UFO
tcp: fastopen: tcp_connect() must refresh the route
net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
bpf, s390: fix jit branch offset related to ldimm64
net: fix keepalive code vs TCP_FASTOPEN_CONNECT
tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states
ppp: fix xmit recursion detection on ppp channels
ppp: Fix false xmit recursion detect with two ppp devices
Linux 4.9.42
workqueue: implicit ordered attribute should be overridable
net: phy: Fix PHY unbind crash
net: account for current skb length when deciding about UFO
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy
mm: don't dereference struct page fields of invalid pages
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
lib/Kconfig.debug: fix frv build failure
mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER
ARM: 8632/1: ftrace: fix syscall name matching
virtio_blk: fix panic in initialization error path
nbd: blk_mq_init_queue returns an error code on failure, not NULL
iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort
ARM: dts: sunxi: Change node name for pwrseq pin on Olinuxino-lime2-emmc
ARM: dts: sun8i: Support DTB build for NanoPi M1
drm/virtio: fix framebuffer sparse warning
scsi: qla2xxx: Get mutex lock before checking optrom_state
clk/samsung: exynos542x: mark some clocks as critical
ipv4: make tcp_notsent_lowat sysctl knob behave as true unsigned int
phy state machine: failsafe leave invalid RUNNING state
netfilter: use fwmark_reflect in nf_send_reset
ASoC: rt5645: set sel_i2s_pre_div1 to 2
spi: spi-axi: Free resources on error path
x86/boot: Add missing declaration of string functions
tg3: Fix race condition in tg3_get_stats64().
net: phy: dp83867: fix irq generation
sh_eth: R8A7740 supports packet shecksumming
sh_eth: fix EESIPR values for SH77{34|63}
wext: handle NULL extra data in iwe_stream_add_point better
sparc64: Fix exception handling in UltraSPARC-III memcpy.
sparc64: Measure receiver forward progress to avoid send mondo timeout
xen-netback: correctly schedule rate-limited queues
net: phy: Correctly process PHY_HALTED in phy_stop_machine()
net/mlx5e: Schedule overflow check work to mlx5e workqueue
net/mlx5e: Fix wrong delay calculation for overflow check scheduling
net/mlx5e: Fix outer_header_zero() check size
net/mlx5: Fix command bad flow on command entry allocation failure
net/mlx5: Consider tx_enabled in all modes on remap
sctp: fix the check for _sctp_walk_params and _sctp_walk_errors
sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
dccp: fix a memleak for dccp_feat_init err process
dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly
dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly
net: ethernet: nb8800: Handle all 4 RGMII modes identically
ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()
packet: fix use-after-free in prb_retire_rx_blk_timer_expired()
openvswitch: fix potential out of bound access in parse_ct
mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled
rtnetlink: allocate more memory for dev_set_mac_address()
ipv4: initialize fib_trie prior to register_netdev_notifier call.
net: dsa: b53: Add missing ARL entries for BCM53125
ipv6: avoid overflow of offset in ip6_find_1stfragopt
net: Zero terminate ifr_name in dev_ifname().
ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()
tcp_bbr: init pacing rate on first RTT sample
tcp_bbr: remove sk_pacing_rate=0 transient during init
tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper
tcp_bbr: introduce bbr_bw_to_pacing_rate() helper
tcp_bbr: cut pacing rate only if filled pipe
saa7164: fix double fetch PCIe access condition
Btrfs: fix early ENOSPC due to delalloc
f2fs: sanity check checkpoint segno and blkoff
media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds
mmc: core: Use device_property_read instead of of_property_read
mmc: dw_mmc: Use device_property_read instead of of_property_read
iscsi-target: Fix initial login PDU asynchronous socket close OOPs
media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl
ARM: dts: tango4: Request RGMII RX and TX clock delays
ARM: dts: armada-38x: Fix irq type for pca955
ext4: fix overflow caused by missing cast in ext4_resize_fs()
ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
gpiolib: skip unwanted events, don't convert them to opposite edge
iommu/amd: Enable ga_log_intr when enabling guest_mode
powerpc/64: Fix __check_irq_replay missing decrementer interrupt
powerpc/tm: Fix saving of TM SPRs in core dump
timers: Fix overflow in get_next_timer_interrupt
mm/page_alloc: Remove kernel address exposure in free_reserved_area()
KVM: async_pf: make rcu irq exit if not triggered from idle task
ASoC: do not close shared backend dailink
drm/amdgpu: Fix undue fallthroughs in golden registers initialization
ALSA: hda - Fix speaker output from VAIO VPCL14M1R
cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
mmc: core: Fix access to HS400-ES devices
device property: Make dev_fwnode() public
mmc: sdhci-of-at91: force card detect value for non removable devices
NFSv4: Fix EXCHANGE_ID corrupt verifier issue
brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twice
iwlwifi: dvm: prevent an out of bounds access
workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
libata: array underflow in ata_find_dev()
cgroup: fix error return value from cgroup_subtree_control()
cgroup: create dfl_root files on subsys registration
parisc: Handle vma's whose context is not current in flush_cache_range
ANDROID: binder: don't queue async transactions to thread.
ANDROID: binder: don't enqueue death notifications to thread todo.
ANDROID: binder: call poll_wait() unconditionally.
ANDROID: keychord: Fix for a memory leak in keychord.
ANDROID: keychord: Fix races in keychord_write.
android: configs: move quota-related configs to recommended
ANDROID: sdcardfs: override credential for ioctl to lower fs
ANDROID: xt_qtaguid: handle properly request sockets
Conflicts:
drivers/staging/android/fiq_debugger/fiq_debugger.c
include/linux/sched.h
kernel/locking/spinlock_debug.c
sound/soc/soc-pcm.c
Change-Id: I163a8c98f1737eeb01b9c8a0636a91d552ef349f
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
|
||
|
|
70b3fd5ce2 |
timers: Fix excessive granularity of new timers after a nohz idle
commit 2fe59f507a65dbd734b990a11ebc7488f6f87a24 upstream.
When a timer base is idle, it is forwarded when a new timer is added
to ensure that granularity does not become excessive. When not idle,
the timer tick is expected to increment the base.
However there are several problems:
- If an existing timer is modified, the base is forwarded only after
the index is calculated.
- The base is not forwarded by add_timer_on.
- There is a window after a timer is restarted from a nohz idle, after
it is marked not-idle and before the timer tick on this CPU, where a
timer may be added but the ancient base does not get forwarded.
These result in excessive granularity (a 1 jiffy timeout can blow out
to 100s of jiffies), which cause the rcu lockup detector to trigger,
among other things.
Fix this by keeping track of whether the timer base has been idle
since it was last run or forwarded, and if so then forward it before
adding a new timer.
There is still a case where mod_timer optimises the case of a pending
timer mod with the same expiry time, where the timer can see excessive
granularity relative to the new, shorter interval. A comment is added,
but it's not changed because it is an important fastpath for
networking.
This has been tested and found to fix the RCU softlockup messages.
Testing was also done with tracing to measure requested versus
achieved wakeup latencies for all non-deferrable timers in an idle
system (with no lockup watchdogs running). Wakeup latency relative to
absolute latency is calculated (note this suffers from round-up skew
at low absolute times) and analysed:
max avg std
upstream 506.0 1.20 4.68
patched 2.0 1.08 0.15
The bug was noticed due to the lockup detector Kconfig changes
dropping it out of people's .configs and resulting in larger base
clk skew When the lockup detectors are enabled, no CPU can go idle for
longer than 4 seconds, which limits the granularity errors.
Sub-optimal timer behaviour is observable on a smaller scale in that
case:
max avg std
upstream 9.0 1.05 0.19
patched 2.0 1.04 0.11
Fixes: Fixes:
|
||
|
|
ce49c27f13 |
kernel: time: Fix accuracy for low resolution timer
timer wheel calculates the index for any timer based on the expiry
value and level granularity of the timer. Due to the level granularity
timer will not fire at the exact time instead expire at a time value
expires + granularity. This is done in the timer code when the index for
each timer is calculated based on the expiry and granularity at each
level:
expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl);
For devfreq drivers the requirement is to fire the timer at the exact
time. If the timer does not expire at the exact time then it'll take
much longer to react and increase the device frequency. Devfreq driver
registers timer for 10ms expiry and due to slack in timer code the
expirty happens at 20 ms. For eg: Frame rendering time is 16ms.
If devfreq driver reacts after 20ms instead of 10ms, that's
way past a frame rendering time.
Timers with 10ms to 630ms expiry fall under level 0, to overcome the
granularity issue for level 0 with low expirty values do not add the
granularity by introducing a new function calc_index_min_granularity.
With the above approach if the timer interrupt on the cpu is delayed for
a long time then there is a chance of missing the timer if base clk is
forwarded to jiffies. In order to account for this corner case modify
the nex_pending_bucket function to choose the least of expiry.
The next_pending_bucket starts at the index based on base clk value and
increments till the end of the bucket.
------------------------------
| | | | | |
| 0 | 1 | -----------| | 63 |
| | | | | |
------------------------------
^
[start] | [pos] [end]
Above pos is the position based on the current base clk value, current
code looks for first pending timer from pos to end. But there is a
chance that there is pending timer from start to pos which could have
lesser expiry. Modify the implementation to pick the least of the
expiries.
Change-Id: I60f6f4394de4b5f409829de9734645e1a0d7659e
Signed-off-by: Channagoud Kadabi <ckadabi@codeaurora.org>
|
||
|
|
9ef8b23b94 |
timers: Fix overflow in get_next_timer_interrupt
commit 34f41c0316ed52b0b44542491d89278efdaa70e4 upstream.
For e.g. HZ=100, timer being 430 jiffies in the future, and 32 bit
unsigned int, there is an overflow on unsigned int right-hand side
of the expression which results with wrong values being returned.
Type cast the multiplier to 64bit to avoid that issue.
Fixes:
|
||
|
|
602c4e27af |
sched: Add a check for cpu unbound deferrable timers
Add a check for cpu unbound deferrable timer expiry and raise softirq for handling the expired timers so that the CPU can process the cpu unbound deferrable times as early as possible when a cpu tries to enter/exit idle loop. Change-Id: Ieffa74fa22a4d25493f5590b5ac1e0d784fcbbad Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> |
||
|
|
a967be8dec |
time: Remove CONFIG_TIMER_STATS
Currently CONFIG_TIMER_STATS exposes process information across namespaces:
kernel/time/timer_list.c print_timer():
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
/proc/timer_list:
#11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Change-Id: I66e06ae2d6e32c309824310d3d9bf54d1047eab1
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Xing Gao <xgao01@email.wm.edu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jessica Frazelle <me@jessfraz.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Git-commit: dfb4357da6ddbdf57d583ba64361c9d792b0e0b1
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[ohaugan@codeaurora.org: Fixed merge conflicts]
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
|
||
|
|
78a643ea96 |
timer: Update code that migrates timers and hrtimers during isolation
__migrate_timers() can be called from both hotplug and isolation contexts. When called from the isolation context, we might sometimes encounter running timers. This is OK since the currently running or just expired timers are off of the timer wheel and so everything else can be migrated off. In the case of hrtimers, we must wait until all callbacks have finished. However, a udelay is important while waiting to allow for store-exclusive fairness when run_hrtimer is attempting to grab the hrtimer base lock. Change-Id: I4dccc66e09819a44b2f9597408a6a3ac4e11f5d7 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> |
||
|
|
df06519aa2 |
timer: Fix incorrect parenthesis in timer
timer code was missing parenthesis in one of the checks to differentiate between global deferrable timer and percpu deferrable timer. Change-Id: I6894da3c3ca8ba01fe267d35aa22f2ec4303cd88 Signed-off-by: Kyle Yan <kyan@codeaurora.org> |
||
|
|
e980f1e966 |
timer: Initialize global deferrable timer
Initialize timer_base_deferrable variables properly along with the initialization of the per cpu timers. Change-Id: I14599cb6ab2fcc657edc7489ee1a55535183e3db Signed-off-by: Kyle Yan <kyan@codeaurora.org> |
||
|
|
c1f109ce0a |
timer: Add a global deferrable timer
Add a global deferrable timer in addition to the per-cpu deferrable timer to allow deferrable timers without the TIMER_PINNED flag to run on any active CPU. Change-Id: I8e6b77cef972589912ad18f324c46c936fbbb96f Signed-off-by: Kyle Yan <kyan@codeaurora.org> |
||
|
|
0f3f78edd7 |
timer: Do not require CPUSETS to be enabled for migration
Do not require CPUSETS to be enabled to allow migration of timers and hrtimers. Change-Id: Ib911a0d34c250c4df020bdb265b92d2b8df8db93 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-4.9] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> |
||
|
|
e92935e2b4 |
timer: Add function to migrate timers
Add function to migrate timer that will be used by later patch set. Change-Id: I370e404001344e635a663822b07557abbe0f6f52 Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org> [ohaugan@codeaurora.org: Updated commit text and fixed trivial merge conflict] Git-commit: 3633b88d8fcb4273807574c27c328b6908a741e5 Git-repo: git://git.linaro.org/people/mike.holmes/santosh.shukla/lng-isol.git Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-4.9] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> |
||
|
|
9536efe77a |
timer: create timer_quiesce_cpu() to isolate CPU from timers
To isolate CPUs (isolate from timers) from sysfs using cpusets, we need some support from the timer core. i.e. A routine timer_quiesce_cpu() which would migrates away all the unpinned timers, but shouldn't touch the pinned ones. This patch creates this routine. Change-Id: I8624e0659b86b7b8fa425a3fafdb0784fe005124 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org> [ohaugan@codeaurora.org: Port to 4.4. Fixes for compilation error] Git-commit: 313910b70ea0c73f8789d9189c11e1f339080646 Git-repo: git://git.linaro.org/people/mike.holmes/santosh.shukla/lng-isol.git Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-4.9 and rebase to change patch dependency order.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> |
||
|
|
6bad6bccf2 |
timers: Prevent base clock corruption when forwarding
When a timer is enqueued we try to forward the timer base clock. This
mechanism has two issues:
1) Forwarding a remote base unlocked
The forwarding function is called from get_target_base() with the current
timer base lock held. But if the new target base is a different base than
the current base (can happen with NOHZ, sigh!) then the forwarding is done
on an unlocked base. This can lead to corruption of base->clk.
Solution is simple: Invoke the forwarding after the target base is locked.
2) Possible corruption due to jiffies advancing
This is similar to the issue in get_net_timer_interrupt() which was fixed
in the previous patch. jiffies can advance between check and assignement
and therefore advancing base->clk beyond the next expiry value.
So we need to read jiffies into a local variable once and do the checks and
assignment with the local copy.
Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes <scoopta@gmail.com>
Reported-by: Michael Thayer <michael.thayer@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Necasek <michal.necasek@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: knut.osmundsen@oracle.com
Cc: stable@vger.kernel.org
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
||
|
|
041ad7bc75 |
timers: Prevent base clock rewind when forwarding clock
Ashton and Michael reported, that kernel versions 4.8 and later suffer from
USB timeouts which are caused by the timer wheel rework.
This is caused by a bug in the base clock forwarding mechanism, which leads
to timers expiring early. The scenario which leads to this is:
run_timers()
while (jiffies >= base->clk) {
collect_expired_timers();
base->clk++;
expire_timers();
}
So base->clk = jiffies + 1. Now the cpu goes idle:
idle()
get_next_timer_interrupt()
nextevt = __next_time_interrupt();
if (time_after(nextevt, base->clk))
base->clk = jiffies;
jiffies has not advanced since run_timers(), so this assignment effectively
decrements base->clk by one.
base->clk is the index into the timer wheel arrays. So let's assume the
following state after the base->clk increment in run_timers():
jiffies = 0
base->clk = 1
A timer gets enqueued with an expiry delta of 63 ticks (which is the case
with the USB timeout and HZ=250) so the resulting bucket index is:
base->clk + delta = 1 + 63 = 64
The timer goes into the first wheel level. The array size is 64 so it ends
up in bucket 0, which is correct as it takes 63 ticks to advance base->clk
to index into bucket 0 again.
If the cpu goes idle before jiffies advance, then the bug in the forwarding
mechanism sets base->clk back to 0, so the next invocation of run_timers()
at the next tick will index into bucket 0 and therefore expire the timer 62
ticks too early.
Instead of blindly setting base->clk to jiffies we must make the forwarding
conditional on jiffies > base->clk, but we cannot use jiffies for this as
we might run into the following issue:
if (time_after(jiffies, base->clk) {
if (time_after(nextevt, base->clk))
base->clk = jiffies;
jiffies can increment between the check and the assigment far enough to
advance beyond nextevt. So we need to use a stable value for checking.
get_next_timer_interrupt() has the basej argument which is the jiffies
value snapshot taken in the calling code. So we can just that.
Thanks to Ashton for bisecting and providing trace data!
Fixes:
|
||
|
|
4da9152a43 |
timers: Lock base for same bucket optimization
Linus stumbled over the unlocked modification of the timer expiry value in
mod_timer() which is an optimization for timers which stay in the same
bucket - due to the bucket granularity - despite their expiry time getting
updated.
The optimization itself still makes sense even if we take the lock, because
in case that the bucket stays the same, we avoid the pointless
queue/enqueue dance.
Make the check and the modification of timer->expires protected by the base
lock and shuffle the remaining code around so we can keep the lock held
when we actually have to requeue the timer to a different bucket.
Fixes:
|
||
|
|
b831275a35 |
timers: Plug locking race vs. timer migration
Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
timer flags. As a consequence the compiler is allowed to reload the flags
between the initial check for TIMER_MIGRATION and the following timer base
computation and the spin lock of the base.
While this has not been observed (yet), we need to make sure that it never
happens.
Fixes:
|
||
|
|
0766f788eb |
latent_entropy: Mark functions with __latent_entropy
The __latent_entropy gcc attribute can be used only on functions and variables. If it is on a function then the plugin will instrument it for gathering control-flow entropy. If the attribute is on a variable then the plugin will initialize it with random contents. The variable must be an integer, an integer array type or a structure with integer fields. These specific functions have been selected because they are init functions (to help gather boot-time entropy), are called at unpredictable times, or they have variable loops, each of which provide some level of latent entropy. Signed-off-by: Emese Revfy <re.emese@gmail.com> [kees: expanded commit message] Signed-off-by: Kees Cook <keescook@chromium.org> |
||
|
|
46c8f0b077 |
timers: Fix get_next_timer_interrupt() computation
The tick_nohz_stop_sched_tick() routine is not properly canceling the sched timer when nothing is pending, because get_next_timer_interrupt() is no longer returning KTIME_MAX in that case. This causes periodic interrupts when none are needed. When determining the next interrupt time, we first use __next_timer_interrupt() to get the first expiring timer in the timer wheel. If no timer is found, we return the base clock value plus NEXT_TIMER_MAX_DELTA to indicate there is no timer in the timer wheel. Back in get_next_timer_interrupt(), we set the "expires" value by converting the timer wheel expiry (in ticks) to a nsec value. But we don't want to do this if the timer wheel expiry value indicates no timer; we want to return KTIME_MAX. Prior to commit |
||
|
|
24f73b9971 |
timers/core: Convert to hotplug state machine
When tearing down, call timers_dead_cpu() before notify_dead(). There is a hidden dependency between: - timers - block multiqueue - rcutree If timers_dead_cpu() comes later than blk_mq_queue_reinit_notify() that latter function causes a RCU stall. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153337.566790058@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
4b4b20852d | Merge branch 'timers/fast-wheel' into timers/core | ||
|
|
f00c0afdfa |
timers: Implement optimization for same expiry time in mod_timer()
The existing optimization for same expiry time in mod_timer() checks whether
the timer expiry time is the same as the new requested expiry time. In the old
timer wheel implementation this does not take the slack batching into account,
neither does the new implementation evaluate whether the new expiry time will
requeue the timer to the same bucket.
To optimize that, we can calculate the resulting bucket and check if the new
expiry time is different from the current expiry time. This calculation
happens outside the base lock held region. If the resulting bucket is the same
we can avoid taking the base lock and requeueing the timer.
If the timer needs to be requeued then we have to check under the base lock
whether the base time has changed between the lockless calculation and taking
the lock. If it has changed we need to recalculate under the lock.
This optimization takes effect for timers which are enqueued into the less
granular wheel levels (1 and above). With a simple test case the functionality
has been verified:
Before After
Match: 5.5% 86.6%
Requeue: 94.5% 13.4%
Recalc: <0.01%
In the non optimized case the timer is requeued in 94.5% of the cases. With
the index optimization in place the requeue rate drops to 13.4%. The case
where the lockless index calculation has to be redone is less than 0.01%.
With a real world test case (networking) we observed the following changes:
Before After
Match: 97.8% 99.7%
Requeue: 2.2% 0.3%
Recalc: <0.001%
That means two percent fewer lock/requeue/unlock operations done in one of
the hot path use cases of timers.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.778527749@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
ffdf047728 |
timers: Split out index calculation
For further optimizations we need to seperate index calculation from queueing. No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.691159619@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
4e85876a9d |
timers: Only wake softirq if necessary
With the wheel forwading in place and with the HZ=1000 4ms folding we can avoid running the softirq at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.607650550@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
a683f390b9 |
timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the side effect that timers which are queued end up in the outer wheel levels. That results in coarser granularity. To solve this, we keep track of the idle state and forward the wheel clock whenever possible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
236968383c |
timers: Optimize collect_expired_timers() for NOHZ
After a NOHZ idle sleep the timer wheel must be forwarded to current jiffies. There might be expired timers so the current code loops and checks the expired buckets for timers. This can take quite some time for long NOHZ idle periods. The pending bitmask in the timer base allows us to do a quick search for the next expiring timer and therefore a fast forward of the base time which prevents pointless long lasting loops. For a 3 seconds idle sleep this reduces the catchup time from ~1ms to 5us. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.351296290@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
73420fea80 |
timers: Move __run_timers() function
Move __run_timers() below __next_timer_interrupt() and next_pending_bucket() in preparation for __run_timers() NOHZ optimization. No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.271872665@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
53bf837b78 |
timers: Remove set_timer_slack() leftovers
We now have implicit batching in the timer wheel. The slack API is no longer used, so remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andrew F. Davis <afd@ti.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jaehoon Chung <jh80.chung@samsung.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: John Stultz <john.stultz@linaro.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Nyman <mathias.nyman@intel.com> Cc: Pali Rohár <pali.rohar@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
500462a9de |
timers: Switch to a non-cascading wheel
The current timer wheel has some drawbacks:
1) Cascading:
Cascading can be an unbound operation and is completely pointless in most
cases because the vast majority of the timer wheel timers are canceled or
rearmed before expiration. (They are used as timeout safeguards, not as
real timers to measure time.)
2) No fast lookup of the next expiring timer:
In NOHZ scenarios the first timer soft interrupt after a long NOHZ period
must fast forward the base time to the current value of jiffies. As we
have no way to find the next expiring timer fast, the code loops linearly
and increments the base time one by one and checks for expired timers
in each step. This causes unbound overhead spikes exactly in the moment
when we should wake up as fast as possible.
After a thorough analysis of real world data gathered on laptops,
workstations, webservers and other machines (thanks Chris!) I came to the
conclusion that the current 'classic' timer wheel implementation can be
modified to address the above issues.
The vast majority of timer wheel timers is canceled or rearmed before
expiry. Most of them are timeouts for networking and other I/O tasks. The
nature of timeouts is to catch the exception from normal operation (TCP ack
timed out, disk does not respond, etc.). For these kinds of timeouts the
accuracy of the timeout is not really a concern. Timeouts are very often
approximate worst-case values and in case the timeout fires, we already
waited for a long time and performance is down the drain already.
The few timers which actually expire can be split into two categories:
1) Short expiry times which expect halfways accurate expiry
2) Long term expiry times are inaccurate today already due to the
batching which is done for NOHZ automatically and also via the
set_timer_slack() API.
So for long term expiry timers we can avoid the cascading property and just
leave them in the less granular outer wheels until expiry or
cancelation. Timers which are armed with a timeout larger than the wheel
capacity are no longer cascaded. We expire them with the longest possible
timeout (6+ days). We have not observed such timeouts in our data collection,
but at least we handle them, applying the rule of the least surprise.
To avoid extending the wheel levels for HZ=1000 so we can accomodate the
longest observed timeouts (5 days in the network conntrack code) we reduce the
first level granularity on HZ=1000 to 4ms, which effectively is the same as
the HZ=250 behaviour. From our data analysis there is nothing which relies on
that 1ms granularity and as a side effect we get better batching and timer
locality for the networking code as well.
Contrary to the classic wheel the granularity of the next wheel is not the
capacity of the first wheel. The granularities of the wheels are in the
currently chosen setting 8 times the granularity of the previous wheel.
So for HZ=250 we end up with the following granularity levels:
Level Offset Granularity Range
0 0 4 ms 0 ms - 252 ms
1 64 32 ms 256 ms - 2044 ms (256ms - ~2s)
2 128 256 ms 2048 ms - 16380 ms (~2s - ~16s)
3 192 2048 ms (~2s) 16384 ms - 131068 ms (~16s - ~2m)
4 256 16384 ms (~16s) 131072 ms - 1048572 ms (~2m - ~17m)
5 320 131072 ms (~2m) 1048576 ms - 8388604 ms (~17m - ~2h)
6 384 1048576 ms (~17m) 8388608 ms - 67108863 ms (~2h - ~18h)
7 448 8388608 ms (~2h) 67108864 ms - 536870911 ms (~18h - ~6d)
That's a worst case inaccuracy of 12.5% for the timers which are queued at the
beginning of a level.
So the new wheel concept addresses the old issues:
1) Cascading is avoided completely
2) By keeping the timers in the bucket until expiry/cancelation we can track
the buckets which have timers enqueued in a bucket bitmap and therefore can
look up the next expiring timer very fast and O(1).
A further benefit of the concept is that the slack calculation which is done
on every timer start is no longer necessary because the granularity levels
provide natural batching already.
Our extensive testing with various loads did not show any performance
degradation vs. the current wheel implementation.
This patch does not address the 'fast lookup' issue as we wanted to make sure
that there is no regression introduced by the wheel redesign. The
optimizations are in follow up patches.
This patch contains fixes from Anna-Maria Gleixner and Richard Cochran.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
494af3ed78 |
timers: Give a few structs and members proper names
Some of the names in the internal implementation of the timer code are not longer correct and others are simply too long to type. Clean it up before we switch the wheel implementation over to the new scheme. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.948752516@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
177ec0a0a5 |
timers: Remove the deprecated mod_timer_pinned() API
We switched all users to initialize the timers as pinned and call mod_timer(). Remove the now unused timer API function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
e675447bda |
timers: Make 'pinned' a timer property
We want to move the timer migration logic from a 'push' to a 'pull' model. Under the current 'push' model pinned timers are handled via a runtime API variant: mod_timer_pinned(). The 'pull' model requires us to store the pinned attribute of a timer in the timer_list structure itself, as a new TIMER_PINNED bit in timer->flags. This flag must be set at initialization time and the timer APIs recognize the flag. This patch: - Implements the new flag and associated new-style initialization methods - makes mod_timer() recognize new-style pinned timers, - and adds some migration helper facility to allow step by step conversion of old-style to new-style pinned timers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.049338558@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
b5227d03b7 |
timers: Clarify usleep_range() function comment
Update the usleep_range() function comment to make it clear that it can only be used in non-atomic context. Previously we claimed usleep_range() was a drop-in replacement for udelay() where wakeup is flexible. But that's only true in non-atomic contexts, where it's possible to sleep instead of delay. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20160531212302.28502.44995.stgit@bhelgaas-glaptop2.roam.corp.google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
||
|
|
b9fdac7f66 |
debugobjects: insulate non-fixup logic related to static obj from fixup callbacks
When activating a static object we need make sure that the object is tracked in the object tracker. If it is a non-static object then the activation is illegal. In previous implementation, each subsystem need take care of this in their fixup callbacks. Actually we can put it into debugobjects core. Thus we can save duplicated code, and have *pure* fixup callbacks. To achieve this, a new callback "is_static_object" is introduced to let the type specific code decide whether a object is static or not. If yes, we take it into object tracker, otherwise give warning and invoke fixup callback. This change has paassed debugobjects selftest, and I also do some test with all debugobjects supports enabled. At last, I have a concern about the fixups that can it change the object which is in incorrect state on fixup? Because the 'addr' may not point to any valid object if a non-static object is not tracked. Then Change such object can overwrite someone's memory and cause unexpected behaviour. For example, the timer_fixup_activate bind timer to function stub_timer. Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com [changbin.du@intel.com: improve code comments where invoke the new is_static_object callback] Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
e3252464da |
timer: update debugobjects fixup callbacks return type
Update the return type to use bool instead of int, corresponding to cheange (debugobjects: make fixup functions return bool instead of int). Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
69b27baf00 |
sched: add schedule_timeout_idle()
This will be needed in the patch "mm, oom: introduce oom reaper". Acked-by: Michal Hocko <mhocko@suse.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |