* aosp/android-4.14-stable:
BACKPORT: drm/virtio: use kvmalloc for large allocations
Linux 4.14.223
dm era: Update in-core bitset after committing the metadata
net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
ipv6: silence compilation warning for non-IPV6 builds
ipv6: icmp6: avoid indirect call for icmpv6_send()
sunvnet: use icmp_ndo_send helper
gtp: use icmp_ndo_send helper
icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n
icmp: introduce helper for nat'd source address in network device context
dm era: only resize metadata in preresume
dm era: Reinitialize bitset cache before digesting a new writeset
dm era: Use correct value size in equality function of writeset tree
dm era: Fix bitset memory leaks
dm era: Verify the data block size hasn't changed
dm era: Recover committed writeset after crash
gfs2: Don't skip dlm unlock if glock has an lvb
sparc32: fix a user-triggerable oops in clear_user()
f2fs: fix out-of-repair __setattr_copy()
printk: fix deadlock when kernel panic
gpio: pcf857x: Fix missing first interrupt
mmc: sdhci-esdhc-imx: fix kernel panic when remove module
module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
libnvdimm/dimm: Avoid race between probe and available_slots_show()
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
mm: hugetlb: fix a race between freeing and dissolving the page
hugetlb: fix copy_huge_page_from_user contig page struct assumption
fs/affs: release old buffer head on error path
mtd: spi-nor: hisi-sfc: Put child node np on error path
watchdog: mei_wdt: request stop on unregister
arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
floppy: reintroduce O_NDELAY fix
x86/reboot: Force all cpus to exit VMX root if VMX is supported
staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
seccomp: Add missing return in non-void function
crypto: sun4i-ss - handle BigEndian for cipher
crypto: sun4i-ss - checking sg length is not sufficient
btrfs: fix extent buffer leak on failure to copy root
btrfs: fix reloc root leak with 0 ref reloc roots on recovery
btrfs: abort the transaction if we fail to inc ref in btrfs_copy_root
KEYS: trusted: Fix migratable=1 failing
tpm_tis: Fix check_locality for correct locality acquisition
ALSA: hda/realtek: modify EAPD in the ALC886
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
USB: serial: mos7720: fix error code in mos7720_write()
USB: serial: mos7840: fix error code in mos7840_write()
usb: musb: Fix runtime PM race in musb_queue_resume_work
USB: serial: option: update interface mapping for ZTE P685M
Input: i8042 - add ASUS Zenbook Flip to noselftest list
Input: joydev - prevent potential read overflow in ioctl
Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
Input: raydium_ts_i2c - do not send zero length
HID: wacom: Ignore attempts to overwrite the touch_max value from HID
ACPI: configfs: add missing check after configfs_register_default_group()
ACPI: property: Fix fwnode string properties matching
blk-settings: align max_sectors on "logical_block_size" boundary
scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
mm/rmap: fix potential pte_unmap on an not mapped pte
i2c: brcmstb: Fix brcmstd_send_i2c_cmd condition
arm64: Add missing ISB after invalidating TLB in __primary_switch
mm/hugetlb: fix potential double free in hugetlb_register_node() error path
mm/memory.c: fix potential pte_unmap_unlock pte error
ocfs2: fix a use after free on error
net/mlx4_core: Add missed mlx4_free_cmd_mailbox()
i40e: Fix overwriting flow control settings during driver loading
i40e: Fix flow for IPv6 next header (extension header)
ext4: fix potential htree index checksum corruption
drm/msm/dsi: Correct io_start for MSM8994 (20nm PHY)
PCI: Align checking of syscall user config accessors
VMCI: Use set_page_dirty_lock() when unregistering guest memory
pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare()
misc: eeprom_93xx46: Add module alias to avoid breaking support for non device tree users
misc: eeprom_93xx46: Fix module alias to enable module autoprobe
sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
Input: elo - fix an error code in elo_connect()
perf test: Fix unaligned access in sample parsing test
perf intel-pt: Fix missing CYC processing in PSB
spi: pxa2xx: Fix the controller numbering for Wildcat Point
powerpc/8xx: Fix software emulation interrupt
powerpc/pseries/dlpar: handle ibm, configure-connector delay status
mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq()
spi: stm32: properly handle 0 byte transfer
RDMA/rxe: Fix coding error in rxe_recv.c
perf tools: Fix DSO filtering when not finding a map for a sampled address
tracepoint: Do not fail unregistering a probe due to memory failure
amba: Fix resource leak for drivers without .remove
ARM: 9046/1: decompressor: Do not clear SCTLR.nTLSMD for ARMv7+ cores
mmc: usdhi6rol0: Fix a resource leak in the error handling path of the probe
powerpc/47x: Disable 256k page size
IB/umad: Return EIO in case of when device disassociated
auxdisplay: ht16k33: Fix refresh rate handling
isofs: release buffer head before return
spi: atmel: Put allocated master before return
certs: Fix blacklist flag type confusion
regulator: axp20x: Fix reference cout leak
clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined
rtc: s5m: select REGMAP_I2C
power: reset: at91-sama5d2_shdwc: fix wkupdbc mask
of/fdt: Make sure no-map does not remove already reserved regions
fdt: Properly handle "no-map" field in the memory region
mfd: bd9571mwv: Use devm_mfd_add_devices()
dmaengine: hsu: disable spurious interrupt
dmaengine: fsldma: Fix a resource leak in an error handling path of the probe function
dmaengine: fsldma: Fix a resource leak in the remove function
HID: core: detect and skip invalid inputs to snto32()
spi: cadence-quadspi: Abort read if dummy cycles required are too many
quota: Fix memory leak when handling corrupted quota file
clk: meson: clk-pll: fix initializing the old rate (fallback) for a PLL
capabilities: Don't allow writing ambiguous v3 file capabilities
jffs2: fix use after free in jffs2_sum_write_data()
fs/jfs: fix potential integer overflow on shift of a int
ima: Free IMA measurement buffer after kexec syscall
ima: Free IMA measurement buffer on error
crypto: ecdh_helper - Ensure 'len >= secret.len' in decode_key()
hwrng: timeriomem - Fix cooldown period calculation
btrfs: clarify error returns values in __load_free_space_cache
Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
ata: ahci_brcm: Add back regulators management
media: uvcvideo: Accept invalid bFormatIndex and bFrameIndex values
media: pxa_camera: declare variable when DEBUG is defined
media: cx25821: Fix a bug when reallocating some dma memory
media: qm1d1c0042: fix error return code in qm1d1c0042_init()
media: lmedm04: Fix misuse of comma
crypto: bcm - Rename struct device_private to bcm_device_private
ASoC: cs42l56: fix up error handling in probe
media: tm6000: Fix memleak in tm6000_start_stream
media: media/pci: Fix memleak in empress_init
media: vsp1: Fix an error handling path in the probe function
media: i2c: ov5670: Fix PIXEL_RATE minimum value
MIPS: lantiq: Explicitly compare LTQ_EBU_PCC_ISTAT against 0
MIPS: c-r4k: Fix section mismatch for loongson2_sc_init
crypto: sun4i-ss - fix kmap usage
gma500: clean up error handling in init
drm/gma500: Fix error return code in psb_driver_load()
fbdev: aty: SPARC64 requires FB_ATY_CT
net: mvneta: Remove per-cpu queue mapping for Armada 3700
net: amd-xgbe: Reset link when the link never comes back
net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
ibmvnic: skip send_request_unmap for timeout reset
b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
mac80211: fix potential overflow when multiplying to u32 integers
xen/netback: fix spurious event detection for common event case
bnxt_en: reverse order of TX disable and carrier off
ath9k: fix data bus crash when setting nf_override via debugfs
bpf_lru_list: Read double-checked variable once without lock
ARM: s3c: fix fiq for clang IAS
arm64: dts: msm8916: Fix reserved and rfsa nodes unit address
staging: rtl8723bs: wifi_regd.c: Fix incorrect number of regulatory rules
usb: dwc2: Make "trimming xfer length" a debug message
usb: dwc2: Abort transaction after errors with unknown reason
usb: dwc2: Do not update data length if it is 0 on inbound transfers
ARM: dts: Configure missing thermal interrupt for 4430
Bluetooth: Put HCI device if inquiry procedure interrupts
Bluetooth: drop HCI device reference before return
usb: gadget: u_audio: Free requests only after callback
cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()
arm64: dts: exynos: correct PMIC interrupt trigger level on Espresso
arm64: dts: exynos: correct PMIC interrupt trigger level on TM2
ARM: dts: exynos: correct PMIC interrupt trigger level on Arndale Octa
ARM: dts: exynos: correct PMIC interrupt trigger level on Spring
ARM: dts: exynos: correct PMIC interrupt trigger level on Rinato
ARM: dts: exynos: correct PMIC interrupt trigger level on Monk
Bluetooth: Fix initializing response id after clearing struct
Bluetooth: btqcomsmd: Fix a resource leak in error handling paths in the probe function
random: fix the RNDRESEEDCRNG ioctl
MIPS: vmlinux.lds.S: add missing PAGE_ALIGNED_DATA() section
kdb: Make memory allocations more robust
vmlinux.lds.h: add DWARF v5 sections
scripts/recordmcount.pl: support big endian for ARCH sh
cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
NET: usb: qmi_wwan: Adding support for Cinterion MV31
arm64: tegra: Add power-domain for Tegra210 HDA
ntfs: check for valid standard information attribute
usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable
HID: make arrays usage and value to be the same
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
Changes in 4.14.223
HID: make arrays usage and value to be the same
usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable
ntfs: check for valid standard information attribute
arm64: tegra: Add power-domain for Tegra210 HDA
NET: usb: qmi_wwan: Adding support for Cinterion MV31
cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
scripts/recordmcount.pl: support big endian for ARCH sh
vmlinux.lds.h: add DWARF v5 sections
kdb: Make memory allocations more robust
MIPS: vmlinux.lds.S: add missing PAGE_ALIGNED_DATA() section
random: fix the RNDRESEEDCRNG ioctl
Bluetooth: btqcomsmd: Fix a resource leak in error handling paths in the probe function
Bluetooth: Fix initializing response id after clearing struct
ARM: dts: exynos: correct PMIC interrupt trigger level on Monk
ARM: dts: exynos: correct PMIC interrupt trigger level on Rinato
ARM: dts: exynos: correct PMIC interrupt trigger level on Spring
ARM: dts: exynos: correct PMIC interrupt trigger level on Arndale Octa
arm64: dts: exynos: correct PMIC interrupt trigger level on TM2
arm64: dts: exynos: correct PMIC interrupt trigger level on Espresso
cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()
usb: gadget: u_audio: Free requests only after callback
Bluetooth: drop HCI device reference before return
Bluetooth: Put HCI device if inquiry procedure interrupts
ARM: dts: Configure missing thermal interrupt for 4430
usb: dwc2: Do not update data length if it is 0 on inbound transfers
usb: dwc2: Abort transaction after errors with unknown reason
usb: dwc2: Make "trimming xfer length" a debug message
staging: rtl8723bs: wifi_regd.c: Fix incorrect number of regulatory rules
arm64: dts: msm8916: Fix reserved and rfsa nodes unit address
ARM: s3c: fix fiq for clang IAS
bpf_lru_list: Read double-checked variable once without lock
ath9k: fix data bus crash when setting nf_override via debugfs
bnxt_en: reverse order of TX disable and carrier off
xen/netback: fix spurious event detection for common event case
mac80211: fix potential overflow when multiplying to u32 integers
b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
ibmvnic: skip send_request_unmap for timeout reset
net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
net: amd-xgbe: Reset link when the link never comes back
net: mvneta: Remove per-cpu queue mapping for Armada 3700
fbdev: aty: SPARC64 requires FB_ATY_CT
drm/gma500: Fix error return code in psb_driver_load()
gma500: clean up error handling in init
crypto: sun4i-ss - fix kmap usage
MIPS: c-r4k: Fix section mismatch for loongson2_sc_init
MIPS: lantiq: Explicitly compare LTQ_EBU_PCC_ISTAT against 0
media: i2c: ov5670: Fix PIXEL_RATE minimum value
media: vsp1: Fix an error handling path in the probe function
media: media/pci: Fix memleak in empress_init
media: tm6000: Fix memleak in tm6000_start_stream
ASoC: cs42l56: fix up error handling in probe
crypto: bcm - Rename struct device_private to bcm_device_private
media: lmedm04: Fix misuse of comma
media: qm1d1c0042: fix error return code in qm1d1c0042_init()
media: cx25821: Fix a bug when reallocating some dma memory
media: pxa_camera: declare variable when DEBUG is defined
media: uvcvideo: Accept invalid bFormatIndex and bFrameIndex values
ata: ahci_brcm: Add back regulators management
Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
btrfs: clarify error returns values in __load_free_space_cache
hwrng: timeriomem - Fix cooldown period calculation
crypto: ecdh_helper - Ensure 'len >= secret.len' in decode_key()
ima: Free IMA measurement buffer on error
ima: Free IMA measurement buffer after kexec syscall
fs/jfs: fix potential integer overflow on shift of a int
jffs2: fix use after free in jffs2_sum_write_data()
capabilities: Don't allow writing ambiguous v3 file capabilities
clk: meson: clk-pll: fix initializing the old rate (fallback) for a PLL
quota: Fix memory leak when handling corrupted quota file
spi: cadence-quadspi: Abort read if dummy cycles required are too many
HID: core: detect and skip invalid inputs to snto32()
dmaengine: fsldma: Fix a resource leak in the remove function
dmaengine: fsldma: Fix a resource leak in an error handling path of the probe function
dmaengine: hsu: disable spurious interrupt
mfd: bd9571mwv: Use devm_mfd_add_devices()
fdt: Properly handle "no-map" field in the memory region
of/fdt: Make sure no-map does not remove already reserved regions
power: reset: at91-sama5d2_shdwc: fix wkupdbc mask
rtc: s5m: select REGMAP_I2C
clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined
regulator: axp20x: Fix reference cout leak
certs: Fix blacklist flag type confusion
spi: atmel: Put allocated master before return
isofs: release buffer head before return
auxdisplay: ht16k33: Fix refresh rate handling
IB/umad: Return EIO in case of when device disassociated
powerpc/47x: Disable 256k page size
mmc: usdhi6rol0: Fix a resource leak in the error handling path of the probe
ARM: 9046/1: decompressor: Do not clear SCTLR.nTLSMD for ARMv7+ cores
amba: Fix resource leak for drivers without .remove
tracepoint: Do not fail unregistering a probe due to memory failure
perf tools: Fix DSO filtering when not finding a map for a sampled address
RDMA/rxe: Fix coding error in rxe_recv.c
spi: stm32: properly handle 0 byte transfer
mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq()
powerpc/pseries/dlpar: handle ibm, configure-connector delay status
powerpc/8xx: Fix software emulation interrupt
spi: pxa2xx: Fix the controller numbering for Wildcat Point
perf intel-pt: Fix missing CYC processing in PSB
perf test: Fix unaligned access in sample parsing test
Input: elo - fix an error code in elo_connect()
sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
misc: eeprom_93xx46: Fix module alias to enable module autoprobe
misc: eeprom_93xx46: Add module alias to avoid breaking support for non device tree users
pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare()
VMCI: Use set_page_dirty_lock() when unregistering guest memory
PCI: Align checking of syscall user config accessors
drm/msm/dsi: Correct io_start for MSM8994 (20nm PHY)
ext4: fix potential htree index checksum corruption
i40e: Fix flow for IPv6 next header (extension header)
i40e: Fix overwriting flow control settings during driver loading
net/mlx4_core: Add missed mlx4_free_cmd_mailbox()
ocfs2: fix a use after free on error
mm/memory.c: fix potential pte_unmap_unlock pte error
mm/hugetlb: fix potential double free in hugetlb_register_node() error path
arm64: Add missing ISB after invalidating TLB in __primary_switch
i2c: brcmstb: Fix brcmstd_send_i2c_cmd condition
mm/rmap: fix potential pte_unmap on an not mapped pte
scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
blk-settings: align max_sectors on "logical_block_size" boundary
ACPI: property: Fix fwnode string properties matching
ACPI: configfs: add missing check after configfs_register_default_group()
HID: wacom: Ignore attempts to overwrite the touch_max value from HID
Input: raydium_ts_i2c - do not send zero length
Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
Input: joydev - prevent potential read overflow in ioctl
Input: i8042 - add ASUS Zenbook Flip to noselftest list
USB: serial: option: update interface mapping for ZTE P685M
usb: musb: Fix runtime PM race in musb_queue_resume_work
USB: serial: mos7840: fix error code in mos7840_write()
USB: serial: mos7720: fix error code in mos7720_write()
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
ALSA: hda/realtek: modify EAPD in the ALC886
tpm_tis: Fix check_locality for correct locality acquisition
KEYS: trusted: Fix migratable=1 failing
btrfs: abort the transaction if we fail to inc ref in btrfs_copy_root
btrfs: fix reloc root leak with 0 ref reloc roots on recovery
btrfs: fix extent buffer leak on failure to copy root
crypto: sun4i-ss - checking sg length is not sufficient
crypto: sun4i-ss - handle BigEndian for cipher
seccomp: Add missing return in non-void function
drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
x86/reboot: Force all cpus to exit VMX root if VMX is supported
floppy: reintroduce O_NDELAY fix
arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
watchdog: mei_wdt: request stop on unregister
mtd: spi-nor: hisi-sfc: Put child node np on error path
fs/affs: release old buffer head on error path
hugetlb: fix copy_huge_page_from_user contig page struct assumption
mm: hugetlb: fix a race between freeing and dissolving the page
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
libnvdimm/dimm: Avoid race between probe and available_slots_show()
module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
mmc: sdhci-esdhc-imx: fix kernel panic when remove module
gpio: pcf857x: Fix missing first interrupt
printk: fix deadlock when kernel panic
f2fs: fix out-of-repair __setattr_copy()
sparc32: fix a user-triggerable oops in clear_user()
gfs2: Don't skip dlm unlock if glock has an lvb
dm era: Recover committed writeset after crash
dm era: Verify the data block size hasn't changed
dm era: Fix bitset memory leaks
dm era: Use correct value size in equality function of writeset tree
dm era: Reinitialize bitset cache before digesting a new writeset
dm era: only resize metadata in preresume
icmp: introduce helper for nat'd source address in network device context
icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n
gtp: use icmp_ndo_send helper
sunvnet: use icmp_ndo_send helper
ipv6: icmp6: avoid indirect call for icmpv6_send()
ipv6: silence compilation warning for non-IPV6 builds
net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
dm era: Update in-core bitset after committing the metadata
Linux 4.14.223
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib3da7b72393e257416645cd59c380fce3c801177
commit ee576c47db60432c37e54b1e2b43a8ca6d3a8dca upstream.
The icmp{,v6}_send functions make all sorts of use of skb->cb, casting
it with IPCB or IP6CB, assuming the skb to have come directly from the
inet layer. But when the packet comes from the ndo layer, especially
when forwarded, there's no telling what might be in skb->cb at that
point. As a result, the icmp sending code risks reading bogus memory
contents, which can result in nasty stack overflows such as this one
reported by a user:
panic+0x108/0x2ea
__stack_chk_fail+0x14/0x20
__icmp_send+0x5bd/0x5c0
icmp_ndo_send+0x148/0x160
In icmp_send, skb->cb is cast with IPCB and an ip_options struct is read
from it. The optlen parameter there is of particular note, as it can
induce writes beyond bounds. There are quite a few ways that can happen
in __ip_options_echo. For example:
// sptr/skb are attacker-controlled skb bytes
sptr = skb_network_header(skb);
// dptr/dopt points to stack memory allocated by __icmp_send
dptr = dopt->__data;
// sopt is the corrupt skb->cb in question
if (sopt->rr) {
optlen = sptr[sopt->rr+1]; // corrupt skb->cb + skb->data
soffset = sptr[sopt->rr+2]; // corrupt skb->cb + skb->data
// this now writes potentially attacker-controlled data, over
// flowing the stack:
memcpy(dptr, sptr+sopt->rr, optlen);
}
In the icmpv6_send case, the story is similar, but not as dire, as only
IP6CB(skb)->iif and IP6CB(skb)->dsthao are used. The dsthao case is
worse than the iif case, but it is passed to ipv6_find_tlv, which does
a bit of bounds checking on the value.
This is easy to simulate by doing a `memset(skb->cb, 0x41,
sizeof(skb->cb));` before calling icmp{,v6}_ndo_send, and it's only by
good fortune and the rarity of icmp sending from that context that we've
avoided reports like this until now. For example, in KASAN:
BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xa0e/0x12b0
Write of size 38 at addr ffff888006f1f80e by task ping/89
CPU: 2 PID: 89 Comm: ping Not tainted 5.10.0-rc7-debug+ #5
Call Trace:
dump_stack+0x9a/0xcc
print_address_description.constprop.0+0x1a/0x160
__kasan_report.cold+0x20/0x38
kasan_report+0x32/0x40
check_memory_region+0x145/0x1a0
memcpy+0x39/0x60
__ip_options_echo+0xa0e/0x12b0
__icmp_send+0x744/0x1700
Actually, out of the 4 drivers that do this, only gtp zeroed the cb for
the v4 case, while the rest did not. So this commit actually removes the
gtp-specific zeroing, while putting the code where it belongs in the
shared infrastructure of icmp{,v6}_ndo_send.
This commit fixes the issue by passing an empty IPCB or IP6CB along to
the functions that actually do the work. For the icmp_send, this was
already trivial, thanks to __icmp_send providing the plumbing function.
For icmpv6_send, this required a tiny bit of refactoring to make it
behave like the v4 case, after which it was straight forward.
Fixes: a2b78e9b2c ("sunvnet: generate ICMP PTMUD messages for smaller port MTUs")
Reported-by: SinYu <liuxyon@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/netdev/CAF=yD-LOF116aHub6RMe8vB8ZpnrrnoTdqhobEx+bvoA8AsP0w@mail.gmail.com/T/
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20210223131858.72082-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Added new procfs flag to toggle the automatic addition of prefix
routes on a per device basis. The new flag is accept_ra_prefix_route.
Defaults to 1 as to not break existing behavior.
CRs-Fixed: 2197954
Change-Id: If25493890c7531c27f5b2c4855afebbbbf5d072a
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Changes in 4.14.11
tracing: Remove extra zeroing out of the ring buffer page
tracing: Fix possible double free on failure of allocating trace buffer
tracing: Fix crash when it fails to alloc ring buffer
x86/cpufeatures: Add X86_BUG_CPU_INSECURE
x86/mm/pti: Disable global pages if PAGE_TABLE_ISOLATION=y
x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching
x86/mm/pti: Add infrastructure for page table isolation
x86/pti: Add the pti= cmdline option and documentation
x86/mm/pti: Add mapping helper functions
x86/mm/pti: Allow NX poison to be set in p4d/pgd
x86/mm/pti: Allocate a separate user PGD
x86/mm/pti: Populate user PGD
x86/mm/pti: Add functions to clone kernel PMDs
x86/mm/pti: Force entry through trampoline when PTI active
x86/mm/pti: Share cpu_entry_area with user space page tables
x86/entry: Align entry text section to PMD boundary
x86/mm/pti: Share entry text PMD
x86/mm/pti: Map ESPFIX into user space
x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
x86/events/intel/ds: Map debug buffers in cpu_entry_area
x86/mm/64: Make a full PGD-entry size hole in the memory map
x86/pti: Put the LDT in its own PGD if PTI is on
x86/pti: Map the vsyscall page if needed
x86/mm: Allow flushing for future ASID switches
x86/mm: Abstract switching CR3
x86/mm: Use/Fix PCID to optimize user/kernel switches
x86/mm: Optimize RESTORE_CR3
x86/mm: Use INVPCID for __native_flush_tlb_single()
x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
x86/mm/pti: Add Kconfig
x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
x86/mm/dump_pagetables: Check user space page table for WX pages
x86/mm/dump_pagetables: Allow dumping current pagetables
x86/ldt: Make the LDT mapping RO
ring-buffer: Mask out the info bits when returning buffer page length
ring-buffer: Do no reuse reader page if still in use
iw_cxgb4: Only validate the MSN for successful completions
ASoC: codecs: msm8916-wcd: Fix supported formats
ASoC: wm_adsp: Fix validation of firmware and coeff lengths
ASoC: da7218: fix fix child-node lookup
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
ASoC: twl4030: fix child-node lookup
ASoC: tlv320aic31xx: Fix GPIO1 register definition
gpio: fix "gpio-line-names" property retrieval
IB/hfi: Only read capability registers if the capability exists
IB/mlx5: Serialize access to the VMA list
IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
IB/core: Verify that QP is security enabled in create and destroy
ALSA: hda: Drop useless WARN_ON()
ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
ALSA: hda - change the location for one mic on a Lenovo machine
ALSA: hda - fix headset mic detection issue on a Dell machine
ALSA: hda - Fix missing COEF init for ALC225/295/299
cpufreq: schedutil: Use idle_calls counter of the remote CPU
block: fix blk_rq_append_bio
block: don't let passthrough IO go into .make_request_fn()
kbuild: add '-fno-stack-check' to kernel build options
ipv4: igmp: guard against silly MTU values
ipv6: mcast: better catch silly mtu values
net: fec: unmap the xmit buffer that are not transferred by DMA
net: igmp: Use correct source address on IGMPv3 reports
netlink: Add netns check on taps
net: qmi_wwan: add Sierra EM7565 1199:9091
net: reevalulate autoflowlabel setting after sysctl setting
ptr_ring: add barriers
RDS: Check cmsg_len before dereferencing CMSG_DATA
tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
tcp md5sig: Use skb's saddr when replying to an incoming segment
tg3: Fix rx hang on MTU change with 5717/5719
tcp_bbr: reset full pipe detection on loss recovery undo
tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
s390/qeth: apply takeover changes when mode is toggled
s390/qeth: don't apply takeover changes to RXIP
s390/qeth: lock IP table while applying takeover changes
s390/qeth: update takeover IPs after configuration change
net: ipv4: fix for a race condition in raw_sendmsg
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
sctp: Replace use of sockets_allocated with specified macro.
adding missing rcu_read_unlock in ipxip6_rcv
ip6_gre: fix device features for ioctl setup
ipv4: Fix use-after-free when flushing FIB tables
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
net: Fix double free and memory corruption in get_net_ns_by_id()
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
sock: free skb in skb_complete_tx_timestamp on error
tcp: invalidate rate samples during SACK reneging
net/mlx5: Fix rate limit packet pacing naming and struct
net/mlx5e: Fix possible deadlock of VXLAN lock
net/mlx5e: Fix features check of IPv6 traffic
net/mlx5e: Add refcount to VXLAN structure
net/mlx5e: Prevent possible races in VXLAN control flow
net/mlx5: Fix error flow in CREATE_QP command
openvswitch: Fix pop_vlan action for double tagged frames
sfc: pass valid pointers from efx_enqueue_unwind
net: dsa: bcm_sf2: Clear IDDQ_GLOBAL_PWR bit for PHY
s390/qeth: fix error handling in checksum cmd callback
sctp: make sure stream nums can match optlen in sctp_setsockopt_reset_streams
tipc: fix hanging poll() for stream sockets
mlxsw: spectrum: Disable MAC learning for ovs port
tcp: fix potential underestimation on rcv_rtt
net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well.
ipv6: Honor specified parameters in fibmatch lookup
tcp: refresh tcp_mstamp from timers callbacks
net/mlx5: FPGA, return -EINVAL if size is zero
vxlan: restore dev->mtu setting based on lower device
net: sched: fix static key imbalance in case of ingress/clsact_init error
bnxt_en: Fix sources of spurious netpoll warnings
phylink: ensure the PHY interface mode is appropriately set
phylink: ensure AN is enabled
ipv4: fib: Fix metrics match when deleting a route
ipv6: set all.accept_dad to 0 by default
Revert "mlx5: move affinity hints assignments to generic code"
skbuff: orphan frags before zerocopy clone
skbuff: skb_copy_ubufs must release uarg even without user frags
skbuff: in skb_copy_ubufs unclone before releasing zerocopy
sparc64: repair calling incorrect hweight function from stubs
usbip: fix usbip bind writing random string after command in match_busid
usbip: prevent leaking socket pointer address in messages
usbip: stub: stop printing kernel pointer addresses in messages
usbip: vhci: stop printing kernel pointer addresses in messages
USB: chipidea: msm: fix ulpi-node lookup
USB: serial: ftdi_sio: add id for Airbus DS P8GR
USB: serial: qcserial: add Sierra Wireless EM7565
USB: serial: option: add support for Telit ME910 PID 0x1101
USB: serial: option: adding support for YUGA CLM920-NC5
usb: Add device quirk for Logitech HD Pro Webcam C925e
usb: add RESET_RESUME for ELSA MicroLink 56K
USB: Fix off by one in type-specific length check of BOS SSP capability
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
timers: Use deferrable base independent of base::nohz_active
timers: Invoke timer_start_debug() where it makes sense
timers: Reinitialize per cpu bases on hotplug
binder: fix proc->files use-after-free
phy: tegra: fix device-tree node lookups
drivers: base: cacheinfo: fix cache type for non-architected system cache
staging: android: ion: Fix dma direction for dma_sync_sg_for_cpu/device
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
x86/smpboot: Remove stale TLB flush invocations
x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
x86/espfix/64: Fix espfix double-fault handling on 5-level systems
x86/ldt: Plug memory leak in error path
x86/ldt: Make LDT pgtable free conditional
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
tty: fix tty_ldisc_receive_buf() documentation
Linux 4.14.11
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 513674b5a2c9c7a67501506419da5c3c77ac6f08 ]
sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
supposed to not include flowlabel. This is true for normal packet, but
not for reset packet.
The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
changed, so the sock will keep the old behavior in terms of auto
flowlabel. Reset packet is suffering from this problem, because reset
packet is sent from a special control socket, which is created at boot
time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
socket will always have its ipv6_pinfo.autoflowlabel set, even after
user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
have flowlabel. Normal sock created before sysctl setting suffers from
the same issue. We can't even turn off autoflowlabel unless we kill all
socks in the hosts.
To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
autoflowlabel setting from user, otherwise we always call
ip6_default_np_autolabel() which has the new settings of sysctl.
Note, this changes behavior a little bit. Before commit 42240901f7
(ipv6: Implement different admin modes for automatic flow labels), the
autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
existing connection will change autoflowlabel behavior. After that
commit, autoflowlabel behavior is sticky in the whole life of the sock.
With this patch, the behavior isn't sticky again.
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <tom@quantonium.net>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).
To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.
Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.
The sysctl behaves as follows:
- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
device's ifindex, and use that table.
The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.
Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
[AmitP: Refactored original changes to align with
the changes introduced by upstream commit
830218c1ad ("net: ipv6: Fix processing of RAs in presence of VRF")
Also folded following android-4.9 commit changes into this patch
be65fb01da4d ("ANDROID: net: ipv6: remove unused variable ifindex in")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
Early demux lookups are handled in the next patch as part of INET_MATCH
changes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 67a51780ae ("ipv6: udp: leverage scratch area
helpers") udp6_recvmsg() read the skb len from the scratch area,
to avoid a cache miss.
But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
length exceeds the 16 bits available in the scratch area. As a side
effect the length returned by recvmsg() is:
<ingress datagram len> % (1<<16)
This commit addresses the issue allocating one more bit in the
IP6CB flags field and setting it for incoming jumbograms.
Such field is still in the first cacheline, so at recvmsg()
time we can check it and fallback to access skb->len if
required, without a measurable overhead.
Fixes: 67a51780ae ("ipv6: udp: leverage scratch area helpers")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds a new sysctl accept_ra_rt_info_min_plen that
defines the minimum acceptable prefix length of Route Information
Options. The new sysctl is intended to be used together with
accept_ra_rt_info_max_plen to configure a range of acceptable
prefix lengths. It is useful to prevent misconfigurations from
unintentionally blackholing too much of the IPv6 address space
(e.g., home routers announcing RIOs for fc00::/7, which is
incorrect).
Signed-off-by: Joel Scherpelz <jscherpelz@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This provides equivalent functionality to the existing ipv4
"disable_policy" systcl. ie. Allows IPsec processing to be skipped
on terminating packets on a per-interface basis.
Signed-off-by: David Forster <dforster@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The address generation mode for IPv6 link-local can only be configured
by netlink messages. This patch adds the ability to change the address
generation mode via sysctl.
v1 -> v2
Removed the rtnl lock and switch to use RCU lock to iterate through
the netdev list.
v2 -> v3
Removed the addrgenmode variable from the idev structure and use the
systcl storage for the flag.
Simplifed the logic for sysctl handling by removing the supported
for all operation.
Added support for more types of tunnel interfaces for link-local
address generation.
Based the patches from net-next.
v3 -> v4
Removed unnecessary whitespace changes.
Signed-off-by: Felix Jia <felix.jia@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implemented RFC7527 Enhanced DAD.
IPv6 duplicate address detection can fail if there is some temporary
loopback of Ethernet frames. RFC7527 solves this by including a random
nonce in the NS messages used for DAD, and if an NS is received with the
same nonce it is assumed to be a looped back DAD probe and is ignored.
RFC7527 is enabled by default. Can be disabled by setting both of
conf/{all,interface}/enhanced_dad to zero.
Signed-off-by: Erik Nordmark <nordmark@arista.com>
Signed-off-by: Bob Gilligan <gilligan@arista.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the necessary functions to compute and check the HMAC signature
of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and
hmac(sha256).
In order to avoid dynamic memory allocation for each HMAC computation,
a per-cpu ring buffer is allocated for this purpose.
A new per-interface sysctl called seg6_require_hmac is added, allowing a
user-defined policy for processing HMAC-signed SR-enabled packets.
A value of -1 means that the HMAC field will always be ignored.
A value of 0 means that if an HMAC field is present, its validity will
be enforced (the packet is dropped is the signature is incorrect).
Finally, a value of 1 means that any SR-enabled packet that does not
contain an HMAC signature or whose signature is incorrect will be dropped.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement minimal support for processing of SR-enabled packets
as described in
https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02.
This patch implements the following operations:
- Intermediate segment endpoint: incrementation of active segment and rerouting.
- Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH
and routing of inner packet.
- Cleanup flag support for SR-inlined packets: removal of SRH if we are the
penultimate segment endpoint.
A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled
packets. Default is deny.
This patch does not provide support for HMAC-signed packets.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
When reading a datagram or raw packet that arrived fragmented, expose
the maximum fragment size if recorded to allow applications to
estimate receive path MTU.
At this point, the field is only recorded when ipv6 connection
tracking is enabled. A follow-up patch will record this field also
in the ipv6 input path.
Tested using the test for IP_RECVFRAGSIZE plus
ip netns exec to ip addr add dev veth1 fc07::1/64
ip netns exec from ip addr add dev veth0 fc07::2/64
ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 &
ip netns exec from nc -q 1 -u fc07::1 6000 < payload
Both with and without enabling connection tracking
ip6tables -A INPUT -m state --state NEW -p udp -j LOG
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, socket lookups for l3mdev (vrf) use cases can match a socket
that is bound to a port but not a device (ie., a global socket). If the
sysctl tcp_l3mdev_accept is not set this leads to ack packets going out
based on the main table even though the packet came in from an L3 domain.
The end result is that the connection does not establish creating
confusion for users since the service is running and a socket shows in
ss output. Fix by requiring an exact dif to sk_bound_dev_if match if the
skb came through an interface enslaved to an l3mdev device and the
tcp_l3mdev_accept is not set.
skb's through an l3mdev interface are marked by setting a flag in
inet{6}_skb_parm. The IPv6 variant is already set; this patch adds the
flag for IPv4. Using an skb flag avoids a device lookup on the dif. The
flag is set in the VRF driver using the IP{6}CB macros. For IPv4, the
inet_skb_parm struct is moved in the cb per commit 971f10eca1, so the
match function in the TCP stack needs to use TCP_SKB_CB. For IPv6, the
move is done after the socket lookup, so IP6CB is used.
The flags field in inet_skb_parm struct needs to be increased to add
another flag. There is currently a 1-byte hole following the flags,
so it can be expanded to u16 without increasing the size of the struct.
Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implements:
https://tools.ietf.org/html/rfc7559
Backoff is performed according to RFC3315 section 14:
https://tools.ietf.org/html/rfc3315#section-14
We allow setting /proc/sys/net/ipv6/conf/*/router_solicitations
to a negative value meaning an unlimited number of retransmits,
and we make this the new default (inline with the RFC).
We also add a new setting:
/proc/sys/net/ipv6/conf/*/router_solicitation_max_interval
defaulting to 1 hour (per RFC recommendation).
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Kellermann reported a kernel crash with 4.5.0 when IPv6 is
disabled at boot using the kernel option ipv6.disable=1. Using
current net-next with the boot option:
$ ip link add red type vrf table 1001
Generates:
[12210.919584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000748
[12210.921341] IP: [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
[12210.922537] PGD b79e3067 PUD bb32b067 PMD 0
[12210.923479] Oops: 0000 [#1] SMP
[12210.924001] Modules linked in: ipvlan 8021q garp mrp stp llc
[12210.925130] CPU: 3 PID: 1177 Comm: ip Not tainted 4.7.0-rc1+ #235
[12210.926168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[12210.928065] task: ffff8800b9ac4640 ti: ffff8800bacac000 task.ti: ffff8800bacac000
[12210.929328] RIP: 0010:[<ffffffff814b30e3>] [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
[12210.930697] RSP: 0018:ffff8800bacaf888 EFLAGS: 00010202
[12210.931563] RAX: 0000000000000748 RBX: ffffffff81a9e280 RCX: ffff8800b9ac4e28
[12210.932688] RDX: 00000000000000e9 RSI: 0000000000000002 RDI: 0000000000000286
[12210.933820] RBP: ffff8800bacaf898 R08: ffff8800b9ac4df0 R09: 000000000052001b
[12210.934941] R10: 00000000657c0000 R11: 000000000000c649 R12: 00000000000003e9
[12210.936032] R13: 00000000000003e9 R14: ffff8800bace7800 R15: ffff8800bb3ec000
[12210.937103] FS: 00007faa1766c700(0000) GS:ffff88013ac00000(0000) knlGS:0000000000000000
[12210.938321] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12210.939166] CR2: 0000000000000748 CR3: 00000000b79d6000 CR4: 00000000000406e0
[12210.940278] Stack:
[12210.940603] ffff8800bb3ec000 ffffffff81a9e280 ffff8800bacaf8c8 ffffffff814b3135
[12210.941818] ffff8800bb3ec000 ffffffff81a9e280 ffffffff81a9e280 ffff8800bace7800
[12210.943040] ffff8800bacaf8f0 ffffffff81397c88 ffff8800bb3ec000 ffffffff81a9e280
[12210.944288] Call Trace:
[12210.944688] [<ffffffff814b3135>] fib6_new_table+0x24/0x8a
[12210.945516] [<ffffffff81397c88>] vrf_dev_init+0xd4/0x162
[12210.946328] [<ffffffff814091e1>] register_netdevice+0x100/0x396
[12210.947209] [<ffffffff8139823d>] vrf_newlink+0x40/0xb3
[12210.948001] [<ffffffff814187f0>] rtnl_newlink+0x5d3/0x6d5
...
The problem above is due to the fact that the fib hash table is not
allocated when IPv6 is disabled at boot.
As for the VRF driver it should not do any IPv6 initializations if IPv6
is disabled, so it needs to know if IPv6 is disabled at boot. The disable
parameter is private to the IPv6 module, so provide an accessor for
modules to determine if IPv6 was disabled at boot time.
Fixes: 35402e3136 ("net: Add IPv6 support to VRF device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the VRF driver uses the rx_handler to switch the skb device
to the VRF device. Switching the dev prior to the ip / ipv6 layer
means the VRF driver has to duplicate IP/IPv6 processing which adds
overhead and makes features such as retaining the ingress device index
more complicated than necessary.
This patch moves the hook to the L3 layer just after the first NF_HOOK
for PRE_ROUTING. This location makes exposing the original ingress device
trivial (next patch) and allows adding other NF_HOOKs to the VRF driver
in the future.
dev_queue_xmit_nit is exported so that the VRF driver can cycle the skb
with the switched device through the packet taps to maintain current
behavior (tcpdump can be used on either the vrf device or the enslaved
devices).
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Struct ctl_table_header holds pointer to sysctl table which could be used
for freeing it after unregistration. IPv4 sysctls already use that.
Remove redundant NULL assignment: ndev allocated using kzalloc.
This also saves some bytes: sysctl table could be shorter than
DEVCONF_MAX+1 if some options are disable in config.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain 802.11 wireless deployments, there will be NA proxies
that use knowledge of the network to correctly answer requests.
To prevent unsolicitd advertisements on the shared medium from
being a problem, on such deployments wireless needs to drop them.
Enable this by providing an option called "drop_unsolicited_na".
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to solve a problem with 802.11, the so-called hole-196 attack,
add an option (sysctl) called "drop_unicast_in_l2_multicast" which, if
enabled, causes the stack to drop IPv6 unicast packets encapsulated in
link-layer multi- or broadcast frames. Such frames can (as an attack)
be created by any member of the same wireless network and transmitted
as valid encrypted frames since the symmetric key for broadcast frames
is shared between all stations.
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch addresses multiple problems :
UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
while socket is not locked : Other threads can change np->opt
concurrently. Dmitry posted a syzkaller
(http://github.com/google/syzkaller) program desmonstrating
use-after-free.
Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
and dccp_v6_request_recv_sock() also need to use RCU protection
to dereference np->opt once (before calling ipv6_dup_options())
This patch adds full RCU protection to np->opt
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
SYN_RECV & TIMEWAIT sockets are not full blown, they do not have a pinet6
pointer.
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like the ipv4 patch with a similar title, this adds a sysctl to allow
the user to change routing behavior based on whether or not the
interface associated with the nexthop was an up or down link. The
default setting preserves the current behavior, but anyone that enables
it will notice that nexthops on down interfaces will no longer be
selected:
net.ipv6.conf.all.ignore_routes_with_linkdown = 0
net.ipv6.conf.default.ignore_routes_with_linkdown = 0
net.ipv6.conf.lo.ignore_routes_with_linkdown = 0
...
When the above sysctls are set, not only will link status be reported to
userspace, but an indication that a nexthop is dead and will not be used
is also reported.
1000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium
1000::/8 via 8000::2 dev p8p1 metric 1024 pref medium
7000::/8 dev p7p1 proto kernel metric 256 dead linkdown pref medium
8000::/8 dev p8p1 proto kernel metric 256 pref medium
9000::/8 via 8000::2 dev p8p1 metric 2048 pref medium
9000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium
fe80::/64 dev p7p1 proto kernel metric 256 dead linkdown pref medium
fe80::/64 dev p8p1 proto kernel metric 256 pref medium
This also adds devconf support and notification when sysctl values
change.
v2: drop use of rt6i_nhflags since it is not needed right now
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6fd99094de ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.
RFC 4861, 6.3.4. Processing Received Router Advertisements
A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
and Retrans Timer) may contain a value denoting that it is
unspecified. In such cases, the parameter should be ignored and the
host should continue using whatever value it is already using.
If the received Cur Hop Limit value is non-zero, the host SHOULD set
its CurHopLimit variable to the received value.
So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Per RFC 6724, section 4, "Candidate Source Addresses":
It is RECOMMENDED that the candidate source addresses be the set
of unicast addresses assigned to the interface that will be used
to send to the destination (the "outgoing" interface).
Add a sysctl to enable this behaviour.
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements the procfs logic for the stable_address knob:
The secret is formatted as an ipv6 address and will be stored per
interface and per namespace. We track initialized flag and return EIO
errors until the secret is set.
We don't inherit the secret to newly created namespaces.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull IPv6 cork initialization into its own function that
can be re-used. IPv6 specific cork data did not have an
explicit data structure. This patch creats eone so that
just ipv6 cork data can be as arguemts. Also, since
IPv6 tries to save the flow label into inet_cork_full
tructure, pass the full cork.
Adjust ip6_cork_release() to take cork data structures.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel forcefully applies MTU values received in router
advertisements provided the new MTU is less than the current. This
behavior is undesirable when the user space is managing the MTU. Instead
a sysctl flag 'accept_ra_mtu' is introduced such that the user space
can control whether or not RA provided MTU updates should be applied. The
default behavior is unchanged; user space must explicitly set this flag
to 0 for RA MTUs to be ignored.
Signed-off-by: Harout Hedeshian <harouth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Automatically generate flow labels for IPv6 packets on transmit.
The flow label is computed based on skb_get_hash. The flow label will
only automatically be set when it is zero otherwise (i.e. flow label
manager hasn't set one). This supports the transmit side functionality
of RFC 6438.
Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
functionality per socket.
By default, auto flowlabels are disabled to avoid possible conflicts
with flow label manager, however if this feature proves useful we
may want to enable it by default.
It should also be noted that FreeBSD has already implemented automatic
flow labels (including the sysctl and socket option). In FreeBSD,
automatic flow labels default to enabled.
Performance impact:
Running super_netperf with 200 flows for TCP_RR and UDP_RR for
IPv6. Note that in UDP case, __skb_get_hash will be called for
every packet with explains slight regression. In the TCP case
the hash is saved in the socket so there is no regression.
Automatic flow labels disabled:
TCP_RR:
86.53% CPU utilization
127/195/322 90/95/99% latencies
1.40498e+06 tps
UDP_RR:
90.70% CPU utilization
118/168/243 90/95/99% latencies
1.50309e+06 tps
Automatic flow labels enabled:
TCP_RR:
85.90% CPU utilization
128/199/337 90/95/99% latencies
1.40051e+06
UDP_RR
92.61% CPU utilization
115/164/236 90/95/99% latencies
1.4687e+06
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an UDP application switches from AF_INET to AF_INET6 sockets, we
have a small performance degradation for IPv4 communications because of
extra cache line misses to access ipv6only information.
This can also be noticed for TCP listeners, as ipv6_only_sock() is also
used from __inet_lookup_listener()->compute_score()
This is magnified when SO_REUSEPORT is used.
Move ipv6only into struct sock_common so that it is available at
no extra cost in lookups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This can be used in virtual networking applications, and
may have other uses as well. The option is disabled by
default.
A specific use case is setting up virtual routers, bridges, and
hosts on a single OS without the use of network namespaces or
virtual machines. With proper use of ip rules, routing tables,
veth interface pairs and/or other virtual interfaces,
and applications that can bind to interfaces and/or IP addresses,
it is possibly to create one or more virtual routers with multiple
hosts attached. The host interfaces can act as IPv6 systems,
with radvd running on the ports in the virtual routers. With the
option provided in this patch enabled, those hosts can now properly
obtain IPv6 addresses from the radvd.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since pktops is only used for IPv6 only and opts is used for IPv4
only, we can move these fields into a union and this allows us to drop
the inet6_reqsk_alloc function as after this change it becomes
equivalent with inet_reqsk_alloc.
This patch also fixes a kmemcheck issue in the IPv6 stack: the flags
field was not annotated after a request_sock was allocated.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently don't report IPV6_RECVPKTINFO in cmsg access ancillary data
for IPv4 datagrams on IPv6 sockets.
This patch splits the ip6_datagram_recv_ctl into two functions, one
which handles both protocol families, AF_INET and AF_INET6, while the
ip6_datagram_recv_specific_ctl only handles IPv6 cmsg data.
ip6_datagram_recv_*_ctl never reported back any errors, so we can make
them return void. Also provide a helper for protocols which don't offer dual
personality to further use ip6_datagram_recv_ctl, which is exported to
modules.
I needed to shuffle the code for ping around a bit to make it easier to
implement dual personality for ping ipv6 sockets in future.
Reported-by: Gert Doering <gert@space.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this option, the socket will reply with the flow label value read
on received packets.
The goal is to have a connection with the same flow label in both
direction of the communication.
Changelog of V4:
* Do not erase the flow label on the listening socket. Use pktopts to
store the received value
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6_PMTU_INTERFACE is the same as IPV6_PMTU_PROBE for ipv6. Add it
nontheless for symmetry with IPv4 sockets. Also drop incoming MTU
information if this mode is enabled.
The additional bit in ipv6_pinfo just eats in the padding behind the
bitfield. There are no changes to the layout of the struct at all.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation of IPV6_FLOWINFO only gives a
result if pktoptions is available (thanks to the
ip6_datagram_recv_ctl function).
It gives inconsistent results to user space, sometimes
there is a result for getsockopt(IPV6_FLOWINFO), sometimes
not.
This patch add rcv_flowinfo to store it, and return it to
the userspace in the same way than other pkt_options.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code to detect fragments in checksum_setup() was missing for IPv4 and
too eager for IPv6. (It transpires that Windows seems to send IPv6 packets
with a fragment header even if they are not a fragment - i.e. offset is zero,
and M bit is not set).
This patch also incorporates a fix to callers of maybe_pull_tail() where
skb->network_header was being erroneously added to the length argument.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
cc: David Miller <davem@davemloft.net>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code for privacy extentions is very mature, and making it
configurable only gives marginal memory/code savings in exchange
for obfuscation and hard to read code via CPP ifdef'ery.
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP listener refactoring, part 5 :
We want to be able to insert request sockets (SYN_RECV) into main
ehash table instead of the per listener hash table to allow RCU
lookups and remove listener lock contention.
This patch includes the needed struct sock_common in front
of struct request_sock
This means there is no more inet6_request_sock IPv6 specific
structure.
Following inet_request_sock fields were renamed as they became
macros to reference fields from struct sock_common.
Prefix ir_ was chosen to avoid name collisions.
loc_port -> ir_loc_port
loc_addr -> ir_loc_addr
rmt_addr -> ir_rmt_addr
rmt_port -> ir_rmt_port
iif -> ir_iif
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>