← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1772675] Re: i40e PF reset due to incorrect MDD event

 

This bug was fixed in the package linux - 4.15.0-141.145

---------------
linux (4.15.0-141.145) bionic; urgency=medium

  * bionic/linux: 4.15.0-141.145 -proposed tracker (LP: #1919536)

  * binary assembly failures with CONFIG_MODVERSIONS present (LP: #1919315)
    - [Packaging] quiet (nomially) benign errors in BUILD script

  * selftests: bpf verifier fails after sanitize_ptr_alu fixes (LP: #1920995)
    - bpf: Simplify alu_limit masking for pointer arithmetic
    - bpf: Add sanity check for upper ptr_limit
    - bpf, selftests: Fix up some test_verifier cases for unprivileged

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * CVE-2018-13095
    - xfs: More robust inode extent count validation

  * i40e PF reset due to incorrect MDD event (LP: #1772675)
    - i40e: change behavior on PF in response to MDD event

  * Bionic update: upstream stable patchset 2021-03-09 (LP: #1918330)
    - ACPI: sysfs: Prefer "compatible" modalias
    - ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
    - net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
    - drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
    - drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
    - KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in
      intel_arch_events[]
    - KVM: x86: get smi pending status correctly
    - xen: Fix XenStore initialisation for XS_LOCAL
    - leds: trigger: fix potential deadlock with libata
    - mt7601u: fix kernel crash unplugging the device
    - mt7601u: fix rx buffer refcounting
    - xen-blkfront: allow discard-* nodes to be optional
    - ARM: imx: build suspend-imx6.S with arm instruction set
    - netfilter: nft_dynset: add timeout extension to template
    - xfrm: Fix oops in xfrm_replay_advance_bmp
    - RDMA/cxgb4: Fix the reported max_recv_sge value
    - iwlwifi: pcie: use jiffies for memory read spin time limit
    - iwlwifi: pcie: reschedule in long-running memory reads
    - mac80211: pause TX while changing interface type
    - can: dev: prevent potential information leak in can_fill_info()
    - x86/entry/64/compat: Preserve r8-r11 in int $0x80
    - x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
    - iommu/vt-d: Gracefully handle DMAR units with no supported address widths
    - iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
    - NFC: fix resource leak when target index is invalid
    - NFC: fix possible resource leak
    - team: protect features update by RCU to avoid deadlock
    - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
    - kernel: kexec: remove the lock operation of system_transition_mutex
    - PM: hibernate: flush swap writer after marking
    - pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process()
    - net/mlx5: Fix memory leak on flow table creation error flow
    - rxrpc: Fix memory leak in rxrpc_lookup_local
    - net: dsa: bcm_sf2: put device node before return
    - ibmvnic: Ensure that CRQ entry read are correctly ordered
    - ACPI: thermal: Do not call acpi_thermal_check() directly
    - net_sched: gen_estimator: support large ewma log
    - phy: cpcap-usb: Fix warning for missing regulator_disable
    - x86: __always_inline __{rd,wr}msr()
    - scsi: scsi_transport_srp: Don't block target in failfast state
    - scsi: libfc: Avoid invoking response handler twice if ep is already
      completed
    - mac80211: fix fast-rx encryption check
    - scsi: ibmvfc: Set default timeout to avoid crash during migration
    - objtool: Don't fail on missing symbol table
    - kthread: Extract KTHREAD_IS_PER_CPU
    - workqueue: Restrict affinity change to rescuer
    - USB: serial: cp210x: add pid/vid for WSDA-200-USB
    - USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
    - USB: serial: option: Adding support for Cinterion MV31
    - arm64: dts: ls1046a: fix dcfg address range
    - net: lapb: Copy the skb before sending a packet
    - elfcore: fix building with clang
    - USB: gadget: legacy: fix an error code in eth_bind()
    - USB: usblp: don't call usb_set_interface if there's a single alt
    - usb: dwc2: Fix endpoint direction check in ep_from_windex
    - ovl: fix dentry leak in ovl_get_redirect
    - mac80211: fix station rate table updates on assoc
    - kretprobe: Avoid re-registration of the same kretprobe earlier
    - xhci: fix bounce buffer usage for non-sg list case
    - cifs: report error instead of invalid when revalidating a dentry fails
    - smb3: Fix out-of-bounds bug in SMB2_negotiate()
    - mmc: core: Limit retries when analyse of SDIO tuples fails
    - nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
    - ARM: footbridge: fix dc21285 PCI configuration accessors
    - mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
    - mm: hugetlb: fix a race between isolating and freeing page
    - mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
    - mm: thp: fix MADV_REMOVE deadlock on shmem THP
    - x86/build: Disable CET instrumentation in the kernel
    - x86/apic: Add extra serialization for non-serializing MSRs
    - Input: xpad - sync supported devices with fork on GitHub
    - iommu/vt-d: Do not use flush-queue when caching-mode is on
    - net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
    - net: mvpp2: TCAM entry enable should be written after SRAM data
    - memblock: do not start bottom-up allocations with kernel_end
    - usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
    - genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
    - KVM: SVM: Treat SVM as unsupported when running as an SEV guest
    - md: Set prev_flush_start and flush_bio in an atomic way
    - net: ip_tunnel: fix mtu calculation
    - block: fix NULL pointer dereference in register_disk
    - remoteproc: qcom_q6v5_mss: Validate modem blob firmware size before load
    - remoteproc: qcom_q6v5_mss: Validate MBA firmware size before load
    - af_key: relax availability checks for skb size calculation
    - pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()
    - iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time()
    - iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap
    - iwlwifi: mvm: guard against device removal in reprobe
    - SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
    - SUNRPC: Handle 0 length opaque XDR object data properly
    - lib/string: Add strscpy_pad() function
    - include/trace/events/writeback.h: fix -Wstringop-truncation warnings
    - memcg: fix a crash in wb_workfn when a device disappears
    - blk-mq: don't hold q->sysfs_lock in blk_mq_map_swqueue
    - squashfs: add more sanity checks in id lookup
    - squashfs: add more sanity checks in inode lookup
    - squashfs: add more sanity checks in xattr id lookup

  * SRU: Add FUA support for XFS (LP: #1917918)
    - block: add blk_queue_fua() helper function
    - xfs: move generic_write_sync calls inwards
    - iomap: iomap_dio_rw() handles all sync writes
    - iomap: Use FUA for pure data O_DSYNC DIO writes

  * CVE-2021-3348
    - nbd: freeze the queue while we're adding connections

  * Bionic kernel 4.15.0-136 causes dosemu2 (with kvm mode) freezes due to lack
    of KVM patch (LP: #1917138)
    - KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint

  * switch LRM to be signed using the Ubuntu Drivers signing key (LP: #1917034)
    - [Packaging] sync dkms-build to updated API

  * Bionic update: upstream stable patchset 2021-02-26 (LP: #1917093)
    - i2c: bpmp-tegra: Ignore unknown I2C_M flags
    - ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
    - ALSA: hda/via: Add minimum mute flag
    - ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
    - mmc: sdhci-xenon: fix 1.8v regulator stabilization
    - dm: avoid filesystem lookup in dm_get_dev_t()
    - drm/atomic: put state on error path
    - ASoC: Intel: haswell: Add missing pm_ops
    - scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
    - xen: Fix event channel callback via INTX/GSI
    - drm/nouveau/bios: fix issue shadowing expansion ROMs
    - drm/nouveau/privring: ack interrupts the same way as RM
    - drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields
    - i2c: octeon: check correct size of maximum RECV_LEN packet
    - can: dev: can_restart: fix use after free bug
    - can: vxcan: vxcan_xmit: fix use after free bug
    - iio: ad5504: Fix setting power-down state
    - irqchip/mips-cpu: Set IPI domain parent chip
    - intel_th: pci: Add Alder Lake-P support
    - stm class: Fix module init return on allocation failure
    - ehci: fix EHCI host controller initialization sequence
    - USB: ehci: fix an interrupt calltrace error
    - usb: udc: core: Use lock when write to soft_connect
    - usb: bdc: Make bdc pci driver depend on BROKEN
    - [Config] updateconfigs for USB_BDC_PCI
    - xhci: make sure TRB is fully written before giving it to the controller
    - xhci: tegra: Delay for disabling LFPS detector
    - compiler.h: Raise minimum version of GCC to 5.1 for arm64
    - netfilter: rpfilter: mask ecn bits before fib lookup
    - sh: dma: fix kconfig dependency for G2_DMA
    - sh_eth: Fix power down vs. is_opened flag ordering
    - skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too
    - udp: mask TOS bits in udp_v4_early_demux()
    - ipv6: create multicast route with RTPROT_KERNEL
    - net_sched: avoid shift-out-of-bounds in tcindex_set_parms()
    - net: dsa: b53: fix an off by one in checking "vlan->vid"
    - gpio: mvebu: fix pwm .get_state period calculation
    - Revert "mm/slub: fix a memory leak in sysfs_slab_add()"
    - futex: Ensure the correct return value from futex_lock_pi()
    - futex: Replace pointless printk in fixup_owner()
    - futex: Provide and use pi_state_update_owner()
    - rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
    - futex: Use pi_state_update_owner() in put_pi_state()
    - futex: Simplify fixup_pi_state_owner()
    - futex: Handle faults correctly for PI futexes
    - tracing: Fix race in trace_open and buffer resize call
    - fs: move I_DIRTY_INODE to fs.h
    - writeback: Drop I_DIRTY_TIME_EXPIRE
    - fs: fix lazytime expiration handling in __writeback_single_inode()
    - mmc: core: don't initialize block size from ext_csd if not present
    - scsi: qedi: Correct max length of CHAP secret
    - riscv: Fix kernel time_init()
    - HID: Ignore battery for Elan touchscreen on ASUS UX550
    - clk: tegra30: Add hda clock default rates to clock driver
    - drm/nouveau/mmu: fix vram heap sizing
    - scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression
    - can: peak_usb: fix use after free bugs
    - serial: mvebu-uart: fix tx lost characters at power off
    - driver core: Extend device_is_dependent()
    - net_sched: reject silly cell_log in qdisc_get_rtab()
    - tools: Factor HOSTCC, HOSTLD, HOSTAR definitions

  * Enforce CONFIG_DRM_BOCHS=m (LP: #1916290)
    - [Config] Enforce CONFIG_DRM_BOCHS=m

  * Please trust Canonical Livepatch Service kmod signing key (LP: #1898716)
    - [Config] enable CONFIG_MODVERSIONS=y
    - [Packaging] build canonical-certs.pem from branch/arch certs
    - [Config] add Canonical Livepatch Service key to SYSTEM_TRUSTED_KEYS
    - [Config] add ubuntu-drivers key to SYSTEM_TRUSTED_KEYS

 -- Kleber Sacilotto de Souza <kleber.souza@xxxxxxxxxxxxx>  Wed, 24 Mar
2021 18:47:50 +0100

** Changed in: linux (Ubuntu Bionic)
       Status: Fix Committed => Fix Released

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-13095

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-3348

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1772675

Title:
  i40e PF reset due to incorrect MDD event

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Cosmic:
  Won't Fix

Bug description:
  [Impact]
  The i40e driver sometimes causes a "malicious device" event that the firmware detects, which causes the firmware to reset the NIC, causing an interruption in the network connection - which can cause further problems, e.g. if the interface is in a bond; the reset will at least cause a temporary interruption in network traffic.

  [Fix]
  In the case of MDD events issued for the PF, they are usually the result of a misconfigured TX descriptor and not due to "bad" actions in the VFs. We don't need to issue a reset to the whole NIC, TX hang checks should handle those if necessary.

  [Test Procedure]
  The bug is unfortunately difficult to reproduce, as there's no detailed documentation on how the i40e firmware detects and raises MDDs. We have seen reports of this happening in Xenial and Bionic, for workloads stressing i40e bonds in LACP mode.
  Reproducing is easily detected, as the network traffic will be interrupted and the system logs will contain a message like:
  i40e 0000:02:00.1: TX driver issue detected, PF reset issued

  An alternative test procedure makes use of the kprobes attached to the LP bug. The test setup is as follows:
  - Create 2 VFs on primary NIC
  - Passthrough VF 1 to a Bionic VM
  - Start iperf3 client on VM, going through i40evf interface
  - Start another iperf3 client on host, going through i40e interface
  Both iperf3 clients should be using an external server located on a separate host. By loading the kprobe module while iperf3 is running, we should be able to raise MDDs more consistently. MDD behaviour can change according to firmware version, so we may need to try with different sets of probes. The one with the most consistent results seems to be 'corrupt_tx_desc_addr', which corrupts the cmd_type_offset_bsz field of the last TX descriptor before the NIC is notified of new data.

  [Regression Potential]
  Since we're removing resets for the NIC, regressions could show up as issues in connectivity after the MDD events are raised. If the firmware expects the whole NIC to reset, we could see TX/RX hangs and general unresponsiveness in networking. The potential for this should however be fairly low, as this patch has been present since kernel 5.2 and hasn't seen any fixes or regressions upstream. Basic smoke tests also showed that the driver continues working as expected, and that necessary PF resets will be issued by the netdev watchdog in case of any hung queues.

  ==
  [original description]

  This is a continuation from bug 1713553 and then bug 1723127; a patch
  was added in the first bug and then the second bug, to attempt to fix
  this, and it may have helped reduce the issue but appears not to have
  fixed it, based on more reports.

  See bug 1713553 and bug 1723127 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1772675/+subscriptions