kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #83285
[Bug 1377851] [NEW] Kernel panic skb_segment+0x5d7/0x980
Public bug reported:
On two Ubuntu 14.04 amd64 servers with tg3 NICs acting as a openvpn gateway we recently had a lot of trouble with kernel panics (linux-image-3.13.0-36-generic 3.13.0-36.63)
The panics were kind of random happening sometimes already during the boot process and sometimes a couple of hours later.
The boxes were running perfectly find for a couple of months before. We believe some kind of "special" packet triggered the bug.
Potential related bugs:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331219
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1313591
Upgrading to linux-image-3.16.3-031603-generic
(3.16.3-031603.201409171435) solved the problem for us.
[ 6076.726520] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
[ 6076.737716] IP: [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6076.745780] PGD 0
[ 6076.748641] Oops: 0000 [#1] SMP
[ 6076.753268] Modules linked in: btrfs ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c cdc_ether usbnet mii mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase ipmi_si ipmi_devintf dell_rbu gpio_ich intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp dcdbas kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 8021q garp stp mrp llc sb_edac edac_core shpchp joydev pl2303 usbserial lpc_ich wmi mei_me mei mac_hid acpi_power_meter ioatdma nf_conntrack dca lp parport raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 tg3 ahci hid_generic raid0 ptp usbhid multipath hid libahci pps_core linear [last unloaded: ipmi_si]
[ 6076.850743] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 6076.861485] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 1.6.0 03/07/2013
[ 6076.872104] task: ffff880223841800 ti: ffff880223848000 task.ti: ffff880223848000
[ 6076.882722] RIP: 0010:[<ffffffff81616787>] [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6076.894243] RSP: 0018:ffff880227263790 EFLAGS: 00010246
[ 6076.901769] RAX: 0000000000000646 RBX: ffff88021f03f000 RCX: ffff8801ed4fff00
[ 6076.911893] RDX: 0000000000000646 RSI: 00000000000000c2 RDI: ffffea0007f6de00
[ 6076.971665] RBP: ffff880227263858 R08: 000000000000fff6 R09: 0000000000000001
[ 6077.031463] R10: ffff88021f03e800 R11: 0000000000010552 R12: ffff8801fdb1fc80
[ 6077.091313] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000646
[ 6077.151619] FS: 0000000000000000(0000) GS:ffff880227260000(0000) knlGS:0000000000000000
[ 6077.262424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6077.320083] CR2: 000000000000006c CR3: 0000000001c0e000 CR4: 00000000000407e0
[ 6077.379854] Stack:
[ 6077.433555] ffffffff811a48f9 ffff8802272772c0 000000000000fff6 ffffffffffff000a
[ 6077.546280] ffffffff00010552 000000000000006a ffff88021f03e800 0000000100000020
[ 6077.658915] ffffffffffffffe4 0000000000010012 0000001c0000055c ffff88021f03f000
[ 6077.771400] Call Trace:
[ 6077.824494] <IRQ>
[ 6077.827251] [<ffffffff811a48f9>] ? __kmalloc_node_track_caller+0xb9/0x290
[ 6077.933211] [<ffffffff8168149d>] tcp_gso_segment+0x10d/0x3f0
[ 6077.988641] [<ffffffff81691822>] inet_gso_segment+0x132/0x360
[ 6078.043154] [<ffffffff810a5db2>] ? enqueue_task_fair+0x422/0x6c0
[ 6078.097358] [<ffffffff81623ffc>] skb_mac_gso_segment+0x9c/0x180
[ 6078.150464] [<ffffffff816a0fb4>] gre_gso_segment+0x134/0x370
[ 6078.202321] [<ffffffff8109828d>] ? ttwu_do_activate.constprop.74+0x5d/0x70
[ 6078.255348] [<ffffffff81691822>] inet_gso_segment+0x132/0x360
[ 6078.306129] [<ffffffff8109a800>] ? try_to_wake_up+0x240/0x2c0
[ 6078.355712] [<ffffffff81623ffc>] skb_mac_gso_segment+0x9c/0x180
[ 6078.404660] [<ffffffff8162413d>] __skb_gso_segment+0x5d/0xb0
[ 6078.452918] [<ffffffff8162444a>] dev_hard_start_xmit+0x18a/0x560
[ 6078.501057] [<ffffffff8164360e>] sch_direct_xmit+0xee/0x1c0
[ 6078.548821] [<ffffffff81624a50>] __dev_queue_xmit+0x230/0x500
[ 6078.596793] [<ffffffff81624d30>] dev_queue_xmit+0x10/0x20
[ 6078.644041] [<ffffffff8162be31>] neigh_direct_output+0x11/0x20
[ 6078.691822] [<ffffffff8165d370>] ip_finish_output+0x1b0/0x3b0
[ 6078.739211] [<ffffffff8165e8d8>] ip_output+0x58/0x90
[ 6078.784448] [<ffffffff8165a84b>] ip_forward_finish+0x8b/0x170
[ 6078.830211] [<ffffffff8165ac85>] ip_forward+0x355/0x410
[ 6078.874484] [<ffffffff8165899d>] ip_rcv_finish+0x7d/0x350
[ 6078.918046] [<ffffffff816592e8>] ip_rcv+0x298/0x3d0
[ 6078.959829] [<ffffffff81622bb6>] __netif_receive_skb_core+0x666/0x840
[ 6079.003064] [<ffffffff8101b200>] ? flush_ptrace_hw_breakpoint+0x30/0x60
[ 6079.045844] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 6079.086823] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 6079.126412] [<ffffffff81622f24>] napi_gro_complete+0xa4/0xe0
[ 6079.164669] [<ffffffff816234a0>] dev_gro_receive+0x210/0x2d0
[ 6079.203193] [<ffffffff816237e5>] napi_gro_receive+0x25/0xb0
[ 6079.242062] [<ffffffffa00d8c2b>] tg3_poll_work+0xc2b/0xf30 [tg3]
[ 6079.281003] [<ffffffffa00d8f6b>] tg3_poll_msix+0x3b/0x140 [tg3]
[ 6079.319178] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 6079.356843] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 6079.394099] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 6079.430844] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 6079.466888] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 6079.503660] <EOI>
[ 6079.506415] [<ffffffff815d11bf>] ? cpuidle_enter_state+0x4f/0xc0
[ 6079.573717] [<ffffffff815d12e9>] cpuidle_idle_call+0xb9/0x1f0
[ 6079.611376] [<ffffffff8101cede>] arch_cpu_idle+0xe/0x30
[ 6079.648243] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 6079.685771] [<ffffffff81041018>] start_secondary+0x218/0x2c0
[ 6079.723121] Code: 4c 24 60 eb 21 0f 1f 80 00 00 00 00 41 83 c5 01 49 83 c4 10 48 83 c1 10 41 39 c3 0f 86 83 01 00 00 41 89 c7 89 c2 45 39 e9 7f 37 <41> 8b 46 6c 41 39 46 68 0f 85 75 03 00 00 45 8b a6 cc 00 00 00
[ 6079.844842] RIP [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6079.885283] RSP <ffff880227263790>
[ 6079.922830] CR2: 000000000000006c
[ 6080.024035] ---[ end trace 6e658236aae2d239 ]---
[ 6080.065884] Kernel panic - not syncing: Fatal exception in interrupt
ethtool -k port1
Features for port1:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1377851
Title:
Kernel panic skb_segment+0x5d7/0x980
Status in “linux” package in Ubuntu:
New
Bug description:
On two Ubuntu 14.04 amd64 servers with tg3 NICs acting as a openvpn gateway we recently had a lot of trouble with kernel panics (linux-image-3.13.0-36-generic 3.13.0-36.63)
The panics were kind of random happening sometimes already during the boot process and sometimes a couple of hours later.
The boxes were running perfectly find for a couple of months before. We believe some kind of "special" packet triggered the bug.
Potential related bugs:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331219
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1313591
Upgrading to linux-image-3.16.3-031603-generic
(3.16.3-031603.201409171435) solved the problem for us.
[ 6076.726520] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
[ 6076.737716] IP: [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6076.745780] PGD 0
[ 6076.748641] Oops: 0000 [#1] SMP
[ 6076.753268] Modules linked in: btrfs ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c cdc_ether usbnet mii mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase ipmi_si ipmi_devintf dell_rbu gpio_ich intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp dcdbas kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 8021q garp stp mrp llc sb_edac edac_core shpchp joydev pl2303 usbserial lpc_ich wmi mei_me mei mac_hid acpi_power_meter ioatdma nf_conntrack dca lp parport raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 tg3 ahci hid_generic raid0 ptp usbhid multipath hid libahci pps_core linear [last unloaded: ipmi_si]
[ 6076.850743] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 6076.861485] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 1.6.0 03/07/2013
[ 6076.872104] task: ffff880223841800 ti: ffff880223848000 task.ti: ffff880223848000
[ 6076.882722] RIP: 0010:[<ffffffff81616787>] [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6076.894243] RSP: 0018:ffff880227263790 EFLAGS: 00010246
[ 6076.901769] RAX: 0000000000000646 RBX: ffff88021f03f000 RCX: ffff8801ed4fff00
[ 6076.911893] RDX: 0000000000000646 RSI: 00000000000000c2 RDI: ffffea0007f6de00
[ 6076.971665] RBP: ffff880227263858 R08: 000000000000fff6 R09: 0000000000000001
[ 6077.031463] R10: ffff88021f03e800 R11: 0000000000010552 R12: ffff8801fdb1fc80
[ 6077.091313] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000646
[ 6077.151619] FS: 0000000000000000(0000) GS:ffff880227260000(0000) knlGS:0000000000000000
[ 6077.262424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6077.320083] CR2: 000000000000006c CR3: 0000000001c0e000 CR4: 00000000000407e0
[ 6077.379854] Stack:
[ 6077.433555] ffffffff811a48f9 ffff8802272772c0 000000000000fff6 ffffffffffff000a
[ 6077.546280] ffffffff00010552 000000000000006a ffff88021f03e800 0000000100000020
[ 6077.658915] ffffffffffffffe4 0000000000010012 0000001c0000055c ffff88021f03f000
[ 6077.771400] Call Trace:
[ 6077.824494] <IRQ>
[ 6077.827251] [<ffffffff811a48f9>] ? __kmalloc_node_track_caller+0xb9/0x290
[ 6077.933211] [<ffffffff8168149d>] tcp_gso_segment+0x10d/0x3f0
[ 6077.988641] [<ffffffff81691822>] inet_gso_segment+0x132/0x360
[ 6078.043154] [<ffffffff810a5db2>] ? enqueue_task_fair+0x422/0x6c0
[ 6078.097358] [<ffffffff81623ffc>] skb_mac_gso_segment+0x9c/0x180
[ 6078.150464] [<ffffffff816a0fb4>] gre_gso_segment+0x134/0x370
[ 6078.202321] [<ffffffff8109828d>] ? ttwu_do_activate.constprop.74+0x5d/0x70
[ 6078.255348] [<ffffffff81691822>] inet_gso_segment+0x132/0x360
[ 6078.306129] [<ffffffff8109a800>] ? try_to_wake_up+0x240/0x2c0
[ 6078.355712] [<ffffffff81623ffc>] skb_mac_gso_segment+0x9c/0x180
[ 6078.404660] [<ffffffff8162413d>] __skb_gso_segment+0x5d/0xb0
[ 6078.452918] [<ffffffff8162444a>] dev_hard_start_xmit+0x18a/0x560
[ 6078.501057] [<ffffffff8164360e>] sch_direct_xmit+0xee/0x1c0
[ 6078.548821] [<ffffffff81624a50>] __dev_queue_xmit+0x230/0x500
[ 6078.596793] [<ffffffff81624d30>] dev_queue_xmit+0x10/0x20
[ 6078.644041] [<ffffffff8162be31>] neigh_direct_output+0x11/0x20
[ 6078.691822] [<ffffffff8165d370>] ip_finish_output+0x1b0/0x3b0
[ 6078.739211] [<ffffffff8165e8d8>] ip_output+0x58/0x90
[ 6078.784448] [<ffffffff8165a84b>] ip_forward_finish+0x8b/0x170
[ 6078.830211] [<ffffffff8165ac85>] ip_forward+0x355/0x410
[ 6078.874484] [<ffffffff8165899d>] ip_rcv_finish+0x7d/0x350
[ 6078.918046] [<ffffffff816592e8>] ip_rcv+0x298/0x3d0
[ 6078.959829] [<ffffffff81622bb6>] __netif_receive_skb_core+0x666/0x840
[ 6079.003064] [<ffffffff8101b200>] ? flush_ptrace_hw_breakpoint+0x30/0x60
[ 6079.045844] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 6079.086823] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 6079.126412] [<ffffffff81622f24>] napi_gro_complete+0xa4/0xe0
[ 6079.164669] [<ffffffff816234a0>] dev_gro_receive+0x210/0x2d0
[ 6079.203193] [<ffffffff816237e5>] napi_gro_receive+0x25/0xb0
[ 6079.242062] [<ffffffffa00d8c2b>] tg3_poll_work+0xc2b/0xf30 [tg3]
[ 6079.281003] [<ffffffffa00d8f6b>] tg3_poll_msix+0x3b/0x140 [tg3]
[ 6079.319178] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 6079.356843] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 6079.394099] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 6079.430844] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 6079.466888] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 6079.503660] <EOI>
[ 6079.506415] [<ffffffff815d11bf>] ? cpuidle_enter_state+0x4f/0xc0
[ 6079.573717] [<ffffffff815d12e9>] cpuidle_idle_call+0xb9/0x1f0
[ 6079.611376] [<ffffffff8101cede>] arch_cpu_idle+0xe/0x30
[ 6079.648243] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 6079.685771] [<ffffffff81041018>] start_secondary+0x218/0x2c0
[ 6079.723121] Code: 4c 24 60 eb 21 0f 1f 80 00 00 00 00 41 83 c5 01 49 83 c4 10 48 83 c1 10 41 39 c3 0f 86 83 01 00 00 41 89 c7 89 c2 45 39 e9 7f 37 <41> 8b 46 6c 41 39 46 68 0f 85 75 03 00 00 45 8b a6 cc 00 00 00
[ 6079.844842] RIP [<ffffffff81616787>] skb_segment+0x5d7/0x980
[ 6079.885283] RSP <ffff880227263790>
[ 6079.922830] CR2: 000000000000006c
[ 6080.024035] ---[ end trace 6e658236aae2d239 ]---
[ 6080.065884] Kernel panic - not syncing: Fatal exception in interrupt
ethtool -k port1
Features for port1:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1377851/+subscriptions
Follow ups
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Launchpad Bug Tracker, 2014-10-29
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Frederik Kriewitz, 2014-10-27
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Frederik Kriewitz, 2014-10-22
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Chris J Arges, 2014-10-21
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Brad Figg, 2014-10-16
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Launchpad Bug Tracker, 2014-10-14
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Launchpad Bug Tracker, 2014-10-10
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Luis Henriques, 2014-10-09
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Chris J Arges, 2014-10-07
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Chris J Arges, 2014-10-07
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Frederik Kriewitz, 2014-10-07
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Joseph Salisbury, 2014-10-06
-
[Bug 1377851] Re: Kernel panic skb_segment+0x5d7/0x980
From: Frederik Kriewitz, 2014-10-06
-
[Bug 1377851] Missing required logs.
From: Brad Figg, 2014-10-06
-
[Bug 1377851] [NEW] Kernel panic skb_segment+0x5d7/0x980
From: Frederik Kriewitz, 2014-10-06
References