kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #74449
[Bug 1349768] Re: kernel 3.13.0-32 ipvs "IPv6 header not found" related to UDP socket sendto() EPERM errors
The utopic kernel appears to be affected as well. Booting the same
Ubuntu 14.04 machine into the Linux 3.16.0-6-generic #11-Ubuntu kernel
installed from
https://launchpad.net/ubuntu/+source/linux/3.16.0-6.11/+build/6217368
gives identical symptoms, i.e. dnsmasq-tftp stalls, dmesg reports "IPv6
header not found", stracing dnsmasq shows sendto() giving EPERM, and
unloading ip_vs allows the stalled tftp transfer to continue.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1349768
Title:
kernel 3.13.0-32 ipvs "IPv6 header not found" related to UDP socket
sendto() EPERM errors
Status in “linux” package in Ubuntu:
Confirmed
Status in “linux” source package in Trusty:
In Progress
Status in “linux” source package in Utopic:
Confirmed
Bug description:
I have an Ubuntu 14.04 host that I am using as both a keepalived/ipvs
loadbalancer and dnsmasq server for pxebooting servers.
After updating linux-image 3.13.0-29.53 -> 3.13.0-32.57 I noticed that
dnsmasq-tftp stopped working. pxeboot clients would hang on the
"Loading ..../linux" TFTP transfer, with the transfer stalling roughly
~1000 blocks into the transfer:
10:30:51.011728 IP 10.1.1.2.43540 > 10.1.12.1.49165: UDP, length 1412
10:30:51.011924 IP 10.1.12.1.49165 > 10.1.1.2.43540: UDP, length 4
10:30:51.012012 IP 10.1.1.2.43540 > 10.1.12.1.49165: UDP, length 1412
10:30:51.012183 IP 10.1.12.1.49165 > 10.1.1.2.43540: UDP, length 4
stracing dnsmasq I noticed something very odd: sendto() on the
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) would suddenly start
persistently returning EPERM in mid-transfer, even when dnsmasq
continued to periodically retry:
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 1 (in [17], left {0, 249834})
recvfrom(17, "\0\4\3\352", 4096, 0, NULL, NULL) = 4
lseek(16, 1410816, SEEK_SET) = 1410816
read(16, "\25\306\345f\2{\r\4)W\276\32\336q\252_\230q\213\341U\354\25\374k7\243\32\221X+\v"..., 1408) = 1408
sendto(17, "\0\3\3\353\25\306\345f\2{\r\4)W\276\32\336q\252_\230q\213\341U\354\25\374k7\243\32"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = 1412
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 1 (in [17], left {0, 249839})
recvfrom(17, "\0\4\3\353", 4096, 0, NULL, NULL) = 4
lseek(16, 1412224, SEEK_SET) = 1412224
read(16, "*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277\221\\\307\372"..., 1408) = 1408
sendto(17, "\0\3\3\354*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = -1 EPERM (Operation not permitted)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout)
lseek(16, 1412224, SEEK_SET) = 1412224
read(16, "*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277\221\\\307\372"..., 1408) = 1408
sendto(17, "\0\3\3\354*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = -1 EPERM (Operation not permitted)
This was with all iptables rules unloaded (so no OUTPUT -j DENY) and
apparmor profiles torn down.
I also noticed the following dmesgs appearing at roughly similar times
to the tftp transfers getting stuck (although not coinciding exactly
with the stall):
[70325.516724] IPv6 header not found
The error pointed to ipvs (which I am using on the same host as an IPv4 NAT loadbalancer):
http://archive.linuxvirtualserver.org/html/lvs-devel/2012-08/msg00018.html
http://comments.gmane.org/gmane.comp.linux.lvs.devel/3614
I then tore down the ipvs rules (service keepalived stop) and unloaded
the modules (rmmod ip_vs_rr ip_vs), and the issue resolved itself -
the stalled dnsmasq-tftp transfer resumed!
This seems to be reproducible, i.e. modprobing ip_vs and starting
keepalived will cause dnsmasq-tftp to stall again, and
stopping/unloading will resume.
This seems to happen reproducibly on boot with -32 and -30. This does NOT seem to happen with 3.13.0-29 which I was using up until now.
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Jul 29 13:43 seq
crw-rw---- 1 root audio 116, 33 Jul 29 13:43 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
HibernationDevice: RESUME=/dev/mapper/catcp2-swap
InstallationDate: Installed on 2014-06-03 (56 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: Dell Inc. PowerEdge R410
Package: linux-image-3.13.0-32-generic 3.13.0-32.57
PackageArchitecture: amd64
PciMultimedia:
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
LANG=C.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-32-generic root=/dev/mapper/hostname-root ro console=ttyS1,115200n8 console=tty0 nomdmonddf nomdmonisw
ProcVersionSignature: Ubuntu 3.13.0-32.57-generic 3.13.11.4
RelatedPackageVersions:
linux-restricted-modules-3.13.0-32-generic N/A
linux-backports-modules-3.13.0-32-generic N/A
linux-firmware 1.127.5
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty
Uname: Linux 3.13.0-32-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 07/30/2013
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.12.0
dmi.board.name: 01V648
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.12.0:bd07/30/2013:svnDellInc.:pnPowerEdgeR410:pvr:rvnDellInc.:rn01V648:rvrA03:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R410
dmi.sys.vendor: Dell Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768/+subscriptions
References