canonical-ubuntu-qa team mailing list archive
-
canonical-ubuntu-qa team
-
Mailing list archive
-
Message #02171
[Bug 2009141] Re: subiquity fails to handle a large burst of udev events
** Tags removed: targetmilestone-inin---
** Tags added: targetmilestone-inin2204
--
You received this bug notification because you are a member of
Canonical's Ubuntu QA, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2009141
Title:
subiquity fails to handle a large burst of udev events
Status in Ubuntu on IBM z Systems:
Fix Released
Status in subiquity package in Ubuntu:
Fix Released
Bug description:
Subiquity, when faced with a large burst of events, may fail to
process them in a timely fashion which can result in a traceback from
pyudev related to filled event buffers. For a reproducer, please see
the "many-partitions" attachment.
Related pyudev issue: https://github.com/pyudev/pyudev/issues/194
There also appears to be a udev-related issue on Mantic that makes
this issue more likely to be seen: LP: #2037569
---
Original bug description:
Bug Description:
Installation of Ubuntu 22.04 on s390x failed with an unknown error just after having successfully activated a zfcp HBA with Fibre-Channel-attached SCSI disks.
I do see 0.0.100d successfully being online and having paths attached:
zfcp-host
0.0.100d online ?
0x500173800cef0111:0x0000000000000000 sg16
0x500173800cef0111:0x0001000000000000 sdp sg17
0x500173800cef0111:0x0002000000000000 sdq sg18
[host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373 byte 238696/1146881 21%]
But immediately after that, the installer reports an error:
An error occurred during installation
??????????????????????????????????????????????????????????????????????????
?subiquity/Early/apply_autoinstall_config
?subiquity/Reporting/apply_autoinstall_config
?subiquity/Error/apply_autoinstall_config
?subiquity/Userdata/apply_autoinstall_config
?subiquity/Package/apply_autoinstall_config
?subiquity/Debconf/apply_autoinstall_config
?subiquity/Kernel/apply_autoinstall_config
?subiquity/Late/apply_autoinstall_config ??????????????????????????????????????????????????????????????????????????
? Sorry, an unknown error occurred. ?
? Information is being collected from the system that will help the ?
? developers diagnose the report. |
It looks like there was some ascii art progress bar while data was
collected and then the dialog updated to:
? [ View full report ] ?
? If you want to help improve the installer, you can send an error ?
? report. ?
? [ Send to Canonical ] ?
? [ Close report ] |
Next "View full report" was selected?
ProblemType: Bug
Architecture: s390x
CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
CurrentDmesg:
[ 0.093997] Linux version 5.15.0-43-generic (buildd@bos02-s390x-005) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #46-Ubuntu SMP Tue Jul 12 12:40:17 UTC 2022 (Ubuntu 5.15.0-43.46-generic 5.15.39)
[ 0.094001] setup: Linux is running as a z/VM guest operating system in 64-bit mode
[ 0.360759] Kernel command line:
ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.1-live-
server-s39
[ 2677.003438] zfcp 0.0.100d: qdio: ZFCP on SC 10 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
[ 2677.032397] scsi host0: zfcp
[ 2677.060025] scsi 0:0:0:0: RAID IBM 2810XIV-LUN-0 0000 PQ: 0 ANSI: 5
[ 2677.061097] scsi 0:0:0:0: alua: disable for non-disk devices
[ 2677.061134] scsi 0:0:0:0: Attached scsi generic sg0 type 12
[ 2677.062988] scsi 0:0:0:1: Direct-Access IBM 2810XIV 0000 PQ: 0 ANSI: 5
[ 2677.064628] scsi 0:0:0:1: alua: supports implicit TPGS
[ 2677.064632] scsi 0:0:0:1: alua: device naa.6001738cfc900cef0000000000013596 port group 0 rel port 301
[ 2677.064679] sd 0:0:0:1: Attached scsi generic sg1 type 0
==> zfcp could successfully set 0.0.100d online and automatic LUN scan
worked.
[ 2677.513028] sd 0:0:16:1074020357: [sdac] Attached SCSI disk
[ 2677.522641] sd 0:0:16:1074020356: [sdab] Attached SCSI disk
[ 2677.522677] sd 0:0:15:1074020357: [sdaa] Attached SCSI disk
[ 2677.522693] sd 0:0:15:1074020356: [sdz] Attached SCSI disk
[ 2678.660373] device-mapper: multipath service-time: version 0.3.0 loaded
Date: Wed Jan 18 21:04:03 2023
DistroRelease: Ubuntu 22.04
ExecutablePath: /snap/subiquity/3699/lib/python3.8/site-packages/subiquity/cmd/server.py
InstallerServerLog:
2023-01-18 20:23:57,901 INFO subiquity:112 Starting Subiquity server revision 3699
2023-01-18 21:02:53,381 DEBUG root:39 start: subiquity/Zdev/GET:
2023-01-18 21:02:53,381 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
2023-01-18 21:02:53,405 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
2023-01-18 21:02:53,406 DEBUG root:39 finish: subiquity/Zdev/GET: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
2023-01-18 21:02:53,414 INFO aiohttp.access:233 [18/Jan/2023:21:02:53 +0000] "GET /zdev HTTP/1.1" 200 2059 "-" "Python/3.8 aiohttp/3.6.2"
2023-01-18 21:03:06,492 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-01-18 21:03:06,534 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
2023-01-18 21:03:06,534 DEBUG probert.network:585 event for addr_change: CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
2023-01-18 21:03:06,534 DEBUG probert.network:717 addr_change CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
2023-01-18 21:03:06,534 DEBUG root:39 start: subiquity/Network/_send_update: CHANGE ence0f
2023-01-18 21:03:06,534 DEBUG subiquity.server.controllers.network:354 dev_info ence0f {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}
2023-01-18 21:03:06,535 DEBUG root:39 finish: subiquity/Network/_send_update: SUCCESS: CHANGE ence0f
2023-01-18 21:03:19,283 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-01-18 21:03:19,314 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
2023-01-18 21:03:19,314 DEBUG probert.network:585 event for addr_change: CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
2023-01-18 21:03:19,314 DEBUG probert.network:717 addr_change CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
2023-01-18 21:03:19,314 DEBUG root:39 start: subiquity/Network/_send_update: CHANGE ence0f
2023-01-18 21:03:19,314 DEBUG subiquity.server.controllers.network:354 dev_info ence0f {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}
2023-01-18 21:03:19,314 DEBUG root:39 finish: subiquity/Network/_send_update: SUCCESS: CHANGE ence0f
2023-01-18 21:03:59,037 DEBUG root:39 start: subiquity/Zdev/chzdev_POST:
2023-01-18 21:03:59,038 DEBUG subiquitycore.utils:92 arun_command called: ['chzdev', '--enable', '0.0.100d']
2023-01-18 21:03:59,357 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-01-18 21:03:59,373 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 1
2023-01-18 21:03:59,373 DEBUG subiquity.server.controller.filesystem:495 waiting 0.1 to let udev event queue settle
... <<< some time with repeated udevadm settle >>> ...
2023-01-18 21:04:02,984 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-01-18 21:04:02,986 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 1
2023-01-18 21:04:02,986 DEBUG subiquity.server.controller.filesystem:495 waiting 0.1 to let udev event queue settle
2023-01-18 21:04:03,029 DEBUG subiquitycore.utils:101 arun_command ['chzdev', '--enable', '0.0.100d'] exited with code 0
2023-01-18 21:04:03,029 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
2023-01-18 21:04:03,047 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
2023-01-18 21:04:03,052 DEBUG root:39 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
2023-01-18 21:04:03,052 INFO aiohttp.access:233 [18/Jan/2023:21:03:59 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.100d%22,+%22type%22:+%22zfcp-host%22,+%22on%22:+false,+%22exists%22:+true,+%22pers%22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 7956 "-" "Python/3.8 aiohttp/3.6.2"
2023-01-18 21:04:03,087 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-01-18 21:04:03,114 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
Traceback (most recent call last):
File "/snap/subiquity/3699/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/snap/subiquity/3699/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 506, in _udev_event
action, dev = self._monitor.receive_device()
File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 397, in receive_device
device = self.poll()
File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 357, in poll
if eintr_retry_call(poll.Poll.for_events((self, 'r')).poll, timeout):
File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_util.py", line 163, in eintr_retry_call
return func(*args, **kwargs)
File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 97, in poll
return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 112, in _parse_events
raise IOError('Error while polling fd: {0!r}'.format(fd))
OSError: Error while polling fd: 20
2023-01-18 21:04:03,116 DEBUG subiquitycore.common.errorreport:384 generating crash report
2023-01-18 21:04:03,116 INFO subiquitycore.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1674075843.116781473.unknown.crash
2023-01-18 21:04:03,117 INFO root:39 start: subiquity/ErrorReporter/1674075843.116781473.unknown/add_info:
2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
2023-01-18 21:04:03,118 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
2023-01-18 21:04:03,118 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
InstallerServerLogInfo:
InterpreterPath: /snap/subiquity/3699/usr/bin/python3.8
UdevDb:
...
P: /devices/css0/0.0.0010/0.0.100d
L: 0
E: DEVPATH=/devices/css0/0.0.0010/0.0.100d
E: SUBSYSTEM=ccw
E: DRIVER=zfcp
E: CU_TYPE=1731
E: CU_MODEL=03
E: DEV_TYPE=1732
E: DEV_MODEL=03
E: MODALIAS=ccw:t1731m03dt1732dm03
...
P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-0/target0:0:0/0:0:0:1/block/sda
N: sda
L: 0
S: disk/by-path/ccw-0.0.100d-fc-0x500173800cef0131-lun-1
S: disk/by-id/scsi-0IBM_2810XIV_host=ilabg13_tuc3_fcp_32G
S: disk/by-id/scsi-36001738cfc900cef0000000000013596
S: disk/by-id/wwn-0x6001738cfc900cef0000000000013596
S: disk/by-uuid/da904d37-e306-4223-8fbb-9670357ca708
S: disk/by-id/scsi-SIBM_2810XIV_6000CEF0000000000013596
S: disk/by-id/scsi-1IBM_2810XIV_6000CEF0000000000013596
...
E: DM_MULTIPATH_DEVICE_PATH=1
...
P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-17/target0:0:17/0:0:17:1/block/sdae
N: sdae
L: 0
S: disk/by-id/scsi-1IBM_FlashSystem-984026c6702d26c6-0000-007d-000294
S: disk/by-id/scsi-SIBM_FlashSystem-9840_26c6702d26c6-0000-007d-000294
S: disk/by-path/ccw-0.0.100d-fc-0x500507605e8b7271-lun-1
S: disk/by-uuid/97c7452a-2f45-4f7b-9127-d63b4a34e1e4
S: disk/by-id/scsi-36005076b19bcb49b180000007d000294
S: disk/by-id/wwn-0x6005076b19bcb49b180000007d000294
E: DEVPATH=/devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-17/target0:0:17/0:0:17:1/block/sdae
E: SUBSYSTEM=block
E: DEVNAME=/dev/sdae
E: DEVTYPE=disk
E: DISKSEQ=43
E: MAJOR=65
E: MINOR=224
E: USEC_INITIALIZED=2677514875
E: DM_MULTIPATH_DEVICE_PATH=1
...
P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-9/target0:0:9/0:0:9:1/block/sdo
N: sdo
L: 0
S: disk/by-path/ccw-0.0.100d-fc-0x500507680b2541ba-lun-1
S: disk/by-id/wwn-0x6005076400818089b000000000000157
S: disk/by-id/scsi-36005076400818089b000000000000157
S: disk/by-id/scsi-SIBM_2145_010020c0226cXX00
S: disk/by-uuid/0fbe15bc-49fb-4a09-b9f2-269129fa9913
E: DEVPATH=/devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-9/target0:0:9/0:0:9:1/block/sdo
E: SUBSYSTEM=block
E: DEVNAME=/dev/sdo
E: DEVTYPE=disk
E: DISKSEQ=27
E: MAJOR=8
E: MINOR=224
E: USEC_INITIALIZED=2677616364
E: DM_MULTIPATH_DEVICE_PATH=1
==> Probably the last discovered SCSI LUN.
P: /devices/virtual/block/dm-0
N: dm-0
L: 50
S: disk/by-id/scsi-360050762198c1fc2180000000b000132
S: disk/by-id/wwn-0x60050762198c1fc2180000000b000132
S: disk/by-id/dm-name-mpatha
S: disk/by-id/dm-uuid-mpath-360050762198c1fc2180000000b000132
S: mapper/mpatha
E: DEVPATH=/devices/virtual/block/dm-0
E: SUBSYSTEM=block
E: DEVNAME=/dev/dm-0
E: DEVTYPE=disk
E: DISKSEQ=44
E: MAJOR=253
E: MINOR=0
E: USEC_INITIALIZED=2678671661
E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_ACTIVATION=1
E: DM_NAME=mpatha
E: DM_UUID=mpath-360050762198c1fc2180000000b000132
E: DM_SUSPENDED=0
E: DM_UDEV_RULES=1
E: DM_UDEV_RULES_VSN=2
E: MPATH_SBIN_PATH=/sbin
E: MPATH_DEVICE_READY=1
==> Multipath devices also got assembled.
Canonical, why did the installer get an error?
Does the installer really have a busy(!) waiting loop calling udevadm settle with zero timeout?
But even if so, with the number of discovered devices and the settle finally returning with success errorlevel 0, it should just work?
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2009141/+subscriptions