dx-packages team mailing list archive
-
dx-packages team
-
Mailing list archive
-
Message #43767
[Bug 1871538] Re: dbus timeout-ed during an upgrade, taking services down including gdm
Quoting upstream systemd developers (https://github.com/systemd/systemd/issues/22737#issuecomment-1077682307):
"We essentially traded one problem (lockup when starting services) for another (the failure described in this commit).
I actually think that the lockup is worse. Here there is a simple solution: switch from dbus-daemon to dbus-broker. [...]
A proper fix will most likely be to make all dbus calls asynchronous in
systemd, but that is a lot of work and it's unclear when/if it'll
happen. The regression is unfortunate, but I don't think we can fix it
in reasonable time."
So I wonder what's the best path forward in Ubuntu... if we revert,
we'll re-introduce the lockup/timeout problem with dbus-daemon. If we
keep the current version, fwupd-refresh.service is broken.
The comment at https://github.com/fwupd/fwupd/issues/3037#issuecomment-1100816992 suggests that disabling the DynamicUser= setting makes the service work again. Maybe that's worth a try, in order to get both problems solved? (i.e. shipping an override config for fwupd)
$cat /etc/systemd/system/fwupd-refresh.service.d/override.conf
[Service]
DynamicUser=no
--
You received this bug notification because you are a member of DX
Packages, which is subscribed to accountsservice in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/1871538
Title:
dbus timeout-ed during an upgrade, taking services down including gdm
Status in D-Bus:
Unknown
Status in systemd:
New
Status in accountsservice package in Ubuntu:
Invalid
Status in dbus package in Ubuntu:
Invalid
Status in gnome-shell package in Ubuntu:
Invalid
Status in systemd package in Ubuntu:
Fix Released
Status in accountsservice source package in Focal:
Invalid
Status in dbus source package in Focal:
Invalid
Status in gnome-shell source package in Focal:
Invalid
Status in systemd source package in Focal:
Fix Released
Status in accountsservice source package in Groovy:
Invalid
Status in dbus source package in Groovy:
Invalid
Status in gnome-shell source package in Groovy:
Invalid
Status in accountsservice source package in Hirsute:
Invalid
Status in dbus source package in Hirsute:
Won't Fix
Status in gnome-shell source package in Hirsute:
Invalid
Status in accountsservice source package in Impish:
Invalid
Status in dbus source package in Impish:
Invalid
Status in gnome-shell source package in Impish:
Invalid
Status in systemd source package in Impish:
Fix Released
Status in accountsservice source package in Jammy:
Invalid
Status in dbus source package in Jammy:
Invalid
Status in gnome-shell source package in Jammy:
Invalid
Status in systemd source package in Jammy:
Fix Released
Bug description:
[Impact]
* There's currently a deadlock between PID 1 and dbus-daemon: in some
cases dbus-daemon will do NSS lookups (which are blocking) at the same
time PID 1 synchronously blocks on some call to dbus-daemon (e.g. `GetConnectionUnixUser` DBus call). Let's break
that by setting SYSTEMD_NSS_DYNAMIC_BYPASS=1 env var for dbus-daemon,
which will disable synchronously blocking varlink calls from nss-systemd
to PID 1.
* This can lead to delayed boot times
* It can also lead to dbus-daemon being killed/re-started, taking
down other services with it, like GDM, killing user sessions on the
way (e.g. on installing updates)
[Test Plan]
* This bug is really hard to reproduce, as can be seen from the
multi-year long discussion at
https://github.com/systemd/systemd/issues/15316
* Canonical's CPC team has the ability to reproduce this issue (with
a relatively high probability) in their Azure test environment, due to
the specific setup they are using
* So our test plan is to ask CPC (@gjolly) for confirmation if the
issue is fixed.
[Where problems could occur]
* This fix touches the communication between systemd and dbus daemon,
especially the NSS lookup, so if something is broken the (user-)name
resolution could be broken.
* As a workaround dbus-daemon could be replaced by dbus-broker, which
never showed this issue or the behaviour could be changed back by
using the `SYSTEMD_NSS_DYNAMIC_BYPASS` env variable, like this:
#/etc/systemd/system/dbus.service.d/override.conf
[Service]
Environment=SYSTEMD_NSS_DYNAMIC_BYPASS=0
[Other Info]
* Fixed upstream (v251) in https://github.com/systemd/systemd/pull/22552
=== Original Description ===
This morning I found my computer on the login screen.
But not the one of the screen log, no a new one - so something must have crashed.
Logging in again confirmed that all apps were gone and the gnome shell
was brought down what seems like triggered by a background update o
accountsservice.
As always things are not perfectly clear :-/
The following goes *back* in time through my logs one by one.
Multiple apps crashed at 06:09, but we will find later that this is a follow on issue of the underlying gnome/X/... recycling.
-rw-r----- 1 paelzer whoopsie 52962868 Apr 8 06:09 _usr_bin_konversation.1000.crash
-rw-r----- 1 paelzer whoopsie 986433 Apr 8 06:09 _usr_lib_x86_64-linux-gnu_libexec_drkonqi.1000.crash
rdkit was failing fast and giving up (that will be a different bug, it just seems broken on my system):
Apr 08 06:10:13 Keschdeichel systemd[1]: Started RealtimeKit Scheduling Policy Service.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Successfully called chroot.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Successfully dropped privileges.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Successfully limited resources.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: pthread_create failed: Resource temporarily unavailable
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Canary thread running.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Exiting canary thread.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Demoting known real-time threads.
Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Demoted 0 threads.
Apr 08 06:10:13 Keschdeichel systemd[1]: rtkit-daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 08 06:10:13 Keschdeichel systemd[1]: rtkit-daemon.service: Failed with result 'exit-code'.
Apr 08 06:10:13 Keschdeichel dbus-daemon[1208]: [system] Activating via systemd: service name='org.freedesktop.RealtimeKit1' unit='rtkit-daemon.service' requested by ':1.1176' (uid=121 pid=>
Apr 08 06:10:13 Keschdeichel systemd[1]: rtkit-daemon.service: Start request repeated too quickly.
Apr 08 06:10:13 Keschdeichel systemd[1]: rtkit-daemon.service: Failed with result 'exit-code'.
Apr 08 06:10:13 Keschdeichel systemd[1]: Failed to start RealtimeKit Scheduling Policy Service.
Apr 08 06:10:13 Keschdeichel bluetoothd[1729331]: Bluetooth daemon 5.53
But that already was only triggered by a gnome restart that kicked of
earlier:
Apr 08 06:09:27 Keschdeichel systemd[1726656]: Started GNOME Shell on Wayland.
Apr 08 06:09:27 Keschdeichel systemd[1726656]: Reached target GNOME Shell on Wayland.
Apr 08 06:09:27 Keschdeichel systemd[1726656]: Reached target GNOME Session is initialized.
Apr 08 06:09:27 Keschdeichel systemd[1726656]: Reached target GNOME Wayland Session.
Apr 08 06:09:27 Keschdeichel systemd[1726656]: Reached target GNOME Session (session: gnome-login).
X was recycleing before:
Apr 08 06:09:19 Keschdeichel systemd[10683]: Stopping GNOME Shell on X11...
...
Apr 08 06:09:22 Keschdeichel /usr/lib/gdm3/gdm-x-session[10710]: (EE) systemd-logind: ReleaseControl failed: Unknown object '/org/freedesktop/login1/session/_32'.
Apr 08 06:09:22 Keschdeichel /usr/lib/gdm3/gdm-x-session[10710]: (II) Server terminated successfully (0). Closing log file.
It seems like some internal service broke and everything that followed was a secondary issue to that:
Apr 08 06:09:19 Keschdeichel systemd[1]: NetworkManager.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel systemd[1]: wpa_supplicant.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel systemd[1]: thermald.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel thermald[1256]: [WARN]Terminating ...
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Got SIGTERM, quitting.
Apr 08 06:09:19 Keschdeichel systemd[1]: udisks2.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel ModemManager[1308]: <info> Caught signal, shutting down...
Apr 08 06:09:19 Keschdeichel systemd[1]: switcheroo-control.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface vnet0.IPv6 with address fe80::fc54:ff:fe78:26c2.
Apr 08 06:09:19 Keschdeichel systemd[1]: accounts-daemon.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface strswanbr2.IPv4 with address 192.168.2.0.
Apr 08 06:09:19 Keschdeichel systemd[1]: gdm.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1.
Apr 08 06:09:19 Keschdeichel systemd[1]: systemd-logind.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface strswanbr1.IPv4 with address 192.168.1.0.
Apr 08 06:09:19 Keschdeichel systemd[1]: systemd-machined.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface lxdbr0.IPv6 with address fd42:fa18:c923:35d5::1.
Apr 08 06:09:19 Keschdeichel systemd-machined[1255]: Failed to drop reference to machine scope, ignoring: Transport endpoint is not connected
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface lxdbr0.IPv4 with address 10.253.194.1.
Apr 08 06:09:19 Keschdeichel systemd[1]: switcheroo-control.service: Succeeded.
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface wlp4s0.IPv6 with address fe80::6832:c873:a074:2c7b.
Apr 08 06:09:19 Keschdeichel systemd[1]: systemd-machined.service: Succeeded.
Apr 08 06:09:19 Keschdeichel bluetoothd[1724347]: Terminating
Apr 08 06:09:19 Keschdeichel systemd[1]: avahi-daemon.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface wlp4s0.IPv4 with address 10.7.0.42.
Apr 08 06:09:19 Keschdeichel systemd[1]: ModemManager.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface enp0s31f6.IPv6 with address fe80::80a8:743:6b37:e561.
Apr 08 06:09:19 Keschdeichel systemd[1]: polkit.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface enp0s31f6.IPv4 with address 10.7.0.69.
Apr 08 06:09:19 Keschdeichel systemd[1]: bolt.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface lo.IPv6 with address ::1.
Apr 08 06:09:19 Keschdeichel systemd[1]: rtkit-daemon.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: Leaving mDNS multicast group on interface lo.IPv4 with address 127.0.0.1.
Apr 08 06:09:19 Keschdeichel systemd[1]: packagekit.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel wpa_supplicant[1265]: nl80211: deinit ifname=p2p-dev-wlp4s0 disabled_11b_rates=0
Apr 08 06:09:19 Keschdeichel systemd[1]: colord.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel systemd[1]: upower.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel systemd[1]: bluetooth.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel systemd[1]: fwupd.service: Unexpected error response from GetNameOwner(): Connection terminated
Apr 08 06:09:19 Keschdeichel xdg-desktop-por[11011]: xdg-desktop-portal-gtk: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 08 06:09:19 Keschdeichel NetworkManager[1209]: <info> [1586318959.7797] caught SIGTERM, shutting down normally.
Apr 08 06:09:19 Keschdeichel avahi-daemon[1204]: avahi-daemon 0.7 exiting.
Apr 08 06:09:19 Keschdeichel systemd[1]: rtkit-daemon.service: Succeeded.
Apr 08 06:09:19 Keschdeichel systemd[1]: Starting Process error reports when automatic reporting is enabled...
Apr 08 06:09:19 Keschdeichel systemd[1]: colord.service: Succeeded.
Apr 08 06:09:19 Keschdeichel systemd[10683]: Stopped target GNOME X11 Session (session: ubuntu).
Apr 08 06:09:19 Keschdeichel systemd[10683]: Stopped target Current graphical user session.
Apr 08 06:09:19 Keschdeichel systemd[10683]: unicast-local-avahi.path: Succeeded.
Apr 08 06:09:19 Keschdeichel systemd[10683]: Stopped Path trigger for Avahi .local domain notifications.
Apr 08 06:09:19 Keschdeichel systemd[10683]: update-notifier-release.path: Succeeded.
Apr 08 06:09:19 Keschdeichel gnome-shell[11036]: gnome-shell: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
The last things before was systemd reload - maybe due to a background
upgrade?
Apr 08 06:08:54 Keschdeichel dbus-daemon[1208]: Unknown group "power" in message bus configuration file
Apr 08 06:08:54 Keschdeichel dbus-daemon[1208]: [system] Reloaded configuration
Apr 08 06:08:54 Keschdeichel dbus-daemon[1208]: Unknown group "power" in message bus configuration file
Apr 08 06:08:54 Keschdeichel dbus-daemon[1208]: [system] Reloaded configuration
Apr 08 06:08:54 Keschdeichel dbus-daemon[1208]: Unknown group "power" in message bus configuration file
Apr 08 06:08:54 Keschdeichel systemd[1]: Reloading.
And indeed there was an upgrade going on:
40773 2020-04-08 06:08:46 status installed libc-bin:amd64 2.31-0ubuntu7
40774 2020-04-08 06:08:54 startup archives unpack
40775 2020-04-08 06:08:54 upgrade accountsservice:amd64 0.6.55-0ubuntu10 0.6.55-0ubuntu11
40776 2020-04-08 06:08:54 status half-configured accountsservice:amd64 0.6.55-0ubuntu10
40777 2020-04-08 06:08:54 status unpacked accountsservice:amd64 0.6.55-0ubuntu10
40778 2020-04-08 06:08:54 status half-installed accountsservice:amd64 0.6.55-0ubuntu10
40779 2020-04-08 06:08:54 status triggers-pending dbus:amd64 1.12.16-2ubuntu2
40780 2020-04-08 06:09:19 status unpacked accountsservice:amd64 0.6.55-0ubuntu11
40781 2020-04-08 06:09:20 upgrade libaccountsservice0:amd64 0.6.55-0ubuntu10 0.6.55-0ubuntu11
40782 2020-04-08 06:09:20 status triggers-pending libc-bin:amd64 2.31-0ubuntu7
40783 2020-04-08 06:09:20 status half-configured libaccountsservice0:amd64 0.6.55-0ubuntu10
40784 2020-04-08 06:09:20 status unpacked libaccountsservice0:amd64 0.6.55-0ubuntu10
40785 2020-04-08 06:09:20 status half-installed libaccountsservice0:amd64 0.6.55-0ubuntu10
40786 2020-04-08 06:09:20 status unpacked libaccountsservice0:amd64 0.6.55-0ubuntu11
Nothing suspicious before that time.
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: accountsservice 0.6.55-0ubuntu11
ProcVersionSignature: Ubuntu 5.3.0-45.37~18.04.1-generic 5.3.18
Uname: Linux 5.3.0-45-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu24
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Wed Apr 8 07:51:03 2020
InstallationDate: Installed on 2018-10-12 (543 days ago)
InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
SourcePackage: accountsservice
UpgradeStatus: Upgraded to focal on 2020-04-03 (4 days ago)
To manage notifications about this bug go to:
https://bugs.launchpad.net/dbus/+bug/1871538/+subscriptions
References