sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #02747
[Bug 1870087] Re: Old broker lockfile blocks landscape-client starts
I verified the fixes for groovy-proposed and focal-proposed packages
versions 19.12-0ubuntu5.1 and 19.12-0ubuntu4.2
I followed the test case without any issue.
Additionally, I also tried a few scenarios for terminating processes, which now all terminate cleanly if stopped with SIGTERM, whether directly or indirectly (through systemd).
** Tags removed: verification-needed verification-needed-focal verification-needed-groovy
** Tags added: verification-done verification-done-focal verification-done-groovy
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1870087
Title:
Old broker lockfile blocks landscape-client starts
Status in Landscape Client:
In Progress
Status in landscape-client package in Ubuntu:
Fix Released
Status in landscape-client source package in Focal:
Fix Committed
Status in landscape-client source package in Groovy:
Fix Committed
Bug description:
[Impact]
* landscape-client services are prevented from starting if its older PIDs get
recycled.
* The exact conditions for the issue, are particularly more likely to occur
on release upgrade. This is exacerbated by the fact clients did not await
on their shutdown routine, thus were likely to leak their lock file.
* The proposed fix tries to verify existing locks actually belong
to landscape-client, instead of just verifying they exist.
* The follow-up patch ensured some of the process actually complete
their shutdown.
[Test Case]
* systemctl stop landscape-client
* There should not be any remaining file in
/var/lib/landscape/client/sockets/
* ln -sf 1 /var/lib/landscape/client/sockets/broker.sock.lock
* systemctl start landscape-client
[Regression Potential]
* The existing twisted logic is still kept, so assuming checking process
names fail, lock conflicts should still be detected normally.
* The locks which twisted creates are unlikely to actually see conflicts in
the wild as those processes are managed by systemd. False positives in
the detection check should have minimal impact.
[Original description]
I have a machine which was failing to connect to the landscape
service. In syslog I found this traceback:
Apr 1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 160, in lock
Apr 1 03:27:53 maas-1 landscape-client[1538354]: symlink(str(os.getpid()), self.name)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: FileExistsError: [Errno 17] File exists: '1538397' -> b'/var/lib/landscape/client/sockets/broker.sock.lock'
Apr 1 03:27:53 maas-1 landscape-client[1538354]: During handling of the above exception, another exception occurred:
Apr 1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/bin/landscape-broker", line 8, in <module>
Apr 1 03:27:53 maas-1 landscape-client[1538354]: run(sys.argv)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 93, in run
Apr 1 03:27:53 maas-1 landscape-client[1538354]: run_landscape_service(BrokerConfiguration, BrokerService, args)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/landscape/client/service.py", line 115, in run_landscape_service
Apr 1 03:27:53 maas-1 landscape-client[1538354]: startApplication(application, False)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 690, in startApplication
Apr 1 03:27:53 maas-1 landscape-client[1538354]: service.IService(application).startService()
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/application/service.py", line 288, in startService
Apr 1 03:27:53 maas-1 landscape-client[1538354]: service.startService()
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 79, in startService
Apr 1 03:27:53 maas-1 landscape-client[1538354]: self.publisher.start()
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/landscape/client/amp.py", line 45, in start
Apr 1 03:27:53 maas-1 landscape-client[1538354]: self._port = self._reactor.listen_unix(socket_path, factory)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/landscape/lib/reactor.py", line 228, in listen_unix
Apr 1 03:27:53 maas-1 landscape-client[1538354]: return self._reactor.listenUNIX(socket, factory, wantPID=True)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/internet/posixbase.py", line 397, in listenUNIX
Apr 1 03:27:53 maas-1 landscape-client[1538354]: p.startListening()
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/internet/unix.py", line 372, in startListening
Apr 1 03:27:53 maas-1 landscape-client[1538354]: if not self.lockFile.lock():
Apr 1 03:27:53 maas-1 landscape-client[1538354]: File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 185, in lock
Apr 1 03:27:53 maas-1 landscape-client[1538354]: kill(int(pid), 0)
Apr 1 03:27:53 maas-1 landscape-client[1538354]: PermissionError: [Errno 1] Operation not permitted
In the sockets directory I saw:
$ sudo ls /var/lib/landscape/client/sockets/ -la
total 8
drwxr-x--- 2 landscape root 4096 Apr 1 03:27 .
drwxr-xr-x 7 landscape root 4096 Apr 1 03:27 ..
srw-rw-rw- 1 landscape landscape 0 Mar 12 01:41 broker.sock
lrwxrwxrwx 1 landscape landscape 3 Mar 12 01:41 broker.sock.lock -> 905
Removing those two files allowed the landscape client to start as
normal.
Looks like we need some lockfile cleanup code on start.
To manage notifications about this bug go to:
https://bugs.launchpad.net/landscape-client/+bug/1870087/+subscriptions