← Back to team overview

sts-sponsors team mailing list archive

[Bug 1870087] Re: Old broker lockfile blocks landscape-client starts

 

** Description changed:

+ [Impact]
+ 
+  * landscape-client services are prevented from starting if its older PIDs get
+    recycled.
+ 
+  * The exact conditions for the issue, are particularly more likely to occur
+    on release upgrade.
+ 
+  * The proposed fix tries to verify existing locks actually belong
+    to landscape-client, instead of just verifying they exist.
+ 
+ [Test Case]
+ 
+  * systemctl stop landscape-client
+ 
+  * ln -sf 1 /var/lib/landscape/client/sockets/broker.sock.lock
+ 
+  * systemctl start landscape-client
+ 
+ [Regression Potential]
+ 
+  * The existing twisted logic is still kept, so assuming checking process
+    names fail, lock conflicts should still be detected normally.
+ 
+  * The locks which twisted creates are unlikely to actually see conflicts in
+    the wild as those processes are managed by systemd. False positives in
+    the detection check should have minimal impact.
+ 
+ [Original description]
+ 
  I have a machine which was failing to connect to the landscape service.
  In syslog I found this traceback:
  
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 160, in lock
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     symlink(str(os.getpid()), self.name)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: FileExistsError: [Errno 17] File exists: '1538397' -> b'/var/lib/landscape/client/sockets/broker.sock.lock'
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: During handling of the above exception, another exception occurred:
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/bin/landscape-broker", line 8, in <module>
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     run(sys.argv)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 93, in run
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     run_landscape_service(BrokerConfiguration, BrokerService, args)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/service.py", line 115, in run_landscape_service
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     startApplication(application, False)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 690, in startApplication
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     service.IService(application).startService()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/application/service.py", line 288, in startService
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     service.startService()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 79, in startService
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     self.publisher.start()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/amp.py", line 45, in start
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     self._port = self._reactor.listen_unix(socket_path, factory)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/lib/reactor.py", line 228, in listen_unix
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     return self._reactor.listenUNIX(socket, factory, wantPID=True)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/internet/posixbase.py", line 397, in listenUNIX
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     p.startListening()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/internet/unix.py", line 372, in startListening
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     if not self.lockFile.lock():
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 185, in lock
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     kill(int(pid), 0)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: PermissionError: [Errno 1] Operation not permitted
  
  In the sockets directory I saw:
  
  $ sudo ls /var/lib/landscape/client/sockets/ -la
  total 8
  drwxr-x--- 2 landscape root      4096 Apr  1 03:27 .
  drwxr-xr-x 7 landscape root      4096 Apr  1 03:27 ..
  srw-rw-rw- 1 landscape landscape    0 Mar 12 01:41 broker.sock
  lrwxrwxrwx 1 landscape landscape    3 Mar 12 01:41 broker.sock.lock -> 905
  
  Removing those two files allowed the landscape client to start as
  normal.
  
  Looks like we need some lockfile cleanup code on start.

-- 
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1870087

Title:
  Old broker lockfile blocks landscape-client starts

Status in Landscape Client:
  Fix Committed
Status in landscape-client package in Ubuntu:
  In Progress
Status in landscape-client source package in Focal:
  In Progress

Bug description:
  [Impact]

   * landscape-client services are prevented from starting if its older PIDs get
     recycled.

   * The exact conditions for the issue, are particularly more likely to occur
     on release upgrade.

   * The proposed fix tries to verify existing locks actually belong
     to landscape-client, instead of just verifying they exist.

  [Test Case]

   * systemctl stop landscape-client

   * ln -sf 1 /var/lib/landscape/client/sockets/broker.sock.lock

   * systemctl start landscape-client

  [Regression Potential]

   * The existing twisted logic is still kept, so assuming checking process
     names fail, lock conflicts should still be detected normally.

   * The locks which twisted creates are unlikely to actually see conflicts in
     the wild as those processes are managed by systemd. False positives in
     the detection check should have minimal impact.

  [Original description]

  I have a machine which was failing to connect to the landscape
  service. In syslog I found this traceback:

  Apr  1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 160, in lock
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     symlink(str(os.getpid()), self.name)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: FileExistsError: [Errno 17] File exists: '1538397' -> b'/var/lib/landscape/client/sockets/broker.sock.lock'
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: During handling of the above exception, another exception occurred:
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: Traceback (most recent call last):
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/bin/landscape-broker", line 8, in <module>
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     run(sys.argv)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 93, in run
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     run_landscape_service(BrokerConfiguration, BrokerService, args)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/service.py", line 115, in run_landscape_service
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     startApplication(application, False)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 690, in startApplication
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     service.IService(application).startService()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/application/service.py", line 288, in startService
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     service.startService()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/broker/service.py", line 79, in startService
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     self.publisher.start()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/client/amp.py", line 45, in start
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     self._port = self._reactor.listen_unix(socket_path, factory)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/landscape/lib/reactor.py", line 228, in listen_unix
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     return self._reactor.listenUNIX(socket, factory, wantPID=True)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/internet/posixbase.py", line 397, in listenUNIX
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     p.startListening()
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/internet/unix.py", line 372, in startListening
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     if not self.lockFile.lock():
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:   File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 185, in lock
  Apr  1 03:27:53 maas-1 landscape-client[1538354]:     kill(int(pid), 0)
  Apr  1 03:27:53 maas-1 landscape-client[1538354]: PermissionError: [Errno 1] Operation not permitted

  In the sockets directory I saw:

  $ sudo ls /var/lib/landscape/client/sockets/ -la
  total 8
  drwxr-x--- 2 landscape root      4096 Apr  1 03:27 .
  drwxr-xr-x 7 landscape root      4096 Apr  1 03:27 ..
  srw-rw-rw- 1 landscape landscape    0 Mar 12 01:41 broker.sock
  lrwxrwxrwx 1 landscape landscape    3 Mar 12 01:41 broker.sock.lock -> 905

  Removing those two files allowed the landscape client to start as
  normal.

  Looks like we need some lockfile cleanup code on start.

To manage notifications about this bug go to:
https://bugs.launchpad.net/landscape-client/+bug/1870087/+subscriptions