yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #75978
[Bug 1805446] [NEW] per-instance is not run after reboot during boot
Public bug reported:
Bumped into a problem with missing user after node reboot during per-
instance module execution. What happens is that the file lock is
acquired, but module does not finish before rebooting[1]. After reboot
we can see that cloud-init thinks module ran successfully although it
was interrupted before succeeding to add user[2]. And later we can see
other modules failing due to missing user[3].
I suppose there is attempts to run same module multiple times at the
same time so we need to have the lock file written before module
execution? Obviously would be easiest if sem.lock() could just be moved
to after execution. If this is not possible, I guess the locking could
be improved in a way that has_run() would distinguish module
running/finished from module failed mid-run on first boot.
[1]
2018-11-23 11:11:18,195 - stages.py[DEBUG]: Running module users-groups (<module 'cloudinit.config.cc_users_groups' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_users_groups.pyc'>) with frequency once-per-instance
2018-11-23 11:11:18,195 - handlers.py[DEBUG]: start: init-network/config-users-groups: running config-users-groups with frequency once-per-instance
2018-11-23 11:11:18,196 - util.py[DEBUG]: Running command sync with allowed return codes [0] (shell=False, capture=True)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2018-11-23 11:11:51,582 - util.py[DEBUG]: Cloud-init v. 17.1 running 'init-local' at Fri, 23 Nov 2018 11:11:51 +0000. Up 13.86 seconds.
2018-11-23 11:11:51,608 - main.py[DEBUG]: No kernel command line url found.
[2]
2018-11-23 11:12:02,624 - stages.py[DEBUG]: Running module users-groups (<module 'cloudinit.config.cc_users_groups' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_users_groups.pyc'>) with frequency once-per-instance
2018-11-23 11:12:02,625 - handlers.py[DEBUG]: start: init-network/config-users-groups: running config-users-groups with frequency once-per-instance
2018-11-23 11:12:02,625 - helpers.py[DEBUG]: config-users-groups already ran (freq=once-per-instance)
2018-11-23 11:12:02,625 - handlers.py[DEBUG]: finish: init-network/config-users-groups: SUCCESS: config-users-groups previously ran
[3]
2018-11-23 11:12:31,806 - util.py[DEBUG]: Running module ssh-authkey-fingerprints (<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_ssh_authkey_fingerprints.pyc'>) failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 793, in _run_modules
freq=freq)
File "/usr/lib/python2.7/site-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python2.7/site-packages/cloudinit/helpers.py", line 198, in run
results = functor(*args)
File "/usr/lib/python2.7/site-packages/cloudinit/config/cc_ssh_authkey_fingerprints.py", line 108, in handle
(key_fn, key_entries) = ssh_util.extract_authorized_keys(user_name)
File "/usr/lib/python2.7/site-packages/cloudinit/ssh_util.py", line 212, in extract_authorized_keys
(ssh_dir, pw_ent) = users_ssh_info(username)
File "/usr/lib/python2.7/site-packages/cloudinit/ssh_util.py", line 205, in users_ssh_info
pw_ent = pwd.getpwnam(username)
KeyError: 'getpwnam(): name not found: testuser'
** Affects: cloud-init
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1805446
Title:
per-instance is not run after reboot during boot
Status in cloud-init:
New
Bug description:
Bumped into a problem with missing user after node reboot during per-
instance module execution. What happens is that the file lock is
acquired, but module does not finish before rebooting[1]. After reboot
we can see that cloud-init thinks module ran successfully although it
was interrupted before succeeding to add user[2]. And later we can see
other modules failing due to missing user[3].
I suppose there is attempts to run same module multiple times at the
same time so we need to have the lock file written before module
execution? Obviously would be easiest if sem.lock() could just be
moved to after execution. If this is not possible, I guess the locking
could be improved in a way that has_run() would distinguish module
running/finished from module failed mid-run on first boot.
[1]
2018-11-23 11:11:18,195 - stages.py[DEBUG]: Running module users-groups (<module 'cloudinit.config.cc_users_groups' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_users_groups.pyc'>) with frequency once-per-instance
2018-11-23 11:11:18,195 - handlers.py[DEBUG]: start: init-network/config-users-groups: running config-users-groups with frequency once-per-instance
2018-11-23 11:11:18,196 - util.py[DEBUG]: Running command sync with allowed return codes [0] (shell=False, capture=True)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2018-11-23 11:11:51,582 - util.py[DEBUG]: Cloud-init v. 17.1 running 'init-local' at Fri, 23 Nov 2018 11:11:51 +0000. Up 13.86 seconds.
2018-11-23 11:11:51,608 - main.py[DEBUG]: No kernel command line url found.
[2]
2018-11-23 11:12:02,624 - stages.py[DEBUG]: Running module users-groups (<module 'cloudinit.config.cc_users_groups' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_users_groups.pyc'>) with frequency once-per-instance
2018-11-23 11:12:02,625 - handlers.py[DEBUG]: start: init-network/config-users-groups: running config-users-groups with frequency once-per-instance
2018-11-23 11:12:02,625 - helpers.py[DEBUG]: config-users-groups already ran (freq=once-per-instance)
2018-11-23 11:12:02,625 - handlers.py[DEBUG]: finish: init-network/config-users-groups: SUCCESS: config-users-groups previously ran
[3]
2018-11-23 11:12:31,806 - util.py[DEBUG]: Running module ssh-authkey-fingerprints (<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_ssh_authkey_fingerprints.pyc'>) failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 793, in _run_modules
freq=freq)
File "/usr/lib/python2.7/site-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python2.7/site-packages/cloudinit/helpers.py", line 198, in run
results = functor(*args)
File "/usr/lib/python2.7/site-packages/cloudinit/config/cc_ssh_authkey_fingerprints.py", line 108, in handle
(key_fn, key_entries) = ssh_util.extract_authorized_keys(user_name)
File "/usr/lib/python2.7/site-packages/cloudinit/ssh_util.py", line 212, in extract_authorized_keys
(ssh_dir, pw_ent) = users_ssh_info(username)
File "/usr/lib/python2.7/site-packages/cloudinit/ssh_util.py", line 205, in users_ssh_info
pw_ent = pwd.getpwnam(username)
KeyError: 'getpwnam(): name not found: testuser'
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1805446/+subscriptions
Follow ups