yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #47223
[Bug 1552293] [NEW] glance plugin not exiting when tar exits from bad data being passed.
Public bug reported:
This bug only applies to openstack users running xenapi in combination
with glance plugin to fetch images from swift. There is logic in
utils.py to kill the tar process when python hits an exception from
subprocess during the tar extraction process in the xenapi plugin
glance. This logic does work when tar exits cleanly with an EOF on a
truncated file, however if you append bad data to that truncated file
(malformed http response), tar will die and re spawn outside of the
child/parent process tree of the xen fork executioner daemon and
continue to read from that stdin pipe. The most minimal code change to
be made is to poll for the process during extraction function which
after making this change tar will no longer hang like this and even
though tar is now its own process, when glance plugin reclaims the
defunct tar it closes the pipe which kills the new tar process under
ppid 1 (init) as that tar process is trying to read from that pipe.
When looking at the code it appears we don't poll for the process and
expect tar to exit in a way in which python will get an exception,
however this is not always the case. Below is what it looks like when
this issue occurs:
root 8750 10.6 0.7 9752 5836 ? Ss 23:44 0:06 \_ python /etc/xapi.d/plugins/glance <methodCall><methodName>download_vhd2</methodName><params><param><value>OpaqueRef:a9b81f62-0281-2301-1eec-754fd2f1a057</value></
root 8820 5.4 0.0 0 0 ? Z 23:45 0:03 \_ [tar] <defunct>
root 8829 1.7 0.0 2552 464 ? S 23:45 0:01 tar -zx --directory=/var/run/sr-mount/637c4bf0-3cf6-b283-66f0-7087dec0439e/tmpIbRPA1
root 8830 18.9 0.0 0 0 ? Z 23:45 0:11 \_ [gzip] <defunct>
[root@# ls -la fd/*
l-wx------ 1 root root 64 Mar 1 23:45 fd/0 -> /dev/null
l-wx------ 1 root root 64 Mar 1 23:45 fd/1 -> /tmp/execute_command_get_outc14016.log
l-wx------ 1 root root 64 Mar 1 23:45 fd/2 -> /tmp/execute_command_get_errf62557.log
lrwx------ 1 root root 64 Mar 1 23:45 fd/3 -> socket:[718064936]
lrwx------ 1 root root 64 Mar 1 23:45 fd/4 -> socket:[718064938]
l-wx------ 1 root root 64 Mar 1 23:45 fd/6 -> pipe:[718065596]
lr-x------ 1 root root 64 Mar 1 23:45 fd/7 -> pipe:[718065597]
[root@]# ls -la ../8829/fd/*
lr-x------ 1 root root 64 Mar 1 23:46 ../8829/fd/0 -> pipe:[718065596]
l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/1 -> pipe:[718065612]
l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/2 -> pipe:[718065597]
** Affects: nova
Importance: Undecided
Assignee: Tim Pownall (pownalltim)
Status: In Progress
** Changed in: nova
Assignee: (unassigned) => Tim Pownall (pownalltim)
** Changed in: nova
Status: New => In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1552293
Title:
glance plugin not exiting when tar exits from bad data being passed.
Status in OpenStack Compute (nova):
In Progress
Bug description:
This bug only applies to openstack users running xenapi in combination
with glance plugin to fetch images from swift. There is logic in
utils.py to kill the tar process when python hits an exception from
subprocess during the tar extraction process in the xenapi plugin
glance. This logic does work when tar exits cleanly with an EOF on a
truncated file, however if you append bad data to that truncated file
(malformed http response), tar will die and re spawn outside of the
child/parent process tree of the xen fork executioner daemon and
continue to read from that stdin pipe. The most minimal code change
to be made is to poll for the process during extraction function which
after making this change tar will no longer hang like this and even
though tar is now its own process, when glance plugin reclaims the
defunct tar it closes the pipe which kills the new tar process under
ppid 1 (init) as that tar process is trying to read from that pipe.
When looking at the code it appears we don't poll for the process and
expect tar to exit in a way in which python will get an exception,
however this is not always the case. Below is what it looks like when
this issue occurs:
root 8750 10.6 0.7 9752 5836 ? Ss 23:44 0:06 \_ python /etc/xapi.d/plugins/glance <methodCall><methodName>download_vhd2</methodName><params><param><value>OpaqueRef:a9b81f62-0281-2301-1eec-754fd2f1a057</value></
root 8820 5.4 0.0 0 0 ? Z 23:45 0:03 \_ [tar] <defunct>
root 8829 1.7 0.0 2552 464 ? S 23:45 0:01 tar -zx --directory=/var/run/sr-mount/637c4bf0-3cf6-b283-66f0-7087dec0439e/tmpIbRPA1
root 8830 18.9 0.0 0 0 ? Z 23:45 0:11 \_ [gzip] <defunct>
[root@# ls -la fd/*
l-wx------ 1 root root 64 Mar 1 23:45 fd/0 -> /dev/null
l-wx------ 1 root root 64 Mar 1 23:45 fd/1 -> /tmp/execute_command_get_outc14016.log
l-wx------ 1 root root 64 Mar 1 23:45 fd/2 -> /tmp/execute_command_get_errf62557.log
lrwx------ 1 root root 64 Mar 1 23:45 fd/3 -> socket:[718064936]
lrwx------ 1 root root 64 Mar 1 23:45 fd/4 -> socket:[718064938]
l-wx------ 1 root root 64 Mar 1 23:45 fd/6 -> pipe:[718065596]
lr-x------ 1 root root 64 Mar 1 23:45 fd/7 -> pipe:[718065597]
[root@]# ls -la ../8829/fd/*
lr-x------ 1 root root 64 Mar 1 23:46 ../8829/fd/0 -> pipe:[718065596]
l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/1 -> pipe:[718065612]
l-wx------ 1 root root 64 Mar 1 23:46 ../8829/fd/2 -> pipe:[718065597]
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1552293/+subscriptions
Follow ups