← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1552293] [NEW] glance plugin not exiting when tar exits from bad data being passed.

 

Public bug reported:

This bug only applies to openstack users running xenapi in combination
with glance plugin to fetch images from swift.  There is logic in
utils.py to kill the tar process when python hits an exception from
subprocess during the tar extraction process in the xenapi plugin
glance.  This logic does work when tar exits cleanly with an EOF on a
truncated file, however if you append bad data to that truncated file
(malformed http response), tar will die and re spawn outside of the
child/parent process tree of the xen fork executioner daemon and
continue to read from that stdin pipe.  The most minimal code change to
be made is to poll for the process during extraction function which
after making this change tar will no longer hang like this and even
though tar is now its own process, when glance plugin reclaims the
defunct tar it closes the pipe which kills the new tar process under
ppid 1 (init) as that tar process is trying to read from that pipe.

When looking at the code it appears we don't poll for the process and
expect tar to exit in a way in which python will get an exception,
however this is not always the case.  Below is what it looks like when
this issue occurs:

root      8750 10.6  0.7   9752  5836 ?        Ss   23:44   0:06  \_ python /etc/xapi.d/plugins/glance <methodCall><methodName>download_vhd2</methodName><params><param><value>OpaqueRef:a9b81f62-0281-2301-1eec-754fd2f1a057</value></
root      8820  5.4  0.0      0     0 ?        Z    23:45   0:03      \_ [tar] <defunct>
root      8829  1.7  0.0   2552   464 ?        S    23:45   0:01 tar -zx --directory=/var/run/sr-mount/637c4bf0-3cf6-b283-66f0-7087dec0439e/tmpIbRPA1
root      8830 18.9  0.0      0     0 ?        Z    23:45   0:11  \_ [gzip] <defunct>

[root@# ls -la fd/*
l-wx------ 1 root root 64 Mar  1 23:45 fd/0 -> /dev/null
l-wx------ 1 root root 64 Mar  1 23:45 fd/1 -> /tmp/execute_command_get_outc14016.log
l-wx------ 1 root root 64 Mar  1 23:45 fd/2 -> /tmp/execute_command_get_errf62557.log
lrwx------ 1 root root 64 Mar  1 23:45 fd/3 -> socket:[718064936]
lrwx------ 1 root root 64 Mar  1 23:45 fd/4 -> socket:[718064938]
l-wx------ 1 root root 64 Mar  1 23:45 fd/6 -> pipe:[718065596]
lr-x------ 1 root root 64 Mar  1 23:45 fd/7 -> pipe:[718065597]
[root@]# ls -la ../8829/fd/*
lr-x------ 1 root root 64 Mar  1 23:46 ../8829/fd/0 -> pipe:[718065596]
l-wx------ 1 root root 64 Mar  1 23:46 ../8829/fd/1 -> pipe:[718065612]
l-wx------ 1 root root 64 Mar  1 23:46 ../8829/fd/2 -> pipe:[718065597]

** Affects: nova
     Importance: Undecided
     Assignee: Tim Pownall (pownalltim)
         Status: In Progress

** Changed in: nova
     Assignee: (unassigned) => Tim Pownall (pownalltim)

** Changed in: nova
       Status: New => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1552293

Title:
  glance plugin not exiting when tar exits from bad data being passed.

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  This bug only applies to openstack users running xenapi in combination
  with glance plugin to fetch images from swift.  There is logic in
  utils.py to kill the tar process when python hits an exception from
  subprocess during the tar extraction process in the xenapi plugin
  glance.  This logic does work when tar exits cleanly with an EOF on a
  truncated file, however if you append bad data to that truncated file
  (malformed http response), tar will die and re spawn outside of the
  child/parent process tree of the xen fork executioner daemon and
  continue to read from that stdin pipe.  The most minimal code change
  to be made is to poll for the process during extraction function which
  after making this change tar will no longer hang like this and even
  though tar is now its own process, when glance plugin reclaims the
  defunct tar it closes the pipe which kills the new tar process under
  ppid 1 (init) as that tar process is trying to read from that pipe.

  When looking at the code it appears we don't poll for the process and
  expect tar to exit in a way in which python will get an exception,
  however this is not always the case.  Below is what it looks like when
  this issue occurs:

  root      8750 10.6  0.7   9752  5836 ?        Ss   23:44   0:06  \_ python /etc/xapi.d/plugins/glance <methodCall><methodName>download_vhd2</methodName><params><param><value>OpaqueRef:a9b81f62-0281-2301-1eec-754fd2f1a057</value></
  root      8820  5.4  0.0      0     0 ?        Z    23:45   0:03      \_ [tar] <defunct>
  root      8829  1.7  0.0   2552   464 ?        S    23:45   0:01 tar -zx --directory=/var/run/sr-mount/637c4bf0-3cf6-b283-66f0-7087dec0439e/tmpIbRPA1
  root      8830 18.9  0.0      0     0 ?        Z    23:45   0:11  \_ [gzip] <defunct>

  [root@# ls -la fd/*
  l-wx------ 1 root root 64 Mar  1 23:45 fd/0 -> /dev/null
  l-wx------ 1 root root 64 Mar  1 23:45 fd/1 -> /tmp/execute_command_get_outc14016.log
  l-wx------ 1 root root 64 Mar  1 23:45 fd/2 -> /tmp/execute_command_get_errf62557.log
  lrwx------ 1 root root 64 Mar  1 23:45 fd/3 -> socket:[718064936]
  lrwx------ 1 root root 64 Mar  1 23:45 fd/4 -> socket:[718064938]
  l-wx------ 1 root root 64 Mar  1 23:45 fd/6 -> pipe:[718065596]
  lr-x------ 1 root root 64 Mar  1 23:45 fd/7 -> pipe:[718065597]
  [root@]# ls -la ../8829/fd/*
  lr-x------ 1 root root 64 Mar  1 23:46 ../8829/fd/0 -> pipe:[718065596]
  l-wx------ 1 root root 64 Mar  1 23:46 ../8829/fd/1 -> pipe:[718065612]
  l-wx------ 1 root root 64 Mar  1 23:46 ../8829/fd/2 -> pipe:[718065597]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1552293/+subscriptions


Follow ups