← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1469754] [NEW] tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest.test_compute_with_volumes fails with SSHTimeout

 

Public bug reported:

http://logs.openstack.org/23/193223/3/check/check-tempest-dsvm-
nova-v21-full/5f73178/console.html.gz#_2015-06-26_23_49_24_797

2015-06-26 23:49:24.797 | tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest.test_compute_with_volumes[id-ab836c29-737b-4101-9fb9-87045eaf89e9]
2015-06-26 23:49:24.797 | --------------------------------------------------------------------------------------------------------------------------------
2015-06-26 23:49:24.797 | 
2015-06-26 23:49:24.798 | Captured traceback:
2015-06-26 23:49:24.798 | ~~~~~~~~~~~~~~~~~~~
2015-06-26 23:49:24.798 |     Traceback (most recent call last):
2015-06-26 23:49:24.798 |       File "tempest/thirdparty/boto/test_ec2_instance_run.py", line 338, in test_compute_with_volumes
2015-06-26 23:49:24.798 |         wait.state_wait(_part_state, 'INCREASE')
2015-06-26 23:49:24.798 |       File "tempest/thirdparty/boto/utils/wait.py", line 36, in state_wait
2015-06-26 23:49:24.798 |         old_status = status = lfunction()
2015-06-26 23:49:24.799 |       File "tempest/thirdparty/boto/test_ec2_instance_run.py", line 330, in _part_state
2015-06-26 23:49:24.799 |         current = ssh.get_partitions().split('\n')
2015-06-26 23:49:24.799 |       File "tempest/common/utils/linux/remote_client.py", line 82, in get_partitions
2015-06-26 23:49:24.799 |         output = self.exec_command(command)
2015-06-26 23:49:24.799 |       File "tempest/common/utils/linux/remote_client.py", line 56, in exec_command
2015-06-26 23:49:24.799 |         return self.ssh_client.exec_command(cmd)
2015-06-26 23:49:24.800 |       File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/ssh.py", line 111, in exec_command
2015-06-26 23:49:24.800 |         ssh = self._get_ssh_connection()
2015-06-26 23:49:24.800 |       File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/ssh.py", line 87, in _get_ssh_connection
2015-06-26 23:49:24.800 |         password=self.password)
2015-06-26 23:49:24.800 |     tempest_lib.exceptions.SSHTimeout: Connection to the 172.24.5.1 via SSH timed out.
2015-06-26 23:49:24.800 |     User: cirros, Password: None

This is a job with nova-network and in tracing through the calls we
definitely need some more debug logging in this path to see that we have
security groups associated with the instance to refresh the sg rules.

nw_info and instance uuid for the instance:

2015-06-26 23:49:24.883 |     === network info ===
2015-06-26 23:49:24.883 |     if-info: lo,up,127.0.0.1,8,::1
2015-06-26 23:49:24.883 |     if-info: eth0,up,10.1.0.2,20,fe80::f816:3eff:fefd:78bd
2015-06-26 23:49:24.883 |     ip-route:default via 10.1.0.1 dev eth0 
2015-06-26 23:49:24.883 |     ip-route:10.1.0.0/20 dev eth0  src 10.1.0.2 
2015-06-26 23:49:24.883 |     === datasource: configdrive local ===
2015-06-26 23:49:24.884 |     instance-id: 7377fb75-089e-4a7f-aa13-42073ca4d981
2015-06-26 23:49:24.884 |     name: Server 7377fb75-089e-4a7f-aa13-42073ca4d981
2015-06-26 23:49:24.884 |     availability-zone: test_az-1643222956
2015-06-26 23:49:24.884 |     local-hostname: server-7377fb75-089e-4a7f-aa13-42073ca4d981.novalocal
2015-06-26 23:49:24.884 |     launch-index: 0

We see that the fixed IP is associated with the instance here:

http://logs.openstack.org/23/193223/3/check/check-tempest-dsvm-
nova-v21-full/5f73178/logs/screen-n-net.txt.gz#_2015-06-26_23_41_39_500

The request ID is req-3b037057-5783-4160-a320-248eb4f2e724.

We see it refresh security groups and there are several messages about
skipping duplicate iptables rule additions:

2015-06-26 23:41:39.685 DEBUG nova.network.linux_net [req-
3b037057-5783-4160-a320-248eb4f2e724 InstanceRunTest-1951055199
InstanceRunTest-702105106] Skipping duplicate iptables rule addition.
[0:0] -A nova-network-snat -s 10.1.0.0/20 -d 0.0.0.0/0 -j SNAT --to-
source 10.0.4.130 -o br100 already in [[0:0] -A PREROUTING -j nova-
network-PREROUTING, [0:0] -A OUTPUT -j nova-network-OUTPUT, [0:0] -A
POSTROUTING -j nova-network-POSTROUTING, [0:0] -A POSTROUTING -j nova-
postrouting-bottom, [0:0] -A nova-postrouting-bottom -j nova-network-
snat, [0:0] -A nova-network-snat -j nova-network-float-snat, [0:0] -A
nova-network-PREROUTING -s 0.0.0.0/0 -d 169.254.169.254/32 -p tcp -m tcp
--dport 80 -j DNAT --to-destination 10.0.4.130:8775, [0:0] -A nova-
network-snat -s 10.1.0.0/20 -d 0.0.0.0/0 -j SNAT --to-source 10.0.4.130
-o br100, [0:0] -A nova-network-POSTROUTING -s 10.1.0.0/20 -d
10.0.4.130/32 -j ACCEPT, [0:0] -A nova-network-POSTROUTING -s
10.1.0.0/20 -d 10.1.0.0/20 -m conntrack ! --ctstate DNAT -j ACCEPT]
add_rule /opt/stack/new/nova/nova/network/linux_net.py:285

2015-06-26 23:41:39.685 DEBUG nova.network.linux_net [req-
3b037057-5783-4160-a320-248eb4f2e724 InstanceRunTest-1951055199
InstanceRunTest-702105106] Skipping apply due to lack of new rules apply
/opt/stack/new/nova/nova/network/linux_net.py:444

I'm sure this is probably a duplicate, maybe of bug 1355573, but that
bug is targeted at the test_volume_boot_pattern test and this is a
different test - although it's related to volumes so maybe related.

** Affects: nova
     Importance: Undecided
         Status: Confirmed


** Tags: boto volumes

** Changed in: nova
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1469754

Title:
  tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest.test_compute_with_volumes
  fails with SSHTimeout

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  http://logs.openstack.org/23/193223/3/check/check-tempest-dsvm-
  nova-v21-full/5f73178/console.html.gz#_2015-06-26_23_49_24_797

  2015-06-26 23:49:24.797 | tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest.test_compute_with_volumes[id-ab836c29-737b-4101-9fb9-87045eaf89e9]
  2015-06-26 23:49:24.797 | --------------------------------------------------------------------------------------------------------------------------------
  2015-06-26 23:49:24.797 | 
  2015-06-26 23:49:24.798 | Captured traceback:
  2015-06-26 23:49:24.798 | ~~~~~~~~~~~~~~~~~~~
  2015-06-26 23:49:24.798 |     Traceback (most recent call last):
  2015-06-26 23:49:24.798 |       File "tempest/thirdparty/boto/test_ec2_instance_run.py", line 338, in test_compute_with_volumes
  2015-06-26 23:49:24.798 |         wait.state_wait(_part_state, 'INCREASE')
  2015-06-26 23:49:24.798 |       File "tempest/thirdparty/boto/utils/wait.py", line 36, in state_wait
  2015-06-26 23:49:24.798 |         old_status = status = lfunction()
  2015-06-26 23:49:24.799 |       File "tempest/thirdparty/boto/test_ec2_instance_run.py", line 330, in _part_state
  2015-06-26 23:49:24.799 |         current = ssh.get_partitions().split('\n')
  2015-06-26 23:49:24.799 |       File "tempest/common/utils/linux/remote_client.py", line 82, in get_partitions
  2015-06-26 23:49:24.799 |         output = self.exec_command(command)
  2015-06-26 23:49:24.799 |       File "tempest/common/utils/linux/remote_client.py", line 56, in exec_command
  2015-06-26 23:49:24.799 |         return self.ssh_client.exec_command(cmd)
  2015-06-26 23:49:24.800 |       File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/ssh.py", line 111, in exec_command
  2015-06-26 23:49:24.800 |         ssh = self._get_ssh_connection()
  2015-06-26 23:49:24.800 |       File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/ssh.py", line 87, in _get_ssh_connection
  2015-06-26 23:49:24.800 |         password=self.password)
  2015-06-26 23:49:24.800 |     tempest_lib.exceptions.SSHTimeout: Connection to the 172.24.5.1 via SSH timed out.
  2015-06-26 23:49:24.800 |     User: cirros, Password: None

  This is a job with nova-network and in tracing through the calls we
  definitely need some more debug logging in this path to see that we
  have security groups associated with the instance to refresh the sg
  rules.

  nw_info and instance uuid for the instance:

  2015-06-26 23:49:24.883 |     === network info ===
  2015-06-26 23:49:24.883 |     if-info: lo,up,127.0.0.1,8,::1
  2015-06-26 23:49:24.883 |     if-info: eth0,up,10.1.0.2,20,fe80::f816:3eff:fefd:78bd
  2015-06-26 23:49:24.883 |     ip-route:default via 10.1.0.1 dev eth0 
  2015-06-26 23:49:24.883 |     ip-route:10.1.0.0/20 dev eth0  src 10.1.0.2 
  2015-06-26 23:49:24.883 |     === datasource: configdrive local ===
  2015-06-26 23:49:24.884 |     instance-id: 7377fb75-089e-4a7f-aa13-42073ca4d981
  2015-06-26 23:49:24.884 |     name: Server 7377fb75-089e-4a7f-aa13-42073ca4d981
  2015-06-26 23:49:24.884 |     availability-zone: test_az-1643222956
  2015-06-26 23:49:24.884 |     local-hostname: server-7377fb75-089e-4a7f-aa13-42073ca4d981.novalocal
  2015-06-26 23:49:24.884 |     launch-index: 0

  We see that the fixed IP is associated with the instance here:

  http://logs.openstack.org/23/193223/3/check/check-tempest-dsvm-
  nova-v21-full/5f73178/logs/screen-n-net.txt.gz#_2015-06-26_23_41_39_500

  The request ID is req-3b037057-5783-4160-a320-248eb4f2e724.

  We see it refresh security groups and there are several messages about
  skipping duplicate iptables rule additions:

  2015-06-26 23:41:39.685 DEBUG nova.network.linux_net [req-
  3b037057-5783-4160-a320-248eb4f2e724 InstanceRunTest-1951055199
  InstanceRunTest-702105106] Skipping duplicate iptables rule addition.
  [0:0] -A nova-network-snat -s 10.1.0.0/20 -d 0.0.0.0/0 -j SNAT --to-
  source 10.0.4.130 -o br100 already in [[0:0] -A PREROUTING -j nova-
  network-PREROUTING, [0:0] -A OUTPUT -j nova-network-OUTPUT, [0:0] -A
  POSTROUTING -j nova-network-POSTROUTING, [0:0] -A POSTROUTING -j nova-
  postrouting-bottom, [0:0] -A nova-postrouting-bottom -j nova-network-
  snat, [0:0] -A nova-network-snat -j nova-network-float-snat, [0:0] -A
  nova-network-PREROUTING -s 0.0.0.0/0 -d 169.254.169.254/32 -p tcp -m
  tcp --dport 80 -j DNAT --to-destination 10.0.4.130:8775, [0:0] -A
  nova-network-snat -s 10.1.0.0/20 -d 0.0.0.0/0 -j SNAT --to-source
  10.0.4.130 -o br100, [0:0] -A nova-network-POSTROUTING -s 10.1.0.0/20
  -d 10.0.4.130/32 -j ACCEPT, [0:0] -A nova-network-POSTROUTING -s
  10.1.0.0/20 -d 10.1.0.0/20 -m conntrack ! --ctstate DNAT -j ACCEPT]
  add_rule /opt/stack/new/nova/nova/network/linux_net.py:285

  2015-06-26 23:41:39.685 DEBUG nova.network.linux_net [req-
  3b037057-5783-4160-a320-248eb4f2e724 InstanceRunTest-1951055199
  InstanceRunTest-702105106] Skipping apply due to lack of new rules
  apply /opt/stack/new/nova/nova/network/linux_net.py:444

  I'm sure this is probably a duplicate, maybe of bug 1355573, but that
  bug is targeted at the test_volume_boot_pattern test and this is a
  different test - although it's related to volumes so maybe related.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1469754/+subscriptions


Follow ups

References