← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1959349] [NEW] test_live_migration_with_trunk fails intermittently due to subport not becoming ACTIVE

 

Public bug reported:

Seeing an issue with
tempest.api.compute.admin.test_live_migration.LiveMigrationTest.
test_live_migration_with_trunk failing intermittently. The live
migration appears to complete successfully but the subport fails to
become ACTIVE within the 60 second timeout:

2022-01-27 21:49:54,119 107445 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'date': 'Thu, 27 Jan 2022 21:49:54 GMT', 'server': 'Apache/2.4.41 (Ubuntu)', 'content-type': 'application/json', 'content-length': '782', 'x-openstack-request-id': 'req-79ecb361-fe12-4efe-9d7d-323378afabdf', 'connection': 'close', 'status': '200', 'content-location': 'https://10.176.196.23:9696/v2.0/ports/94892f6e-22d8-4707-bfd6-05ae6159c1e5'}
        Body: b'{"port":{"id":"94892f6e-22d8-4707-bfd6-05ae6159c1e5","name":"tempest-subport-887366041","network_id":"d60604d8-e700-41d5-8d80-ae2f840e7a9e","tenant_id":"2fe75fa1bd044fb89156fa9b6245a9af","mac_address":"fa:16:3e:74:a4:4d","admin_state_up":true,"status":"DOWN","device_id":"","device_owner":"trunk:subport","fixed_ips":[{"subnet_id":"cd95ee55-6cb3-4401-be11-53207a7794c9","ip_address":"19.80.0.197"}],"allowed_address_pairs":[],"extra_dhcp_opts":[],"security_groups":["9eef9e28-4f49-4f80-9680-af8011de0c8b"],"description":"","binding:vnic_type":"normal","port_security_enabled":true,"qos_policy_id":null,"qos_network_policy_id":null,"tags":[],"created_at":"2022-01-27T21:47:56Z","updated_at":"2022-01-27T21:48:56Z","revision_number":6,"project_id":"2fe75fa1bd044fb89156fa9b6245a9af"}}'
2022-01-27 21:49:59,125 107445 DEBUG    [tempest.lib.common.utils.test_utils] Call _is_port_status_active returns false in 60.000000 seconds
}}}

Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
    return func(*func_args, **func_kwargs)
  File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
    return f(*func_args, **func_kwargs)
  File "/opt/stack/tempest/tempest/api/compute/admin/test_live_migration.py", line 281, in test_live_migration_with_trunk
    self.assertTrue(
  File "/usr/lib/python3.8/unittest/case.py", line 765, in assertTrue
    raise self.failureException(msg)
AssertionError: False is not true

This is where the tempest test is failing [1]:

        # Wait till subport status is ACTIVE
        self.assertTrue(
            test_utils.call_until_true(
                self._is_port_status_active, CONF.validation.connect_timeout,
                5, subport['id']))

The example instance ^ d0ef464c-2c66-4619-8d5d-e0eb0568f545 is shown to
have migrated successfully [2]:

Jan 27 21:48:58.568894 ubuntu-focal-rax-iad-0028206891 nova-
compute[53318]: INFO nova.compute.manager [None
req-0fb6b829-6b7e-460f-96f2-5774ba2fbc3e None None] [instance:
d0ef464c-2c66-4619-8d5d-e0eb0568f545] Migrating instance to ubuntu-
focal-rax-iad-0028206890 finished successfully.

Checking in the neutron logs (q-svc and q-agt) I did not find any errors
for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 but these messages in
q-svc might(?) be relevant [3]:

Jan 27 21:48:56.252261 ubuntu-focal-rax-iad-0028206890 neutron-
server[90137]: DEBUG neutron.plugins.ml2.rpc [None
req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Device
94892f6e-22d8-4707-bfd6-05ae6159c1e5 no longer exists at agent ovs-
agent-ubuntu-focal-rax-iad-0028206891 {{(pid=90137) update_device_down
/opt/stack/neutron/neutron/plugins/ml2/rpc.py:259}}

[...]

Jan 27 21:48:56.585883 ubuntu-focal-rax-iad-0028206890 neutron-
server[90137]: DEBUG neutron.plugins.ml2.plugin [None
req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Current status of
the port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 is: ACTIVE; New status is:
DOWN {{(pid=90137) _update_individual_port_db_status
/opt/stack/neutron/neutron/plugins/ml2/plugin.py:2225}}

[...]

Jan 27 21:48:56.675076 ubuntu-focal-rax-iad-0028206890 neutron-
server[90137]: DEBUG neutron.notifiers.nova [None
req-8069a001-aba6-49cb-a9b7-8812281db396 None None] device_id is not set
on port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 yet. {{(pid=90137)
_can_notify /opt/stack/neutron/neutron/notifiers/nova.py:194}}

[...]

Jan 27 21:48:56.763220 ubuntu-focal-rax-iad-0028206890 neutron-
server[90137]: DEBUG neutron.plugins.ml2.db [None
req-8069a001-aba6-49cb-a9b7-8812281db396 None None] For port
94892f6e-22d8-4707-bfd6-05ae6159c1e5, host ubuntu-focal-rax-
iad-0028206891, got binding levels
[PortBindingLevel(driver='openvswitch',host='ubuntu-focal-rax-
iad-0028206891',level=0,port_id=94892f6e-22d8-4707-bfd6-05ae6159c1e5,segment=NetworkSegment(a64eebe3-ed10-423e-89ad-949a28f525d6),segment_id=a64eebe3-ed10-423e-89ad-949a28f525d6)]
{{(pid=90137) get_binding_level_objs
/opt/stack/neutron/neutron/plugins/ml2/db.py:74}}

[...]

Jan 27 21:48:56.947538 ubuntu-focal-rax-iad-0028206890 neutron-
server[90137]: DEBUG neutron.api.rpc.handlers.resources_rpc [None
req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Pushing event
updated for resources: {'Port':
['ID=94892f6e-22d8-4707-bfd6-05ae6159c1e5,revision_number=6']}
{{(pid=90137) push
/opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:237}}


[...]

Jan 27 21:48:58.985314 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 up at agent ovs-agent-ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:296}}
Jan 27 21:48:59.121212 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.plugin [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] The host ubuntu-focal-rax-iad-0028206890 is not matching for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 host ubuntu-focal-rax-iad-0028206891! {{(pid=90137) port_bound_to_host /opt/stack/neutron/neutron/plugins/ml2/plugin.py:2323}}
Jan 27 21:48:59.121805 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 not bound to the agent host ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:304}}

Then the last 3 log messages ^ repeat once again and then after that all
I see is the periodic polling of port
94892f6e-22d8-4707-bfd6-05ae6159c1e5, presumably the tempest test
waiting for it to go to ACTIVE.


[1] https://github.com/openstack/tempest/blob/8c4b99e26be51272692f51b00a534ee15dcf74d1/tempest/api/compute/admin/test_live_migration.py#L281
[2] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/compute1/logs/screen-n-cpu.txt#10973
[3] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/controller/logs/screen-q-svc.txt#11519

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: gate-failure neutron

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1959349

Title:
  test_live_migration_with_trunk fails intermittently due to subport not
  becoming ACTIVE

Status in OpenStack Compute (nova):
  New

Bug description:
  Seeing an issue with
  tempest.api.compute.admin.test_live_migration.LiveMigrationTest.
  test_live_migration_with_trunk failing intermittently. The live
  migration appears to complete successfully but the subport fails to
  become ACTIVE within the 60 second timeout:

  2022-01-27 21:49:54,119 107445 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
          Body: None
      Response - Headers: {'date': 'Thu, 27 Jan 2022 21:49:54 GMT', 'server': 'Apache/2.4.41 (Ubuntu)', 'content-type': 'application/json', 'content-length': '782', 'x-openstack-request-id': 'req-79ecb361-fe12-4efe-9d7d-323378afabdf', 'connection': 'close', 'status': '200', 'content-location': 'https://10.176.196.23:9696/v2.0/ports/94892f6e-22d8-4707-bfd6-05ae6159c1e5'}
          Body: b'{"port":{"id":"94892f6e-22d8-4707-bfd6-05ae6159c1e5","name":"tempest-subport-887366041","network_id":"d60604d8-e700-41d5-8d80-ae2f840e7a9e","tenant_id":"2fe75fa1bd044fb89156fa9b6245a9af","mac_address":"fa:16:3e:74:a4:4d","admin_state_up":true,"status":"DOWN","device_id":"","device_owner":"trunk:subport","fixed_ips":[{"subnet_id":"cd95ee55-6cb3-4401-be11-53207a7794c9","ip_address":"19.80.0.197"}],"allowed_address_pairs":[],"extra_dhcp_opts":[],"security_groups":["9eef9e28-4f49-4f80-9680-af8011de0c8b"],"description":"","binding:vnic_type":"normal","port_security_enabled":true,"qos_policy_id":null,"qos_network_policy_id":null,"tags":[],"created_at":"2022-01-27T21:47:56Z","updated_at":"2022-01-27T21:48:56Z","revision_number":6,"project_id":"2fe75fa1bd044fb89156fa9b6245a9af"}}'
  2022-01-27 21:49:59,125 107445 DEBUG    [tempest.lib.common.utils.test_utils] Call _is_port_status_active returns false in 60.000000 seconds
  }}}

  Traceback (most recent call last):
    File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
      return func(*func_args, **func_kwargs)
    File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
      return f(*func_args, **func_kwargs)
    File "/opt/stack/tempest/tempest/api/compute/admin/test_live_migration.py", line 281, in test_live_migration_with_trunk
      self.assertTrue(
    File "/usr/lib/python3.8/unittest/case.py", line 765, in assertTrue
      raise self.failureException(msg)
  AssertionError: False is not true

  This is where the tempest test is failing [1]:

          # Wait till subport status is ACTIVE
          self.assertTrue(
              test_utils.call_until_true(
                  self._is_port_status_active, CONF.validation.connect_timeout,
                  5, subport['id']))

  The example instance ^ d0ef464c-2c66-4619-8d5d-e0eb0568f545 is shown
  to have migrated successfully [2]:

  Jan 27 21:48:58.568894 ubuntu-focal-rax-iad-0028206891 nova-
  compute[53318]: INFO nova.compute.manager [None
  req-0fb6b829-6b7e-460f-96f2-5774ba2fbc3e None None] [instance:
  d0ef464c-2c66-4619-8d5d-e0eb0568f545] Migrating instance to ubuntu-
  focal-rax-iad-0028206890 finished successfully.

  Checking in the neutron logs (q-svc and q-agt) I did not find any
  errors for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 but these
  messages in q-svc might(?) be relevant [3]:

  Jan 27 21:48:56.252261 ubuntu-focal-rax-iad-0028206890 neutron-
  server[90137]: DEBUG neutron.plugins.ml2.rpc [None
  req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Device
  94892f6e-22d8-4707-bfd6-05ae6159c1e5 no longer exists at agent ovs-
  agent-ubuntu-focal-rax-iad-0028206891 {{(pid=90137) update_device_down
  /opt/stack/neutron/neutron/plugins/ml2/rpc.py:259}}

  [...]

  Jan 27 21:48:56.585883 ubuntu-focal-rax-iad-0028206890 neutron-
  server[90137]: DEBUG neutron.plugins.ml2.plugin [None
  req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Current status of
  the port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 is: ACTIVE; New status
  is: DOWN {{(pid=90137) _update_individual_port_db_status
  /opt/stack/neutron/neutron/plugins/ml2/plugin.py:2225}}

  [...]

  Jan 27 21:48:56.675076 ubuntu-focal-rax-iad-0028206890 neutron-
  server[90137]: DEBUG neutron.notifiers.nova [None
  req-8069a001-aba6-49cb-a9b7-8812281db396 None None] device_id is not
  set on port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 yet. {{(pid=90137)
  _can_notify /opt/stack/neutron/neutron/notifiers/nova.py:194}}

  [...]

  Jan 27 21:48:56.763220 ubuntu-focal-rax-iad-0028206890 neutron-
  server[90137]: DEBUG neutron.plugins.ml2.db [None
  req-8069a001-aba6-49cb-a9b7-8812281db396 None None] For port
  94892f6e-22d8-4707-bfd6-05ae6159c1e5, host ubuntu-focal-rax-
  iad-0028206891, got binding levels
  [PortBindingLevel(driver='openvswitch',host='ubuntu-focal-rax-
  iad-0028206891',level=0,port_id=94892f6e-22d8-4707-bfd6-05ae6159c1e5,segment=NetworkSegment(a64eebe3-ed10-423e-89ad-949a28f525d6),segment_id=a64eebe3-ed10-423e-89ad-949a28f525d6)]
  {{(pid=90137) get_binding_level_objs
  /opt/stack/neutron/neutron/plugins/ml2/db.py:74}}

  [...]

  Jan 27 21:48:56.947538 ubuntu-focal-rax-iad-0028206890 neutron-
  server[90137]: DEBUG neutron.api.rpc.handlers.resources_rpc [None
  req-8069a001-aba6-49cb-a9b7-8812281db396 None None] Pushing event
  updated for resources: {'Port':
  ['ID=94892f6e-22d8-4707-bfd6-05ae6159c1e5,revision_number=6']}
  {{(pid=90137) push
  /opt/stack/neutron/neutron/api/rpc/handlers/resources_rpc.py:237}}

  
  [...]

  Jan 27 21:48:58.985314 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 up at agent ovs-agent-ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:296}}
  Jan 27 21:48:59.121212 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.plugin [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] The host ubuntu-focal-rax-iad-0028206890 is not matching for port 94892f6e-22d8-4707-bfd6-05ae6159c1e5 host ubuntu-focal-rax-iad-0028206891! {{(pid=90137) port_bound_to_host /opt/stack/neutron/neutron/plugins/ml2/plugin.py:2323}}
  Jan 27 21:48:59.121805 ubuntu-focal-rax-iad-0028206890 neutron-server[90137]: DEBUG neutron.plugins.ml2.rpc [None req-eeef9c46-be6a-44e1-b19d-d45aae4f830d None None] Device 94892f6e-22d8-4707-bfd6-05ae6159c1e5 not bound to the agent host ubuntu-focal-rax-iad-0028206890 {{(pid=90137) update_device_up /opt/stack/neutron/neutron/plugins/ml2/rpc.py:304}}

  Then the last 3 log messages ^ repeat once again and then after that
  all I see is the periodic polling of port
  94892f6e-22d8-4707-bfd6-05ae6159c1e5, presumably the tempest test
  waiting for it to go to ACTIVE.

  
  [1] https://github.com/openstack/tempest/blob/8c4b99e26be51272692f51b00a534ee15dcf74d1/tempest/api/compute/admin/test_live_migration.py#L281
  [2] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/compute1/logs/screen-n-cpu.txt#10973
  [3] https://zuul.opendev.org/t/openstack/build/c8f780748b10417a90dc74db65f8dd3d/log/controller/logs/screen-q-svc.txt#11519

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1959349/+subscriptions