← Back to team overview

openstack team mailing list archive

Re: Failed to query agent version

 

Ahhhh - understood.  I now see the changes - sorry I had assumed that the config key I was talking about was present in the version of the code you are running.

You could also achieve a similar result to reduce the timeout waiting for the agent by setting agent_version_timeout to a value lower than 300.  This would still check that the agent exists, but would fail significantly faster.

Thanks,

Bob

From: Gonzalo Alvarez [mailto:gonzaloab@xxxxxxxxx]
Sent: 09 May 2013 14:54
To: Bob Ball
Cc: OpenStack
Subject: Re: [Openstack] Failed to query agent version

Hi Bob,

    well, it seems to me that the openstack version shipped with Debian 7.0 wheezy (2012.1.1) does not include that config key.

    I've made the change myself in the vmops.py file to ignore all the logic regarding the agent. I've created a patch in case somebody with the same debian version hits the same problem http://pastebin.com/FqvGNAUP

    For the shake of documentation, here is my nova.conf file:  http://pastebin.com/UxJUEKZd

    So, now it works and I'm really happy with it :)

Thanks a lot Bob for your support.

Regards,
Gonzalo Alvarez.

---
GonzaloAlvarez.es<http://www.gonzaloalvarez.es/>
Ponzano 80, 6º 6 - 28003 - Madrid
Mail<mailto:gonzaloab@xxxxxxxxx> | Facebook<https://www.facebook.com/gonzalo.alvarez> | Linkedin<http://www.linkedin.com/in/gonzaloab>
Tel: +34 678 252 458

On 9 May 2013 15:12, Bob Ball <bob.ball@xxxxxxxxxx<mailto:bob.ball@xxxxxxxxxx>> wrote:
Hi Gonzalo,

That's very strange.  We've checked the code and have confirmed it should be responding to this config value as described.  We've also got several test scenarios that do not have the agent installed that are not suffering from this - although there may be another configuration difference between the two setups.

Can you confirm that you restarted nova after changing the config key?

If restarting nova didn't fix the problem, please can you pastebin your nova config and some more context around the log file messages.

Thanks,

Bob

From: Gonzalo Alvarez [mailto:gonzaloab@xxxxxxxxx<mailto:gonzaloab@xxxxxxxxx>]
Sent: 09 May 2013 14:01

To: Bob Ball
Cc: OpenStack
Subject: Re: [Openstack] Failed to query agent version

Hi Bob,

    Still nothing :(

    I've changed the nova.conf file to reflect the new key:

xenapi_disable_agent=True

    but still doesn't work. I keep getting the timeout error messages in the nova-compute.log and the dashboard keeps showing the 'Spawning' message much longer than it should. The instance is up in around 40 seconds, but the openstack dashboard mark it as running only after around 4 minutes.

    In case it helps, I'm using Debian Wheezy 7.0, Xen 4.1.4 and XCP 1.3.2 as my domU (so it is XCP Kronos), and I obviously have the nova-xcp-plugins installed.

    BTW, I made a mistake the first time. It is always the other way around in the nova-compute.log. First I get the 'Running' state change message in the log, and then all the errors regarding the timout. Sorry about that.

Regards,
Gonzalo.

---
GonzaloAlvarez.es<http://www.gonzaloalvarez.es/>
Ponzano 80, 6º 6 - 28003 - Madrid
Mail<mailto:gonzaloab@xxxxxxxxx> | Facebook<https://www.facebook.com/gonzalo.alvarez> | Linkedin<http://www.linkedin.com/in/gonzaloab>
Tel: +34 678 252 458<tel:%2B34%20678%20252%20458>

On 9 May 2013 14:33, Bob Ball <bob.ball@xxxxxxxxxx<mailto:bob.ball@xxxxxxxxxx>> wrote:
Hi Gonzalo,

My apologies - I made a mistake when copy/pasting the key.  I hope that adding the key didn't make any difference?

The key should be using underscores, not hyphens: xenapi_disable_agent=True

I raised https://bugs.launchpad.net/nova/+bug/1178223 as a bug to fix the default value.

Thanks,

Bob

From: Gonzalo Alvarez [mailto:gonzaloab@xxxxxxxxx<mailto:gonzaloab@xxxxxxxxx>]
Sent: 09 May 2013 13:24
To: Bob Ball
Cc: OpenStack
Subject: Re: [Openstack] Failed to query agent version

Hi Bob,

    thanks a lot for your answer, but neither of the solutions worked for me.

    The Rackspace-supplied agent latest version simply refuses to install on Ubuntu 13.03 and Debian 7.0. I will ask the rackerlabs people about it with a detailed description of my errors, but it doesn't seem straightforward at all...

    Also, as you suggested, I added the xenapi-disable-agent key to my nova.conf file. Now I get earlier in the nova-compute log the message that says that the instance is in 'Running' state, but in the dashboard it is still marked as 'Spawning' :(

    By the way, this is my nova.conf file, in case it helps...

[DEFAULT]
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova
root_helper=sudo nova-rootwrap
auth_strategy=keystone
verbose=True

sql_connection=mysql://nova-common:xxxxxxxx@localhost/nova

sr_matching_filter=default-sr:true
connection_type=xenapi
xenapi_connection_url=https://192.168.2.48
xenapi_connection_username=root
xenapi_connection_password=xxxxxxx
rescue_timeout=86400
xenapi_inject_image=False

network_manager=nova.network.manager.FlatManager
image_service=nova.image.glance.GlanceImageService
flat_network_bridge=xenbr0
public_interface=eth0
flat_interface=eth0

host=192.168.2.14

volume_driver=nova.volume.xensm.XenSMDriver
use_local_volumes=False

xenapi-disable-agent=True

# AUTHENTICATION
auth_strategy=keystone
[keystone_authtoken]
auth_host = 127.0.0.1
auth_port = 35357
auth_protocol = http
admin_tenant_name = admin
admin_user = admin
admin_password = admin
signing_dirname = /tmp/keystone-signing-nova


Thanks a lot.

Regards,
Gonzalo.

---
GonzaloAlvarez.es<http://www.gonzaloalvarez.es/>
Ponzano 80, 6º 6 - 28003 - Madrid
Mail<mailto:gonzaloab@xxxxxxxxx> | Facebook<https://www.facebook.com/gonzalo.alvarez> | Linkedin<http://www.linkedin.com/in/gonzaloab>
Tel: +34 678 252 458<tel:%2B34%20678%20252%20458>

On 9 May 2013 13:20, Bob Ball <bob.ball@xxxxxxxxxx<mailto:bob.ball@xxxxxxxxxx>> wrote:
Hi Gonzalo,

The agent referred to here is the Rackspace-supplied agents that can be installed in images, so are present when they are booted and you do not suffer this timeout.  I believe these are currently available from https://github.com/rackerlabs/openstack-guest-agents-unix and https://github.com/rackerlabs/openstack-guest-agents-windows-xenserver.

Set the value xenapi-disable-agent=true in /etc/nova/nova.conf and the presence of the agent will not be checked.  The impact of this is that Nova will think the guest is running before the bootup sequence has actually finished, and (of course) you will not have the agent functionality such as live password reset.

Thanks,

Bob

From: Openstack [mailto:openstack-bounces+bob.ball<mailto:openstack-bounces%2Bbob.ball>=citrix.com@xxxxxxxxxxxxxxxxxxx<mailto:citrix.com@xxxxxxxxxxxxxxxxxxx>] On Behalf Of Gonzalo Alvarez
Sent: 09 May 2013 11:38
To: OpenStack
Subject: [Openstack] Failed to query agent version

Hi all,

    I've managed to install OpenStack (Version 2012.1.1-18) on top of a XenServer installation. Instances are created fine, but it takes really long for Openstack to realize that the instance is up and running. In the nova-compute.log I see the following messages:

2013-05-09 05:19:47 DEBUG nova.virt.xenapi_conn [-] Got exception: ['XENAPI_PLUGIN_FAILURE', 'version', 'PluginError', 'TIMEOUT: No response from agent within 30 seconds.'] from (pid=3047) _unwrap_plugin_exceptions /usr/lib/python2.7/dist-packages/nova/virt/xenapi_conn.py:612
2013-05-09 05:19:47 ERROR nova.virt.xenapi.vmops [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] TIMEOUT: The call to version timed out. VM id=a1978a21-7598-4d5e-984b-a9ca858f7237; args={'path': '', 'dom_id': '4', 'id': '957d7359-1632-4b38-bf19-8bfd0d45aca5', 'host_uuid': 'c38359bb-7a82-ac2b-0ee6-4a6dd68c5285'}
2013-05-09 05:19:47 ERROR nova.virt.xenapi.vmops [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] Failed to query agent version: {'message': 'TIMEOUT: No response from agent within 30 seconds.', 'returncode': 'timeout'}

These three messages are repeated 5 times, until I get these messages:

2013-05-09 05:24:20 DEBUG nova.compute.manager [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] [instance: a1978a21-7598-4d5e-984b-a9ca858f7237] Checking state from (pid=3047) _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:272
2013-05-09 05:24:20 INFO nova.virt.xenapi.vm_utils [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] (VM_UTILS) xenserver vm state -> |Running|
2013-05-09 05:24:20 INFO nova.virt.xenapi.vm_utils [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] (VM_UTILS) xenapi power_state -> |1|

And then the dashboard properly shows the instance as running. But the instance was up and running long ago. The instace takes about 30 seconds until command prompt, but OpenStack waits about 4 minutes to mark it as 'Running'.

Is there any way to fix this?

Thanks in advance.

Regards,
Gonzalo Alvarez.
---
GonzaloAlvarez.es<http://www.gonzaloalvarez.es/>
Ponzano 80, 6º 6 - 28003 - Madrid
Mail<mailto:gonzaloab@xxxxxxxxx> | Facebook<https://www.facebook.com/gonzalo.alvarez> | Linkedin<http://www.linkedin.com/in/gonzaloab>
Tel: +34 678 252 458<tel:%2B34%20678%20252%20458>




References