openstack team mailing list archive

Thread
Date

How to I unconfuse my resource_tracker?

To: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
From: Jonathan Proulx <jon@xxxxxxxxxxxxx>
Date: Tue, 6 Nov 2012 16:03:16 -0500
Sender: jonathan.proulx@xxxxxxxxx

My compute nodes are confused about how many resource they have free.  I
suspect this is largely due to RPC timeouts I was experiencing due to a
misconfiguration compounded by high load and a scheduler bug, but not so
much interested in how it got this way as to how to clean it up.

for example on node nova-1 the are actually 7  single CPU instances running
as shown by virsh on the system and in the instances table and 24 available
vCPUs. However it reports -102 VCPUs available:

root@nova-1:~# grep AUDIT /var/log/nova/nova-compute.log|tail -4
2012-11-06 15:49:20 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -102
2012-11-06 15:50:50 AUDIT nova.compute.resource_tracker [-] Free ram (MB):
-205145
2012-11-06 15:50:50 AUDIT nova.compute.resource_tracker [-] Free disk (GB):
-2175
2012-11-06 15:50:50 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -102

nova-manage service describe_resource nova-1
HOST                              PROJECT     cpu mem(mb)     hdd
nova-1          (total)                        24   48295     605
nova-1          (used_now)                    126  253440    2780
nova-1          (used_max)                      7   14336     210
nova-1                   3008a142e9524f7295b06ea811908f93       7
14336     210

Both of these match what I see in the compute_nodes table which also tells
tells me there are 88 running_vms (not 7).

Where is the resource_tracker on nova-1 getting this information and how do
I correct it?  (It's clearly pushing this bad info back to the database as
I  first tried to correct it there)

-Jon