yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79241
[Bug 1836253] [NEW] Sometimes InstanceMetada API returns 404 due to invalid InstaceID returned by _get_instance_and_tenant_id()
Public bug reported:
Sometimes on instance initialization, the metadata step fails.
On metadata-agent.log there are lots of 404:
"GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404 len: 297 time: 0.0771070
On nova-api.log we get 404 too:
"GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404
After some debuging we get that problem occurs when new instances is getting same IP used by deleted instances.
The problem is related to cache implementation on method "_get_ports_for_remote_address()" on "/neutron/agent/metadata/agent.py" that returns an port from deleted instance (with the same IP) which returns wrong InstanceID that will be sent to nova-api which will fail because this instanceId not exists.
This problem only occurs with cache enabled on neuton metadata-agent.
Version: Queens
How to reproduce:
---
#!/bin/bash
computenodelist=(
'computenode00.test.openstack.net'
'computenode01.test.openstack.net'
'computenode02.test.openstack.net'
'computenode03.test.openstack.net'
)
validate_metadata(){
cat << EOF > /tmp/metadata
#!/bin/sh -x
if curl 192.168.10.2
then
echo "ControllerNode00 - OK"
else
echo "ControllerNode00 - ERROR"
fi
EOF
#SUBNAME=$(date +%s)
openstack server delete "${node}" 2>/dev/null
source /root/admin-openrc
openstack server create --image cirros --nic net-id=internal --flavor Cirros --security-group default --user-data /tmp/metadata --availability-zone nova:${node} --wait "${node}" &> /dev/null
i=0
until [ $i -gt 3 ] || openstack console log show "${node}" | grep -q "ControllerNode00"
do
i=$((i+1))
sleep 1
done
openstack console log show "${node}" | grep -q "ControllerNode00 - OK"
if [ $? == 0 ]; then
echo "Metadata Servers OK: ${node}"
else
echo "Metadata Servers ERROR: ${node}"
fi
rm /tmp/metadata
}
for node in ${computenodelist[@]}
do
export node
validate_metadata
done
echo -e "\n"
---
** Affects: neutron
Importance: Undecided
Status: New
** Description changed:
Sometimes on instance initialization, the metadata step fails.
On metadata-agent.log there are lots of 404:
"GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404 len: 297 time: 0.0771070
On nova-api.log we get 404 too:
- "GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404
+ "GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404
-
- After some debuging we get that problem occurs when new instances is getting same IP used by deleted instances.
- The problem is related to cache implementation on method "_get_instance_and_tenant_id()" on "/neutron/agent/metadata/agent.py" that returns an port from deleted instance (with the same IP) which returns wrong InstanceID that will be sent to nova-api which will fail because this instanceId not exists.
+ After some debuging we get that problem occurs when new instances is getting same IP used by deleted instances.
+ The problem is related to cache implementation on method "_get_ports_for_remote_address()" on "/neutron/agent/metadata/agent.py" that returns an port from deleted instance (with the same IP) which returns wrong InstanceID that will be sent to nova-api which will fail because this instanceId not exists.
This problem only occurs with cache enabled on neuton metadata-agent.
Version: Queens
How to reproduce:
---
#!/bin/bash
-
+
computenodelist=(
- 'computenode00.test.openstack.net'
- 'computenode01.test.openstack.net'
- 'computenode02.test.openstack.net'
- 'computenode03.test.openstack.net'
+ 'computenode00.test.openstack.net'
+ 'computenode01.test.openstack.net'
+ 'computenode02.test.openstack.net'
+ 'computenode03.test.openstack.net'
)
validate_metadata(){
cat << EOF > /tmp/metadata
#!/bin/sh -x
if curl 192.168.10.2
then
- echo "ControllerNode00 - OK"
+ echo "ControllerNode00 - OK"
else
- echo "ControllerNode00 - ERROR"
+ echo "ControllerNode00 - ERROR"
fi
EOF
- #SUBNAME=$(date +%s)
- openstack server delete "${node}" 2>/dev/null
- source /root/admin-openrc
- openstack server create --image cirros --nic net-id=internal --flavor Cirros --security-group default --user-data /tmp/metadata --availability-zone nova:${node} --wait "${node}" &> /dev/null
+ #SUBNAME=$(date +%s)
+ openstack server delete "${node}" 2>/dev/null
+ source /root/admin-openrc
+ openstack server create --image cirros --nic net-id=internal --flavor Cirros --security-group default --user-data /tmp/metadata --availability-zone nova:${node} --wait "${node}" &> /dev/null
+ i=0
+ until [ $i -gt 3 ] || openstack console log show "${node}" | grep -q "ControllerNode00"
+ do
+ i=$((i+1))
+ sleep 1
+ done
+ openstack console log show "${node}" | grep -q "ControllerNode00 - OK"
+ if [ $? == 0 ]; then
+ echo "Metadata Servers OK: ${node}"
+ else
+ echo "Metadata Servers ERROR: ${node}"
+ fi
- i=0
- until [ $i -gt 3 ] || openstack console log show "${node}" | grep -q "ControllerNode00"
- do
- i=$((i+1))
- sleep 1
- done
- openstack console log show "${node}" | grep -q "ControllerNode00 - OK"
- if [ $? == 0 ]; then
- echo "Metadata Servers OK: ${node}"
- else
- echo "Metadata Servers ERROR: ${node}"
- fi
-
- rm /tmp/metadata
+ rm /tmp/metadata
}
-
for node in ${computenodelist[@]}
do
- export node
- validate_metadata
+ export node
+ validate_metadata
done
echo -e "\n"
---
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1836253
Title:
Sometimes InstanceMetada API returns 404 due to invalid InstaceID
returned by _get_instance_and_tenant_id()
Status in neutron:
New
Bug description:
Sometimes on instance initialization, the metadata step fails.
On metadata-agent.log there are lots of 404:
"GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404 len: 297 time: 0.0771070
On nova-api.log we get 404 too:
"GET /2009-04-04/meta-data/instance-id HTTP/1.1" status: 404
After some debuging we get that problem occurs when new instances is getting same IP used by deleted instances.
The problem is related to cache implementation on method "_get_ports_for_remote_address()" on "/neutron/agent/metadata/agent.py" that returns an port from deleted instance (with the same IP) which returns wrong InstanceID that will be sent to nova-api which will fail because this instanceId not exists.
This problem only occurs with cache enabled on neuton metadata-agent.
Version: Queens
How to reproduce:
---
#!/bin/bash
computenodelist=(
'computenode00.test.openstack.net'
'computenode01.test.openstack.net'
'computenode02.test.openstack.net'
'computenode03.test.openstack.net'
)
validate_metadata(){
cat << EOF > /tmp/metadata
#!/bin/sh -x
if curl 192.168.10.2
then
echo "ControllerNode00 - OK"
else
echo "ControllerNode00 - ERROR"
fi
EOF
#SUBNAME=$(date +%s)
openstack server delete "${node}" 2>/dev/null
source /root/admin-openrc
openstack server create --image cirros --nic net-id=internal --flavor Cirros --security-group default --user-data /tmp/metadata --availability-zone nova:${node} --wait "${node}" &> /dev/null
i=0
until [ $i -gt 3 ] || openstack console log show "${node}" | grep -q "ControllerNode00"
do
i=$((i+1))
sleep 1
done
openstack console log show "${node}" | grep -q "ControllerNode00 - OK"
if [ $? == 0 ]; then
echo "Metadata Servers OK: ${node}"
else
echo "Metadata Servers ERROR: ${node}"
fi
rm /tmp/metadata
}
for node in ${computenodelist[@]}
do
export node
validate_metadata
done
echo -e "\n"
---
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1836253/+subscriptions
Follow ups