← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1650188] [NEW] Concurrently update server's metadata are handled badly

 

Public bug reported:

Currently, we have two APIs to update server's metadata:
1. update: only update/add the key=value user provided in this call
2. update_all: replace all previous added key=value pair with the key=value pair user provided in this call
they are using the same _update_instance_metadata method, differed only with one boolean key:
http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n78
http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n95

Then it will be handled in the below work flow:
I. get the server object
   http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n108
II. handled in compute/api.py update_instance_metadata:
   http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/api.py#n3689
   the difference of update_all and update is handled here:
   i) if we are using update all, the target meatadata will be set as the metadata passed in, say new_meta
   ii) if we are using update, the target meatadata will be set to old_meta + new_meta
III. then we just set instance.metadata to the target metadata, and call instance.save() to do the job.
IV. it is finally handled in DB:
    http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n2694
    here we compared the metadata read from DB and the target metadata we passed in:
    i) if we call update_all, the target metadata only contains new_meta, so key=value pairs that are in old_meta but
    not in new_meta will be deleted.
    ii) if we call update, the target metadata will contains new_meta + old_meta so old_meta will not be deleted.

The above mentioned process worked pretty well for general uses, but when we come to concurrently usage, problem may
occour:

For example, we have two concurrent meta_data update calls, say meta_A and meta_B, they are called at the same time
since the target meta is generated at the API level(in previous step I and II), it the two calls came in at the same
time, the instance.metadata got in these two calls will be the same (old_meta), here comes the problem, the target
meta for A call will be old_meta + A, target meta for B call will be old_meta + B, and they will then go the rest
of the process;

When it comes to the DB layer, step IV, as we only have one DB so it is first time come first serve style, lets say
call A has successfully handled, the metadata in DB is now old_meta + A, then we will handle call B, as the target
meta is old_meta + B hence A will be removed.

** Affects: nova
     Importance: Undecided
     Assignee: Zhenyu Zheng (zhengzhenyu)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Zhenyu Zheng (zhengzhenyu)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1650188

Title:
  Concurrently update server's metadata are handled badly

Status in OpenStack Compute (nova):
  New

Bug description:
  Currently, we have two APIs to update server's metadata:
  1. update: only update/add the key=value user provided in this call
  2. update_all: replace all previous added key=value pair with the key=value pair user provided in this call
  they are using the same _update_instance_metadata method, differed only with one boolean key:
  http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n78
  http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n95

  Then it will be handled in the below work flow:
  I. get the server object
     http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/server_metadata.py#n108
  II. handled in compute/api.py update_instance_metadata:
     http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/api.py#n3689
     the difference of update_all and update is handled here:
     i) if we are using update all, the target meatadata will be set as the metadata passed in, say new_meta
     ii) if we are using update, the target meatadata will be set to old_meta + new_meta
  III. then we just set instance.metadata to the target metadata, and call instance.save() to do the job.
  IV. it is finally handled in DB:
      http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n2694
      here we compared the metadata read from DB and the target metadata we passed in:
      i) if we call update_all, the target metadata only contains new_meta, so key=value pairs that are in old_meta but
      not in new_meta will be deleted.
      ii) if we call update, the target metadata will contains new_meta + old_meta so old_meta will not be deleted.

  The above mentioned process worked pretty well for general uses, but when we come to concurrently usage, problem may
  occour:

  For example, we have two concurrent meta_data update calls, say meta_A and meta_B, they are called at the same time
  since the target meta is generated at the API level(in previous step I and II), it the two calls came in at the same
  time, the instance.metadata got in these two calls will be the same (old_meta), here comes the problem, the target
  meta for A call will be old_meta + A, target meta for B call will be old_meta + B, and they will then go the rest
  of the process;

  When it comes to the DB layer, step IV, as we only have one DB so it is first time come first serve style, lets say
  call A has successfully handled, the metadata in DB is now old_meta + A, then we will handle call B, as the target
  meta is old_meta + B hence A will be removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1650188/+subscriptions