yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1528834] [NEW] HostManager host_state_map should have synchronized access

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Chris Dent <cdent+lp@xxxxxxxxxxxxx>
Date: Wed, 23 Dec 2015 12:04:28 -0000
Reply-to: Bug 1528834 <1528834@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Public bug reported:

Reporting this as  a bug to get some discussion rolling on how best to
fix it (or whether to fix it). I kind of assume this is a well known
problem, but since the code isn't covered with NOTEs about it, I thought
perhaps best to throw it up here.

In nova's master branch as of Dec 2015 the HostState objects and
host_state_map are not robust in the face of concurrent access. That
concurrent access doesn't happen frequently under test situations is
primarily the fault of the test situations and luck, not the correctness
of the code. Unless I'm completely misunderstanding the way in which
eventlet is being used, each of the following methods has the potential
to yield on I/O (rpc calls, log messages) or sleeps (not used, but handy
in debugging situations). This is not an exhaustive list, just what came
up in my digging in nova/scheduler/host_manager.py (digging described in
more detail below):

* HostManager.get_all_host_states
* HostState.consume_from_request
* HostState.update_from_compute_node

If these yield at exactly the wrong time, an ongoing set of
select_destinations calls will have some which have incorrect host
information. As we know, this is often true anyway, but the suggested
fix (below) is pretty lightweight, so perhaps a small improvement is
worth it?

How I figured this out:

I was reading the code and realized that access to the HostState is not
synchronized so tried to see if I could come up with a way to get
incorrect information to show up at the scheduler when running just one
scheduler, just one compute node and doing multiple "concurrent" server
creates (calling three creates over the API in a backgrounding loop). It
was pretty easy to report some incorrect RAM usage. To make things fail
more reliably I introduced small sleeps into consume_from_request to
force it to yield. I managed to get very incorrect usage information
(negative RAM).

Adding a utils.synchronized decorator to consume_from_request improved
the situation somewhat: At the end of the scheduling run the RAM usage
was off by one deduction. Adding a synchronized to get_all_host_states
cleared this up.

I'm not clear on the implications of these changes: they make sense to
me (we don't want to concurrent access to a shared data structure) but I
wonder if perhaps there are other things to consider that I'm not aware
of such as "well, actually, this stuff was supposed to be written so it
never yielded, that they can or may is the bug" (in which case the
methods need some phat warnings on them for future maintainers).

I'm also not clear on the semaphores that ought to be used (if
synchronization is the way to go). In my POC solution
get_all_host_states had its own semaphore while the other two methods
shared one.

Beyond that, though, it might make sense for the host_state_map and the
entire HostState object to be "thread" safe, to get all places where
this could be a problem rather than piecemeal dealing with methods as
curious people notice them.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1528834

Title:
  HostManager host_state_map should have synchronized access

Status in OpenStack Compute (nova):
  New

Bug description:
  Reporting this as  a bug to get some discussion rolling on how best to
  fix it (or whether to fix it). I kind of assume this is a well known
  problem, but since the code isn't covered with NOTEs about it, I
  thought perhaps best to throw it up here.

  In nova's master branch as of Dec 2015 the HostState objects and
  host_state_map are not robust in the face of concurrent access. That
  concurrent access doesn't happen frequently under test situations is
  primarily the fault of the test situations and luck, not the
  correctness of the code. Unless I'm completely misunderstanding the
  way in which eventlet is being used, each of the following methods has
  the potential to yield on I/O (rpc calls, log messages) or sleeps (not
  used, but handy in debugging situations). This is not an exhaustive
  list, just what came up in my digging in
  nova/scheduler/host_manager.py (digging described in more detail
  below):

  * HostManager.get_all_host_states
  * HostState.consume_from_request
  * HostState.update_from_compute_node

  If these yield at exactly the wrong time, an ongoing set of
  select_destinations calls will have some which have incorrect host
  information. As we know, this is often true anyway, but the suggested
  fix (below) is pretty lightweight, so perhaps a small improvement is
  worth it?

  How I figured this out:

  I was reading the code and realized that access to the HostState is
  not synchronized so tried to see if I could come up with a way to get
  incorrect information to show up at the scheduler when running just
  one scheduler, just one compute node and doing multiple "concurrent"
  server creates (calling three creates over the API in a backgrounding
  loop). It was pretty easy to report some incorrect RAM usage. To make
  things fail more reliably I introduced small sleeps into
  consume_from_request to force it to yield. I managed to get very
  incorrect usage information (negative RAM).

  Adding a utils.synchronized decorator to consume_from_request improved
  the situation somewhat: At the end of the scheduling run the RAM usage
  was off by one deduction. Adding a synchronized to get_all_host_states
  cleared this up.

  I'm not clear on the implications of these changes: they make sense to
  me (we don't want to concurrent access to a shared data structure) but
  I wonder if perhaps there are other things to consider that I'm not
  aware of such as "well, actually, this stuff was supposed to be
  written so it never yielded, that they can or may is the bug" (in
  which case the methods need some phat warnings on them for future
  maintainers).

  I'm also not clear on the semaphores that ought to be used (if
  synchronization is the way to go). In my POC solution
  get_all_host_states had its own semaphore while the other two methods
  shared one.

  Beyond that, though, it might make sense for the host_state_map and
  the entire HostState object to be "thread" safe, to get all places
  where this could be a problem rather than piecemeal dealing with
  methods as curious people notice them.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1528834/+subscriptions
Follow ups

[Bug 1528834] Re: HostManager host_state_map should have synchronized access
From: Chris Dent, 2016-01-07