← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1790195] Re: performance problems starting up nova process due to regex code

 

Reviewed:  https://review.openstack.org/599071
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=90b206894a2c442a5c475a3d90fff8f89a9b3ce0
Submitter: Zuul
Branch:    master

commit 90b206894a2c442a5c475a3d90fff8f89a9b3ce0
Author: Sean Mooney <work@xxxxxxxxxxxxxxx>
Date:   Fri Aug 31 21:35:14 2018 +0100

    add caching to _build_regex_range
    
    - _build_regex_range is called 17 times on
      import of nova.api.validation.parameters_types.
      _build_regex_range internally calls re.escape
      and valid_char  on every char returned
      from _get_all_chars.
      _get_all_chars yields all chars up to 0xffff.
      As a result re.escape and valid_char are called
      1.1 million times when
      nova.api.validation.parameters_types is imported.
    
    - This change add a memorize decorator and uses
      it to cache _build_regex_range
    
    - This change does not cache valid_char,
      _is_printable or re.escape as hashing and
      caching them for each invocation would
      be far more costly both in time and memory
      than computing the result.
    
    Change-Id: Ic1f2c560a6da815b26fdf770450bbe439d18d4f9
    Closes-Bug: #1790195


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1790195

Title:
  performance problems starting up nova process due to regex code

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  We noticed that nova process startup seems to take a long time.  It
  looks like one major culprit is the regex code at
  https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py

  Sean K Mooney highlighted one possible culprit:

  <sean-k-mooney> i dont really like this https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L128-L142
  <sean-k-mooney> def _get_all_chars():
  <sean-k-mooney>     for i in range(0xFFFF):
  <sean-k-mooney>         yield six.unichr(i)
  <sean-k-mooney> so that is got to loop 65535 times
  <sean-k-mooney> *going too
  <sean-k-mooney> and we call the function 17 times
  <sean-k-mooney> so that 1.1 million callse to re.escape every time we load that module

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1790195/+subscriptions


References