← Back to team overview

credativ team mailing list archive

[Bug 885682] Re: [Trunk] Windows-specific random deadlock in upgrade_module

 

Hi Thibaut,

Let's confirm this bug as you provided sufficient information for us to investigate further and see if we can find anything, even if we can't reproduce yet. The importance should be Low unless we confirm another way to reproduce this more easily.
Also it happens only during module installation, not during normal production use of the system.

It's sad to see that Python is not as cross-platform as you'd think. We
should perhaps advice integrators to always deploy the server on Linux,
there's usually no problem with doing that even in a Windows-based
company, clients can connect from Windows.

** Changed in: openobject-server
   Importance: Undecided => Low

** Changed in: openobject-server
       Status: New => Confirmed

** Changed in: openobject-server
     Assignee: (unassigned) => OpenERP's Framework R&D (openerp-dev-framework)

-- 
You received this bug notification because you are a member of OpenERP
Framework Experts, which is subscribed to OpenERP Server.
https://bugs.launchpad.net/bugs/885682

Title:
  [Trunk] Windows-specific random deadlock in upgrade_module

Status in OpenERP Server:
  Confirmed

Bug description:
  Hi.

  I'm using OpenERP Trunk (updated at the begining of the week) and I
  encounter a deadlock problem with threads. The bug is not easy to
  reproduce, because it's kind of random. The better I found to trigger
  it is to install/uninstall the same module different times.

  I looked into the code with pdb, and found that the block line is this
  one [1]:

  openerp/module/registry.py:
      with cls.registries_lock:

  I enabled the debug on the lock (passing verbose=True to the RLock
  constructor). Here is a "normal" output :

  netrpc-client-127.0.0.1:51862: <_RLock owner='netrpc-client-127.0.0.1:51862' count=1>.acquire(1): initial success
  netrpc-client-127.0.0.1:51862: <_RLock owner='netrpc-client-127.0.0.1:51862' count=2>.acquire(1): recursive success
  netrpc-client-127.0.0.1:51862: <_RLock owner='netrpc-client-127.0.0.1:51862' count=1>.release(): non-final release
  netrpc-client-127.0.0.1:51862: <_RLock owner=None count=0>.release(): final release

  But when there is the deadlock, I only get this output, before the
  server stop to respond :

  netrpc-client-127.0.0.1:51872: <_RLock owner='netrpc-client-127.0.0.1:51872' count=1>.acquire(1): initial success
  netrpc-client-127.0.0.1:51872: <_RLock owner='netrpc-client-127.0.0.1:51872' count=2>.acquire(1): recursive success

  I don't understand why it doesn't work because we can see that is has
  been release correctly just before (here, the output is from the same
  instance).

  Notes :

  - I couldn't trigger the bug on Linux, everything seems ok. But
  running the server on windows make this happen almost everytime. I
  don't really understand why, but I'm not an expert.

  - I was using the GTK Client

  - Using Python 2.6, with all libs installed manually using pip and
  .exe distributions.

  Thanks for working on this !

  [1] http://bazaar.launchpad.net/~openerp/openobject-
  server/trunk/view/head:/openerp/modules/registry.py#L149

To manage notifications about this bug go to:
https://bugs.launchpad.net/openobject-server/+bug/885682/+subscriptions