← Back to team overview

registry team mailing list archive

[Bug 376145] Re: gvfs-gdu-volume-monitor crashed with SIGSEGV in gdu_pool_get_presentables()

 

Launchpad has imported 5 comments from the remote bug at
http://bugs.freedesktop.org/show_bug.cgi?id=24254.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2009-10-01T06:32:08+00:00 James Westby wrote:

Hi,

We were getting a lot of crashes related to devicekit-disks on startup.

Chris Coulson correctly deduced that this was due to activation, and found
a way to make the problem reproducible.

The problem is this:

  * D-Bus activated service that is slow to start.
  * Call to that service with a timeout not too far removed from
    time it takes to start the service, but still higher. (e.g. 5s vs 16s)
  * The caller gets a timeout error that comes back to them in
    a time that is close to the time that it takes to start the
    service, even though the service is correctly started.

The cause is this:

  * In  _dbus_connection_block_pending_call the elapsed time is counted
    from the start of the block.
  * Every time a message is received this elapsed time is subtracted from
    the total timeout.
  * This means that the timeout is hit much quicker than it should be.

What happens is:

  * The service takes 5s to start, so elapsed_milliseconds is ~5000.
  * There are a number of messages received before the response, in this
    case 3 NameOwnerChanged signals.
  * For each of these the timeout is reduced by about 5000 milliseconds.
  * The real response then comes back. The timeout left is compared against
    the elapsed time, and if it is less the timeout error is returned.
  * So, in this case 3x5000 for the NameOwnerChanged signals leaves ~1000
    left when the response comes back, so it thinks it has timed out, even
    though it's only about 5s in to the 16s timeout.

Therefore this bug will be hit when you are activating the service and

   (time to activate serive) * (number of messages received in that
time) > (timeout)

so it explains the non-linearities. Also, the lack of non-determinism in the testing explains why I was able to pinpoint the threshold timeout quite so accurately, and the non-determinism at boot probably explains why some people
see it and some people don't.

The attached patch fixes this by:

  * Not subtracting the elapsed time from the total time, just comparing them.
  * When a timeout is needed for other calls, using the difference so that
    their timeout is the remaining time.
  * Using the already calculated elapsed_milliseconds in the _dbus_verbose at
    end, as we already have the value, and the fact that it wasn't being re-used
    muddied the debugging waters slightly.

Please consider applying this patch. We will probably ship it in the
Ubuntu package after review as so many people are hitting this bug.

Thanks,

James

Reply at: https://bugs.launchpad.net/dbus/+bug/376145/comments/76

------------------------------------------------------------------------
On 2009-10-01T06:33:48+00:00 James Westby wrote:

Created an attachment (id=29976)
Patch to fix timeout accounting

Reply at: https://bugs.launchpad.net/dbus/+bug/376145/comments/77

------------------------------------------------------------------------
On 2009-10-01T06:39:07+00:00 Thiago Macieira wrote:

Patch looks good.

Can you make that a Git commit and attach it (git format-patch -n1) ?

Reply at: https://bugs.launchpad.net/dbus/+bug/376145/comments/78

------------------------------------------------------------------------
On 2009-10-01T07:04:33+00:00 Scott James Remnant wrote:

I think I got nearest to that code last, and it looks right to me

Reply at: https://bugs.launchpad.net/dbus/+bug/376145/comments/79

------------------------------------------------------------------------
On 2009-10-01T07:12:59+00:00 Scott James Remnant wrote:

I've taken care of the heavy lifting for James:

http://cgit.freedesktop.org/dbus/dbus/commit/?id=03cc20707a3e7b2d8629e84d7a766f41edb8b444

Reply at: https://bugs.launchpad.net/dbus/+bug/376145/comments/80


** Changed in: dbus
   Importance: Unknown => High

-- 
gvfs-gdu-volume-monitor crashed with SIGSEGV in gdu_pool_get_presentables()
https://bugs.launchpad.net/bugs/376145
You received this bug notification because you are a member of Registry
Administrators, which is the registrant for D-Bus.