← Back to team overview

registry team mailing list archive

[Bug 401823] Re: Gdk-WARNING **: XID collision, trouble ahead

 

Launchpad has imported 21 comments from the remote bug at
http://bugs.freedesktop.org/show_bug.cgi?id=21583.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2009-05-05T16:14:42+00:00 Matthias Clasen wrote:

http://bugzilla.gnome.org/show_bug.cgi?id=581526 describes a scenario
where XCreateWindow appears to reuse an XID while the DestroyNotify for
the previous owner of that XID is still sitting in the event queue. This
causes GDK to get confused, and things go downhill from there.

It seems unreasonable to demand that clients peek the queue for pending
destroy notifies whenever they want to create a window, in particular
since this problem does not occur without the resource-reusing
extension.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/0

------------------------------------------------------------------------
On 2009-07-05T16:50:42+00:00 Bugs-freedesktop wrote:

I can't see a reasonable way for either Xlib or the Xserver to guarantee that
XIDs in client's event queues are unique.

The X server has handed off the DestroyNotify event, so it thinks it has
finished with the event.

Xlib could ensure not to allocate an XID referenced in its own event queue
(for known event types), but it wouldn't know what other clients might have a
reference to a candidate XID sitting in their event queues.

If the server were to keep XIDs of destroyed windows allocated until clients
have processed events on that window, it would need to know when the events in
Xlib's queue have been processed.  I can't see how the Xserver can know this
(without some change in protocol).

The other way of looking at this is that the events are a history of what has
happened and need to be interpreted in the context of when they happened.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/11

------------------------------------------------------------------------
On 2009-09-13T15:28:51+00:00 Davidsboogs wrote:

This bug causes serious problems for some of us.  In my case, (after bug
20254 was fixed) this is I believe the cause behind the way most of my
firefox sessions terminate (after sometimes producing the disembodied
windows mentioned in the first comment at
https://bugzilla.gnome.org/show_bug.cgi?id=581526 )

So.  Even if it's not possible to completely prevent an XID from being
reused before it's processed, perhaps it could be made so unlikely that
it won't happen in reasonable circumstances?  I am thinking of the way
process IDs work - each one is higher than the previous one assigned
until it hits an integer limit and wraps back to 0, but any unallocated
XIDs that old would hopefully not still be in queues.

I tried to take a look at the code but quickly came to the conclusion
that this isn't something I personally could just jump into.  So I don't
know if it's a feasible suggestion or not - if not perhaps there could
be some similar workaround to delay a given ID's reuse until it's simply
unlikely to be a problem

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/31

------------------------------------------------------------------------
On 2009-09-13T16:11:40+00:00 Bugs-freedesktop wrote:

Improving the algorithm providing the XID range so that it provided a
larger range where possible would make this less likely (though it could
still happen less often in reasonable circumstances).

Keeping a buffer of a certain number of recently released XIDs is
another possibility.

Or perhaps calculating the range in advance, so that the range used is a
range of XIDs that were available (but not advertised) at the time of a
previous range request.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/32

------------------------------------------------------------------------
On 2009-09-14T17:12:32+00:00 D. Hugh Redelmeier wrote:

Reducing the frequency of the problem would provide relief.  In my
(possibly naive) opinion it is the wrong approach: the design flaw needs
to be fixed.  Perhaps that requires an API change.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/33

------------------------------------------------------------------------
On 2009-09-22T19:19:43+00:00 Ben Gamari wrote:

This seems to be biting me too, to the order of once every 15 minutes
(closing a firefox tab has by my estimate a 10% chance of crashing the
firefox process). Meanwhile, .xsession-errors is flooded with messages
from GDK warning of XID collisions.

I run most of the Xorg stack from git and interestingly enough, this
behavior started a few weeks ago. I haven't had a chance to try
bisecting yet, but as soon I get a chance I'll drop a note.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/34

------------------------------------------------------------------------
On 2009-09-25T14:06:11+00:00 Ben Gamari wrote:

Created an attachment (id=29852)
Firefox backtrace with RenderBadPicture

It seems that Google Maps serves as an excellent reproduction case for
the Firefox crash. Opening Google Maps in a tab and closing it will
almost always result in a a RenderBadPicture within 3 attempts. Attached
is a backtrace from doing just that. Is it possible that this backtrace
is caused by aggressive XID reuse?

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/36

------------------------------------------------------------------------
On 2009-09-27T23:25:59+00:00 Bugs-freedesktop wrote:

(From update of attachment 29852)
(In reply to comment #6)
> Is it possible that this backtrace is caused by aggressive XID reuse?

I wouldn't have expected RenderBadPicture from this bug.  If you can get
a stack when running Firefox with --sync, then it would be best to file
a bug at https://bugzilla.mozilla.org/ under Core -> Widget: Gtk

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/37

------------------------------------------------------------------------
On 2009-09-28T07:00:23+00:00 Sandmann wrote:

The RenderBadPicture may be caused by running cairo master. If you are,
try downgrading to 1.8.8.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/38

------------------------------------------------------------------------
On 2009-09-28T10:21:12+00:00 Ben Gamari wrote:

(In reply to comment #8)
> The RenderBadPicture may be caused by running cairo master. If you are, try
> downgrading to 1.8.8.
> 

Yep, indeed I am running cairo from master. I just reverted and the
usual reproduction cases seem to be stable. This is evidently a known
issue? Has a bug been opened for it? Can I do anything to help? Thanks a
ton for your comment. I've been passively scratching my head over this
for weeks now.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/39

------------------------------------------------------------------------
On 2009-09-28T12:37:17+00:00 Sandmann wrote:

I don't know if a bug has been filed, but I do know that it has been
talked about on the #cairo IRC channel, and that at least Chris Wilson
is aware of it.

I'm sure they'd appreciate a bisecting, although that's a bit painful to
do because the bug isn't 100% reproducible.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/40

------------------------------------------------------------------------
On 2009-09-28T12:42:33+00:00 Ben Gamari wrote:

(In reply to comment #10)
> I don't know if a bug has been filed, but I do know that it has been talked
> about on the #cairo IRC channel, and that at least Chris Wilson is aware of it.
> 
Yeah, Chris and I talked briefly on #intel-gfx.

> I'm sure they'd appreciate a bisecting, although that's a bit painful to do
> because the bug isn't 100% reproducible. 
> 
I actually tried but it looks like the bug predates 1.8.8. Arg!

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/41

------------------------------------------------------------------------
On 2009-09-29T09:49:33+00:00 Sandmann wrote:

Note that if you install 1.8.8 on top of an 1.9 installation, you'll
need to delete the existing libcairo.so, or it won't take effect.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/42

------------------------------------------------------------------------
On 2009-09-29T09:56:06+00:00 Ben Gamari wrote:

(In reply to comment #12)
> Note that if you install 1.8.8 on top of an 1.9 installation, you'll need to
> delete the existing libcairo.so, or it won't take effect.
> 

Yep, restarted my Xorg session in between tests which I thought should
be sufficient. Moreover, I'm fairly certain the newly installed
libraries did take effect after the restart as a scaling bug seen in
firefox in 1.8.8 reared its head again. So anyways, I'm fairly confident
that I did in fact establish that the bug predates 1.8.8, although it
strikes me as odd that it's not seen by more people.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/43

------------------------------------------------------------------------
On 2009-09-29T10:01:47+00:00 Chris Wilson wrote:

My analysis into this bug indicates that the RenderBadPicture results
from a delayed cairo_surface_destroy() after firefox has called
XDestroyWindow() on the *parent* Window. In this situation firefox
should be calling cairo_surface_finish(), or cairo_surface_destroy() and
disposing of the cairo_surface_t, on the destroyed hierarchy.

So the RenderBadPicture is a separate bug (and not ours! ;-) from the
XID reuse.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/44

------------------------------------------------------------------------
On 2009-10-04T13:44:05+00:00 Sandmann wrote:

Well, I haven't looked into this bug, but for me, it is definitely the
case that it happens with cairo master and not with 1.8.8.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/45

------------------------------------------------------------------------
On 2009-10-08T12:25:00+00:00 Chris Wilson wrote:

Created an attachment (id=30180)
xtrace of a typical crash

Note that cairo calls RenderFreePicture (4ebda) immediately upon the
cairo_surface_finish() [which presumably is actually trigged by the
final cairo_surface_destroy() and is not being manually called], but the
drawable was destroyed much earlier (the DestroyNotify arrives at 47608)
and note that the drawable is never explicitly destroyed but is reaped
along with its parent (475f7).

The full trace is available at
http://people.freedesktop.org/~ickle/ff.crash.log

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/47

------------------------------------------------------------------------
On 2009-10-17T06:57:36+00:00 Roberto Jimeno wrote:

I saw a way to reproduce this bug in Firefox at:
https://bugzilla.mozilla.org/show_bug.cgi?id=522635
I can confirm it gets reliably triggered with cairo 1.9.4 but not with cairo 1.8.8

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/48

------------------------------------------------------------------------
On 2009-11-21T22:36:25+00:00 D. Hugh Redelmeier wrote:

Created an attachment (id=31379)
firefox crash and gdb of corpse

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/85

------------------------------------------------------------------------
On 2009-11-21T22:37:45+00:00 D. Hugh Redelmeier wrote:

I still get crashes from FireFox every few days.
Before each crash, I see one or more messages like this:
 (firefox:5290): Gdk-WARNING **: XID collision, trouble ahead

The actual crash is usually a SEGV.  I think that it is a null pointer
dereference but I cannot be sure because GDB is unreliable with
optimized code.  (I have an example where gdb prints 0 for a pointer
variable but when I look at the assembly code I see that that variable
is not represented at that point in the code.)

I don't think my problem has anything to do with cairo because I don't
find RenderBadPicture in any of the tracebacks.  Am I being naive?
Should I look for something else?  I'm using an up-to-date Fedora 11 on
x86-64; cairo-1.8.8-1.fc11.x86_64; no flash plugin.

I'm attaching a very long typescript of a firefox session that failed
and a gdb of the resulting core file.  Perhaps someone could tell if

I think that the Cairo problems are a different bug and should have a
different bugzilla entry.

The original posting in this bugzilla entry describes a bug that I still
think is real.  I imagine that this is the bug that is afflicting me.

I'm attaching a very long typescript of a firefox session that failed
and a gdb of the resulting core file.  Perhaps someone could tell from
this if what I've said in this comment is wrong.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/86

------------------------------------------------------------------------
On 2009-12-01T14:48:24+00:00 No-tellin wrote:

You may want to view the following video here:
http://www.youtube.com/watch?v=fwIwZazMTgM

I created this video to clearly demonstrate at least one trigger for the XID
Collision message. I believe there are at least two triggers and that both
triggers are adobe flash 10 related.

You can see from the video that you should have re-createable real life test
cases for this problem.

I run a Gentoo installation.

For those familiar with Gentoo, at the end of the video, I run:

emerge -epv mozilla-firefox | less
emerge --info

I have saved the output of these to text files if anyone is interested. Just
contact me.

The reason is that the emerge -epv mozilla-firefox command will display every
package and depencies required for mozilla-firefox. For the record, prior to
creating the video, I actually did re-compile every package in this list
(emerge -e mozilla-firefox) in order to ensure a clean run.

In the video, the left part of the screen is a konsole terminal window. The
right part of the screen is firefox. I start firefox with the command "firefox
-sync' in the terminal window.

I have FF set up to start with a number of tabs. As I change focus from tab to
tab, watch the terminal window. There are two tabs where changing focus causes
XID Collision messages to appear. It is particularly obvious that the error
messages are generated during flash activity. Note especially the generation of
messages as the flash window controls autohide and then re-appear. It's not
clear to me in the second tab (The Daily Show) what kind of flash control is
causing the messages. However, that site never seems to stop loading flash
objects. Or rather, my patience runs out before the flash downloads can
complete.

My reading of other people's problems suggest that x86 (i386) based systems
don't have this problem but please regard this as an unconfirmed data point.

In this thread in the Gentoo forums, I am 'dufeu':
http://forums.gentoo.org/viewtopic-t-788609-highlight-.html

The video best viewed in HD on a screen 1384x768 or larger. (full screen
mode)

Thank you all for your time and patience!

BTW - I did understand the discussion of asynchonous ID assignment and
release. However, while the problem seems to be properly identified, I'm
not sure that the exact trigger for invoking the problem has been
properly identified. I hope the video will be helpful. Unless I (as and
end-user) have completely misunderstood what I see, it's seems clear
that the actual trigger is probably flash 10.

Displaimer: I am only and end user. I am not a programmer.

Reply at: https://bugs.launchpad.net/firefox/+bug/401823/comments/89


** Changed in: xlibs
   Importance: Unknown => Medium

** Bug watch added: Mozilla Bugzilla #522635
   https://bugzilla.mozilla.org/show_bug.cgi?id=522635

-- 
Gdk-WARNING **: XID collision, trouble ahead
https://bugs.launchpad.net/bugs/401823
You received this bug notification because you are a member of Registry
Administrators, which is the registrant for xlibs.