← Back to team overview

kernel-packages team mailing list archive

[Bug 1140716] Re: [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on Sandybridge

 

Launchpad has imported 100 comments from the remote bug at
https://bugs.freedesktop.org/show_bug.cgi?id=54226.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2012-08-29T19:19:16+00:00 NT Man wrote:

Created attachment 66289
dmesg output

>From time to time interface freezes, and in dmesg appear these records:
[drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed...
blitter ring idle

$ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5)
00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1d.0 USB Controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation H61 Express Chipset Family LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05)
02:00.0 PCI bridge: ASMedia Technology Inc. Device 1080 (rev 01)
03:01.0 Multimedia audio controller: VIA Technologies Inc. VT1720/24 [Envy24PT/HT] PCI Multi-Channel Audio Controller (rev 01)
04:00.0 Ethernet controller: Atheros Communications AR8151 v2.0 Gigabit Ethernet (rev c0)
05:00.0 USB Controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
06:00.0 SATA controller: ASMedia Technology Inc. Device 0612 (rev 01)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/0

------------------------------------------------------------------------
On 2012-10-21T18:10:55+00:00 Chris Wilson wrote:

If you can easily reproduce this error, can you please build a kernel
using http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=xv-overlay
which has some revised memory barriers.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/1

------------------------------------------------------------------------
On 2012-10-27T08:06:45+00:00 NT Man wrote:

Can you help me to build rpm for fedora?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/2

------------------------------------------------------------------------
On 2012-11-22T09:53:50+00:00 Chris Wilson wrote:

On second thoughts, I think this should be fixed by the slight
robustification in more recent hangcheck.

Please try the latest kernel for your distribution (should be 3.6.7 atm)
and reopen if it still occurs.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/3

------------------------------------------------------------------------
On 2012-11-24T13:13:12+00:00 NT Man wrote:

I am use Fedora 18 with 3.6.7-5.fc18.i686 kernel and in dmesg output still exists message:
[22826.654365] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[22826.654369] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/4

------------------------------------------------------------------------
On 2012-11-24T14:39:24+00:00 Chris Wilson wrote:

That is not the same bug, so you need to attach a fresh set of debug
info (please remember the i915_error_state)...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/5

------------------------------------------------------------------------
On 2012-11-24T14:42:03+00:00 NT Man wrote:

Please, explain how get needed debug info. Thanks.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/6

------------------------------------------------------------------------
On 2012-11-24T14:50:07+00:00 Chris Wilson wrote:

http://intellinuxgraphics.org/how_to_report_bug.html

>From which we need the i915_error_state, so

$ sudo mount -tdebugfs debug /sys/kernel/debug
$ sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/7

------------------------------------------------------------------------
On 2012-11-24T14:57:07+00:00 NT Man wrote:

Created attachment 70518
i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/8

------------------------------------------------------------------------
On 2012-11-24T15:09:56+00:00 Chris Wilson wrote:

Looks that corresponds to the bug

commit 1c8b46fc8c865189f562c9ab163d63863759712f
Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Date:   Wed Nov 14 09:15:14 2012 +0000

    drm/i915: Use LRI to update the semaphore registers
    
    The bspec was recently updated to remove the ability to update the
    semaphore using the MI_SEMAPHORE_BOX command, the ability to wait upon
    the semaphore value remained. Instead the advice is to update the
    register using the MI_LOAD_REGISTER_IMM command. In cursory testing,
    semaphores continue to function - the question is whether this fixes
    some of the deadlocks where the semaphore registers contained stale
    values?
    
hopefully addresses.

That patch is only available on drm-intel-next at the moment, which is
available either at http://cgit.freedesktop.org/~danvet/drm-intel or
available as drm-intel-experimental in the ubuntu kernel-ppa.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/9

------------------------------------------------------------------------
On 2012-12-08T11:37:21+00:00 NT Man wrote:

Problem repeated with patched kernel.

[118637.439016] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[118637.439020] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[mikhail@localhost ~]$ uname -a
Linux localhost.localdomain 3.6.9-4.1.fc18.i686.PAE #1 SMP Wed Dec 5 15:16:33 UTC 2012 i686 i686 i386 GNU/Linux
[mikhail@localhost ~]$ sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state
[sudo] password for mikhail: 
[mikhail@localhost ~]$

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/10

------------------------------------------------------------------------
On 2012-12-08T11:38:52+00:00 NT Man wrote:

Created attachment 71192
i915_error_state (new)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/11

------------------------------------------------------------------------
On 2012-12-08T13:36:15+00:00 NT Man wrote:

sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state-8
cat: /sys/kernel/debug/dri/0/i915_error_state: Cannot allocate memory


What it mean??

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/12

------------------------------------------------------------------------
On 2012-12-08T13:37:34+00:00 NT Man wrote:

Created attachment 71199
i915_error_state (new)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/13

------------------------------------------------------------------------
On 2012-12-08T14:07:49+00:00 NT Man wrote:

Created attachment 71200
dmesg output (new)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/14

------------------------------------------------------------------------
On 2012-12-08T17:22:52+00:00 Chris Wilson wrote:

Lalalalala.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/15

------------------------------------------------------------------------
On 2012-12-09T21:25:18+00:00 Chris Wilson wrote:

*** Bug 58057 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/16

------------------------------------------------------------------------
On 2012-12-12T21:34:41+00:00 Chris Wilson wrote:

*** Bug 58212 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/17

------------------------------------------------------------------------
On 2012-12-13T08:30:44+00:00 Chris Wilson wrote:

We can confirm the synopsis by disabling semaphores (i915.semaphore=0),
but can we also test whether this is an rc6 side-effect
(i915.i915_enable_rc6-0)?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/18

------------------------------------------------------------------------
On 2012-12-13T08:35:07+00:00 Chris Wilson wrote:

Also maybe time for ' git revert
4e0e90dcb8a7df1229c69e30abebb59b0b3c2a1f'

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/19

------------------------------------------------------------------------
On 2012-12-15T14:20:12+00:00 NT Man wrote:

Created attachment 71549
i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/20

------------------------------------------------------------------------
On 2012-12-15T14:21:59+00:00 NT Man wrote:

Created attachment 71550
dmesg

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/21

------------------------------------------------------------------------
On 2012-12-17T07:24:13+00:00 NT Man wrote:

Created attachment 71629
i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/22

------------------------------------------------------------------------
On 2012-12-17T07:24:33+00:00 NT Man wrote:

Created attachment 71630
dmesg

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/23

------------------------------------------------------------------------
On 2012-12-30T10:28:48+00:00 Chris Wilson wrote:

Mikhail, for the time being you can set i915.semaphores=0 (or echo 0 >
/sys/modules/i915/parameters/semaphores) to prevent this hang.

The only interesting patch I can suggest atm is

commit 31643d54a739382626c27c0f2a12b3bbc22d1a38
Author: Ben Widawsky <ben@xxxxxxxxxxxx>
Date:   Wed Sep 26 10:34:01 2012 -0700

    drm/i915: Workaround to bump rc6 voltage to 450
    
    BIOS should be setting the minimum voltage for rc6 to be 450mV. Old or
    buggy BIOSen may not be doing this, so we correct it for them. Ideally
    customers should update the BIOS as only it would know the optimal
    values for the platform, so we leave that fact as a DRM_ERROR for the
    user to see.

in 3.8-rc1 or look for a BIOS update.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/24

------------------------------------------------------------------------
On 2013-01-03T16:00:37+00:00 Chris Wilson wrote:

*** Bug 58986 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/25

------------------------------------------------------------------------
On 2013-01-10T01:10:49+00:00 Chris Wilson wrote:

Created attachment 72766
Read back semaphore mboxes after update

Can you please try this patch, enable semaphores and see if the bug
persists?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/26

------------------------------------------------------------------------
On 2013-01-10T01:42:16+00:00 NT Man wrote:

(In reply to comment #24)
> Mikhail, for the time being you can set i915.semaphores=0 (or echo 0 >
> /sys/modules/i915/parameters/semaphores) to prevent this hang.

What are the consequences?

> The only interesting patch I can suggest atm is
> 
> commit 31643d54a739382626c27c0f2a12b3bbc22d1a38
> Author: Ben Widawsky <ben@xxxxxxxxxxxx>
> Date:   Wed Sep 26 10:34:01 2012 -0700
> 
>     drm/i915: Workaround to bump rc6 voltage to 450
>     
>     BIOS should be setting the minimum voltage for rc6 to be 450mV. Old or
>     buggy BIOSen may not be doing this, so we correct it for them. Ideally
>     customers should update the BIOS as only it would know the optimal
>     values for the platform, so we leave that fact as a DRM_ERROR for the
>     user to see.
> 
> in 3.8-rc1 or look for a BIOS update.

I have H61M/U3S3 motherboard and you latest BIOS ver 2.20 from 8/15/2012
ftp://174.142.97.10/bios/1155/H61MU3S3(2.20)ROM.zip
How to check problem persists or not?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/27

------------------------------------------------------------------------
On 2013-01-10T02:30:37+00:00 Chris Wilson wrote:

(In reply to comment #27)
> (In reply to comment #24)
> > Mikhail, for the time being you can set i915.semaphores=0 (or echo 0 >
> > /sys/modules/i915/parameters/semaphores) to prevent this hang.
> 
> What are the consequences?

Rendering throughput is dropped by 10% with SNA, or as much as 3x with
UXA. OpenGL performance is likely to be reduced by about 30%. More CPU
time is spent waiting for the GPU with rc6 disabled, so increased power
consumption.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/28

------------------------------------------------------------------------
On 2013-01-20T20:07:03+00:00 Ben-chgtaa3qpp0 wrote:

(In reply to comment #27)

> > The only interesting patch I can suggest atm is
> > 
> > commit 31643d54a739382626c27c0f2a12b3bbc22d1a38
> > Author: Ben Widawsky <ben@xxxxxxxxxxxx>
> > Date:   Wed Sep 26 10:34:01 2012 -0700
> > 
> >     drm/i915: Workaround to bump rc6 voltage to 450
> >     
> >     BIOS should be setting the minimum voltage for rc6 to be 450mV. Old or
> >     buggy BIOSen may not be doing this, so we correct it for them. Ideally
> >     customers should update the BIOS as only it would know the optimal
> >     values for the platform, so we leave that fact as a DRM_ERROR for the
> >     user to see.
> > 
> > in 3.8-rc1 or look for a BIOS update.
> 
> I have H61M/U3S3 motherboard and you latest BIOS ver 2.20 from 8/15/2012
> ftp://174.142.97.10/bios/1155/H61MU3S3(2.20)ROM.zip
> How to check problem persists or not?

The easiest way is to apply the patch and look for DRM_DEBUG_DRIVER
messages. This is unlikely to fix the problem, but also can't hurt.

We've only assumed new BIOS will fix the problem, but who knows.
Especially if it's a 3rd party BIOS.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/29

------------------------------------------------------------------------
On 2013-01-24T10:33:59+00:00 Chris Wilson wrote:

*** Bug 59786 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/30

------------------------------------------------------------------------
On 2013-01-24T11:07:56+00:00 Daniel-ffwll wrote:

Created attachment 73560
write mbox regs twice on snb

Another piece of magic which might help. Please test this patch and the
one from Chris ("Read back semaphore mboxes after update") separately
and report back whether anything changes.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/31

------------------------------------------------------------------------
On 2013-01-24T13:21:57+00:00 Daniel-ffwll wrote:

Created attachment 73577
write mbox regs twice on snb, v2

Now actually the right patch attached, the old one didn't compile ...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/32

------------------------------------------------------------------------
On 2013-01-30T21:01:58+00:00 NT Man wrote:

Which patch I need applied for fix this issue?

I see that patches from comment 26 and 32  have similar logic...

@@ -596,6 +606,16 @@ gen6_add_request(struct intel_ring_buffer *ring)
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
+	if (IS_GEN6(ring->dev)) {
+		ret = intel_ring_begin(ring, 6);
+		if (ret)
+			return ret;
+
+		read_mboxes(ring, mbox1_reg, 1024);
+		read_mboxes(ring, mbox2_reg, 1028);
+		intel_ring_advance(ring);
+	}
+
 	return 0;
 }

@@ -598,6 +598,19 @@ gen6_add_request(struct intel_ring_buffer *ring)
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	intel_ring_advance(ring);
 
+	if (IS_GEN6(ring->dev)) {
+		ret = intel_ring_begin(ring, 6);
+		if (ret)
+			return ret;
+
+		mbox1_reg = ring->signal_mbox[0];
+		mbox2_reg = ring->signal_mbox[1];
+
+		update_mboxes(ring, mbox1_reg);
+		update_mboxes(ring, mbox2_reg);
+		intel_ring_advance(ring);
+	}
+
 	return 0;
 }

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/33

------------------------------------------------------------------------
On 2013-01-30T21:37:10+00:00 Daniel-ffwll wrote:

> --- Comment #33 from mikhail.v.gavrilov@xxxxxxxxx ---
> Which patch I need applied for fix this issue?

We can't reproduce the bug, so those are just patches to test
different ideas. Please test them both each individually (i.e. remove
the first before testing the 2nd patch) and the report whether
anything changes (i.e. harder or easier for you to hit the issue).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/34

------------------------------------------------------------------------
On 2013-02-02T14:25:05+00:00 NT Man wrote:

Can't compile kernel with patch above:

drivers/gpu/drm/i915/intel_ringbuffer.c: In function 'gen6_add_request':
drivers/gpu/drm/i915/intel_ringbuffer.c:611:3: error: too few arguments to function 'update_mboxes'
drivers/gpu/drm/i915/intel_ringbuffer.c:557:1: note: declared here
drivers/gpu/drm/i915/intel_ringbuffer.c:612:3: error: too few arguments to function 'update_mboxes'
drivers/gpu/drm/i915/intel_ringbuffer.c:557:1: note: declared here
make[4]: *** [drivers/gpu/drm/i915/intel_ringbuffer.o] Error 1
make[3]: *** [drivers/gpu/drm/i915] Error 2
make[2]: *** [drivers/gpu/drm] Error 2
make[1]: *** [drivers/gpu] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [drivers] Error 2
make: *** Waiting for unfinished jobs....

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/35

------------------------------------------------------------------------
On 2013-02-02T14:25:54+00:00 NT Man wrote:

Created attachment 74087
kernel.spec

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/36

------------------------------------------------------------------------
On 2013-02-10T18:49:10+00:00 NT Man wrote:

Created attachment 74561
i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/37

------------------------------------------------------------------------
On 2013-02-10T20:26:14+00:00 NT Man wrote:

Created attachment 74566
i915_error_state (kernel 3.8 Ubuntu)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/38

------------------------------------------------------------------------
On 2013-02-13T19:28:34+00:00 NT Man wrote:

Created attachment 74779
i915_error_state (kernel 3.7 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/39

------------------------------------------------------------------------
On 2013-02-13T20:05:33+00:00 NT Man wrote:

Created attachment 74781
i915_error_state (kernel 3.7 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/40

------------------------------------------------------------------------
On 2013-02-15T03:22:32+00:00 NT Man wrote:

Created attachment 74850
i915_error_state (kernel 3.7 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/41

------------------------------------------------------------------------
On 2013-02-20T05:00:01+00:00 Norman Yarvin wrote:

I'm seeing this bug, or something like it, on an older chip (G965,
desktop version):

Feb 19 22:05:56 muttonhead kernel: [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Feb 19 22:05:56 muttonhead kernel: [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Feb 19 22:05:56 muttonhead kernel: [drm:kick_ring] *ERROR* Kicking stuck wait on render ring
Feb 19 22:05:57 muttonhead kernel: [drm:i915_reset] *ERROR* Failed to reset chip.

after which the mouse pointer sticks in one spot (with most other things
working), and then when I shut down X, the console fails to appear,
requiring a reboot.  Not knowing that the given file path was under
/sys/kernel, I failed to capture the error state, but will do so next
time this happens (which is maybe every other day).  This is with a 3.7
kernel (Gentoo); before 3.7, the driver was stable.  I don't know what
the 'generation' numbers in the driver mean, but I'm guessing that
generation 6 is later, so many of the suggested fixes would not make any
difference on this machine.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/42

------------------------------------------------------------------------
On 2013-02-20T09:13:34+00:00 Chris Wilson wrote:

(In reply to comment #42)
> I'm seeing this bug, or something like it, on an older chip (G965, desktop
> version):

Good news, it is not this bug. Please make sure you have the latest
stable driver (a gentoo user not using 3.8 already! ;-) and latest xf86
-video-intel, then file a fresh bug report, attaching your dmesg,
Xorg.0.log and i915_error_state.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/43

------------------------------------------------------------------------
On 2013-02-20T14:04:23+00:00 gneman wrote:

I subscribed to this bug because I was seeing this hang too. It happened
randomly several times, without a specific cause or way to reproduce it.

This was around December, and it happened maybe 4-5 times along a month.
The GPU would hang with that error in dmesg, and everything continued to
work, though very slowly.

However, I must say that since then it didn't happen again for almost 2
months maybe. I use Arch Linux, which means I always update to the
latest stable packages of everything, so it seems that for me it got
solved at some point (or at least much harder to reproduce).

This is an Ironlake / HD 2000 based Dell laptop. I did update the BIOS
when I found this bug report, but it didn't solve the problem, the hang
happened after updating it.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/44

------------------------------------------------------------------------
On 2013-02-22T22:14:55+00:00 Chris Wilson wrote:

*** Bug 61310 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/45

------------------------------------------------------------------------
On 2013-03-03T07:05:37+00:00 NT Man wrote:

Created attachment 75818
i915_error_state (kernel 3.8.1 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/51

------------------------------------------------------------------------
On 2013-03-03T07:07:04+00:00 NT Man wrote:

Today Fedora 18 updated kernel to 3.8.1 and message
"[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung"
still here. Please look at my last log. Any updates?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/52

------------------------------------------------------------------------
On 2013-03-06T02:29:32+00:00 Ben-chgtaa3qpp0 wrote:

This looks weird to me:

0x00005a58:      0x11000001: MI_LOAD_REGISTER_IMM
0x00005a5c:      0x00012044:    dword 1
0x00005a60:      0x0043b625:    dword 2
0x00005a64:      0x11000001: MI_LOAD_REGISTER_IMM
0x00005a68:      0x00022040:    dword 1
0x00005a6c:      0x0043b625:    dword 2
0x00005a70:      0x10800001: MI_STORE_DATA_INDEX
0x00005a74:      0x00000080:    index
0x00005a78:      0x0043b625:    dword
0x00005a7c:      0x01000000: MI_USER_INTERRUPT
0x00005a80:      0x0b160001: MI_SEMAPHORE_MBOX compare semaphore, use compare reg 2
0x00005a84:      0x0043b625:    value
0x00005a88:      0x00000000:    address
0x00005a8c:      0x00000000: MI_NOOP


Chris?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/56

------------------------------------------------------------------------
On 2013-03-06T09:03:14+00:00 Chris Wilson wrote:

Weird? Did you just forget about that the hw does a strictly greater-
than comparison?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/57

------------------------------------------------------------------------
On 2013-03-06T09:04:19+00:00 Chris Wilson wrote:

(In reply to comment #47)
> Today Fedora 18 updated kernel to 3.8.1 and message
> "[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung"
> still here. Please look at my last log. Any updates?

We're still waiting upon you apply patches and report.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/58

------------------------------------------------------------------------
On 2013-03-06T22:51:02+00:00 Daniel-ffwll wrote:

*** Bug 61925 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/59

------------------------------------------------------------------------
On 2013-03-08T22:20:09+00:00 NT Man wrote:

Created attachment 76196
i915_error_state (kernel 3.8.1 Fedora) with path (write mbox regs twice on snb, v2)

I am applied patch "write mbox regs twice on snb, v2" but still have
problem [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU
hung

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/60

------------------------------------------------------------------------
On 2013-03-09T08:16:24+00:00 NT Man wrote:

Created attachment 76208
i915_error_state (kernel 3.8.1 Fedora) with path (Read back semaphore mboxes after update)

I am also applied patch "Read back semaphore mboxes after update" but
still have problem [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
elapsed... GPU hung

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/61

------------------------------------------------------------------------
On 2013-03-09T10:24:13+00:00 Chris Wilson wrote:

(In reply to comment #52)
> Created attachment 76196 [details]
> i915_error_state (kernel 3.8.1 Fedora) with path (write mbox regs twice on
> snb, v2)
> 
> I am applied patch "write mbox regs twice on snb, v2" but still have problem
> [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

0x00052cc8:      0x18800100: MI_BATCH_BUFFER_START
0x00052ccc:      0x0d59b000:    dword 1
0x00052cd0:      0x13000001: MI_FLUSH_DW post_sync_op='no write' 
0x00052cd4:      0x000000c4:    address
0x00052cd8:      0x00000000:    dword
0x00052cdc:      0x00000000: MI_NOOP
0x00052ce0:      0x11000001: MI_LOAD_REGISTER_IMM
0x00052ce4:      0x00002044:    dword 1
0x00052ce8:      0x0007a582:    dword 2
0x00052cec:      0x11000001: MI_LOAD_REGISTER_IMM
0x00052cf0:      0x00012040:    dword 1
0x00052cf4:      0x0007a582:    dword 2
0x00052cf8:      0x10800001: MI_STORE_DATA_INDEX
0x00052cfc:      0x00000080:    index
0x00052d00:      0x0007a582:    dword
0x00052d04:      0x01000000: MI_USER_INTERRUPT

That's only a single LRI per semaphore, the patch wasn't tested.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/62

------------------------------------------------------------------------
On 2013-03-09T10:25:44+00:00 Chris Wilson wrote:

I would say '3.8.1-203.fc18.i686.PAE' was the distro kernel and not your
patched version.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/63

------------------------------------------------------------------------
On 2013-03-09T11:56:27+00:00 NT Man wrote:

Created attachment 76215
kernel.spec

(In reply to comment #55)
> I would say '3.8.1-203.fc18.i686.PAE' was the distro kernel and not your
> patched version.

It's impossible. Distro kernel is 3.8.1-201.fc18.i686.PAE.
3.8.1-202.fc18.i686.PAE and 3.8.1-203.fc18.i686.PAE is kernels patched
by me.

You can sure if look at my build spec file.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/64

------------------------------------------------------------------------
On 2013-03-09T19:31:22+00:00 NT Man wrote:

Created attachment 76239
i915_error_state (kernel 3.8.1 Fedora) with path (Read back semaphore mboxes after update)

I am sorry. Seems I forgot add "ApplyPatch" to spec. I am rebuild kernel
with "0001-drm-i915-Read-back-semaphore-mboxes-after-updating-t.patch"
patch, but seems problem still here.

Does it make sense to check the "0001-write-mbox-regs-twice-on-
gen6.patch" patch?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/65

------------------------------------------------------------------------
On 2013-03-09T21:19:24+00:00 NT Man wrote:

Created attachment 76243
i915_error_state (kernel 3.8.1 Fedora) with path (Read back semaphore mboxes after update)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/66

------------------------------------------------------------------------
On 2013-03-10T09:05:05+00:00 NT Man wrote:

Created attachment 76261
i915_error_state (kernel 3.8.2 Fedora) with path (write mbox regs twice on snb, v2)

"write mbox regs twice on snb, v2" patch also not solve problem.

[ 1399.270341] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 1399.270345] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 1399.277331] [drm:__gen6_gt_force_wake_get] *ERROR* Timed out waiting for forcewake old ack to clear.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/67

------------------------------------------------------------------------
On 2013-03-10T21:47:45+00:00 NT Man wrote:

Created attachment 76293
i915_error_state (kernel 3.8.2 Fedora) with path (write mbox regs twice on snb, v2)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/68

------------------------------------------------------------------------
On 2013-03-12T21:49:44+00:00 NT Man wrote:

Created attachment 76448
i915_error_state (kernel 3.8.2 Fedora) with path (write mbox regs twice on snb, v2)

Any updates?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/71

------------------------------------------------------------------------
On 2013-03-17T18:49:52+00:00 Chris Wilson wrote:

*** Bug 62443 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/79

------------------------------------------------------------------------
On 2013-03-19T08:59:38+00:00 Chris Wilson wrote:

As a workaround, this

commit a24a11e6b4e96bca817f854e0ffcce75d3eddd13
Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Date:   Thu Mar 14 17:52:05 2013 +0200

    drm/i915: Resurrect ring kicking for semaphores, selectively

should improve the recovery from the hangs.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/81

------------------------------------------------------------------------
On 2013-03-20T10:18:36+00:00 CB wrote:

OK, I've been experiencing this bug from time to time on my Arch Linux
box. No apparent reason, last time it happened I was watching a Youtube
video, and it also seems to happen more often when I'm running
VirtualBox. However, this might just be a coincidence.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/86

------------------------------------------------------------------------
On 2013-03-31T09:36:10+00:00 Longerdev wrote:

I have this bug too.

Gentoo 64bit
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
        Subsystem: Samsung Electronics Co Ltd Device c0a0
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at f5c00000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at e000 [size=64]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: <access denied>
        Kernel driver in use: i915

Kernel 3.8.0 gentoo-sources

I try patch a24a11e6b4e96bca817f854e0ffcce75d3eddd13, but nothing change.
Mar 31 15:14:37 localhost kernel: [64379.291736] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 31 15:14:37 localhost kernel: [64379.291742] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/132

------------------------------------------------------------------------
On 2013-04-05T12:52:57+00:00 Mika-kuoppala wrote:

Created attachment 77475
[PATCH] drm/i915: Resurrect ring kicking for semaphores, selectively

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/157

------------------------------------------------------------------------
On 2013-04-05T12:55:10+00:00 Mika-kuoppala wrote:

(In reply to comment #61)
> Created attachment 76448 [details]
> i915_error_state (kernel 3.8.2 Fedora) with path (write mbox regs twice on
> snb, v2)
> 
> Any updates?

Mikhail,

Could you please try patch:
[PATCH] drm/i915: Resurrect ring kicking for semaphores, selectively

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/158

------------------------------------------------------------------------
On 2013-04-05T19:03:41+00:00 Daniel-ffwll wrote:

Patch is also included in latest drm-intel-nightly, linux-next. So you
can test it by grabbing a distro-build of one of those.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/160

------------------------------------------------------------------------
On 2013-04-09T20:59:41+00:00 NT Man wrote:

(In reply to comment #67)
> (In reply to comment #61)
> > Created attachment 76448 [details]
> > i915_error_state (kernel 3.8.2 Fedora) with path (write mbox regs twice on
> > snb, v2)
> > 
> > Any updates?
> 
> Mikhail,
> 
> Could you please try patch:
> [PATCH] drm/i915: Resurrect ring kicking for semaphores, selectively

Hm, seems better but problem still here

[59120.008798] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[59120.008802] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[59120.012173] [drm:kick_ring] *ERROR* Kicking stuck semaphore on render ring

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/196

------------------------------------------------------------------------
On 2013-04-09T21:01:13+00:00 NT Man wrote:

Created attachment 77692
i915_error_state (kernel 3.8.5 Fedora) with path (drm/i915: Resurrect ring kicking for semaphores, selectively)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/197

------------------------------------------------------------------------
On 2013-04-09T21:02:28+00:00 NT Man wrote:

Created attachment 77693
dmesg (kernel 3.8.5 Fedora) with path (drm/i915: Resurrect ring kicking for semaphores, selectively)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/198

------------------------------------------------------------------------
On 2013-04-09T21:08:44+00:00 Chris Wilson wrote:

\o/ It kicked the right ring.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/199

------------------------------------------------------------------------
On 2013-04-09T21:16:56+00:00 NT Man wrote:

(In reply to comment #72)
> \o/ It kicked the right ring.

So is this normal?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/200

------------------------------------------------------------------------
On 2013-04-09T22:13:53+00:00 Chris Wilson wrote:

It's the expected 'improved' recovery behaviour for this bug.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/201

------------------------------------------------------------------------
On 2013-04-15T09:35:58+00:00 Chris Wilson wrote:

*** Bug 63542 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/217

------------------------------------------------------------------------
On 2013-04-22T23:55:08+00:00 B-harrington wrote:

Chris, what is the upstream status for the ring kicker patch?  Is that
likely to get incorporated upstream, or do you feel it needs further
polish before it's ready?  Would this patch incur some risk of
regressions in other areas were it be backported for inclusion in
Ubuntu?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/265

------------------------------------------------------------------------
On 2013-04-23T14:57:39+00:00 Daniel-ffwll wrote:

(In reply to comment #76)
> Chris, what is the upstream status for the ring kicker patch?  Is that
> likely to get incorporated upstream, or do you feel it needs further polish
> before it's ready?  Would this patch incur some risk of regressions in other
> areas were it be backported for inclusion in Ubuntu?

Merged for 3.10 as

commit a24a11e6b4e96bca817f854e0ffcce75d3eddd13
Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Date:   Thu Mar 14 17:52:05 2013 +0200

    drm/i915: Resurrect ring kicking for semaphores, selectively

Nothing else planned for now, but I think we can just keep this bug here
open in case we stumble across a new idea. And it seems to be good honey
to attrack all the me,too reports ;-)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/267

------------------------------------------------------------------------
On 2013-04-23T16:37:43+00:00 Tomwij-1 wrote:

(In reply to comment #65)
> Kernel 3.8.0 gentoo-sources

Did you report this at the Gentoo Bugzilla?

When you do, please attach /debug/dri/0/i915_error_state

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/269

------------------------------------------------------------------------
On 2013-04-29T12:13:16+00:00 Longerdev wrote:

>Did you report this at the Gentoo Bugzilla?

>When you do, please attach /debug/dri/0/i915_error_state

Now no report in gentoo bugzilla (so as in kernel they no have patches
intel drivers). But now with it patch, I can't repeat bug 2 weeks on
kernel 3.9-rc6. But I no test with blender (when I try use blender, GPU
hung reapeted for 1-5 minutes).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/289

------------------------------------------------------------------------
On 2013-05-01T06:44:05+00:00 Chris Wilson wrote:

*** Bug 64094 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/292

------------------------------------------------------------------------
On 2013-05-01T07:04:06+00:00 NT Man wrote:

Created attachment 78692
i915_error_state (kernel 3.9 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/293

------------------------------------------------------------------------
On 2013-05-01T07:04:33+00:00 NT Man wrote:

Created attachment 78693
i915_error_state (kernel 3.9 Fedora)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/294

------------------------------------------------------------------------
On 2013-05-07T07:42:11+00:00 Chris Wilson wrote:

*** Bug 64094 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/303

------------------------------------------------------------------------
On 2013-05-23T12:46:34+00:00 Freedesktop-l wrote:

Created attachment 79704
i915_error_state - kernel 3.10-rc2, dual monitor, Dell E6430

I can reproduce this bug every time I try to quickly drag a Chrome
window with a YouTube movie to a secondary monitor connected to my
laptop Dell E6430. It is very annoying. Tested on latest kernel
3.10-rc2.

I can give you any additional information you want, test patches, etc.
Just please try to fix this :)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/308

------------------------------------------------------------------------
On 2013-05-23T12:55:27+00:00 Freedesktop-l wrote:

(In reply to comment #84)
> Created attachment 79704 [details]
> i915_error_state - kernel 3.10-rc2, dual monitor, Dell E6430
> 
> I can reproduce this bug every time I try to quickly drag a Chrome window
> with a YouTube movie to a secondary monitor connected to my laptop Dell
> E6430.

One more information - you need to enable "Override software rendering
list" in chrome://flags

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/309

------------------------------------------------------------------------
On 2013-05-29T17:54:42+00:00 Cwawak wrote:

Created attachment 79979
i915_error_state - 3.9.2-201.rhbz879823.fc18.x86_64 (included patch write mbox regs twice on snb, v2)

Linux bobloblaw 3.9.2-201.rhbz879823.fc18.x86_64 #1 SMP Thu May 16
13:35:12 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

[45482.757631] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[45482.757645] [drm] capturing error event; look for more information in/sys/kernel/debug/dri/0/i915_error_state
[45482.766942] [drm:kick_ring] *ERROR* Kicking stuck semaphore on render ring
[45482.770617] [drm:__gen6_gt_force_wake_get] *ERROR* Timed out waiting for forcewake old ack to clear.

I added patch (drm/i915: Resurrect ring kicking for semaphores,
selectively) to Fedora 18's 3.9.2-200 x86_64 kernel.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/314

------------------------------------------------------------------------
On 2013-06-30T20:28:46+00:00 Cwawak wrote:

Is there any input or assistance I can give to help move this along?

Thanks!

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/333

------------------------------------------------------------------------
On 2013-07-20T21:33:52+00:00 Chris Wilson wrote:

Created attachment 82747
New read-after-write patch

New patch for testing, thanks!

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/349

------------------------------------------------------------------------
On 2013-07-20T21:35:45+00:00 Chris Wilson wrote:

Created attachment 82748
New read-after-write patch

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/350

------------------------------------------------------------------------
On 2013-07-20T21:46:24+00:00 NT Man wrote:

For which version of the kernel this patch?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/351

------------------------------------------------------------------------
On 2013-07-21T10:49:00+00:00 Longerdev wrote:

I tried it patch on linux-3.11_rc1, but when X starting I see:
791966 Jul 21 16:17:07 localhost kernel: [   19.320879] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
791967 Jul 21 16:17:07 localhost kernel: [   19.320948] IP: [<ffffffff8136bfc0>] gen6_add_request+0xe7/0x178
791968 Jul 21 16:17:07 localhost kernel: [   19.320995] PGD b0d80067 PUD b0c18067 PMD 0
791969 Jul 21 16:17:07 localhost kernel: [   19.321031] Oops: 0000 [#1] PREEMPT SMP
791970 Jul 21 16:17:07 localhost kernel: [   19.321064] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec brcmsmac snd_hwdep snd_p       cm cordic brcmutil bcma snd_page_alloc snd_timer snd soundcore
791971 Jul 21 16:17:07 localhost kernel: [   19.321209] CPU: 0 PID: 2696 Comm: X Not tainted 3.11.0-rc1 #1
791972 Jul 21 16:17:07 localhost kernel: [   19.321249] Hardware name: SAMSUNG ELECTRONICS CO., LTD. SF311/SF411/SF511/SF311/SF411/SF511, BIOS 06HW.M011.20110503.SCY 05       /03/2011
791973 Jul 21 16:17:07 localhost kernel: [   19.321322] task: ffff8800b1c07590 ti: ffff8800b0c24000 task.ti: ffff8800b0c24000
791974 Jul 21 16:17:07 localhost kernel: [   19.321370] RIP: 0010:[<ffffffff8136bfc0>]  [<ffffffff8136bfc0>] gen6_add_request+0xe7/0x178
791975 Jul 21 16:17:07 localhost kernel: [   19.321426] RSP: 0018:ffff8800b0c25bc8  EFLAGS: 00010286
791976 Jul 21 16:17:07 localhost kernel: [   19.321461] RAX: 0000000000000000 RBX: ffff8800b1c3d4d8 RCX: 0000000000027330
791977 Jul 21 16:17:07 localhost kernel: [   19.321506] RDX: 0000000000000080 RSI: ffffc900045c003c RDI: ffffc900045c0038
791978 Jul 21 16:17:07 localhost kernel: [   19.321550] RBP: ffff8800b0c25c08 R08: ffff8800b0d97f00 R09: 00000000000145c0
791979 Jul 21 16:17:07 localhost kernel: [   19.321594] R10: 0000000000001000 R11: ffff8800b1c3c000 R12: 0000000000000000
791980 Jul 21 16:17:07 localhost kernel: [   19.321638] R13: 0000000000002044 R14: 0000000000000000 R15: ffff8800b1c3c000
791981 Jul 21 16:17:07 localhost kernel: [   19.321682] FS:  00007ff167ae8880(0000) GS:ffff880100200000(0000) knlGS:0000000000000000
791982 Jul 21 16:17:07 localhost kernel: [   19.321732] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
791983 Jul 21 16:17:07 localhost kernel: [   19.321767] CR2: 0000000000000010 CR3: 00000000b1cc9000 CR4: 00000000000407f0
791984 Jul 21 16:17:07 localhost kernel: [   19.321810] Stack:
791985 Jul 21 16:17:07 localhost kernel: [   19.321824]  ffff8800b1c3d4d8 0000000000000000 ffff8800aff24000 0000000000000000
791986 Jul 21 16:17:07 localhost kernel: [   19.321876]  ffff8800b1c3c000 ffff8800b0d97f00 ffff8800b1f66a00 ffff8800b1c3d4d8
791987 Jul 21 16:17:07 localhost kernel: [   19.321927]  ffff8800b0c25c68 ffffffff81334b11 ffff880000000028 0000000000000000
791988 Jul 21 16:17:07 localhost kernel: [   19.321979] Call Trace:
791989 Jul 21 16:17:07 localhost kernel: [   19.322000]  [<ffffffff81334b11>] __i915_add_request+0x6d/0x215
791990 Jul 21 16:17:07 localhost kernel: [   19.322045]  [<ffffffff8133b8d9>] i915_gem_do_execbuffer.isra.14+0xd07/0xdc5
791991 Jul 21 16:17:07 localhost kernel: [   19.322089]  [<ffffffff8133bd5e>] ? i915_gem_execbuffer2+0x5d/0x1e3
791992 Jul 21 16:17:07 localhost kernel: [   19.322128]  [<ffffffff8133be5a>] i915_gem_execbuffer2+0x159/0x1e3
791993 Jul 21 16:17:07 localhost kernel: [   19.322170]  [<ffffffff8130e167>] drm_ioctl+0x302/0x446
791994 Jul 21 16:17:07 localhost kernel: [   19.322204]  [<ffffffff8133bd01>] ? i915_gem_execbuffer+0x36a/0x36a
791995 Jul 21 16:17:07 localhost kernel: [   19.322245]  [<ffffffff8102a823>] ? __do_page_fault+0x34f/0x3f3
791996 Jul 21 16:17:07 localhost kernel: [   19.322285]  [<ffffffff810d3621>] vfs_ioctl+0x21/0x34
791997 Jul 21 16:17:07 localhost kernel: [   19.322317]  [<ffffffff810d3e7a>] do_vfs_ioctl+0x3b8/0x3fb
791998 Jul 21 16:17:07 localhost kernel: [   19.322353]  [<ffffffff810dbab9>] ? fget_light+0xa1/0xb8
791999 Jul 21 16:17:07 localhost kernel: [   19.322387]  [<ffffffff810d3efd>] SyS_ioctl+0x40/0x6b
792000 Jul 21 16:17:07 localhost kernel: [   19.322420]  [<ffffffff816450d2>] system_call_fastpath+0x16/0x1b
792001 Jul 21 16:17:07 localhost kernel: [   19.322457] Code: e8 d4 c0 f0 ff 8b 73 2c 44 89 ef 83 c6 04 89 73 2c 48 03 73 10 e8 bf c0 f0 ff 8b 73 2c 48 8b 45 c8 83 c6 0       4 89 73 2c 48 03 73 10 <8b> 78 10 83 ef 80 e8 a3 c0 f0 ff 83 43 2c 04 49 ff c4 49 83 fc
792002 Jul 21 16:17:07 localhost kernel: [   19.322688] RIP  [<ffffffff8136bfc0>] gen6_add_request+0xe7/0x178
792003 Jul 21 16:17:07 localhost kernel: [   19.322728]  RSP <ffff8800b0c25bc8>
792004 Jul 21 16:17:07 localhost kernel: [   19.322750] CR2: 0000000000000010
792005 Jul 21 16:17:07 localhost kernel: [   19.330669] ---[ end trace b13215eb98a2df5f ]---

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/352

------------------------------------------------------------------------
On 2013-07-21T11:05:03+00:00 Chris Wilson wrote:

Created attachment 82768
New read-after-write patch

Oops, my mistake, please try again.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/353

------------------------------------------------------------------------
On 2013-07-21T11:38:42+00:00 Longerdev wrote:

Created attachment 82773
i915_error_state with new patch

(In reply to comment #92)
> Created attachment 82768 [details] [review]
> New read-after-write patch
> 
> Oops, my mistake, please try again.

Now loading, but after five minutes test:
793485 Jul 21 17:32:56 localhost kernel: [  321.432882] hda-intel 0000:00:1b.0: Unstable LPIB (32740 >= 4096); disabling LPIB delay counting
793486 Jul 21 17:34:49 localhost kernel: [  434.291085] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
793487 Jul 21 17:34:49 localhost kernel: [  434.291088] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
793488 Jul 21 17:34:49 localhost kernel: [  434.307124] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xbfe2000 ctx 1) at 0xbfe21dc

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/354

------------------------------------------------------------------------
On 2013-07-21T11:42:36+00:00 Chris Wilson wrote:

(In reply to comment #93)
> Created attachment 82773 [details]
> i915_error_state with new patch
> 
> (In reply to comment #92)
> > Created attachment 82768 [details] [review] [review]
> > New read-after-write patch
> > 
> > Oops, my mistake, please try again.
> 
> Now loading, but after five minutes test:
> 793485 Jul 21 17:32:56 localhost kernel: [  321.432882] hda-intel
> 0000:00:1b.0: Unstable LPIB (32740 >= 4096); disabling LPIB delay counting
> 793486 Jul 21 17:34:49 localhost kernel: [  434.291085]
> [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
> 793487 Jul 21 17:34:49 localhost kernel: [  434.291088] [drm] capturing
> error event; look for more information in
> /sys/kernel/debug/dri/0/i915_error_state
> 793488 Jul 21 17:34:49 localhost kernel: [  434.307124]
> [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xbfe2000
> ctx 1) at 0xbfe21dc

That is a blorp (mesa/i965) bug and not the semaphore deadlock.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/355

------------------------------------------------------------------------
On 2013-08-11T11:53:14+00:00 Chris Wilson wrote:

Will someone please try
https://bugs.freedesktop.org/attachment.cgi?id=82768 with a working
mesa! :)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/357

------------------------------------------------------------------------
On 2013-08-24T01:49:55+00:00 Andy Lutomirski wrote:

The patch seems to have helped -- my box survived a couple days with the
patch applied.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/363

------------------------------------------------------------------------
On 2013-08-25T13:25:37+00:00 Chris Wilson wrote:

The bad news is that I've just had the semaphore hang with all the read-
after-write patch applied. :|

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/365

------------------------------------------------------------------------
On 2013-09-03T20:18:48+00:00 Januszmk6 wrote:

(In reply to comment #94)
> (In reply to comment #93)
> > Created attachment 82773 [details]
> > i915_error_state with new patch
> > 
> > (In reply to comment #92)
> > > Created attachment 82768 [details] [review] [review] [review]
> > > New read-after-write patch
> > > 
> > > Oops, my mistake, please try again.
> > 
> > Now loading, but after five minutes test:
> > 793485 Jul 21 17:32:56 localhost kernel: [  321.432882] hda-intel
> > 0000:00:1b.0: Unstable LPIB (32740 >= 4096); disabling LPIB delay counting
> > 793486 Jul 21 17:34:49 localhost kernel: [  434.291085]
> > [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
> > 793487 Jul 21 17:34:49 localhost kernel: [  434.291088] [drm] capturing
> > error event; look for more information in
> > /sys/kernel/debug/dri/0/i915_error_state
> > 793488 Jul 21 17:34:49 localhost kernel: [  434.307124]
> > [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xbfe2000
> > ctx 1) at 0xbfe21dc
> 
> That is a blorp (mesa/i965) bug and not the semaphore deadlock.
Could you please provide some link to this blorp bug report?
I had problem with semaphore deadlock, seems that with kernel 3.11 problem does not occur (without patch), but now I have:

[22221.843000] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
[22221.843483] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x4dfb5000 ctx 1) at 0x4dfb5518

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/367

------------------------------------------------------------------------
On 2013-09-04T00:27:20+00:00 Chris Wilson wrote:

*** Bug 68913 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1140716/comments/368


** Changed in: dri
       Status: Unknown => Confirmed

** Changed in: dri
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1140716

Title:
  [regression] 3.5.0-26-generic and  3.2.0-39-generic GPU hangs on
  Sandybridge

Status in Direct Rendering Infrastructure:
  Confirmed
Status in The Linux Kernel:
  Invalid
Status in “linux” package in Ubuntu:
  Invalid
Status in “linux-lts-quantal” package in Ubuntu:
  Invalid
Status in “linux” source package in Precise:
  Fix Released
Status in “linux-lts-quantal” source package in Precise:
  Fix Released
Status in “linux” source package in Quantal:
  Fix Released
Status in “linux-lts-quantal” source package in Quantal:
  Invalid
Status in “linux” source package in Raring:
  Confirmed
Status in “linux-lts-quantal” source package in Raring:
  Invalid
Status in “linux” package in Debian:
  New
Status in “linux” package in Fedora:
  Unknown

Bug description:
  I'm getting errors about GPU hangs every minute or so (usually only
  when using FF and scrolling a webpage or something). I also get an
  annoying ubuntu dialog saying there is a "system error".

  This didn't happen with 3.5.0-24-generic.

  Here is the dmesg:
  [15169.033709] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [15169.034517] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [15628.480216] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [15628.480570] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [15844.231372] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [15844.231773] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [20173.232593] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [20173.233211] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [26285.650393] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [26285.650980] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [26285.658405] ------------[ cut here ]------------
  [26285.658472] WARNING: at /build/buildd/linux-3.5.0/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
  [26285.658474] Hardware name: SATELLITE Z830
  [26285.658476] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm arc4 ghash_clmulni_intel aesni_intel cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep uvcvideo snd_pcm videobuf2_core microcode videodev bnep iwlwifi videobuf2_vmalloc snd_seq_midi psmouse videobuf2_memops snd_rawmidi rfcomm pcspkr snd_seq_midi_event serio_raw snd_seq bluetooth mac80211 snd_timer snd_seq_device i915 drm_kms_helper cfg80211 drm toshiba_acpi snd sparse_keymap soundcore wmi i2c_algo_bit toshiba_bluetooth snd_page_alloc parport_pc mei video mac_hid lpc_ich ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc lp parport e1000e ahci libahci [last unloaded: sdhci]
  [26285.658537] Pid: 23433, comm: kworker/u:0 Not tainted 3.5.0-26-generic #40-Ubuntu
  [26285.658539] Call Trace:
  [26285.658549]  [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
  [26285.658553]  [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
  [26285.658569]  [<ffffffffa02d32e6>] gen6_enable_rps+0x706/0x710 [i915]
  [26285.658584]  [<ffffffffa02bf3f6>] intel_modeset_init_hw+0x66/0xa0 [i915]
  [26285.658595]  [<ffffffffa02954b4>] i915_reset+0x1a4/0x6e0 [i915]
  [26285.658601]  [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
  [26285.658612]  [<ffffffffa029a943>] i915_error_work_func+0xc3/0x110 [i915]
  [26285.658618]  [<ffffffff8107097a>] process_one_work+0x12a/0x420
  [26285.658629]  [<ffffffffa029a880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
  [26285.658632]  [<ffffffff8107152e>] worker_thread+0x12e/0x2f0
  [26285.658636]  [<ffffffff81071400>] ? manage_workers.isra.26+0x200/0x200
  [26285.658640]  [<ffffffff81076023>] kthread+0x93/0xa0
  [26285.658644]  [<ffffffff8168a3e4>] kernel_thread_helper+0x4/0x10
  [26285.658649]  [<ffffffff81075f90>] ? kthread_freezable_should_stop+0x70/0x70
  [26285.658652]  [<ffffffff8168a3e0>] ? gs_change+0x13/0x13
  [26285.658654] ---[ end trace 59c6162fdfcbffee ]---
  [26756.021167] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [26756.021426] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [26766.014093] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [26766.014397] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [26932.376233] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [26932.376544] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
  [26932.384285] ------------[ cut here ]------------
  [26932.384354] WARNING: at /build/buildd/linux-3.5.0/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
  [26932.384356] Hardware name: SATELLITE Z830
  [26932.384358] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm arc4 ghash_clmulni_intel aesni_intel cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep uvcvideo snd_pcm videobuf2_core microcode videodev bnep iwlwifi videobuf2_vmalloc snd_seq_midi psmouse videobuf2_memops snd_rawmidi rfcomm pcspkr snd_seq_midi_event serio_raw snd_seq bluetooth mac80211 snd_timer snd_seq_device i915 drm_kms_helper cfg80211 drm toshiba_acpi snd sparse_keymap soundcore wmi i2c_algo_bit toshiba_bluetooth snd_page_alloc parport_pc mei video mac_hid lpc_ich ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc lp parport e1000e ahci libahci [last unloaded: sdhci]
  [26932.384421] Pid: 24262, comm: kworker/u:2 Tainted: G        W    3.5.0-26-generic #40-Ubuntu
  [26932.384422] Call Trace:
  [26932.384431]  [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
  [26932.384436]  [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
  [26932.384451]  [<ffffffffa02d32e6>] gen6_enable_rps+0x706/0x710 [i915]
  [26932.384466]  [<ffffffffa02bf3f6>] intel_modeset_init_hw+0x66/0xa0 [i915]
  [26932.384476]  [<ffffffffa02954b4>] i915_reset+0x1a4/0x6e0 [i915]
  [26932.384482]  [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
  [26932.384493]  [<ffffffffa029a943>] i915_error_work_func+0xc3/0x110 [i915]
  [26932.384500]  [<ffffffff8107097a>] process_one_work+0x12a/0x420
  [26932.384511]  [<ffffffffa029a880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
  [26932.384514]  [<ffffffff8107152e>] worker_thread+0x12e/0x2f0
  [26932.384517]  [<ffffffff81071400>] ? manage_workers.isra.26+0x200/0x200
  [26932.384521]  [<ffffffff81076023>] kthread+0x93/0xa0
  [26932.384526]  [<ffffffff8168a3e4>] kernel_thread_helper+0x4/0x10
  [26932.384531]  [<ffffffff81075f90>] ? kthread_freezable_should_stop+0x70/0x70
  [26932.384534]  [<ffffffff8168a3e0>] ? gs_change+0x13/0x13
  [26932.384536] ---[ end trace 59c6162fdfcbffef ]---

  ProblemType: Bug
  DistroRelease: Ubuntu 12.10
  Package: linux-image-3.5.0-26-generic 3.5.0-26.40
  ProcVersionSignature: Ubuntu 3.5.0-26.40-generic 3.5.7.6
  Uname: Linux 3.5.0-26-generic x86_64
  ApportVersion: 2.6.1-0ubuntu10
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  luca       2084 F.... pulseaudio
  CheckboxSubmission: f8b82cd9bc23fe075e5068a9824afda5
  CheckboxSystem: b1865df84255b8716d3bcc269ff410d1
  Date: Sat Mar  2 22:25:14 2013
  HibernationDevice: RESUME=UUID=20fe6da8-7d68-4660-953f-6e4ae1d348a7
  InstallationDate: Installed on 2012-04-26 (310 days ago)
  InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
  MachineType: TOSHIBA SATELLITE Z830
  MarkForUpload: True
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-26-generic root=UUID=36929bf3-a158-44d9-a80d-3adac2840fa8 ro quiet splash acpi_backlight=vendor i915.i915_enable_rc6=1 i915.lvds_downclock=1 vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-3.5.0-26-generic N/A
   linux-backports-modules-3.5.0-26-generic  N/A
   linux-firmware                            1.95
  SourcePackage: linux
  UpgradeStatus: Upgraded to quantal on 2012-10-28 (125 days ago)
  dmi.bios.date: 07/31/2012
  dmi.bios.vendor: TOSHIBA
  dmi.bios.version: Version 1.70
  dmi.board.asset.tag: 0000000000
  dmi.board.name: Portable PC
  dmi.board.vendor: TOSHIBA
  dmi.board.version: Version A0
  dmi.chassis.asset.tag: 0000000000
  dmi.chassis.type: 10
  dmi.chassis.vendor: TOSHIBA
  dmi.chassis.version: Version 1.0
  dmi.modalias: dmi:bvnTOSHIBA:bvrVersion1.70:bd07/31/2012:svnTOSHIBA:pnSATELLITEZ830:pvrPT22LE-00300GGR:rvnTOSHIBA:rnPortablePC:rvrVersionA0:cvnTOSHIBA:ct10:cvrVersion1.0:
  dmi.product.name: SATELLITE Z830
  dmi.product.version: PT22LE-00300GGR
  dmi.sys.vendor: TOSHIBA

To manage notifications about this bug go to:
https://bugs.launchpad.net/dri/+bug/1140716/+subscriptions