Re: Stability Under Load


On Fri, Aug 19, 2011 at 03:53:09PM +0100, Gordan Bobic wrote:
> So you are suggesting that hf platform is actually more stable? Are
> your results repeatable in terms of demonstrating that on hf it
> doesn't happen?
I didn't try it multiple times. It might have slightly newer
userspace though, which might fix some of the problems, if
only accidently.

> >I ran some memory testing tools, but they did not find any
> >problem.
> Ditto, I ran many, many times and it hasn't found any issues, but it
> is generally not good for stress-testing.
But I believe it shows us that memory itself is correct, and the
problem must be somewhere else.

> It would be very hard for the binary Xorg driver to cause other
> programs to randomly crash.
Part of the system memory is used by the display driver, so if
the kernel has a bug that it uses one of those portions of the
RAM despite it being used by the graphics system, then this could
explain it.

> The obvious question I have now is that since there clearly are
> several people who have seen stability issues, why hasn't this been
> raised before?
I raised the issue multiple times on IRC, but obviously only when
you were not there.
> If it turns out that AC100 is systematically suffering from duff,
> pre-over-overclocked hardware (as is fairly typical of nvidia -
> their chips generally cannot handle running at full load at default
> clocks for reasonable periods of time, and they have no margin for
> error at all, both in terms of default voltages and clock-speeds),
> it seems the effort going into it may well be wasted, at least until
> other similar hardware becomes available. I'm eagerly awaiting
> Jeremiah's report on whether is TrimSlice is exhibiting the same
> issues. I sincerely hope it isn't and that it's down to memory
> timings, since at least we can try to do something about those.

We could still underclock devices if needed.

PS. My Mail-Followup-To header indicated that followups should
    be sent to the list, I am subscribed to it and do not have
    to be in To/Cc.

Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

