← Back to team overview

ac100 team mailing list archive

Re: Stability Under Load

 

On Fri, Aug 19, 2011 at 10:18:31AM +0100, Gordan Bobic wrote:
> As some of you may have already heard on the IRC channel, I had my
> AC100 suddenly become very unstable under load. When doing big
> compile jobs, the compiler would relatively regularly segfault or
> detect hardware errors, or errors it didn't think was hardware and
> invited me to post a bug report with pre-processed C file. None of
> these were reproducible (it would error out in a different place on
> different runs). So I figured I had duff hardware and got another
> one. This is a lot better, but I still get spurious, unreproducible
> errors like this every few hours (old one would error out up to a
> few times/hour if it was being hammered with compiling jobs for a
> few hours). Both of mine are the 10U models with Micron RAM.
> 
> Now, either I am incredibly unlucky or something else is going on.
> What I would like to know is:
> 1) Do you use their AC100 for big compile jobs (e.g. the 2-day gcc
> compile)?
> 2) If 1), are you seeing random errors like what I'm describing?

Yes, I am experiencing crashing and endlessly recursing GCCs when
trying to build kernels on my 10V, on Debian armel, at least when
compiling with multiple cores. With one core, I got a complete
build (although this takes 4 hours then), at least after having
the machine off for a day and then booting and starting to build
directly.

I also built a kernel in an armhf environment using 3-5 parallel
jobs without a problem (in 2 hours build time).

Furthermore, while building a kernel I tried to decompress a
file; this failed on the first attempt, but succeeded on the
second attempt.

I ran some memory testing tools, but they did not find any
problem.

My questions here being:
  (a) do you run a customly built kernel?
  (b) do you use the binary nvidia driver?

If I recall correctly, the builds only succeeded sofar on systems
without binary drivers. But I can be wrong.

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Attachment: pgpPitGkr4ja3.pgp
Description: PGP signature


Follow ups

References