← Back to team overview

ac100 team mailing list archive

Re: Stability Under Load


On Sun, Aug 21, 2011 at 01:09:09PM +0100, Gordan Bobic wrote:
> I'm also curious how come my powertop is showing 1000MHz with no
> errors in the log when I set SM1 to 975mV.
975mV might be equal to 1000mV, there is some rounding up involved,
and as far as I know, the frequency steps are 50mV.

> >>>Harmony sets minimum voltage to 750 and maximum voltage to 1125,
> >>>maybe that gives more stability?
> >>
> >>Interesting. It also occurs to me that just tweaking voltages
> >>(which, again, would be much easier if they were run-time adjustable
> >>via /sys as I said in a previous post), it would be really handy to
> >>get core temperature readings? Does the AC100 have temperature
> >>sensors built in?
> >No, we don't know what happens if we started exposing the various
> >settings somewhere, when they are read, etc. To unsafe, in my
> >opinion.
> Provided there are limit checks set in place (e.g. hard-code a limit
> check so that you can't set the voltage > 1250mV), I don't see what
> harm could come of it, other than making stability stress testing
> easier.
They are exposed in /sys/class/regulators/, but read-only. There's
also read-only /sys/kernel/debug/clock/dvfs for the current dvfs

The question is if the kernel code is ready for having those values
changed at run-time, or whether it reads them at start and builds
other structures out of them. We don't know this, that's why I
said it shouldn't be done.

> >>On an unrelated note, I noticed an interesting possible correlation
> >>of an error in my message log with instability that I am currently
> >>investigating. It is possible that I have been barking up a
> >>completely wrong tree so far. I need to do some more investigating
> >>(A _LOT_ of SLUB memory allocation failures, possibly to do with
> >>zram swapping and/or the size of vmalloc set on the kernel command
> >>line).
> >SLUB errors come from rt2800usb usually, without the module loaded,
> >the errors should vanish. You could also try using SLAB instead of
> >SLUB.
> Yes, I did notice that the rt* modules were in the error dump. I
> don't remember seeing the option in the kernel config to choose SLAB
> over SLUB. Where is it?
In init/Kconfig, aka "General setup", "Choose SLAB allocator". There
you can choose between SLAB, SLUB, and SLOB.

> This stability problem is particularly frustrating because I saw the
> errors occurring on 2.6.29 which didn't have zram, so in theory, it
> can't be directly zram related (and I've been running zram on my
> SheevaPlug on kernel for ages with much heavier loads).
I don't have zram either.

> So I'm taking all my observations at the moment with a fist sized
> grain of salt. What is weird, however, is that I seem to be running
> completely stable today at 975mV SM1 set in board-paz00-power.c, and
> it's warmer than it was yesterday.
You always need to remember that this value is just a maximum for the
regulator, it is not fixed. In your case, the regulator scales from
725mV to 975mV, in 50(?) mV steps.

> The only other differences are:
> 1) Disabled zram swap (still have normal swap)
> 2) Changed vm.swappiness from 100 to 0
> 3) Unloaded rt* and related modules
> 4) Rebuilding the kernel (with -j4) instead of glibc
> The obvious difference with 4) is that glibc compile takes a lot
> more memory to compile than the kernel, which causes swapping. When
> the kernel compile finishes if there are no errors, I'll try the
> glibc building again. If that shakes it loose, the only thing I can
> think of is the vmalloc kernel boot parameter which came from the
> original Android setup (vmalloc=320M). I'm pretty sure this
> shouldn't be needed, but it is vaguely plausible it is causing
> issues under high memory pressure, at least in combination with
> other things that I have running.

It's most likely not memory related, but a bug in voltage
scaling. I currently have DVFS disabled, and the build seems
to be running without errors, whereas it would error out after
a few minutes with DVFS enabled.

The commit 1f8100366e46c626becd71a34cdcf7976570ea11[1] for seaboard might
be interested, it reduces the slew rate to avoid voltage from going down
to quickly.

[1] http://gitorious.org/~marvin24/ac100/marvin24s-kernel/commit/1f8100366e46c626becd71a34cdcf7976570ea11

Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Attachment: pgpgZZrnqg9Ad.pgp
Description: PGP signature

Follow ups