← Back to team overview

kernel-packages team mailing list archive

[Bug 1013807] [NEW] transparent hugepages and thrashing on amd64

 

You have been subscribed to a public bug:

I seem to have found a solution to a severe thrashing/swapping/freezing
problem that I've been having for months now.  I guess the real question
is - should I turn it into a bug report and what would be useful data to
include if so.

This is a quad core AMD Phaeom system with 4G of ram, dual monitors and
a single 1TB WD caviar black HD.  It had been behaving normally until
something broke sometime late in the 11.x release cycle and continues in
the current 12.04 LTS.  The symptoms are running a moderate load of apps
(firefox with ~8 tabs, a terminal or 2, and aisleriot solitaire for
example) and experiencing system freezes where the entire UI becomes
totally unresponsive for 20 seconds - 5 minutes with solid disk
activity.  Trying to figure out what was going on via iotop and top show
jbd2 and kswapd accounting for the largest load, but since it freezes
iotop like everything else I can't tell what's going during the worst
storms.  Googling around shows a fair number of other people with
similar problems, most of them with multi core amd64 systems.

The other day I spotted this report on opensuse that looked similar but
not identical:

http://lists.opensuse.org/opensuse/2012-03/msg00657.html

I booted with the grub parameter transparent_hugepage=never yesterday
and the problem went and away and hasn't come back. I've streesed the
system by running  a bunch of flash/java tabs in firefox, running a
large java based stock app (ThinkorSwim) in another workspace and
playing a 1080p 60fps movie in a third workspace.  This certainly causes
swapping, but not freezing or stumbling.  It actually did a bit of
swapping a minute ago while I was typing and it managed to make Pandora
radio stumble for a moment - but that's orders of magnitude better than
it has been.

I think there may be a fundamental problem with how transparent
hugepages are handled with some AMD CPUs.  I think this problem started
when this feature was implemented and enabled by default. The manpage
for madvise() says this was added in 2.6.38, but I don't know if it was
enabled by default at that point.

Hre's a partial list of things that haven't worked well in the past:

Playing with the swappiness value: setting swappiness to very low values
makes the problem take longer to surface, but (unsurprisinglly) makes it
even worse once it does.

swapoff-a ; swapon-a: this makes it go away for a while.  A potentially
interesting thing is that as soon as I can get the system to act on the
swapoff -a the system becomes responsive again. It pegs once CPU core at
100% and the HD grinds like crazy but it stops freezing right away.

Moving swap from the HD to a USB thumb drive: Obviously I didn't expect
that to be faster but wanted to see if segregating swap to a different
device on a different bus would make it swap more smoothly - it didn't.

Playing with nice and ionice priorities for jdb2, kswapd.  The fact that
running these processes at a lower priority than anything else on the
system makes no difference leads me to think they were just symptoms and
not at the root of the problem.

I think this may be a tip of the iceberg and there may be a lot of other
having this problem.  Looking around I see a fair number of reports,
most of them unsolved.  Some may have been fixed by just adding enough
RAM that dirty hugepages just don't collect.  Some may have been fixed
by chaanging filesystems - ext4 seems like something a lot of people
with this problem have in common.

Workaround:
hold down the spacebar during boot in order to bring up the grub menu, edit the command line and add
transparent_hugepage=never

If this fixes the problem you can make it permanent by editing
/etc/default/grub and adding the ransparent_hugepage=never to the
GRUB_CMDLINE_LINUX_DEFAULT  line and then running update-grub

Problems with this workaround:
1) transparent hugepages should work.  This may cause a small performance hit in some situations and a larger hit in others.
2) If you do this you will probably never know when or if it actually gets fixed.

PS: Lars Müller [ˈlaː(r)z ˈmʏlɐ]
Samba Team
SUSE Linux, Maxfeldstraße 5, 90409 Nürnberg, Germany

is looking for bugzilla reports on this too.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: bot-comment
-- 
transparent hugepages and thrashing on amd64
https://bugs.launchpad.net/bugs/1013807
You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.