kernel-packages team mailing list archive

Thread
Date

[Bug 605773] Re: Wrong kernel setting zone_reclaim_mode leads to performance problems

To: kernel-packages@xxxxxxxxxxxxxxxxxxx
From: Andras Fabian <605773@xxxxxxxxxxxxxxxxxx>
Date: Thu, 15 Aug 2013 11:36:23 -0000
Reply-to: Bug 605773 <605773@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Hi,

Well, the problem was "solved" with the described workaround (setting
"zone_reclaim_mode = 0" manually). Since then, we never again had this
issue. To be honest I was even surprised to see new activity on this
issues (as I have reported it over 3 years ago :-) ) and really had to
read it before remembering what was going on at all.

The affected servers are still in use, but usually they have no real
maintenance windows (in the classic sense), neither do we have time and
resources to experiment with different kernel versions on them.

Neither do I see a way to sanely run apport-collect, as the server has
no X, and the text gui is really hard to work with (for example I didn't
find a way to supply the necessary email address ... etc.)

Andras Fabian

--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/605773

Title:
Wrong kernel setting zone_reclaim_mode leads to performance problems

Status in “linux” package in Ubuntu:
Incomplete

Bug description:
Binary package hint: linux-image-server

--------------------------------------------------
Description: Ubuntu 10.04 LTS
Release: 10.04
--------------------------------------------------
linux-image-server version:
Installed: 2.6.32.22.23
--------------------------------------------------

The background of this problem is - or how I discovered it - a
migration of PostgreSQL database server from old hardware+old OS to a
new hardware+new OS. Transition was no problem, but after we started
using the server in production, we discovered a strange problem during
nightly backups. The runtime of the backups went up from 2 1/2 hours
to 6 1/2 hours (despite the fact, that the new hardware was designed
to have much more power ... which positively showed up in most other
tasks!).

A longer research of the issue using the knowledge of many helpful
guys on the PostgreSQL mailing list finally helped to find the reason
for this slow down. It turned out to be a problem around the VM part
of the kernel! Under some situations, where a lot of memory - for
caching purposes - was consumed (which easily happens while backing up
100 GByte DBs), a congestion happened in the VM which slowed down the
process dramatically.

In depth analysis of many parts (vie /proc file system, ps, strace
etc.) and comparing with settings on the old machines, I finally found
an essential kernel setting, vm.zone_reclaim_mode, that was solely
responsible for the issue. Luckily I could construct a simple test
scenario (COPY-to-STDOU - exporting the data from a database table via
stdout ... and writing this via pipe to the file system) where I could
reproduce the issue. Our server had the value zone_reclaim_mode = 1
set, whereas our old servers used zone_reclaim_mode = 0. By switching
(via sysctl) this values back and forth, I could easily bring down the
experimental export process to crouching speed, or let it run again.

The complete path of the analysis can be viewed at the PostgreSQL mailing list here:
(there ia also a description, how the problem can be reproduced, and what the many symptoms are)
http://archives.postgresql.org/pgsql-general/2010-07/msg00267.php

Now, the conclusion to use "zone_reclaim_mode = 0" on our type of hardware was further strengthened by a very interesting thread at LKML, where the kernel developer discussed potential issues with this setting. You can read it here:
http://lkml.org/lkml/2009/5/12/586

That discussion boils down to the fact, that for some reasons
(described there in detail), the Linux kernel thinks on modern CPU
architectures (out new Servers use Core i7 generation CPUs which are
explicitly mentioned!) that it has a NUMA architecture. And for NUMA
architectures it automatically enables "zone_reclaim_mode = 1" ...
even though it is wrong, and not even recommended under many
circumstances. Interestingly, even most posters at the LKML thread
think, that it would be better to always(!) default this value to
"zone_reclaim_mode = 0" instead of some automatic decision.

Some more detail on what zone_reclaim_mode does can also be found here:
http://www.linuxinsight.com/proc_sys_vm_zone_reclaim_mode.html

Now, I don't know why this "defaulting to 0" is still not in the
mainline kernels. That discussion from May 2009 at LKML died down, and
obviously no one feeled responsible to commit the patches (even
though, obvioulsy one of the guys had already prepared some!). BUT, I
would ask the Ubuntu team, to maybe act on their own and provide a way
in the Ubuntu 10.04 LTS to fix this issue (because, some reports on
the net suggest, that "zone_reclaim_mode = 1" can do harm to
performance in many ways)! And I believe, that I will not be the only
PostgreSQL admin being affected by this issue!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/605773/+subscriptions