sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #00594
[Bug 1316970] Re: g_dbus memory leak in lrmd
Hi Dan, made some progress on the investigation (not definitive, but
still it helps us to continue with the SRU process).
By using Valgrind memcheck analyzer I couldn't observe any non-constant
leaks after Seyeong's patch gets applied. Then, I started to use two
more analyzers in order to obtain more information of lrmd's memory
behavior: DHAT and massif, both contained in Valgrind's pack of
analyzers.
DHAT (dynamic heap analyzer tool) allowed me to confirm some hypothesis.
By running the experiment 3 times, the first for 10 minutes, then for
20 minutes and finally for 30 minutes, I observed that:
1) lrmd allocated a total of 20,671,285 bytes during 10 minutes,
38,486,917 in the 20 minute run and 56,338,077 when running for 30
minutes;
2) In all the 3 cases, the total leaked memory (from Glib library) was
104,146 bytes, a constant value;
3) Also, it measured that in all cases 20,673 bytes have lived for more
than half of the run, meaning the application is allocating and de-
allocating memory in reduced intervals of time, not keeping allocated
memory until its end (when it would free all chunks);
Number #1 above indicates the total memory allocated - it doesn't mean
this entire amount was living at same time. It basically sums all the
calls to malloc-like functions during the program execution.
Number #2 indicates we have a constant amount of leaked memory, that is
not increasing and so is not responsible for the slow memory increase
we're observing.
Finally, number #3 shows us that this is not a case of a program
allocating memory constantly and only de-allocating all chunks in the
end of application at once. This was one of my hypothesis, now proved
false.
That said, it's clear that heap-wise the application is not leaking an
increasing amount of memory. From the stack point-of-view, by running
application through massif analyzer it's possible to observe the stack
behavior - the maximum size from stack was 5008 bytes, the minimum was
744 bytes. It floated between those 2 limits in a non-constant ratio,
meaning it had increased and decreased over time, multiple times. This
proves the stack has not much influence in the issue.
So, after that I started observing the /proc/smaps of the application,
and it showed an important data point: the "area" that is growing is an
anonymous non-heap map, so it was allocated through the mmap() syscall.
Valgrind cannot capture mmap() syscalls, so it's likely to miss a
possible leak if the memory in question was allocated through mmap(). By
"stracing" the application, I saw many mmap() calls, more then munmap().
And by inspecting GLib code, I could see mmap() calls there (whereas
lrmd code has none itself). So, it could be a GLib wrapper causing this
slow increase of memory.
My last hypothesis is memory fragmentation, but I'd like to first
exclude or confirm the mmap() idea before going with memory
fragmentation hypothesis.
That all said, I believe we should continue the SRU process since Seyeong's patch was proved a valid fix for the heap leaks we had. I intend to continue the investigation to understand exactly what kind of memory behavior lrmd has to justify this slow but steady memory growth now.
Thanks,
Guilherme
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1316970
Title:
g_dbus memory leak in lrmd
Status in pacemaker package in Ubuntu:
Fix Released
Status in pacemaker source package in Trusty:
In Progress
Bug description:
[Impact]
lrmd daemon with upstart resource has memory leak in Trusty
affected to pacemaker 1.1.10.
affected to glib2.0 2.40.2-0ubuntu1 >> for glib2.0 created new lp [1]
Please note that patch for pacemaker is created myself.
[Test Case]
https://pastebin.ubuntu.com/p/fqK6Cx3SKK/
you can check memory leak with this script
[Regression]
Restarting daemon after upgrading this pkg will be needed. this patch adds free for non-freed dynamic allocated memory. so it solves memory leak.
[Others]
this patch is from my self with testing.
Please review carefully if it is ok.
[1] https://bugs.launchpad.net/ubuntu/+source/glib2.0/+bug/1750741
[Original Description]
I'm running Pacemaker 1.1.10+git20130802-1ubuntu1 on Ubuntu Saucy
(13.10) and have encountered a memory leak in lrmd.
The details of the bug are covered here in this thread
(http://oss.clusterlabs.org/pipermail/pacemaker/2014-May/021689.html)
but to summarise, the Pacemaker developers believe the leak is caused
by the g_dbus API, the use of which was removed in Pacemaker 1.11.
I've also attached the Valgrind output from the run that exposed the
issue.
Given that this issue affects production stability (a periodic restart
of Pacemaker is required), will a version of 1.11 be released for
Trusty? (I'm happy to upgrade the OS to Trusty to get it).
If not, can you advise which version of the OS will be the first to
take 1.11 please?
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1316970/+subscriptions