yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74779
[Bug 1793411] [NEW] Dashboard memory leaks
Public bug reported:
1.Issue description
Recently, we found the server which hosts horizon dashboard had serveral
times OOM caused by horizon services. After restarting the dashboard,
the memory usage goes up very quickly if we access
/project/network_topology/ path.
2.How to reproduce
Login into the dashboard and go to 'Network Topology' tab, then leave it
there (autorefresh 10s by default), now monitor the memory changes on
the host.
3.Versions and Components
Dashboard: Stable/Pike
Server: uWSGI 1.9.17-1
OS: Ubuntu 14.04 trusty
Python: 2.7.6
As the codes of memoized has little changes since Pike, if you use
Queen/Rocky release, you may also succeed to reproduce it.
4.The investigation
The root cause of the memory leak is the decorator
memorized(horizon/utils/memoized.py) which is used to cache function
calls in Horizon.
After disable it, the memory increases has been controlled.
The following is the comparison of memory change(with guppy) for each
request of /project/network_topology:
- original (no code change) 684kb
- do garbage collection manually 185kb
- disable memorize cache 10kb
As we known, memoized uses weakref to cache objects. A weak reference to
an object is not enough to keep the object alive: when the only
remaining references to a referent are weak references, garbage
collection is free to destroy the referent and reuse its memory for
something else.
In the memory, we could see lots of weakref stuffs, the following is a
example:
Partition of a set of 394 objects. Total size = 37824 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 197 50 18912 50 18912 50 _cffi_backend.CDataGCP
1 197 50 18912 50 37824 100 weakref.KeyedRefq
But the rest of them are not. the following result is the memory objects
changes of per /project/network_topology access with garbage collection
manually.
Partition of a set of 1017 objects. Total size = 183680 bytes.
Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
0 419 41 58320 32 58320 32 dict (no owner)
1 100 10 23416 13 81736 44 list
2 135 13 15184 8 96920 53 <Nothing>
3 2 0 6704 4 103624 56 urllib3.connection.VerifiedHTTPSConnection
4 2 0 6704 4 110328 60 urllib3.connectionpool.HTTPSConnectionPool
5 1 0 3352 2 113680 62 novaclient.v2.client.Client
6 2 0 2096 1 115776 63 OpenSSL.SSL.Connection
7 2 0 2096 1 117872 64 OpenSSL.SSL.Context
8 2 0 2096 1 119968 65 Queue.LifoQueue
9 12 1 2096 1 122064 66 dict of urllib3.connectionpool.HTTPSConnectionPool
The most of them are dicts. Followings are the dicts sorted by class, as
you can see most of them are not weakref objects:
Partition of a set of 419 objects. Total size = 58320 bytes.
Index Count % Size % Cumulative % Class
0 362 86 50712 87 50712 87 unicode
1 27 6 3736 6 54448 93 list
2 5 1 2168 4 56616 97 dict
3 22 5 1448 2 58064 100 str
4 2 0 192 0 58256 100 weakref.KeyedRef
5 1 0 64 0 58320 100 keystoneauth1.discover.Discover
5.The issue
So the problem is that memoized does not work like what we expect. It
allocates memory to cache objects but some of them could not be
released.
** Affects: horizon
Importance: Undecided
Status: New
** Description changed:
- Issue description
+ 1.Issue description
Recently, we found the server which hosts horizon dashboard had serveral
times OOM caused by horizon services. After restarting the dashboard,
the memory usage goes up very quickly if we access
/project/network_topology/ path.
-
- How to reproduce
+ 2.How to reproduce
Login into the dashboard and go to 'Network Topology' tab, then leave it
there (autorefresh 10s by default), now monitor the memory changes on
the host.
- Versions and Components
+ 3.Versions and Components
Releases: Stable/Pike
Server: uWSGI 1.9.17-1
OS: Ubuntu 14.04 trusty
Python: 2.7.6
As the codes of memoized has little changes since Pike, if you use
Queen/Rocky release, you may also succeed to reproduce it.
-
- The investigation
+ 4.The investigation
The root cause of the memory leak is the decorator
memorized(horizon/utils/memoized.py) which is used to cache function
calls in Horizon.
After disable it, the memory increases has been controlled.
The following is the comparison of memory change(with guppy) for each
request of /project/network_topology:
- - original (no code change) 684kb
+ - original (no code change) 684kb
- - do garbage collection manually 185kb
+ - do garbage collection manually 185kb
- - disable memorize cache 10kb
+ - disable memorize cache 10kb
-
- As we known, memoized uses weakref to cache objects. A weak reference to an object is not enough to keep the object alive: when the only remaining references to a referent are weak references, garbage collection is free to destroy the referent and reuse its memory for something else.
+ As we known, memoized uses weakref to cache objects. A weak reference to
+ an object is not enough to keep the object alive: when the only
+ remaining references to a referent are weak references, garbage
+ collection is free to destroy the referent and reuse its memory for
+ something else.
In the memory, we could see lots of weakref stuffs, the following is a
example:
Partition of a set of 394 objects. Total size = 37824 bytes.
- Index Count % Size % Cumulative % Kind (class / dict of class)
- 0 197 50 18912 50 18912 50 _cffi_backend.CDataGCP
- 1 197 50 18912 50 37824 100 weakref.KeyedRefq
+ Index Count % Size % Cumulative % Kind (class / dict of class)
+ 0 197 50 18912 50 18912 50 _cffi_backend.CDataGCP
+ 1 197 50 18912 50 37824 100 weakref.KeyedRefq
But the rest of them are not. the following result is the memory objects
changes of per /project/network_topology access with garbage collection
manually.
Partition of a set of 1017 objects. Total size = 183680 bytes.
- Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
- 0 419 41 58320 32 58320 32 dict (no owner)
- 1 100 10 23416 13 81736 44 list
- 2 135 13 15184 8 96920 53 <Nothing>
- 3 2 0 6704 4 103624 56 urllib3.connection.VerifiedHTTPSConnection
- 4 2 0 6704 4 110328 60 urllib3.connectionpool.HTTPSConnectionPool
- 5 1 0 3352 2 113680 62 novaclient.v2.client.Client
- 6 2 0 2096 1 115776 63 OpenSSL.SSL.Connection
- 7 2 0 2096 1 117872 64 OpenSSL.SSL.Context
- 8 2 0 2096 1 119968 65 Queue.LifoQueue
- 9 12 1 2096 1 122064 66 dict of urllib3.connectionpool.HTTPSConnectionPool
+ Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
+ 0 419 41 58320 32 58320 32 dict (no owner)
+ 1 100 10 23416 13 81736 44 list
+ 2 135 13 15184 8 96920 53 <Nothing>
+ 3 2 0 6704 4 103624 56 urllib3.connection.VerifiedHTTPSConnection
+ 4 2 0 6704 4 110328 60 urllib3.connectionpool.HTTPSConnectionPool
+ 5 1 0 3352 2 113680 62 novaclient.v2.client.Client
+ 6 2 0 2096 1 115776 63 OpenSSL.SSL.Connection
+ 7 2 0 2096 1 117872 64 OpenSSL.SSL.Context
+ 8 2 0 2096 1 119968 65 Queue.LifoQueue
+ 9 12 1 2096 1 122064 66 dict of urllib3.connectionpool.HTTPSConnectionPool
The most of them are dicts. Followings are the dicts sorted by class, as
you can see most of them are not weakref objects:
Partition of a set of 419 objects. Total size = 58320 bytes.
- Index Count % Size % Cumulative % Class
- 0 362 86 50712 87 50712 87 unicode
- 1 27 6 3736 6 54448 93 list
- 2 5 1 2168 4 56616 97 dict
- 3 22 5 1448 2 58064 100 str
- 4 2 0 192 0 58256 100 weakref.KeyedRef
- 5 1 0 64 0 58320 100 keystoneauth1.discover.Discover
+ Index Count % Size % Cumulative % Class
+ 0 362 86 50712 87 50712 87 unicode
+ 1 27 6 3736 6 54448 93 list
+ 2 5 1 2168 4 56616 97 dict
+ 3 22 5 1448 2 58064 100 str
+ 4 2 0 192 0 58256 100 weakref.KeyedRef
+ 5 1 0 64 0 58320 100 keystoneauth1.discover.Discover
- The issue
+ 5.The issue
So the problem is that memoized does not work like what we expect. It
allocates memory to cache objects but some of them could not be
released.
** Description changed:
1.Issue description
Recently, we found the server which hosts horizon dashboard had serveral
times OOM caused by horizon services. After restarting the dashboard,
the memory usage goes up very quickly if we access
/project/network_topology/ path.
2.How to reproduce
Login into the dashboard and go to 'Network Topology' tab, then leave it
there (autorefresh 10s by default), now monitor the memory changes on
the host.
3.Versions and Components
- Releases: Stable/Pike
+ Dashboard: Stable/Pike
Server: uWSGI 1.9.17-1
OS: Ubuntu 14.04 trusty
Python: 2.7.6
As the codes of memoized has little changes since Pike, if you use
Queen/Rocky release, you may also succeed to reproduce it.
4.The investigation
The root cause of the memory leak is the decorator
memorized(horizon/utils/memoized.py) which is used to cache function
calls in Horizon.
After disable it, the memory increases has been controlled.
The following is the comparison of memory change(with guppy) for each
request of /project/network_topology:
- original (no code change) 684kb
- do garbage collection manually 185kb
- disable memorize cache 10kb
As we known, memoized uses weakref to cache objects. A weak reference to
an object is not enough to keep the object alive: when the only
remaining references to a referent are weak references, garbage
collection is free to destroy the referent and reuse its memory for
something else.
In the memory, we could see lots of weakref stuffs, the following is a
example:
Partition of a set of 394 objects. Total size = 37824 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 197 50 18912 50 18912 50 _cffi_backend.CDataGCP
1 197 50 18912 50 37824 100 weakref.KeyedRefq
But the rest of them are not. the following result is the memory objects
changes of per /project/network_topology access with garbage collection
manually.
Partition of a set of 1017 objects. Total size = 183680 bytes.
Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
0 419 41 58320 32 58320 32 dict (no owner)
1 100 10 23416 13 81736 44 list
2 135 13 15184 8 96920 53 <Nothing>
3 2 0 6704 4 103624 56 urllib3.connection.VerifiedHTTPSConnection
4 2 0 6704 4 110328 60 urllib3.connectionpool.HTTPSConnectionPool
5 1 0 3352 2 113680 62 novaclient.v2.client.Client
6 2 0 2096 1 115776 63 OpenSSL.SSL.Connection
7 2 0 2096 1 117872 64 OpenSSL.SSL.Context
8 2 0 2096 1 119968 65 Queue.LifoQueue
9 12 1 2096 1 122064 66 dict of urllib3.connectionpool.HTTPSConnectionPool
The most of them are dicts. Followings are the dicts sorted by class, as
you can see most of them are not weakref objects:
Partition of a set of 419 objects. Total size = 58320 bytes.
Index Count % Size % Cumulative % Class
0 362 86 50712 87 50712 87 unicode
1 27 6 3736 6 54448 93 list
2 5 1 2168 4 56616 97 dict
3 22 5 1448 2 58064 100 str
4 2 0 192 0 58256 100 weakref.KeyedRef
5 1 0 64 0 58320 100 keystoneauth1.discover.Discover
5.The issue
So the problem is that memoized does not work like what we expect. It
allocates memory to cache objects but some of them could not be
released.
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1793411
Title:
Dashboard memory leaks
Status in OpenStack Dashboard (Horizon):
New
Bug description:
1.Issue description
Recently, we found the server which hosts horizon dashboard had
serveral times OOM caused by horizon services. After restarting the
dashboard, the memory usage goes up very quickly if we access
/project/network_topology/ path.
2.How to reproduce
Login into the dashboard and go to 'Network Topology' tab, then leave
it there (autorefresh 10s by default), now monitor the memory changes
on the host.
3.Versions and Components
Dashboard: Stable/Pike
Server: uWSGI 1.9.17-1
OS: Ubuntu 14.04 trusty
Python: 2.7.6
As the codes of memoized has little changes since Pike, if you use
Queen/Rocky release, you may also succeed to reproduce it.
4.The investigation
The root cause of the memory leak is the decorator
memorized(horizon/utils/memoized.py) which is used to cache function
calls in Horizon.
After disable it, the memory increases has been controlled.
The following is the comparison of memory change(with guppy) for each
request of /project/network_topology:
- original (no code change) 684kb
- do garbage collection manually 185kb
- disable memorize cache 10kb
As we known, memoized uses weakref to cache objects. A weak reference
to an object is not enough to keep the object alive: when the only
remaining references to a referent are weak references, garbage
collection is free to destroy the referent and reuse its memory for
something else.
In the memory, we could see lots of weakref stuffs, the following is a
example:
Partition of a set of 394 objects. Total size = 37824 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 197 50 18912 50 18912 50 _cffi_backend.CDataGCP
1 197 50 18912 50 37824 100 weakref.KeyedRefq
But the rest of them are not. the following result is the memory
objects changes of per /project/network_topology access with garbage
collection manually.
Partition of a set of 1017 objects. Total size = 183680 bytes.
Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
0 419 41 58320 32 58320 32 dict (no owner)
1 100 10 23416 13 81736 44 list
2 135 13 15184 8 96920 53 <Nothing>
3 2 0 6704 4 103624 56 urllib3.connection.VerifiedHTTPSConnection
4 2 0 6704 4 110328 60 urllib3.connectionpool.HTTPSConnectionPool
5 1 0 3352 2 113680 62 novaclient.v2.client.Client
6 2 0 2096 1 115776 63 OpenSSL.SSL.Connection
7 2 0 2096 1 117872 64 OpenSSL.SSL.Context
8 2 0 2096 1 119968 65 Queue.LifoQueue
9 12 1 2096 1 122064 66 dict of urllib3.connectionpool.HTTPSConnectionPool
The most of them are dicts. Followings are the dicts sorted by class,
as you can see most of them are not weakref objects:
Partition of a set of 419 objects. Total size = 58320 bytes.
Index Count % Size % Cumulative % Class
0 362 86 50712 87 50712 87 unicode
1 27 6 3736 6 54448 93 list
2 5 1 2168 4 56616 97 dict
3 22 5 1448 2 58064 100 str
4 2 0 192 0 58256 100 weakref.KeyedRef
5 1 0 64 0 58320 100 keystoneauth1.discover.Discover
5.The issue
So the problem is that memoized does not work like what we expect. It
allocates memory to cache objects but some of them could not be
released.
To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1793411/+subscriptions
Follow ups