openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #10157
Swift replication analysis and questions (high CPU usage)
Hi there,
We're currently running a 3 nodes installation of Swift. We saw that the
CPU usage on this nodes is always very high, around 50%, with no users
connected, and we'd like to be sure there's not something wrong.
So I did some analysis and the whole point of this mail is to be
corroborated or invalidated.
Facts
=====
These 3 nodes are hosting account, containers and objects.
The following processes take a lot of CPU time:
swift-container-replicator
swift-container-server
swift-object-replicator
swift-object-server
The object replication time as indicated in log and by swift-recon
showed that it took around 5 minutes.
The options used are the default ones (nothing fancy then).
We replicate each account/container/objects 3 times.
Volumetry:
- 450 GB of storage used (each node has 19 TB)
- 57 accounts
- 7929 containers in 7870 partitions
- 58158 objects partitions used so far
Each ring has been built with 2^18 (262144) partitions.
The containers and objects sync run at around 300 partitions/s.
Analysis
========
What's taking CPU time is replication. If I understand correctly, the
container and object replicators walks through their respective
directories, and send a REPLICATE command to all the other servers
responsible for the same partition. Since there's only 3 servers for 3
replicates, obviously all nodes hold all the partitions. That means that
swift-{objecter,container}-replicator processes of host A will send
REPLICATE requests to hosts B and C, making the
swift-{object,container}-server process uses CPU on host B and C to
return some response indicating that the data must be rsync-ed or not.
For containers, it means that it walks ~8K directories and open 8K
sqlite database, send 2 REPLICATES, and does nothing (when everything is
in sync, which is the case 99.999% of the time).
For objects, it means that it walks ~58K directories and open the
hashfile for each of its 58K partitions and send 2 REPLICATES, and does
nothing (when everything is in sync, which is the case 99.999% of the
time)
For containers, it does that every 30s (by default). With around 8K
containers, it takes more than 35s, so swift-container-replicator uses
one CPU at 100 % all the time.
For objects, it does that in 5 minutes and pauses for 30s (by default).
If this is the normal behaviour, I can then conclude that:
- we need more hosts to lower CPU usage, because when we'll reach the
262K objects partitions used, the CPU usage will explode and the
objects synchronisation time will increase by 5 times.
For example adding 2 more hosts would allow to reduce by 40 % the
number of partitions hosted on each hosts (each would have 60 % of
partition space rather than 100 %).
- we should have used less partitions to use on such a small hosts,
probably something between 2^10 and 2^14 (we'll add more hosts, but
probably not thousands).
What I did so far is to increase the number of workers and the
concurrency setting. Setting concurrency to 32 (each hosts has 8 CPU
cores) for object replicator pushed the CPU usage to 80 % (reminder:
50 % before) but divided the replication time by more than 2 (~2 minutes
instead of 5 minutes).
This doesn't help with CPU usage obviously, it's worst, but at least it
takes just 30 % more CPU to do the same thing twice in the same time
frame.
Thanks for any hint or helpful comment about this!
--
Julien Danjou
// eNovance http://enovance.com
// ✉ julien.danjou@xxxxxxxxxxxx ☎ +33 1 49 70 99 81
Follow ups