openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #14095
Re: [Swift] [Storage node] Lots of timeouts in load test after several hours around 1, 000, 0000 operations
I found that updater and replicator could improve this issue.
In my original practice , for getting best performance , I only start main
workers ( account-server , container-server , object-server) , And keep
upload / download / delete objects over 1000000 times.
Issues:
1. XFS or Swift consumes lots of memory for some reason , does anyone know
what's been cached(or buffered , cached usage is not too much though) in
memory in this practice ? After running container/object replicator , those
memory all released. I'm curious the contents in memory . Is that all about
object's metadata or something else?
2. Plenty of 10s timeout in proxy-server's log . Due to timeout for getting
final status of put object from storage node.
At beginning , object-workers complain about 3s timeout for updating
container (async later). but there's not too much complains . As more and
more put / get / delete operations , more and more timeout happend.
Seems that updater can improve this issue.
Does this behavior related to the number of data in pickle ?
Thanks
Hugo
2012/7/2 Kuo Hugo <tonytkdk@xxxxxxxxx>
> Hi all ,
>
> I did several loading tests for swift in recent days.
>
> I'm facing an issue ....... Hope you can share you consideration to me ...
>
> My environment:
> Swift-proxy with Tempauth in one server : 4 cores/32G rams
>
> Swift-object + Swift-account + Swift-container in storage node * 3 , each
> for : 8 cores/32G rams 2TB SATA HDD * 7
>
> =====================================================================================
> bench.conf :
>
> [bench]
> auth = http://172.168.1.1:8082/auth/v1.0
> user = admin:admin
> key = admin
> concurrency = 200
> object_size = 4048
> num_objects = 100000
> num_gets = 100000
> delete = yes
> =====================================================================
>
> After 70 rounds .....
>
> PUT operations get lots of failures , but GET still works properly
> *ERROR log:*
> Jul 1 04:35:03 proxy-server ERROR with Object server
> 192.168.100.103:36000/DISK6 re: Trying to get final status of PUT to
> /v1/AUTH_admin/af5862e653054f7b803d8cf1728412d2_6/24fc2f997bcc4986a86ac5ff992c4370:
> Timeout (10s) (txn: txd60a2a729bae46be9b667d10063a319f) (client_ip:
> 172.168.1.2)
> Jul 1 04:34:32 proxy-server ERROR with Object server
> 192.168.100.103:36000/DISK2 re: Expect: 100-continue on
> /AUTH_admin/af5862e653054f7b803d8cf1728412d2_19/35993faa53b849a89f96efd732652e31:Timeout (10s)
>
>
> And kernel starts to report failed message as below
> *kernel failed log:*
> 76666 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.020736] w83795
> 0-002f: Failed to read from register 0x03c, err -6
> 76667 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.052654]
> w83795 0-002f: Failed to read from register 0x015, err -6
> 76668 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.080613]
> w83795 0-002f: Failed to read from register 0x03c, err -6
> 76669 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.112583]
> w83795 0-002f: Failed to read from register 0x016, err -6
> 76670 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.144517]
> w83795 0-002f: Failed to read from register 0x03c, err -6
> 76671 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.176468]
> w83795 0-002f: Failed to read from register 0x017, err -6
> 76672 Jul 1 16:37:50 angryman-storage-03 kernel: [350840.208455]
> w83795 0-002f: Failed to read from register 0x03c, err -6
> 76673 Jul 1 16:37:51 angryman-storage-03 kernel: [350840.240410]
> w83795 0-002f: Failed to read from register 0x01b, err -6
> 76674 Jul 1 16:37:51 angryman-storage-03 kernel: [350840.272Jul 1
> 17:05:28 angryman-storage-03 kernel: imklog 6.2.0, log source =
> /proc/kmsg started.
>
> PUTs become slower and slower , from 1,200/s to 200/s ...
>
> I'm not sure if this is a bug or that's the limitation of XFS. If it's an
> limit of XFS . How to improve it ?
>
> An additional question is XFS seems consume lots of memory , does anyone
> know about the reason of this behavior?
>
>
> Appreciate .......
>
>
> --
> +Hugo Kuo+
> tonytkdk@xxxxxxxxx
> + <tonytkdk@xxxxxxxxx>886 935004793
>
>
--
+Hugo Kuo+
tonytkdk@xxxxxxxxx
+ <tonytkdk@xxxxxxxxx>886 935004793
Follow ups
References