openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #16502
Re: [OpenStack][Swift][Replicator] Does object replicator push "exist" object to handoff node while a node/disk/network fails ?
you can force a replicator to push to a handoff node by unmounting the drive one of the primary replicas is on.
--John
On Sep 6, 2012, at 9:00 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
> Hi folks , John and Chmouel ,
>
> I did post a question about this long time ago. And my test result is match to Chmouel's answer.
>
> https://answers.launchpad.net/swift/+question/191924
> "The object replicator will push an object to a handoff node if another primary node returns that the drive the object is supposed to go on is bad. We don't push to handoff nodes on general errors, otherwise things like network partitions or rebooting machines would cause storms of unneeded handoff traffic."
>
> But I read something different from John (or just my misunderstanding) , so want to clarify it.
>
> Assumption :
> Storage Nodes : 5 (each for one zone)
> Zones : 5
> Replica : 3
> Disks : 2*5 ( 1 disk/per node )
>
> Account AUTH_test
> Container Con_1
> Object Obj1
>
>
> Partition 3430
> Hash 6b342ac122448ef16bf1655d652bfe1e
>
> Server:Port Device 192.168.1.101:36000 DISK1
> Server:Port Device 192.168.1.102:36000 DISK1
> Server:Port Device 192.168.1.103:36000 DISK1
> Server:Port Device 192.168.1.104:36000 DISK1 [Handoff]
> Server:Port Device 192.168.1.105:36000 DISK1 [Handoff]
>
>
> curl -I -XHEAD "http://192.168.1.101:36000/DISK1/3430/AUTH_test/Con_1/Obj1"
> curl -I -XHEAD "http://192.168.1.102:36000/DISK1/3430/AUTH_test/Con_1/Obj1"
> curl -I -XHEAD "http://192.168.1.103:36000/DISK1/3430/AUTH_test/Con_1/Obj1"
> curl -I -XHEAD "http://192.168.1.104:36000/DISK1/3430/AUTH_test/Con_1/Obj1" # [Handoff]
> curl -I -XHEAD "http://192.168.1.105:36000/DISK1/3430/AUTH_test/Con_1/Obj1" # [Handoff]
>
>
> ssh 192.168.1.101 "ls -lah /srv/node/DISK1/objects/3430/e1e/6b342ac122448ef16bf1655d652bfe1e/"
> ssh 192.168.1.102 "ls -lah /srv/node/DISK1/objects/3430/e1e/6b342ac122448ef16bf1655d652bfe1e/"
> ssh 192.168.1.103 "ls -lah /srv/node/DISK1/objects/3430/e1e/6b342ac122448ef16bf1655d652bfe1e/"
> ssh 192.168.1.104 "ls -lah /srv/node/DISK1/objects/3430/e1e/6b342ac122448ef16bf1655d652bfe1e/" # [Handoff]
> ssh 192.168.1.105 "ls -lah /srv/node/DISK1/objects/3430/e1e/6b342ac122448ef16bf1655d652bfe1e/" # [Handoff]
>
> Case :
> Obj1 is already been uploaded to 3 primary devices properly. What kind of fails on "192.168.1.101:3600 DISK1" will trigger replicator push a copy to "192.168.1.104:36000 DISK1 [handoff] " device ?
>
> In my past test , the replicator does not push a copy to handoff node for an "existing" object. Whatever network fail / reboot machine / umount disk , I think these are general errors from Chmouel mentioned before. But I'm not that sure about the meaning of "replicator will push an object to a handoff node if another primary node returns that the drive the object is supposed to go on is bad" . How object-replicator to know that the drive the object is supposed to go on is bad (I think replicator will never know it. Should it work with object-auditor ?)
>
> How to produce a fail to trigger replicator push object to handoff node ?
>
> In my consideration , for replicator pushes an object to handoff node there's a condition is that primary device does not have the object , also can not push into the device(192.168.1.101:36000 DISK1). It might be moved to quarantine due to the object-auditor found the object is broken.
>
> So that even the disk(192.168.1.101:3600 DISK1) is still mounted and the target partition 3430 does not have Obj1 . Another node's object-replicator try to push it's Obj1 to "192.168.1.101:36000 DISK1" , but unluckily , the "192.168.1.101:36000 DISK1" is bad. So the object-replicator will push object to "192.168.1.104:36000 DISK1 [handoff] " now .
>
> That's my inference , please feel free to correct it . I'm really confusing about to produce the kind of fails for replicator to push object to handoff node .
> Any idea would be great .
>
>
> Cheers
> --
> +Hugo Kuo+
> tonytkdk@xxxxxxxxx
> +886 935004793
>
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
Follow ups
References