← Back to team overview

openstack team mailing list archive

[Swift] LFS patch (Ia32c9c34)



I've submit patch (https://review.openstack.org/#/c/7101/), that help Swift use special features of file system on that it working.

One of the  changes in this patch is for reduce number of network replicas of partition if user use self-repairing mirrored device. For this user should add mirror_copies parameter to each device. By default mirror_copies for all devices is 1, so changes of code don't take any effect for current Swift deployments.  For almost all systems three singleton replicas can be replaced by two mirrored replicas. So if all user devices is mirrored (mirror_copies >= 2), then number of network copies of most partition will be reduced, and then for operation like PUT and POST we will make less request. The definition of mirroring specifically requires the local file system detect the bad replica on its own, such as by calculating checksums of the content, and automatically repairing data defects when discovered. So if one of devices fail recovery will be done by file system without coping data from other device. This changes was made in ring builder and take effect if mirror_copies > 1, so this code is not danger for current Swift users, but for other users can provide new possibility.

Also this patch add hooks, that can be used for manipulation with file system, when Swift operate with account, container or object files. This hooks used by middleware that is separate project, so if user don't install it this changes will not take effect.

This feature only enabled by customers that have chosen to install  the enabling software and turn it on and it is easy to test that this patches have no impact on the generic deployments.

Most of patch code was restructured, most of logic was moved to middleware level and use hooks in Swift code. I create separate project (LFS middleware https://github.com/nexenta/lfs) for now there are only 2 supported file system types (XFS and ZFS) there. Also this middleware provide API for getting file system status information (for example, for ZFS it's current pool status, etc).

Further the Nexenta side-project is not the only local file system that could provide this form of local replication and data protection.Trading off between network replication and local replication is a valid performance decision. Insisting on a fixed amount of network replication without regard to the degree of local protection provided against data loss would effectively bias the system towards network replication only. Why pay for local data protection if you cannot reduce the amount you pay for network replication? This patch enables solutions that widen the range of choices available to users as to how they protect their data.

Victor Rodionov

Follow ups