openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #18010
Re: Troubleshooting Swift 1.7.4 on mini servers
Ok, if I was giving out t-shirts for finding this issue then the prize
would go to Pete. Thank you!!!!
Disabling fallocate did the trick. I was slowly working my way through
all the object-server config options and hadn't gotten to that one yet.
Turning features on and off by brute force is admittedly lame, but
sometimes that's all you have.
I also turned off all the other things I was doing to try to slow down the
mini-servers, but disabling fallocate was all that was necessary. Here is
my config:
[DEFAULT]
bind_ip = 192.168.1.202
workers = 1
disable_fallocate = true
[pipeline:main]
pipeline = object-server
[app:object-server]
use = egg:swift#object
[object-replicator]
[object-updater]
[object-auditor]
A few more details...
My servers are running Ubuntu 12.04 LTS. A straight-up apt-get of all the
pre-requisites did NOT produce a working Swift deployment on Arm.
Although all the dependencies would deploy fine and the Swift services
would start up, the proxy-server could not communicate with the storage
nodes.
So I also had to get older, Armel versions of the python-greenlet and
python-eventlet.
https://launchpad.net/ubuntu/precise/armel/python-greenlet/0.3.1-1ubuntu5.1
https://launchpad.net/ubuntu/precise/armel/python-eventlet/0.9.16-1ubuntu4.1
Once I deployed those older libraries for Armel, then my Swift cluster
worked (except for the fallocate issue).
Thanks for everyone's help.
-N
On Tue, Oct 30, 2012 at 11:07 AM, Nathan Trueblood
<nathan@xxxxxxxxxxxxxxxx>wrote:
> The filesystem is XFS, and I used the recommended mkfs and mount options
> for Swift.
>
> The file size seems to have no bearing on the issue, although I haven't
> tried really tiny files. Bigfile3 is only 200K.
>
> I'll try disabling fallocate...
>
>
> On Mon, Oct 29, 2012 at 7:37 PM, Pete Zaitcev <zaitcev@xxxxxxxxxx> wrote:
>
>> On Mon, 29 Oct 2012 18:16:52 -0700
>> Nathan Trueblood <nathan@xxxxxxxxxxxxxxxx> wrote:
>>
>> > Definitely NOT a problem with the filesystem, but something is causing
>> the
>> > object-server to think there is a problem with the filesystem.
>>
>> If you are willing to go all-out, you can probably catch the
>> error with strace, if it works on ARM. Failing that, find all places
>> where 507 is generated and see if any exceptions are caught, by
>> modifying the source, I'm afraid to say.
>>
>> > I suspect a bug in one of the underlying libraries.
>>
>> That's a possibility. Or, it could be a kernel bug. You are using XFS,
>> right? If it were something other than XFS or ext4, I would suspect
>> ARM blowing over the 2GB barrier somewhere, since your object is
>> called "bigfile3". As it is, you have little option than to divide
>> the layers until you identify the one that's broken.
>>
>> BTW, make sure to disable the fallocate, since we're at it.
>>
>> -- Pete
>>
>
>
References