sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #06061
[Bug 1999816] Re: Failure to get free disk space breaks "rabbitmqctl status" command
In focal (haven't yet checked the others in this regard), rabbitmq (the
server) seems to call "df" every 10s, regardless if I run rabbitmqctl
status or not:
# execsnoop-bpfcc -T -n df
TIME PCOMM PID PPID RET ARGS
13:17:48 df 7203 7202 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:17:48 df.orig 7204 7203 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:17:58 df 7209 7208 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:17:58 df.orig 7210 7209 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:08 df 7212 7211 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:08 df.orig 7213 7212 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:18 df 7215 7214 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:18 df.orig 7216 7215 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:28 df 7218 7217 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:28 df.orig 7219 7218 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:38 df 7221 7220 0 /usr/bin/df -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
13:18:38 df.orig 7222 7221 0 /usr/bin/df.orig -kP /var/lib/rabbitmq/mnesia/rabbit@f-rabbit
(I used a /usr/bin/df wrapper that calls /usr/bin/df.orig "$@")
If you call rabbitmqctl-status in between those df calls, you will get
the report from the last df run.
If I add the long sleep, and call rabbitmqctl status while that sleep is
running, then my status command hangs until the sleep is over, or a
timeout is reached.
How about changing the test case to have df exit without printing
anything?
Like:
cat <<EOF >$SH
#!/bin/sh
exit 0
EOF
I noticed that in this case (focal at least) the server calls df once,
probably notices it isn't working, and doesn't call it again, so no
repeated calls every 10s. I left it for a while and it looks like the
new frequency is every 2min. Once df is working again (if I let the
wrapper call df.orig for example), then it resumes the 10s frequency.
--
You received this bug notification because you are a member of SE
("STS") Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1999816
Title:
Failure to get free disk space breaks "rabbitmqctl status" command
Status in rabbitmq-server package in Ubuntu:
Fix Released
Status in rabbitmq-server source package in Focal:
Fix Committed
Status in rabbitmq-server source package in Jammy:
Fix Committed
Status in rabbitmq-server source package in Kinetic:
Fix Committed
Bug description:
[Impact]
When for some reason the df command fails to get the disk free space
(for example timeout on a heavily loaded system) the result is a
harcoded value of "unknown". As this is not a valid number this
generates arithmetic errors when the "rabbitmqctl status" command is
run and tries to divide that value to convert it to another unit.
This has been fixed upstream here:
https://github.com/rabbitmq/rabbitmq-server/pull/4897
[Test Plan]
The df command can be linked to another file that just waits for a few
minutes to force a timeout for example: [detailed steps in comment
#5].
#!/bin/bash
sleep 5m
After the timeout occurs the "rabbitmqctl status" returns an error
with the unpatched version. After the patch it shows all the
information and displays unknown in the free space line.
[Where problems could occur]
The patch just changes the display of information, it should not break
anything in the core operations of the package
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1999816/+subscriptions