maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #03227
MariaDB init.d Script Probable Bug in wait_for_ready()
Hi,
Context:
-----------
Version: MariaDB 10.0
Repo: http://yum.mariadb.org/10.0/centos7-amd64
OS: CentOS Linux release 7.2.1511
Kernel: 3.10.0-229.7.2.el7.x86_64
Cloud Provider: Windows Azure
I run a three-node MariaDB-Galera cluster on production for my company.
With the custom config file at:
*/etc/my.cnf.d/server.cnf*
My datadisk is mounted over a lvm2 partition over two physical volumes,
mounted on the path.
*/datadrive/galera/mysql*
The logical volume only takes up 50% of the volume group as I leave the
empty space for snapshot operations.
This is reflected in my configuration as:
*[mysqld]*
*datadir = /datadrive/galera/mysql *
*socket = /datadrive/galera/mysql/mysql.sock *
Issue:
--------
Coming to the issue at hand;
When starting the primary component through:
*$ sudo service mysql bootstrap*
or,
*$ sudo service mysqld start --wsrep-new-cluster*
( Essentially same, as bootstrap differs to *start --wsrep_new_cluster *)
The bootstrap sequence correctly calls mysqldsafe, as below
*/usr/bin/mysqld_safe --datadir=/datadrive/galera/mysql
--pid-file=/datadrive/galera/mysql/ciq-test-db01.pid --wsrep-new-cluster*
However,
it then gets stuck in the *wait_for_ready() function of /etc/init.d/mysql*
The sequence tries to ping mysqladmin through:
* if $bindir/mysqladmin ping >/dev/null 2>&1; then*
*log_success_msg*
* return 0*
*elif kill -0 $! 2>/dev/null ; then*
* : # mysqld_safe is still running*
*else*
* # mysqld_safe is no longer running, abort the wait loop*
* break*
It gets stuck here as the test:
* if $bindir/mysqladmin ping >/dev/null 2>&1; then*
Always fails and the counter keeps increasing till 900 and Error's out with
the
*log_failure_msg*
*return 1*
Test:
I tested that line from the console as:
*$mysqladmin ping >dev/null 2>&1*
and, was thrown the following error:
*mysqladmin: connect to server at 'localhost' failed*
*error: 'Can't connect to local MySQL server through socket
'/var/lib/mysql/mysql.sock' (2 "**No such file or directory")'*
*Check that mysqld is running and that the socket:
'/var/lib/mysql/mysql.sock' exists!*
Obviously, '*/var/lib/mysql' *does not exist in my setup, and the socket
file was at *'/datadrive/galera/mysql/mysql.sock'*
But, my socket can be passed explicitly to *mysqladmin *by doing,
*$mysqladmin --socket=**/datadrive/galera/mysql/mysql.sock ping*
This actually worked as expected and I got the desired output,
*mysqld is alive*
Solution:
-----------
What I surmised through my novice abilities was that the init script was
not passing the variables it read from */etc/my.cnf.d/server.cnf to
mysqladmin*
So, for now I have done a hackjob by altering the init script function
wait_for_ready() with the test:
*if $bindir/mysqladmin --socket=/datadrive/galera/mysql/mysql.sock ping
>/dev/null 2>&1; then*
* log_success_msg*
* return 0*
And voila,
*$ sudo service mysql start*
*$ sudo service mysql start --wsrep-new-cluster*
*$ sudo service mysql bootstrap*
All above works. It is evident that I should be passing all the variables
from */etc/my.cnf.d/server.cnf *that *mysqladmin *needs to set it's
environement correctly.
However I think that this could be / should be done on the pre-distributed
init script itself rather than a user-side hack.
While I have tried to be thorough in my inspection of the issue, I may have
missed either something very basic or inherently complex that's currently
is inherent to the process of initializing the mysql service. Please
redirect me to a corrected course if that is the case.
P.S. First time post in the list, I may have stated things unacceptably
P.P.S. Merry Christmas to you all.
Thanks and regards,
Joy Bhattacherjee
Mob: +91-9011235028
Follow ups