maria-developers team mailing list archive
Mailing list archive
Re: [Commits] 454a9dc: MDEV-10004: Galera's pc.recovery process fails in 10.1 with systemd
I believe this patch is acceptable. Still I didn't completely understand the
need for all this complexity and your answer to Daniel's question.
Some minor comments inline.
On Fri, May 20, 2016 at 11:04:24PM -0400, Nirbhay Choubey wrote:
> revision-id: 454a9dc53a5b02af316dd713c577082eb8cf79c4 (mariadb-10.1.14-4-g454a9dc)
> parent(s): 9a1c4e900b98fdb9940aab57c895753f175c2bd8
> author: Nirbhay Choubey
> committer: Nirbhay Choubey
> timestamp: 2016-05-20 23:04:20 -0400
> MDEV-10004: Galera's pc.recovery process fails in 10.1 with systemd
> Galera recovery process works in two phases. In the first
> phase, mysqld is started as non-daemon with --wsrep-recover
> to recover and fetch the last logged global transaction ID.
> This ID is then used in second phase as the start position
> (--wsrep-start-position=XX) to start mysqld as daemon.
> As this process was implemented in mysqld_safe script, the
> recovery did not work when server was started using systemd.
> Fixed by introducing a shell script (wsrep_recovery.sh) that
> mimics the first phase of the recovery process.
> cmake/systemd.cmake | 3 +-
> scripts/galera_recovery.sh | 94 ++++++++++++++++++++++++++++++++++++++++
> support-files/mariadb.service.in | 11 ++++-
> 3 files changed, 106 insertions(+), 2 deletions(-)
> diff --git a/cmake/systemd.cmake b/cmake/systemd.cmake
> index b0161cf..17f44f7 100644
> --- a/cmake/systemd.cmake
> +++ b/cmake/systemd.cmake
> @@ -55,9 +55,10 @@ MACRO(CHECK_SYSTEMD)
> IF(HAVE_SYSTEMD AND HAVE_SYSTEMD_SD_DAEMON_H AND HAVE_SYSTEMD_SD_LISTEN_FDS
> AND HAVE_SYSTEMD_SD_NOTIFY AND HAVE_SYSTEMD_SD_NOTIFYF)
> - SET(SYSTEMD_SCRIPTS mariadb-service-convert galera_new_cluster)
> + SET(SYSTEMD_SCRIPTS mariadb-service-convert galera_new_cluster galera_recovery)
> SET(SYSTEMD_DEB_FILES "usr/bin/mariadb-service-convert
> + usr/bin/galera_recovery
> diff --git a/scripts/galera_recovery.sh b/scripts/galera_recovery.sh
> new file mode 100755
> index 0000000..6d4f1d5
> --- /dev/null
> +++ b/scripts/galera_recovery.sh
> @@ -0,0 +1,94 @@
> +# Copyright (c) 2016 MariaDB Corporation
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; version 2 of the License.
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
> +# This script is intended to be executed by systemd. It starts mysqld with
> +# --wsrep-recover to recover from a non-graceful shutdown, determines the
> +# last stored global transaction ID and echoes it in --wsrep-start-position=XX
> +# format. The output is then captured and used by systemd to start mysqld.
> +# If the server was configured to start without wsrep, nothing is echoed.
> +log_file=$(mktemp /tmp/wsrep_recovery.XXXXXX)
> +euid=$(id -u)
> +log ()
> + local msg="$1"
> + # Print all messages to stderr as we reserve stdout for printing
> + # --wsrep-start-position=XXXX.
> + echo "$msg" >&2
> + rm -f "$log_file"
> +trap finish EXIT
> +# Safety checks
> +if [ -z "$log_file" ]; then
> + log "WSREP: mktemp failed"
> + exit 1
> +if [ -f "$log_file" ]; then
> + [ "$euid" = "0" ] && chown $user $log_file
> + chmod 600 $log_file
> + log "WSREP: Failed to change the permissions of $log_file."
I'd say it's rather mktemp failed than change permissions failed.
> + exit 1
> +# Redirect server's error log to the log file.
> +eval /usr/sbin/mysqld --user=$user --wsrep_recover 2> "$log_file"
> +if [ $ret -ne 0 ]; then
> + # Something went wrong, let us also print the error log so that it
> + # shows up in systemctl status output as a hint to the user.
> + log "WSREP: Failed to start mysqld for wsrep recovery: '`cat $log_file`'"
> + exit 1
Since this script is executed unconditionally, what shall happen if wsrep is
not enabled at compile time (--wsrep_recover is not known)? Fail the whole
> +# Parse server's error log for recovered position. The server prints
> +# "..skipping position recovery.." if started without wsrep.
> +recovered_pos="$(grep 'WSREP: Recovered position:' $log_file)"
> +if [ -z "$recovered_pos" ]; then
> + skipped="$(grep WSREP $log_file | grep 'skipping position recovery')"
> + if [ -z "$skipped" ]; then
> + log "WSREP: Failed to recover position: '`cat $log_file`'"
> + exit 1
Can we come here with wsrep disabled?
> + else
> + log "WSREP: Position recovery skipped."
> + fi
> + start_pos="$(echo $recovered_pos | sed 's/.*WSREP\:\ Recovered\ position://' \
> + | sed 's/^[ \t]*//')"
> + log "WSREP: Recovered position $start_pos"
> + start_pos_opt="--wsrep_start_position=$start_pos"
> +echo "$start_pos_opt"
> diff --git a/support-files/mariadb.service.in b/support-files/mariadb.service.in
> index b18674b..00162d4 100644
> --- a/support-files/mariadb.service.in
> +++ b/support-files/mariadb.service.in
> @@ -48,6 +48,12 @@ CapabilityBoundingSet=CAP_IPC_LOCK
> # Execute pre and post scripts as root, otherwise it does it as User=
> +# Perform automatic wsrep recovery. When server is started without wsrep,
> +# galera_recovery simply returns an empty string. In any case, however,
> +# the script is not expected to return with a non-zero status.
> +ExecStartPre=/bin/sh -c "VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && \
> + systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1"
I believe _WSREP_START_POSITION won't work with multi instance version. Not
sure if it works but we may need to use something like _WSREP_START_POSITION%I
> # Needed to create system tables etc.
> # ExecStartPre=/usr/bin/mysql_install_db -u mysql
> @@ -57,9 +63,12 @@ PermissionsStartOnly=true
> # This isn't a replacement for my.cnf.
> # _WSREP_NEW_CLUSTER is for the exclusive use of the script galera_new_cluster
> -ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER
> +ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION
> +# Unset _WSREP_START_POSITION environment variable.
> +ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
> commits mailing list