← Back to team overview

maria-developers team mailing list archive

Re: [Commits] 7a7ad82: MDEV-13478 Full SST sync fails because of the error in the cleaning part


Hi, Sachin!

Unfortunately, your comment is rather difficult to understand. What
about this one:

 The command was:
    find $paths -mindepth 1 -regex $cpat -prune -o -exec rm -rf {} \+
 Which was supposed to work as
    * skipping $paths directories themselves (-mindepth 1)
    * see if the dir/file name matches $cpat (-regex)
    * if yes - don't dive into the directory, skip it (-prune)
    * otherwise (-o)
    * remove it and everything inside (-exec)
 Now -exec ... \+ works like this:
    every new found path is appended to the end of the command line.
    when accumulated command line length reaches `getconf ARG_MAX` (~2Gb)
    it's executed, and find continues, appending to a new command line.

 What happens here, find appends some directory to the command line,
 then dives into it, and starts appending files from that directory.
 At some point command line overflows, rm -rf gets executed and removes
 the whole directory. Now find tries to continue scanning the directory
 that was already removed.

Fix: don't dive into directories that will be recursively removed
anyway, use -prune for them. Basically, we should be pruning both paths
that have matched $cpat and paths that have not matched it. This is
achived by pruning unconditionally, before the regex is tested:
    find $paths -mindepth 1 -prune -regex $cpat -o -exec rm -rf {} \+

On Dec 19, sachin wrote:
> revision-id: 7a7ad82029a6c78d31d6736a562c12d02c4d968c (mariadb-galera-5.5.58-3-g7a7ad82)
> parent(s): e6e026ae51a77969749de201d491a176483bbc69
> author: Sachin Setiya
> committer: Sachin Setiya
> timestamp: 2017-12-19 22:30:43 +0530
> message:
> MDEV-13478 Full SST sync fails because of the error in the cleaning part
> Problem:- The problem is in wsrep_sst_xtrabackup-v2.sh we use
> find $ib_home_dir $ib_log_dir $ib_undo_dir $DATA -mindepth 1  -regex $cpat  -prune  -o -exec rm -rfv {} 1>&2 \+
>     the problem is that since we have '\+' in end that means all output
>     will be expanded after rm -rfv . If we have really large database(
>     quite a no of tables with big names) then this create a problem.
>     This will result in calling 'rm -rvf' two time(or may be more). So non
>     deterministicly this might that upto directory name went to rm and
>     remaining was truncated. For example consider a folder xyz with lots
>     of files. We executed
>     find xyz -exec rm -rfv {} \+
>     Since we have like millions of file in rm, So it will be greater then
>     ARG_MAX , so there will be multiple rm invocation. Say in nth invocation
>     this might happen rm ....... xyz/ {So it get truncated at xyz/}
>     Above will remove the whole xyz directory and remaining invocation
>     of rm will return error. How ever this type of error is non deterministic
>     so is bug.
> Solution:-
>     In above example if instead of removing each file in xyz if we remove
>     xyz then we have our solution :).
>     Actually if we shift the -prune term in find we can get to solution.
>     Why currently find is working like this
>        find ( -regex && -prune) || -exec
>     that means if -regex is true (-prune will always return true) then exec
>     (and hence rm )wont work. But if regex fails then exec is applied with
>     out -prune which makes rm delete each single file instead of folder.
>     So we change the position of regex and prune then prune will always be
>     applied whether regex is true or not.
>        find (-prune && -regex) || -exec
> Patch Credit:- Serg
> ---
>  scripts/wsrep_sst_xtrabackup-v2.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> diff --git a/scripts/wsrep_sst_xtrabackup-v2.sh b/scripts/wsrep_sst_xtrabackup-v2.sh
> index 327c92f..00d8fe2 100644
> --- a/scripts/wsrep_sst_xtrabackup-v2.sh
> +++ b/scripts/wsrep_sst_xtrabackup-v2.sh
> @@ -863,7 +863,7 @@ then
>          wsrep_log_info "Cleaning the existing datadir and innodb-data/log directories"
> -        find $ib_home_dir $ib_log_dir $ib_undo_dir $DATA -mindepth 1  -regex $cpat  -prune  -o -exec rm -rfv {} 1>&2 \+
> +        find $ib_home_dir $ib_log_dir $ib_undo_dir $DATA -mindepth 1 -prune -regex $cpat -o -exec rm -rfv {} 1>&2 \+
>          tempdir=$(parse_cnf mysqld log-bin "")
>          if [[ -n ${tempdir:-} ]];then

Chief Architect MariaDB
and security@xxxxxxxxxxx

Follow ups