Skip to content

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140

Open
Thirunarayanan wants to merge 2 commits into
MDEV-14992from
MDEV-39061
Open

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140
Thirunarayanan wants to merge 2 commits into
MDEV-14992from
MDEV-39061

Conversation

@Thirunarayanan

Copy link
Copy Markdown
Member

scripts/mariabackup/mariabackup.sh: a drop-in wrapper that
lets existing mariabackup --backup invocations drive the server-side
BACKUP SERVER command without changing user scripts.

mariabackup.sh translates
--backup into "BACKUP SERVER TO '<dir>'" via the mariadb client,
forwarding connection options, and layers
--stream/--compress as tar/gzip pipelines on the result. It
validates the target directory up front and rejects the

- mbstream.sh shims the mbstream CLI onto tar, dropping
mbstream-only flags (-p/--parallel) so legacy pipelines keep working.
- README.md maps every supported --backup option to its BACKUP SERVER
equivalent and documents current limitations.

Add include/have_mariabackup_wrapper.inc redirects
$XTRABACKUP to the wrapper so a test opts in by sourcing one
file; skips when the wrapper, bash, or the mariadb client
is unavailable.

wrapper_basic.test : Exercises full backup, streaming, compression,
the ignored legacy options

@Thirunarayanan Thirunarayanan requested a review from dr-m May 28, 2026 09:52
@CLAassistant

CLAassistant commented May 28, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces server-side backup support (BACKUP SERVER) for MariaDB, implementing backup and log archiving mechanisms for both the InnoDB and Aria storage engines. It also includes a compatibility wrapper script (mariabackup.sh) to map legacy mariabackup CLI commands to the new server-side SQL interface. The code review identified several critical issues, including a potential server crash in backup_innodb.cc due to invalid format arguments in error reporting, null pointer dereferences and assertion failures in both engines when backup steps are executed out of order or fail to initialize, and security vulnerabilities (command injection) and argument parsing bugs in the shell wrapper script.

Comment thread storage/innobase/handler/backup_innodb.cc Outdated
Comment thread storage/maria/ma_backup.cc Outdated
Comment thread storage/innobase/handler/backup_innodb.cc Outdated
Comment thread storage/innobase/handler/backup_innodb.cc Outdated
Comment thread scripts/mariabackup/mariabackup.sh Outdated
Comment on lines +315 to +328
if [[ -n "$FINAL_INCLUDE" ]]; then
echo "Setting backup_include='$FINAL_INCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'"
fi

if [[ -n "$FINAL_EXCLUDE" ]]; then
echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'"
fi

# Execute BACKUP SERVER
SQL="BACKUP SERVER TO '$TARGET_DIR'"
echo "Executing: $SQL" >&2
mariadb $MARIADB_OPTS -e "$SQL"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The parameters FINAL_INCLUDE, FINAL_EXCLUDE, and TARGET_DIR are interpolated into shell commands. To effectively prevent command injection, we should validate these inputs using a strict whitelist of allowed characters (such as alphanumeric characters and a limited set of safe symbols like '-', '_', '.') instead of relying on a blacklist or simple escaping.

Suggested change
if [[ -n "$FINAL_INCLUDE" ]]; then
echo "Setting backup_include='$FINAL_INCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'"
fi
if [[ -n "$FINAL_EXCLUDE" ]]; then
echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2
mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'"
fi
# Execute BACKUP SERVER
SQL="BACKUP SERVER TO '$TARGET_DIR'"
echo "Executing: $SQL" >&2
mariadb $MARIADB_OPTS -e "$SQL"
if [[ -n "$FINAL_INCLUDE" && ! "$FINAL_INCLUDE" =~ ^[a-zA-Z0-9_.-]+$ ]]; then
echo "Error: Invalid characters in FINAL_INCLUDE" >&2
exit 1
fi
References
  1. When validating input that will be interpolated into shell commands, use a strict whitelist of allowed characters (e.g., alphanumeric and a limited set of safe symbols like '-', '_', '.') instead of a blacklist to effectively prevent command injection.

Comment thread scripts/mariabackup/mariabackup.sh Outdated
TABLES_PATTERN=""
TABLES_EXCLUDE_PATTERN=""
TABLES_FILE=""
MARIADB_OPTS=""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Handling MARIADB_OPTS as a flat string will break if any option contains spaces or special characters (e.g., --password="my secret pass"). We should use a bash array instead.

Suggested change
MARIADB_OPTS=""
MARIADB_OPTS=()

Comment thread scripts/mariabackup/mariabackup.sh Outdated
Comment on lines +74 to +100
--user=*|--password=*|--host=*|--port=*|--socket=*)
# Connection options (long form) - pass through to mariadb client
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;
-u|-p|-h|-P|-S)
# Short form. If the next argv looks like a value (not another
# flag), consume it as well — supports both `-u USER` and bare
# `-p` (which mariadb client treats as a password prompt).
if [[ -n "${2-}" && "$2" != -* ]]; then
MARIADB_OPTS="$MARIADB_OPTS $1 $2"
shift 2
else
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
fi
;;
-u*|-p*|-h*|-P*|-S*)
# Bundled short form: -uUSER, -pSECRET, -hHOST, ...
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;
--defaults-file=*|--defaults-extra-file=*)
# Config file options - pass through to mariadb client
MARIADB_OPTS="$MARIADB_OPTS $1"
shift
;;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

We should append connection options to the MARIADB_OPTS array instead of concatenating them into a flat string to correctly handle arguments with spaces or special characters.

        --user=*|--password=*|--host=*|--port=*|--socket=*)
            # Connection options (long form) - pass through to mariadb client
            MARIADB_OPTS+=("$1")
            shift
            ;;
        -u|-p|-h|-P|-S)
            # Short form. If the next argv looks like a value (not another
            # flag), consume it as well — supports both `-u USER` and bare
            # `-p` (which mariadb client treats as a password prompt).
            if [[ -n "${2-}" && "$2" != -* ]]; then
                MARIADB_OPTS+=("$1" "$2")
                shift 2
            else
                MARIADB_OPTS+=("$1")
                shift
            fi
            ;;
        -u*|-p*|-h*|-P*|-S*)
            # Bundled short form: -uUSER, -pSECRET, -hHOST, ...
            MARIADB_OPTS+=("$1")
            shift
            ;;
        --defaults-file=*|--defaults-extra-file=*)
            # Config file options - pass through to mariadb client
            MARIADB_OPTS+=("$1")
            shift
            ;;

Comment thread sql/sql_backup.cc Outdated
}

#ifndef _WIN32
int dir= open(target.str, O_DIRECTORY);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To be fully standard and portable, open with O_DIRECTORY should explicitly include the access mode O_RDONLY.

Suggested change
int dir= open(target.str, O_DIRECTORY);
int dir= open(target.str, O_RDONLY | O_DIRECTORY);

Comment thread storage/maria/ma_backup.cc Outdated
explicit Aria_backup(THD *thd, Target target) noexcept
: target(target)
#ifndef _WIN32
, datadir_fd(open(maria_data_root, O_DIRECTORY))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To be fully standard and portable, open with O_DIRECTORY should explicitly include the access mode O_RDONLY.

    , datadir_fd(open(maria_data_root, O_RDONLY | O_DIRECTORY))

@grooverdan

Copy link
Copy Markdown
Member

Hey @Thirunarayanan , have you thought of how this is going to end up in packaging? It is replacing the original mariadb-backup?

It needs some INSTALL/INSTALL_SCRIPT cmake directives around this to give it a install location and a cmake component. debian installation/packaging would need the relevant debian/{package}.install to include te script.

@dr-m dr-m left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase this on the current branch.

@@ -0,0 +1,82 @@
--source include/have_mariabackup_wrapper.inc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be implemented in a way that is compatible with the .combinations logic?

I would like most of --suite=mariabackup to be run both with the genuine mariadb-backup and with the BACKUP SERVER based wrapper scripts. Currently, this is the only test that exercises the wrappers, and it fails to test the original executables to demonstrate compatibility between them and the wrappers.

Comment on lines +30 to +40
--echo #
--echo # --stream=mbstream emits a valid tar archive to stdout
--echo #
--let $targetdir=$MYSQLTEST_VARDIR/tmp/bk_stream
--let $streamfile=$MYSQLTEST_VARDIR/tmp/bk_stream.tar
--exec $XTRABACKUP $defaults --backup --target-dir=$targetdir --stream=mbstream > $streamfile 2>$logfile
--exec tar -tf $streamfile > /dev/null
--let SEARCH_PATTERN=Creating tar stream
--source include/search_pattern_in_file.inc
--rmdir $targetdir
--remove_file $streamfile

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure if this should be in the current scope. We don’t have any MDEV-38362 streaming backup yet. What should be more in the scope is demonstrating that the mbstream wrapper script is syntactically compatible with its namesake utility.

Comment thread scripts/mariabackup/mbstream.sh Outdated
@@ -0,0 +1,19 @@
#!/bin/bash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write this for the POSIX shell? Bourne Again Shell (bash) is not available in all environments, for good reasons: reduced resource usage as well as attack surface; remember https://en.wikipedia.org/wiki/Shellshock_(software_bug)

Comment thread scripts/mariabackup/mbstream.sh Outdated
Comment on lines +7 to +13
-p|--parallel)
SKIP_NEXT=1
;;
-p*)
;;
--parallel=*)
;;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are necessary but not sufficient rules. mbstream -xp4 would be a valid invocation of the original utility (equivalent to -x -p4), but not of GNU tar.

I did test that mbstream indeed treats anything after -p as a single argument. For example, -p4t and -p4x would report the following, respectively:

Warning: option 'parallel': signed value 4398046511104 adjusted to 2147483647
Unknown suffix 'x' used for variable 'parallel' (value '4x'). Legal suffix characters are: K, M, G, T, P, E
mbstream: Error while setting value '4x' to 'parallel'

Please add a test file that covers both the real mbstream and this wrapper. Both with some invalid and valid invocation. For example, mbstream -t is not allowed, while tar -t is.

Comment thread scripts/mariabackup/mariabackup.sh Outdated
Comment on lines +78 to +80
--parallel=*|--throttle=*|--no-lock|--safe-slave-backup)
# Handled server-side by BACKUP SERVER.
shift ;;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --parallel option will have to be passed through to non-streaming backup.

8077134 in #4817 implements multi-threaded backup to a mounted file system. By default, only 1 thread will be used.

For streaming backup (which is not implemented yet except as stubs), I think that we must limit the backup to a single thread, because unlike the old xbstream or mbstream format, BACKUP SERVER will generate one stream per thread. So, the only way to end up with one backup stream is to use a single thread.

@dr-m dr-m left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #4817 includes support for streaming (see the test backup.backup_stream, which passes on Linux, FreeBSD, Microsoft Windows and Apple macOS), it would be good to add support for streaming backup.

When I tested the merge of this to the current #4817, I got one test failure:

CURRENT_TEST: mariabackup.wrapper_basic
NOT FOUND /Executing: BACKUP SERVER TO/ in wrapper.log
mysqltest: In included file "./include/search_pattern_in_file.inc": 
included from /mariadb/main/mysql-test/suite/mariabackup/wrapper_basic.test at line 19:
At line 65: command "perl" failed with error: 255  my_errno: 0  errno: 0

I couldn’t figure this out. At the very end of scripts/mariabackup/mariabackup.sh we are writing a configuration file, and that is what happens. I even added set -eux or set -eu to the start of the script, and it did not complain about any uninitialized variables. I think that set -eu would be a good safety measure.

Note: I am going to squash #4817 shortly.

Comment on lines +30 to +33
if ($errno)
{
--skip mariabackup.sh wrapper unavailable (script or sh missing)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bin/sh can’t possibly be missing on a POSIX system; it is what system(3), popen(3) and friends will invoke.

#
# $XTRABACKUP — now points at mariabackup.sh

--source include/linux.inc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only want to exclude non-POSIX environments, more specifically, Microsoft Windows. For that, I would suggest the following:

--source include/not_windows.inc

or if it cannot be used for some reason:

if ($MARIADB_UPGRADE_EXE) {
--skip Need POSIX
}

@dr-m dr-m force-pushed the MDEV-14992 branch 2 times, most recently from 87036f8 to 4769a43 Compare June 25, 2026 13:18
This adds a shell script that lets users keep using their existing
mariadb-backup commands while the real work is done by the new
server-side BACKUP SERVER command. The goal is "drop-in": users should
not have to change their backup scripts.

scripts/mariabackup/mariabackup.sh (plain POSIX sh) understands the usual
mariadb-backup modes and translates each one. A companion helper,
scripts/mariabackup/mbstream.sh, lets streamed backups be unpacked by
pipelines that expect the mbstream CLI. Both are documented in
scripts/mariabackup/README.md.

--backup
========
Connects with the mariadb client and runs "BACKUP SERVER TO '<dir>'".
Connection options (--user, --host, --port, --socket, --defaults-file,
ssl, ...) are passed through to the client; --parallel=N becomes the
"<N> CONCURRENT" clause.
After the backup it writes backup-prepare.cnf into the backup
directory, recording what --prepare needs later: where
mariadbd lives, the InnoDB parameters (page size, data file path,
undo tablespaces, checksum algorithm, log file size), and if
the server is encrypted then how to reload the encryption key
plugin (the file_key_management variables),
so an encrypted backup can be prepared without extra input.

--backup --stream
=================
Runs "BACKUP SERVER WITH [N CONCURRENT] '<command>'": the server feeds
each stream's tar to <command>, the wrapper collects the parts, writes
them to stdout, then appends backup-prepare.cnf as a final tar. The
output is several tar archives concatenated, so extract it with
"tar --ignore-zeros".
Two properties follow from how BACKUP SERVER streams,
both differing from mariadb-backup:
 - local: the stream command runs inside the server, so the wrapper
   must share its filesystem;
 - tar only: any --stream=<format> (including xbstream) yields tar.
--target-dir is optional in stream mode (scratch for the per-stream
parts; a mktemp dir is used otherwise).

mbstream.sh maps the mbstream CLI onto tar. Extraction uses
"tar --ignore-zeros" to unpack all the concatenated archives. So
existing "mbstream -x"/"-c" pipelines keep working on the wrapper's
stream. mbstream-only flags (-p/--parallel, ...)
are accepted and ignored.

--prepare
=========
Starts "mariadbd --bootstrap" on the backup directory using
backup-prepare.cnf as its defaults file, replays the archived redo
log between the start and target LSN read from backup.cnf,
then builds a fresh ib_logfile0 so a normal server can start
on the directory. mariadbd is taken from the path recorded in
backup-prepare.cnf if that binary exists, otherwise from PATH.
User --defaults-file/-extra-file and encryption options are
layered onto the bootstrap.

--copy-back / --move-back
=========================
Copy or move a prepared backup into the datadir. The datadir
is created if missing, a non-empty datadir is refused unless
--force-non-empty-directories is given, and a chown
reminder is printed.

If --aria-log-dir-path is given, the Aria logs (aria_log_control,
aria_log.*) are relocated into that directory.

Packaging
=========
The wrapper is not installed by default and never replaces the
real mariadb-backup / mbstream binaries.
1. cmake -DWITH_MARIABACKUP_WRAPPER=ON (default OFF) controls it.
2. When ON, the scripts install as /usr/bin/mariadb-backup-server
and /usr/bin/mbstream-server, tagged COMPONENT Backup so they
ship in the mariadb-backup package.
3. RPM: nothing extra to do. the component handles it.
4. DEB: not wired. debian/rules uses --fail-missing and does not
enable the option, so the -server binaries are not listed.
To ship via DEB, make a paired change: add
-DWITH_MARIABACKUP_WRAPPER=ON in debian/rules and list both
usr/bin/mariadb-backup-server and
usr/bin/mbstream-server in debian/mariadb-backup.install together.
5. The real mariadb-backup/mbstream binaries and the
mariabackup symlink are left untouched; opt in via an alias or a
symlink early in PATH.

Limitations (not supported yet)
===============================
1) Incremental backup & prepare (--incremental-basedir,
   --incremental-dir, --apply-log-only)
2) --rollback-xa
3) Partial backup (--databases, --tables, --tables-file)
4) Output compression and encryption (--compress, --encrypt)
5) --export is accepted but only warns and runs a plain recovery
6) --extra-lsndir is ignored
7) Windows: POSIX sh only, not installed on Windows

Behaviour differences from native mariadb-backup
================================================
 - The wrapper needs the mariadb client on PATH for
--backup, and mariadbd on PATH (or recorded in backup-prepare.cnf)
for --prepare
 - BACKUP SERVER refuses an already-existing target directory
 - BACKUP SERVER does copy the data file as raw pages without
checksum validation, so a corrupted table is not detected
at backup time
 - --prepare only works on a wrapper-made backup. It
needs backup-prepare.cnf)
 - --stream is tar, not xbstream, and local-only

Tests
=====
include/have_mariabackup_wrapper.inc redirects $XTRABACKUP to
mariabackup.sh and $XBSTREAM to mbstream.sh, skipping when a
wrapper or the mariadb client is unavailable.
include/have_mariabackup_combination.inc runs a test under both the
[CLIENT] mariadb-backup binary and the [SERVER] wrapper.
BACKUP SERVER scanned maria_data_root (= aria_log_dir_path) for both
table files and logs, so when aria_log_dir_path differed from the
datadir the Aria tables were missed. Scan the server datadir for
.MAD or .MAI files and aria_log_dir_path only for the logs/control file.

@dr-m dr-m left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some initial comments. I did not check the mariabackup.sh in detail, but I think that this is a very good start.

Comment on lines +119 to +127
########################################################################
# mariadb-backup-server: BACKUP SERVER-compatible shell wrapper
########################################################################
# A drop-in mariadb-backup-compatible POSIX-sh wrapper that translates the
# CLI into server-side BACKUP SERVER SQL. Experimental; OFF by default
# Installed as a mariadb-backup-server; clients opt in via symlink/alias
# (see scripts/mariabackup/README.md).
OPTION(WITH_MARIABACKUP_WRAPPER
"Install the BACKUP SERVER shell wrapper (mariadb-backup-server)" OFF)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts/mariabackup/CMakeLists.txt would be a less surprising place for this. Would it work? If not, then I would suggest moving scripts/mariabackup to extra/mariabackup/scripts.

-v|--verbose) ;;

--) shift; break ;;
-*) ;;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t we throw an error for any unknown option? For example, -b 1 works in GNU tar, but certainly not in mbstream.

Comment on lines +48 to +52
# The wrapper streams several tar archives
# back to back (one per stream, plus backup-prepare.cnf),
# so --ignore-zeros is required to extract them all rather
# than stopping at the first archive's end-of-file marker.
x) exec tar --ignore-zeros -x -f - -C "$dir" ;;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not currently adding any NUL block at the end of each stream, so the GNU tar specific option --ignore-zeros should not be needed. If you install libarchive-tools, you should get a bsdtar that is compatible with the tar.exe that is shipped with Microsoft Windows. That executable does not recognize an --ignore-zeros option.

I think that we should try to make this script compatible also with BSD tar.

Comment on lines +151 to +152
if [ -n "$_enc" ]; then
echo "plugin-load-add=$_enc"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tests would fail unless we extend this with a patch that you shared with me:

        _plugin_dir=$(ask "SELECT @@global.plugin_dir")
        [ -n "$_plugin_dir" ] && echo "plugin-dir=$_plugin_dir"

esac
# Where "rr record" stores traces, so we can point at it on a crash.
rr_trace_dir=${_RR_TRACE_DIR:-${XDG_DATA_HOME:-$HOME/.local/share}/rr}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want it this complicated, or could we do the following:

: ${MARIADBD=mariadbd}

Then, to have the invoked server run under rr record, one would execute (for example)

_RR_TRACE_DIR=/dev/shm/rr MARIADBD='rr record -h` mariabackup.sh

Also, should we rather call this script mariadb-backup-server? The name mariabackup was deprecated and replaced with mariadb-backup some years ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants