diff --git a/config/samples/quickstart_fan_out_replication_source.yml b/config/samples/quickstart_fan_out_replication_source.yml new file mode 100644 index 0000000..4ec5964 --- /dev/null +++ b/config/samples/quickstart_fan_out_replication_source.yml @@ -0,0 +1,59 @@ +jobs: +# Separate job for snapshots and pruning +- name: snapshots + type: snap + filesystems: + 'tank<': true # all filesystems + snapshotting: + type: periodic + prefix: zrepl_ + interval: 10m + pruning: + keep: + # Keep non-zrepl snapshots + - type: regex + negate: true + regex: '^zrepl_' + # Time-based snapshot retention + - type: grid + grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d + regex: '^zrepl_' + +# Source job for target B +- name: target_b + type: source + serve: + type: tls + listen: :8888 + ca: /etc/zrepl/b.example.com.crt + cert: /etc/zrepl/a.example.com.crt + key: /etc/zrepl/a.example.com.key + client_cns: + - b.example.com + filesystems: + 'tank<': true # all filesystems + # Snapshots are handled by the separate snap job + snapshotting: + type: manual + +# Source job for target C +- name: target_c + type: source + serve: + type: tls + listen: :8889 + ca: /etc/zrepl/c.example.com.crt + cert: /etc/zrepl/a.example.com.crt + key: /etc/zrepl/a.example.com.key + client_cns: + - c.example.com + filesystems: + 'tank<': true # all filesystems + # Snapshots are handled by the separate snap job + snapshotting: + type: manual + +# Source jobs for remaining targets. Each one should listen on a different port +# and reference the correct certificate and client CN. +# - name: target_c +# ... diff --git a/config/samples/quickstart_fan_out_replication_target.yml b/config/samples/quickstart_fan_out_replication_target.yml new file mode 100644 index 0000000..15b216f --- /dev/null +++ b/config/samples/quickstart_fan_out_replication_target.yml @@ -0,0 +1,30 @@ +jobs: +# Pull from source server A +- name: source_a + type: pull + connect: + type: tls + # Use the correct port for this specific client (eg. B is 8888, C is 8889, etc.) + address: a.example.com:8888 + ca: /etc/zrepl/a.example.com.crt + # Use the correct key pair for this specific client + cert: /etc/zrepl/b.example.com.crt + key: /etc/zrepl/b.example.com.key + server_cn: a.example.com + root_fs: pool0/backup + interval: 10m + pruning: + keep_sender: + # Source does the pruning in its snap job + - type: regex + regex: '.*' + # Receiver-side pruning can be configured as desired on each target server + keep_receiver: + # Keep non-zrepl snapshots + - type: regex + negate: true + regex: '^zrepl_' + # Time-based snapshot retention + - type: grid + grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d + regex: '^zrepl_' diff --git a/docs/configuration/overview.rst b/docs/configuration/overview.rst index 2230cc0..adb0ae3 100644 --- a/docs/configuration/overview.rst +++ b/docs/configuration/overview.rst @@ -248,7 +248,7 @@ Limitations Multiple Jobs & More than 2 Machines ------------------------------------ -The quick-start guides focus on simple setups with a single sender and a single receiver. +Most users are served well with a single sender and a single receiver job. This section documents considerations for more complex setups. .. ATTENTION:: @@ -288,11 +288,46 @@ This section might be relevant to users who wish to *fan-in* (N machines replica **Working setups**: -* N ``push`` identities, 1 ``sink`` (as long as the different push jobs have a different :ref:`client identity `) +* **Fan-in: N servers replicated to one receiver, disjoint dataset trees.** - * ``sink`` constrains each client to a disjoint sub-tree of the sink-side dataset hierarchy ``${root_fs}/${client_identity}``. + * This is the common use case of a centralized backup server. + + * Implementation: + + * N ``push`` jobs (one per sender server), 1 ``sink`` (as long as the different push jobs have a different :ref:`client identity `) + * N ``source`` jobs (one per sender server), N ``pull`` on the receiver server (unique names, disjoing ``root_fs``) + + * The ``sink`` job automatically constrains each client to a disjoint sub-tree of the sink-side dataset hierarchy ``${root_fs}/${client_identity}``. Therefore, the different clients cannot interfere. + * The ``pull`` job only pulls from one host, so it's up to the zrepl user to ensure that the different ``pull`` jobs don't interfere. + +.. _fan-out-replication: + +* **Fan-out: 1 server replicated to N receivers** + + * Can be implemented either in a pull or push fashion. + + * **pull setup**: 1 ``pull`` job on each receiver server, each with a corresponding **unique** ``source`` job on the sender server. + * **push setup**: 1 ``sink`` job on each receiver server, each with a corresponding **unique** ``push`` job on the sender server. + + * It is critical that we have one sending-side job (``source``, ``push``) per receiver. + The reason is that :ref:`zrepl's ZFS abstractions ` (``zrepl zfs-abstraction list``) include the name of the ``source``/``push`` job, but not the receive-side job name or client identity (see :issue:`380`). + As a counter-example, suppose we used multiple ``pull`` jobs with only one ``source`` job. + All ``pull`` jobs would share the same :ref:`replication cursor bookmark ` and trip over each other, breaking incremental replication guarantees quickly. + The anlogous problem exists for 1 ``push`` to N ``sink`` jobs. + + * The ``filesystems`` matched by the sending side jobs (``source``, ``push``) need not necessarily be disjoint. + For this to work, we need to avoid interference between snapshotting and pruning of the different sending jobs. + The solution is to centralize sender-side snapshot management in a separate ``snap`` job. + Snapshotting in the ``source``/``push`` job should then be disabled (``type: manual``). + And sender-side pruning (``keep_sender``) needs to be disabled in the active side (``pull`` / ``push``), since that'll be done by the ``snap job``. + + * **Restore limitations**: when restoring from one of the ``pull`` targets (e.g., using ``zfs send -R``), the replication cursor bookmarks don't exist on the restored system. + This can break incremental replication to all other receive-sides after restore. + + * See :ref:`the fan-out replication quick-start guide ` for an example of this setup. + **Setups that do not work**: diff --git a/docs/quickstart.rst b/docs/quickstart.rst index f7f5cda..9739262 100644 --- a/docs/quickstart.rst +++ b/docs/quickstart.rst @@ -33,6 +33,7 @@ Keep the :ref:`full config documentation ` handy if a config quickstart/continuous_server_backup quickstart/backup_to_external_disk + quickstart/fan_out_replication Use ``zrepl configcheck`` to validate your configuration. No output indicates that everything is fine. diff --git a/docs/quickstart/fan_out_replication.rst b/docs/quickstart/fan_out_replication.rst new file mode 100644 index 0000000..6c546e9 --- /dev/null +++ b/docs/quickstart/fan_out_replication.rst @@ -0,0 +1,93 @@ +.. include:: ../global.rst.inc + +.. _quickstart-fan-out-replication: + +Fan-out replication +=================== + +This quick-start example demonstrates how to implement a fan-out replication setup where datasets on a server (A) are replicated to multiple targets (B, C, etc.). + +This example uses multiple ``source`` jobs on server A and ``pull`` jobs on the target servers. + +.. WARNING:: + + Before implementing this setup, please see the caveats listed in the :ref:`fan-out replication configuration overview `. + +Overview +-------- + +On the source server (A), there should be: + +* A ``snap`` job + + * Creates the snapshots + * Handles the pruning of snapshots + +* A ``source`` job for target B + + * Accepts connections from server B and B only + +* Further ``source`` jobs for each additional target (C, D, etc.) + + * Listens on a unique port + * Only accepts connections from the specific target + +On each target server, there should be: + +* A ``pull`` job that connects to the corresponding ``source`` job on A + + * ``prune_sender`` should keep all snapshots since A's ``snap`` job handles the pruning + * ``prune_receiver`` can be configured as appropriate on each target server + +Generate TLS Certificates +------------------------- + +Mutual TLS via the :ref:`TLS client authentication transport ` can be used to secure the connections between the servers. In this example, a self-signed certificate is created for each server without setting up a CA. + +.. code-block:: bash + + source=a.example.com + targets=( + b.example.com + c.example.com + # ... + ) + + for server in "${source}" "${targets[@]}"; do + openssl req -x509 -sha256 -nodes \ + -newkey rsa:4096 \ + -days 365 \ + -keyout "${server}.key" \ + -out "${server}.crt" \ + -addext "subjectAltName = DNS:${server}" \ + -subj "/CN=${server}" + done + + # Distribute each host's keypair + for server in "${source}" "${targets[@]}"; do + ssh root@"${server}" mkdir /etc/zrepl + scp "${server}".{crt,key} root@"${server}":/etc/zrepl/ + done + + # Distribute target certificates to the source + scp "${targets[@]/%/.crt}" root@"${source}":/etc/zrepl/ + + # Distribute source certificate to the targets + for server in "${targets[@]}"; do + scp "${source}.crt" root@"${server}":/etc/zrepl/ + done + +Configure source server A +------------------------- + +.. literalinclude:: ../../config/samples/quickstart_fan_out_replication_source.yml + +Configure each target server +---------------------------- + +.. literalinclude:: ../../config/samples/quickstart_fan_out_replication_target.yml + +Go Back To Quickstart Guide +--------------------------- + +:ref:`Click here ` to go back to the quickstart guide.