2017-11-08 22:33:13 +01:00
.. zrepl documentation master file, created by
sphinx-quickstart on Wed Nov 8 22:28:10 2017.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
2017-11-10 14:31:04 +01:00
.. include :: global.rst.inc
2020-06-14 15:18:05 +02:00
|GitHub license| |Language: Go| |Twitter| |Donate via Patreon| |Donate via GitHub Sponsors| |Donate via Liberapay| |Donate via PayPal|
2019-09-07 23:16:57 +02:00
2018-10-26 21:50:43 +02:00
2017-11-09 20:33:09 +01:00
zrepl - ZFS replication
-----------------------
2017-11-08 22:33:13 +01:00
2018-10-11 17:46:26 +02:00
**zrepl** is a one-stop, integrated solution for ZFS replication.
.. raw :: html
2019-03-30 18:43:33 +01:00
<div style="margin-bottom: 1em; background: #2e3436; color: white; font-size: 0.8em; min-height: 6em; max-width: 100%; overflow: auto;">
<pre>
Job: prod_to_backups
Type: push
Replication:
Attempt #1
Status: fan-out-filesystems
Progress: [=========================\----] 246.7 MiB / 264.7 MiB @ 11.5 MiB/s
zroot STEPPING (step 1/2, 624 B/1.2 KiB) next: @a => @b
zroot/ROOT DONE (step 2/2, 1.2 KiB/1.2 KiB)
zroot/ROOT/default STEPPING (step 1/2, 123.4 MiB/129.3 MiB) next: @a => @b
zroot/tmp STEPPING (step 1/2, 29.9 KiB/44.2 KiB) next: @a => @b
zroot/usr STEPPING (step 1/2, 624 B/1.2 KiB) next: @a => @b
zroot/usr/home STEPPING (step 1/2, 123.3 MiB/135.3 MiB) next: @a => @b
zroot/var STEPPING (step 1/2, 624 B/1.2 KiB) next: @a => @b
zroot/var/audit DONE (step 2/2, 1.2 KiB/1.2 KiB)
zroot/var/crash DONE (step 2/2, 1.2 KiB/1.2 KiB)
zroot/var/log STEPPING (step 1/2, 22.0 KiB/29.2 KiB) next: @a => @b
zroot/var/mail STEPPING (step 1/2, 624 B/1.2 KiB) next: @a => @b
Pruning Sender:
...
Pruning Receiver:
</pre>
2018-10-11 17:46:26 +02:00
</div>
2017-11-09 20:33:09 +01:00
2019-03-30 18:43:33 +01:00
2017-11-09 20:33:09 +01:00
Getting started
~~~~~~~~~~~~~~~
2020-06-11 18:50:39 +02:00
The :ref: `10 minute quick-start guides <quickstart-toc>` give you a first impression.
2017-11-09 20:33:09 +01:00
Main Features
~~~~~~~~~~~~~
2018-10-11 17:46:26 +02:00
* **Filesystem replication**
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* [x] Pull & Push mode
2019-10-03 12:14:03 +02:00
* [x] Multiple :ref: `transport modes <transport>` : TCP, TCP + TLS client auth, SSH
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* Advanced replication features
2017-11-09 21:17:09 +01:00
2019-03-19 18:18:41 +01:00
* [x] Automatic retries for temporary network errors
new features: {resumable,encrypted,hold-protected} send-recv, last-received-hold
- **Resumable Send & Recv Support**
No knobs required, automatically used where supported.
- **Hold-Protected Send & Recv**
Automatic ZFS holds to ensure that we can always resume a replication step.
- **Encrypted Send & Recv Support** for OpenZFS native encryption.
Configurable at the job level, i.e., for all filesystems a job is responsible for.
- **Receive-side hold on last received dataset**
The counterpart to the replication cursor bookmark on the send-side.
Ensures that incremental replication will always be possible between a sender and receiver.
Design Doc
----------
`replication/design.md` doc describes how we use ZFS holds and bookmarks to ensure that a single replication step is always resumable.
The replication algorithm described in the design doc introduces the notion of job IDs (please read the details on this design doc).
We reuse the job names for job IDs and use `JobID` type to ensure that a job name can be embedded into hold tags, bookmark names, etc.
This might BREAK CONFIG on upgrade.
Protocol Version Bump
---------------------
This commit makes backwards-incompatible changes to the replication/pdu protobufs.
Thus, bump the version number used in the protocol handshake.
Replication Cursor Format Change
--------------------------------
The new replication cursor bookmark format is: `#zrepl_CURSOR_G_${this.GUID}_J_${jobid}`
Including the GUID enables transaction-safe moving-forward of the cursor.
Including the job id enables that multiple sending jobs can send the same filesystem without interfering.
The `zrepl migrate replication-cursor:v1-v2` subcommand can be used to safely destroy old-format cursors once zrepl has created new-format cursors.
Changes in This Commit
----------------------
- package zfs
- infrastructure for holds
- infrastructure for resume token decoding
- implement a variant of OpenZFS's `entity_namecheck` and use it for validation in new code
- ZFSSendArgs to specify a ZFS send operation
- validation code protects against malicious resume tokens by checking that the token encodes the same send parameters that the send-side would use if no resume token were available (i.e. same filesystem, `fromguid`, `toguid`)
- RecvOptions support for `recv -s` flag
- convert a bunch of ZFS operations to be idempotent
- achieved through more differentiated error message scraping / additional pre-/post-checks
- package replication/pdu
- add field for encryption to send request messages
- add fields for resume handling to send & recv request messages
- receive requests now contain `FilesystemVersion To` in addition to the filesystem into which the stream should be `recv`d into
- can use `zfs recv $root_fs/$client_id/path/to/dataset@${To.Name}`, which enables additional validation after recv (i.e. whether `To.Guid` matched what we received in the stream)
- used to set `last-received-hold`
- package replication/logic
- introduce `PlannerPolicy` struct, currently only used to configure whether encrypted sends should be requested from the sender
- integrate encryption and resume token support into `Step` struct
- package endpoint
- move the concepts that endpoint builds on top of ZFS to a single file `endpoint/endpoint_zfs.go`
- step-holds + step-bookmarks
- last-received-hold
- new replication cursor + old replication cursor compat code
- adjust `endpoint/endpoint.go` handlers for
- encryption
- resumability
- new replication cursor
- last-received-hold
- client subcommand `zrepl holds list`: list all holds and hold-like bookmarks that zrepl thinks belong to it
- client subcommand `zrepl migrate replication-cursor:v1-v2`
2019-09-11 17:19:17 +02:00
* [x] Automatic resumable send & receive
* [x] Automatic ZFS holds during send & receive
* [x] Automatic bookmark \& hold management for guaranteed incremental send & recv
* [x] Encrypted raw send & receive to untrusted receivers (OpenZFS native encryption)
2020-09-07 01:20:57 +02:00
* [x] Properties send & receive
* [x] Compressed send & receive
* [x] Large blocks send & receive
* [x] Embedded data send & receive
* [x] Resume state send & receive
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* **Automatic snapshot management**
2017-11-09 21:17:09 +01:00
2019-09-28 17:50:07 +02:00
* [x] Periodic :ref: `filesystem snapshots <job-snapshotting-spec>`
* [x] Support for :ref: `pre- and post-snapshot hooks <job-snapshotting-hooks>` with builtins for MySQL & Postgres
2018-10-11 17:46:26 +02:00
* [x] Flexible :ref: `pruning rule system <prune>`
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* [x] Age-based fading (grandfathering scheme)
* [x] Bookmarks to avoid divergence between sender and receiver
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* **Sophisticated Monitoring & Logging**
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* [x] Live progress reporting via `zrepl status` :ref: `subcommand <usage>`
* [x] Comprehensive, structured :ref: `logging <logging>`
2018-04-14 11:30:48 +02:00
* `` human `` , `` logfmt `` and `` json `` formatting
* stdout, syslog and TCP (+TLS client auth) outlets
2018-10-11 17:46:26 +02:00
* [x] Prometheus :ref: `monitoring <monitoring>` endpoint
2017-11-09 21:17:09 +01:00
2018-10-11 17:46:26 +02:00
* **Maintainable implementation in Go**
2017-11-09 21:17:09 +01:00
* [x] Cross platform
new features: {resumable,encrypted,hold-protected} send-recv, last-received-hold
- **Resumable Send & Recv Support**
No knobs required, automatically used where supported.
- **Hold-Protected Send & Recv**
Automatic ZFS holds to ensure that we can always resume a replication step.
- **Encrypted Send & Recv Support** for OpenZFS native encryption.
Configurable at the job level, i.e., for all filesystems a job is responsible for.
- **Receive-side hold on last received dataset**
The counterpart to the replication cursor bookmark on the send-side.
Ensures that incremental replication will always be possible between a sender and receiver.
Design Doc
----------
`replication/design.md` doc describes how we use ZFS holds and bookmarks to ensure that a single replication step is always resumable.
The replication algorithm described in the design doc introduces the notion of job IDs (please read the details on this design doc).
We reuse the job names for job IDs and use `JobID` type to ensure that a job name can be embedded into hold tags, bookmark names, etc.
This might BREAK CONFIG on upgrade.
Protocol Version Bump
---------------------
This commit makes backwards-incompatible changes to the replication/pdu protobufs.
Thus, bump the version number used in the protocol handshake.
Replication Cursor Format Change
--------------------------------
The new replication cursor bookmark format is: `#zrepl_CURSOR_G_${this.GUID}_J_${jobid}`
Including the GUID enables transaction-safe moving-forward of the cursor.
Including the job id enables that multiple sending jobs can send the same filesystem without interfering.
The `zrepl migrate replication-cursor:v1-v2` subcommand can be used to safely destroy old-format cursors once zrepl has created new-format cursors.
Changes in This Commit
----------------------
- package zfs
- infrastructure for holds
- infrastructure for resume token decoding
- implement a variant of OpenZFS's `entity_namecheck` and use it for validation in new code
- ZFSSendArgs to specify a ZFS send operation
- validation code protects against malicious resume tokens by checking that the token encodes the same send parameters that the send-side would use if no resume token were available (i.e. same filesystem, `fromguid`, `toguid`)
- RecvOptions support for `recv -s` flag
- convert a bunch of ZFS operations to be idempotent
- achieved through more differentiated error message scraping / additional pre-/post-checks
- package replication/pdu
- add field for encryption to send request messages
- add fields for resume handling to send & recv request messages
- receive requests now contain `FilesystemVersion To` in addition to the filesystem into which the stream should be `recv`d into
- can use `zfs recv $root_fs/$client_id/path/to/dataset@${To.Name}`, which enables additional validation after recv (i.e. whether `To.Guid` matched what we received in the stream)
- used to set `last-received-hold`
- package replication/logic
- introduce `PlannerPolicy` struct, currently only used to configure whether encrypted sends should be requested from the sender
- integrate encryption and resume token support into `Step` struct
- package endpoint
- move the concepts that endpoint builds on top of ZFS to a single file `endpoint/endpoint_zfs.go`
- step-holds + step-bookmarks
- last-received-hold
- new replication cursor + old replication cursor compat code
- adjust `endpoint/endpoint.go` handlers for
- encryption
- resumability
- new replication cursor
- last-received-hold
- client subcommand `zrepl holds list`: list all holds and hold-like bookmarks that zrepl thinks belong to it
- client subcommand `zrepl migrate replication-cursor:v1-v2`
2019-09-11 17:19:17 +02:00
* [x] Dynamic feature checking
2017-11-09 21:17:09 +01:00
* [x] Type safe & testable code
2017-11-08 22:33:13 +01:00
2018-10-11 17:46:26 +02:00
.. ATTENTION ::
zrepl as well as this documentation is still under active development.
There is no stability guarantee on the RPC protocol or configuration format,
but we do our best to document breaking changes in the :ref: `changelog` .
2017-11-09 20:33:09 +01:00
Contributing
~~~~~~~~~~~~
2017-11-08 22:33:13 +01:00
2017-11-09 20:33:09 +01:00
We are happy about any help we can get!
2017-11-08 22:33:13 +01:00
2019-09-07 23:16:57 +02:00
* :ref: `Financial Support <supporters>`
2017-11-09 20:33:09 +01:00
* Explore the codebase
2017-11-10 14:31:04 +01:00
* These docs live in the `` docs/ `` subdirectory
2017-11-09 20:33:09 +01:00
* Document any non-obvious / confusing / plain broken behavior you encounter when setting up zrepl for the first time
2019-03-19 18:22:15 +01:00
* Check the *Issues* and *Projects* sections for things to do.
The `good first issues <https://github.com/zrepl/zrepl/labels/good%20first%20issue> `_ and `docs <https://github.com/zrepl/zrepl/labels/docs> `_ are suitable starting points.
2017-11-09 20:33:09 +01:00
2017-11-10 14:31:04 +01:00
.. admonition :: Development Workflow
:class: note
The `GitHub repository <https://github.com/zrepl/zrepl> `_ is where all development happens.
Make sure to read the `Developer Documentation section <https://github.com/zrepl/zrepl> `_ and open new issues or pull requests there.
2017-11-09 20:33:09 +01:00
2019-09-07 23:16:57 +02:00
2017-11-09 20:33:09 +01:00
Table of Contents
~~~~~~~~~~~~~~~~~
.. toctree ::
:maxdepth: 2
:caption: Contents:
2017-11-08 22:33:13 +01:00
2020-06-11 18:50:39 +02:00
quickstart
2017-11-09 20:33:09 +01:00
installation
configuration
2017-12-27 17:07:11 +01:00
usage
2018-10-11 17:46:26 +02:00
pr
2017-11-12 14:12:57 +01:00
changelog
2017-11-12 16:45:11 +01:00
GitHub Repository & Issue Tracker <https://github.com/zrepl/zrepl>
2019-09-07 23:16:57 +02:00
supporters
2018-10-11 17:46:26 +02:00