docs: add warning on replication lag & retention grid.

This commit is contained in:
Christian Schwarz 2017-11-04 13:04:32 +01:00
parent 1f266d02ce
commit ff10a71f3a
2 changed files with 15 additions and 0 deletions

View File

@ -37,6 +37,11 @@ This is temporary and being worked on {{< zrepl-issue 24 >}}.
A source job is the counterpart to a [pull job]({{< relref "#pull" >}}). A source job is the counterpart to a [pull job]({{< relref "#pull" >}}).
Note that the prune policy determines the maximum replication lag:
a pull job may stop replication due to link failure, misconfiguration or administrative action.
The source prune policy will eventually destroy the last common snapshot between source and pull job, requiring full replication.
Make sure you read the [prune policy documentation]({{< relref "configuration/prune.md" >}}).
Example: {{< sampleconflink "pullbackup/productionhost.yml" >}} Example: {{< sampleconflink "pullbackup/productionhost.yml" >}}
### Pull ### Pull

View File

@ -48,3 +48,13 @@ The following procedure happens during pruning:
1. the contained snapshot list is sorted by creation. 1. the contained snapshot list is sorted by creation.
1. snapshots from the list, oldest first, are destroyed until the specified `keep` count is reached. 1. snapshots from the list, oldest first, are destroyed until the specified `keep` count is reached.
1. all remaining snapshots on the list are kept. 1. all remaining snapshots on the list are kept.
{{% notice info %}}
The configuration of the first interval (`1x1h(keep=all)` in the example) determines the **maximum allowable replication lag** between source and destination.
After the first interval, source and destination likely have different retention settings.
This means source and destination may prune different snapshots, prohibiting incremental replication froms snapshots that are not in the first interval.
**Always** configure the first interval to **`1x?(keep=all)`**, substituting `?` with the maximum time replication may fail due to downtimes, maintenance, connectivity issues, etc.
After outages longer than `?` you may be required to perform **full replication** again.
{{% / notice %}}