Commit Graph

126 Commits

Author SHA1 Message Date
Zeyad Tamimi
fe8bc50527 Fixed docs compilation errors 2024-12-08 14:14:42 -08:00
Zeyad Tamimi
4533918d90 Factored out include documentation into a seperate file 2024-12-08 13:59:05 -08:00
Zeyad Tamimi
38731a6810 Incorporated documentation feedback
As per some of the discussion items I modified the proposal to have a
special top level includes key as well as added references to handling
of specific file includes.
2024-12-08 13:28:15 -08:00
Zeyad Tamimi
50cc2fdb77 Added distributed job YAML file support
**Note: This is a WIP PR that presently only adds the documentation for
the features so that proposal can be discussed.**

Background
==========

While trying to use zrepl in my ansible driven home lab deployment I ran
into an interesting problem. My ZFS based servers subscribe to different
sometimes overlapping roles.

Example role distribution between two servers:
```
serverA:
  - common
  - web
  - file

serverB:
  - common
  - git
  - file
```

Each role wants to create and manage a ZFS dataset with its own
replication / backup policies:
   - web: pool/web
   - git: pool/git
   - file pool/file

At present the creation of a ZFS dataset from each role role is somewhat
very easy, so to is the creation of the basic zrepl configuration file
from the "common" role.

However, when each role tries to register it's job(s) into the singular
zrepl configuration files things get tricky.

I could try adding a role at the end that hardcodes the datasets that
need to be backed up but that seems a bit hacky.

I could also use ansible's `lineinfile` task to try to idempotently add
each dataset's snapshot jobs to the zrepl configuration files but that
causes a problem:

Everytime the "common" role gets run the basic zrepl configuration file
gets re-created causing the web, git, and file roles to all register
changes as they have to re-insert all jobs back into the singular
configuration file.

Proposed Solution
=================

The proposed solution is to allow for the distribution of zrepl job
definition between multiple different YAML files that can be included
from the main zrepl configuration files.

```
global: ...
jobs:
  include: jobs.d
```

This directive would be only acceptable in the main configuration file
and is mutually exclusive with any other job definitions in the file.

To keep things lean there will be no conflict resolution provided to
users, job names must be unique across all included job YAML files.

With this feature, the above problem becomes much simpler:
   - Common: Sets up the global zrepl configuration and the include
     directive
   - web/git/file: Each manage their own datasets and create their
                   jobs.d/web.yml, jobs.d/git.yml, and jobs.d/file.yml.
2024-12-01 10:37:20 -08:00
Christian Schwarz
b9b9ad10cf
snapshotting: ability to specify timestamp location != UTC (#801)
This PR adds a new field optional field `timestamp_location` that allows
the user to specify a timezone different than the default UTC for use in
the snapshot suffix.

I took @mjasnik 's PR https://github.com/zrepl/zrepl/pull/785 and
refactored+extended it as follows:
* move all formatting logic into its own package
* disallow `dense` and `human` with formats != UTC to protect users from
stupidity
* document behavior more clearly
* regression test for existing users
2024-10-18 15:12:41 +02:00
wxiaoguang
affe00aefe
docs: draw attention to risks of not_replicated (#810)
Co-authored-by: Christian Schwarz <me@cschwarz.com>
2024-09-05 23:56:59 +02:00
Denis Shaposhnikov
27012e5623
Allow same root_fs for different jobs: sinks and so on (#752)
Because some jobs add client identity to root_fs and other jobs don't do
that,
we can't reliable detect overlapping of filesystems. And and the same
time we
need an ability to use equal or overlapped root_fs for different jobs.
For
instance see this config:

```
  - name: "zdisk"
    type: "sink"
    root_fs: "zdisk/zrepl"
    serve:
      type: "local"
      listener_name: "zdisk"
```
and
```
  - name: "remote-to-zdisk"
    type: "pull"
    connect:
      type: "tls"
    root_fs: "zdisk/zrepl/remote"
```

As you can see, two jobs have overlapped root_fs, but actually datasets
are not
overlapped, because job `zdisk` save everything under
`zdisk/zrepl/localhost`,
because it adds client identity. So they actually use two different
filesystems:
`zdisk/zrepl/localhost` and `zdisk/zrepl/remote`. And we can't detect
this
situation during config check. So let's just remove this check, because
it's
admin's duty to configure correct root_fs's.

---------

Co-authored-by: Christian Schwarz <me@cschwarz.com>
2023-11-01 00:12:54 +01:00
InsanePrawn
1a72edea5d docs/jobs: add replication- conflict_resolution-options to active job types 2023-01-26 00:09:28 +01:00
Christian Schwarz
6be133f55d remove unused JobDebugSettings along with docs
For this kind of debugging, we switched to env vars a while ago.
For example, ZREPL_RPC_DEBUG.

I don't think we have a substitute for the RPCLog stuff.
However, NetConnLogger is still in the codebase.

obsoletes https://github.com/zrepl/zrepl/pull/661
2022-12-22 18:13:45 +01:00
Christian Schwarz
5ffd470596 docs: update comment on overriding mountpoint properties during zfs recv of ZVOLs
fixes https://github.com/zrepl/zrepl/issues/430
2022-12-10 12:53:24 +01:00
Christian Schwarz
a3379d6785 docs: finalize 0.6 changelog 2022-10-27 00:19:06 +02:00
Yannick Dylla
1da8f848f2 snapper: support custom timestamp format
fixes https://github.com/zrepl/zrepl/issues/465
closes https://github.com/zrepl/zrepl/pull/639
2022-10-27 00:19:06 +02:00
Christian Schwarz
c743c7b03f refactor snapper & support cron-based snapshotting
fixes https://github.com/zrepl/zrepl/issues/554
refs https://github.com/zrepl/zrepl/discussions/547#discussioncomment-1936126
2022-09-25 19:23:44 +02:00
Christian Schwarz
206d359dcd docs: sendrecvoptions: fix heading level for section on placeholders 2022-09-25 18:23:54 +02:00
jtagcat
c7771f98f5 docs: improve overview
There were and still is too many words. It's a very white paper vibe.
Docs needs to be more brief, exact, and on-point.

closes https://github.com/zrepl/zrepl/pull/618
2022-07-31 15:50:53 +02:00
jtagcat
299f1c906e docs: overview: clarify configs _are_ ordered
Previously with unordered list, and 'are considered'
left if unsure whether one or all files are 'considered'.
In reality, the first valid is used, so an ordered list and
perhaps better wording communicates this fact.

refs https://github.com/zrepl/zrepl/pull/618
2022-07-31 15:33:23 +02:00
Christian Schwarz
53f9bd6d88 docs: update CLI usage to --mode raw & remove outdated "Limitations" section
fixes https://github.com/zrepl/zrepl/issues/609
2022-06-28 00:17:34 +02:00
JMoVS
43c2a0d9b0 docs: clarity on the section that covers more complex setups
closes https://github.com/zrepl/zrepl/pull/596
2022-06-27 22:41:12 +02:00
Christian Schwarz
2642c64303 make initial replication policy configurable (most_recent, all, fail)
Config:

```
- type: push
  ...
  conflict_resolution:
    initial_replication: most_recent | all | fali
```

The ``initial_replication`` option determines which snapshots zrepl
replicates if the filesystem has not been replicated before.
If ``most_recent`` (the default), the initial replication will only
transfer the most recent snapshot, while ignoring previous snapshots.
If all snapshots should be replicated, specify ``all``.
Use ``fail`` to make replication of the filesystem fail in case
there is no corresponding fileystem on the receiver.

Code-Level Changes, apart from the obvious:
- Rework IncrementalPath()'s return signature.
  Now returns an error for initial replications as well.
- Rename & rework it's consumer, resolveConflict().

Co-authored-by: Graham Christensen <graham@grahamc.com>

Fixes https://github.com/zrepl/zrepl/issues/550
Fixes https://github.com/zrepl/zrepl/issues/187
Closes https://github.com/zrepl/zrepl/pull/592
2022-06-26 14:36:59 +02:00
JMoVS
1acafabb5b docs: Fix typo in disjoing to disjoint
Signed-off-by: Justin Scholz <git@justinscholz.de>
2022-05-07 22:13:56 +02:00
Christian Schwarz
459508c9d9 docs: sendrecvoptions: placeholders: fix wrong link name and add summarizing config snippet for recv.placeholders
fixes https://github.com/zrepl/zrepl/issues/573
2022-02-05 10:59:33 +01:00
Andrew Gunnerson
556fac3002 docs: document fan-out replication & add quick-start guide
closes https://github.com/zrepl/zrepl/pull/552
fixes https://github.com/zrepl/zrepl/issues/551

Signed-off-by: Andrew Gunnerson <chillermillerlong@hotmail.com>
Co-authored-by: Christian Schwarz <me@cschwarz.com>
2022-01-09 12:45:09 +01:00
Christian Schwarz
fb6a9be954 fix encrypt-on-receive with placeholders
fixes https://github.com/zrepl/zrepl/issues/504

Problem:
  plain send + recv with root_fs encrypted + placeholders causes plain recvs
  whereas user would expect encrypt-on-recv
Reason:
  We create placeholder filesytems with -o encryption=off.
  Thus, children received below those placeholders won't inherit
  encryption of root_fs.
Fix:
  We'll have three values for `recv.placeholders.encryption: unspecified (default) | off | inherit`.
  When we create a placeholder, we will fail the operation if  `recv.placeholders.encryption = unspecified`.
  The exception is if the placeholder filesystem is to encode the client identity ($root_fs/$client_identity) in a pull job.
  Those are created in `inherit` mode if the config field is `unspecified` so that users who don't need
  placeholders are not bothered by these details.

Future Work:
  Automatically warn existing users of encrypt-on-recv about the problem
  if they are affected.
  The problem that I hit during implementation of this is that the
  `encryption` prop's `source` doesn't quite behave like other props:
  `source` is `default` for `encryption=off` and `-` when `encryption=on`.
  Hence, we can't use `source` to distinguish the following 2x2 cases:
  (1) placeholder created with explicit -o encryption=off
  (2) placeholder created without specifying -o encryption
  with
  (A) an encrypted parent at creation time
  (B) an unencrypted parent at creation time
2021-12-18 15:12:47 +01:00
Samy Mahmoudi
1850a332ed docs: prune: improve docs for 'grid' rule
- Substitute full words for both string name 'gridspec' and short form 'grid spec'
- Fix alignment and make spacing more consistent
- Fix fall of snapshots into buckets for the example to really reflect right-exclusiveness

closes https://github.com/zrepl/zrepl/pull/535
2021-11-14 17:34:32 +01:00
Christian Schwarz
20ff9717bc fix mis-spelled send option for embedded data
fixes https://github.com/zrepl/zrepl/issues/522
2021-11-14 17:34:32 +01:00
Christian Schwarz
1f0f2f8569 pruner + docs: less confusing type names, some comments, better docs for keep: not_replicated
fixes https://github.com/zrepl/zrepl/issues/524
2021-10-10 21:11:38 +02:00
Christian Schwarz
f5f269bfd5 send/recv: job-level bandwidth limiting
Sponsored-by: Prominic.NET, Inc.

fixes #339
2021-09-12 20:08:43 +02:00
Christian Schwarz
009bd410af docs: prune: improve grid example 2021-07-08 19:46:24 +02:00
Lapo Luchini
3b5a1a8b9a docs/monitoring: change suggested prometheus port to 9811
Change to 9811 as registered with the prometheus project now.

Closes #444.
2021-03-28 18:18:02 +02:00
InsanePrawn
b2c6e51a43 client/signal: Revert "add signal 'snapshot', rename existing signal 'wakeup' to 'replication'"
This was merged to master prematurely as the job components are not decoupled well enough
for these signals to be useful yet.

This reverts commit 2c8c2cfa14.

closes #452
2021-03-25 22:26:17 +01:00
InsanePrawn
8d678eed19 docs: add note about zfs recv -x mountpoint with ZVOLs
refs #430
2021-03-14 20:26:39 +01:00
Calistoc
2c8c2cfa14 add signal 'snapshot', rename existing signal 'wakeup' to 'replication' 2021-03-14 18:16:23 +01:00
Christian Schwarz
0ceea1b792 replication: simplify parallel replication variables & expose them in config
closes #140
2021-03-14 17:30:10 +01:00
InsanePrawn
393fc10a69 [#285] support setting zfs send / recv flags in the config (send: -wLcepbS, recv: -ox)
Co-authored-by: Christian Schwarz <me@cschwarz.com>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>

closes #285
closes #276
closes #24
2021-02-20 17:20:45 +01:00
Rafał Bugajewski
96d5288667
docs: fix typos 2020-12-17 12:00:29 +01:00
Jeremy Bryan Smith
bb5ef0c8b2 docs: fix link to template.sh sample hook file 2020-11-01 10:45:17 +01:00
Christian Schwarz
8839ed1f95 docs: update multi-job & multi-host setup section 2020-09-05 17:45:18 +02:00
Christian Schwarz
41b4038ad5 docs: add example setup 'local disk backup' to jobs overview table 2020-09-05 17:44:46 +02:00
Christian Schwarz
b1f8cdf385 [#373] pruning: add optional regex field to last_n rule
fixes #373
2020-09-02 22:45:44 +02:00
Christian Schwarz
428a60870a pruning: cleanup retention grid impl + tests + correct docs
package is now at 95% code coverage and the additional tests codify
all behavior specified in the docs

There is a slight change in behavior:
Intervals are now [duration) instead of (duration].
If the leftmost interval is not keep=all, the most recently created
snapshot will be destroyed if there are other snapshots within
that first interval.
Since we recommend keep=all all over the docs, and zrepl 0.3
will put holds on that snapshot if it is being replicated,
I feel like this is an acceptable change in behavior.

refs #292
fixup of 0bbe2befce
2020-09-02 22:45:44 +02:00
Christian Schwarz
7f1695c457 docs: transport: fix easyrsa script (fixup of 6b4c6fc) 2020-08-23 20:36:43 +02:00
Christian Schwarz
6b4c6fc062 [#357] docs: update quickstart + tls transport to produce keypairs with subject alternative names
fixes #357
2020-08-22 03:05:30 +02:00
InsanePrawn
0bbe2befce docs: prune: add prune interval visualisation
fixes #122

Co-Authored-By: Christian Schwarz <me@cschwarz.com>

Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
2020-08-21 22:05:05 +02:00
Christian Schwarz
30cdc1430e replication + endpoint: replication guarantees: guarantee_{resumability,incremental,nothing}
This commit

- adds a configuration in which no step holds, replication cursors, etc. are created
- removes the send.step_holds.disable_incremental setting
- creates a new config option `replication` for active-side jobs
- adds the replication.protection.{initial,incremental} settings, each
  of which can have values
    - `guarantee_resumability`
    - `guarantee_incremental`
    - `guarantee_nothing`
  (refer to docs/configuration/replication.rst for semantics)

The `replication` config from an active side is sent to both endpoint.Sender and endpoint.Receiver
for each replication step. Sender and Receiver then act accordingly.

For `guarantee_incremental`, we add the new `tentative-replication-cursor` abstraction.
The necessity for that abstraction is outlined in https://github.com/zrepl/zrepl/issues/340.

fixes https://github.com/zrepl/zrepl/issues/340
2020-07-26 20:32:35 +02:00
Brian Candler
dbc8bbeb6a docs: config: prune: example: keep manual snapshots on receiver
Fixes #335
closes #336

Signed-off-by: Christian Schwarz <me@cschwarz.com>
2020-06-22 12:32:03 +02:00
Christian Schwarz
a827894274 docs: add backup-to-external-disk quick-start guide and convert existing tutorial to quick-start guide
refs #219
fixes #329
2020-06-14 15:26:05 +02:00
Christian Schwarz
1c270b7e39 add option to disable step holds for incremental sends
This is a stop-gap solution until we re-write the pruner to support
rules for removing step holds.

Note that disabling step holds for incremental sends does not affect
zrepl's guarantee that incremental replication is always possible:

Suppose you yank the external drive during an incremental @from -> @to step:

* restarting that step or future incrementals @from -> @to_later` will be possible
  because the replication cursor bookmark points to @from until the step is complete
* resuming @from -> @to will work as long as the pruner on your internal pool doesn't come around to destroy @to.
    * in that case, the replication algorithm should determine that the resumable state
      on the receiving side isuseless because @to no longer exists on the sending side,
      and consequently clear it, and restart an incremental step @from -> @to_later

refs #288
2020-06-14 15:26:05 +02:00
Christian Schwarz
1b39e9d03c docs: update & extend replication overview wrt step holds + bookmarks 2020-06-14 15:21:36 +02:00
Christian Schwarz
655a2e5404 docs/configuration/overview.rst: fix wrong headline hierarchy 2020-06-14 15:21:36 +02:00
Bruce Smith
2fbd9d8f8c transport/tcp: support for CIDR-mask based ACLs + client-identities
Co-authored-by: Christian Schwarz <me@cschwarz.com>

fixes #235
close #265
2020-05-15 21:17:01 +02:00