docs: initial port of hugo to sphinx, including rtd theme

2025-06-26 04:31:34 +02:00 · 2017-11-09 20:33:09 +01:00 · 2017-11-09 20:33:09 +01:00 · df181108b4
commit df181108b4
parent c3af267f48
13 changed files with 1009 additions and 11 deletions
--- a/docs/conf.py
+++ b/docs/conf.py
@ -84,7 +84,7 @@ todo_include_todos = True
 # The theme to use for HTML and HTML Help pages.  See the documentation for
 # a list of builtin themes.
 #
-html_theme = 'alabaster'
+html_theme = 'sphinx_rtd_theme'

 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@ -0,0 +1,13 @@
+
+*************
+Configuration
+*************
+
+.. toctree::
+
+    configuration/jobs
+    configuration/transports
+    configuration/map_filter_syntax
+    configuration/prune
+    configuration/logging
+    configuration/misc
--- a/docs/configuration/jobs.rst
+++ b/docs/configuration/jobs.rst
@ -0,0 +1,129 @@
+Job Types
+=========
+
+A *job* is the unit of activity tracked by the zrepl daemon and configured in the [configuration file]({{< relref "install/_index.md#configuration-files" >}}).
+
+Every job has a unique `name`, a `type` and type-dependent fields which are documented on this page.
+
+Check out the [tutorial]({{< relref "tutorial/_index.md" >}}) and {{< sampleconflink >}} for examples on how job types are actually used.
+
+.. ATTENTION::
+
+    Currently, zrepl does not replicate filesystem properties.
+    Whe receiving a filesystem, it is never mounted (`-u` flag)  and `mountpoint=none` is set.
+    This is temporary and being worked on {{< zrepl-issue 24 >}}.
+
+Source Job
+----------
+
+========== ======= =====================
+Parameter  Default Description / Example
+========== ======= =====================
+========== ======= =====================
+
+::
+
+    |-----|-------|-------|
+    |`type`||`source`|
+    |`name`||unique name of the job|
+    |`serve`||{{< zrepl-transport "serve transport" >}} specification|
+    |`datasets`||{{< zrepl-filter >}} for datasets to expose to client|
+    |`snapshot_prefix`||prefix for ZFS snapshots taken by this job|
+    |`interval`||snapshotting interval|
+    |`prune`||{{< zrepl-prune >}} policy for datasets in `datasets` with prefix `snapshot_prefix`|
+
+* Snapshotting Task (every `interval`, {{% zrepl-job-patient %}})
+    1. A snapshot of filesystems matched by `datasets` is taken every `interval` with prefix `snapshot_prefix`.
+    1. The `prune` policy is triggered on datasets matched by `datasets` with snapshots matched by `snapshot_prefix`.
+* Serve Task
+    * Wait for connections from pull job using `serve`
+
+A source job is the counterpart to a [pull job]({{< relref "#pull" >}}).
+
+Note that the prune policy determines the maximum replication lag:
+a pull job may stop replication due to link failure, misconfiguration or administrative action.
+The source prune policy will eventually destroy the last common snapshot between source and pull job, requiring full replication.
+Make sure you read the [prune policy documentation]({{< relref "configuration/prune.md" >}}).
+
+Example: {{< sampleconflink "pullbackup/productionhost.yml" >}}
+
+Pull Job
+--------
+
+::
+
+    |Parameter|Default|Description / Example|
+    |-----|-------|-------|
+    |`type`||`pull`|
+    |`name`||unique name of the job|
+    |`connect`||{{< zrepl-transport "connect transport" >}} specification|
+    |`interval`||Interval between pull attempts|
+    |`mapping`||{{< zrepl-mapping >}} for remote to local filesystems|
+    |`initial_repl_policy`|`most_recent`|initial replication policy|
+    |`snapshot_prefix`||prefix filter used for replication & pruning|
+    |`prune`||{{< zrepl-prune >}} policy for local filesystems reachable by `mapping`|
+
+    - Main Task (every `interval`, {{% zrepl-job-patient %}})
+      #. A connection to the remote source job is established using the strategy in `connect`
+      #. `mapping` maps filesystems presented by the remote side to local *target filesystems*
+      #. Those remote filesystems with a local *target filesystem* are replicated
+         #. Only snapshots with prefix `snapshot_prefix` are replicated.
+         #. If possible, incremental replication takes place.
+         #. If the local target filesystem does not exist, `initial_repl_policy` is used.
+         #. On conflicts, an error is logged but replication of other filesystems with mapping continues.
+      #. The `prune` policy is triggered for all *target filesystems*
+
+A pull job is the counterpart to a [source job]({{< relref "#source" >}}).
+
+Example: {{< sampleconflink "pullbackup/backuphost.yml" >}}
+
+Local Job
+---------
+
+::
+
+    |Parameter|Default|Description / Example|
+    |-----|-------|-------|
+    |`type`||`local`|
+    |`name`||unique name of the job|
+    |`mapping`||{{<zrepl-mapping>}} from source to target filesystem (both local)|
+    |`snapshot_prefix`||prefix for ZFS snapshots taken by this job|
+    |`interval`|snapshotting & replication interval|
+    |`initial_repl_policy`|`most_recent`|initial replication policy|
+    |`prune_lhs`||pruning policy on left-hand-side (source)|
+    |`prune_rhs`||pruning policy on right-hand-side (target)|
+
+    * Main Task (every `interval`, {{% zrepl-job-patient %}})
+        1. Evaluate `mapping` for local filesystems, those with a *target filesystem* are called *mapped filesystems*.
+        1. Snapshot *mapped filesystems* with `snapshot_prefix`.
+        1. Replicate *mapped filesystems* to their respective *target filesystems*:
+            1. Only snapshots with prefix `snapshot_prefix` are replicated.
+            1. If possible, incremental replication takes place.
+            1. If the *target filesystem* does not exist, `initial_repl_policy` is used.
+            1. On conflicts, an error is logged but replication of other *mapped filesystems* continues.
+        1. The `prune_lhs` policy is triggered for all *mapped filesystems*
+        1. The `prune_rhs` policy is triggered for all *target filesystems*
+
+A local job is combination of source & pull job executed on the same machine.
+
+Example: {{< sampleconflink "localbackup/host1.yml" >}}
+
+Terminology
+-----------
+
+task
+
+    A job consists of one or more tasks and a task consists of one or more steps.
+    Some tasks may be periodic while others wait for an event to occur.
+
+patient task
+
+    A patient task is supposed to execute some task every `interval`.
+    We call the start of the task an *invocation*.
+
+    * If the task completes in less than `interval`, the task is restarted at `last_invocation + interval`.
+    * Otherwise, a patient job
+        * logs a warning as soon as a task exceeds its configured `interval`
+        * waits for the last invocation to finish
+        * logs a warning with the effective task duration
+        * immediately starts a new invocation of the task
--- a/docs/configuration/logging.rst
+++ b/docs/configuration/logging.rst
@ -0,0 +1,148 @@
+Logging
+=======
+
+zrepl uses structured logging to provide users with easily processable log messages.
+
+Configuration
+-------------
+
+Logging outlets are configured in the `global` section of the [configuration file]({{< relref "install/_index.md#configuration-files" >}}).<br />
+Check out {{< sampleconflink "random/logging.yml" >}} for an example on how to configure multiple outlets:
+
+::
+
+    global:
+      logging:
+    
+        - outlet: OUTLET_TYPE
+          level: MINIMUM_LEVEL
+          format: FORMAT
+    
+        - outlet: OUTLET_TYPE
+          level: MINIMUM_LEVEL
+          format: FORMAT
+    
+        ...
+    
+    jobs: ...
+
+Default Configuration
+~~~~~~~~~~~~~~~~~~~~~
+
+By default, the following logging configuration is used
+
+::
+
+    global:
+      logging:
+    
+        - outlet: "stdout"
+          level:  "warn"
+          format: "human"
+
+.. ATTENTION::
+    Output to **stderr** should always be considered a **critical error**.<br />
+    Only errors in the logging infrastructure itself, e.g. IO errors when writing to an outlet, are sent to stderr.
+
+Building Blocks
+---------------
+
+The following sections document the semantics of the different log levels, formats and outlet types.
+
+Levels
+~~~~~~
+
+::
+
+    | Level | SHORT | Description |
+    |-------|-------|-------------|
+    |`error`|`ERRO` | immediate action required |
+    |`warn` |`WARN` | symptoms for misconfiguration, soon expected failure, etc.|
+    |`info` |`INFO` | explains what happens without too much detail |
+    |`debug`|`DEBG` | tracing information, state dumps, etc. useful for debugging. |
+
+Incorrectly classified messages are considered a bug and should be reported.
+
+Formats
+~~~~~~~
+
+::
+
+    | Format | Description |
+    |--------|---------|
+    |`human` | emphasized context by putting job, task, step and other context variables into brackets before the actual message, followed by remaining fields in logfmt style|
+    |`logfmt`| [logfmt](https://brandur.org/logfmt) output. zrepl uses [github.com/go-logfmt/logfmt](github.com/go-logfmt/logfmt).|
+    |`json`  | JSON formatted output. Each line is a valid JSON document. Fields are marshaled by `encoding/json.Marshal()`, which is particularly useful for processing in log aggregation or when processing state dumps.
+    
+Outlets
+~~~~~~~
+
+Outlets are ... well ... outlets for log entries into the world.
+
+**`stdout`**
+^^^^^^^^^^^^
+
+::
+
+    | Parameter | Default   | Comment |
+    |-----------| --------- | ----------- |
+    |`outlet`   | *none*    | required |
+    |`level`    | *none*    | minimum [log level](#levels), required |
+    |`format`   | *none*    | output [format](#formats), required |
+
+Writes all log entries with minimum level `level` formatted by `format` to stdout.
+
+Can only be specified once.
+
+**`syslog`**
+^^^^^^^^^^^^
+
+::
+
+    | Parameter | Default   | Comment |
+    |-----------| --------- | ----------- |
+    |`outlet`   | *none*    | required |
+    |`level`    | *none*    | minimum [log level](#levels), required, usually `debug` |
+    |`format`   | *none*    | output [format](#formats), required|
+    |`retry_interval`| 0 | Interval between reconnection attempts to syslog  |
+
+Writes all log entries formatted by `format` to syslog.
+On normal setups, you should not need to change the `retry_interval`.
+
+Can only be specified once.
+
+**`tcp`**
+^^^^^^^^^
+
+::
+
+    | Parameter | Default   | Comment |
+    |-----------| --------- | ----------- |
+    |`outlet`   | *none*    | required |
+    |`level`    | *none*    | minimum [log level](#levels), required |
+    |`format`   | *none*    | output [format](#formats), required |
+    |`net`|*none*|`tcp` in most cases|
+    |`address`|*none*|remote network, e.g. `logs.example.com:10202`|
+    |`retry_interval`|*none*|Interval between reconnection attempts to `address`|
+    |`tls`|*none*|TLS config (see below)|
+
+Establishes a TCP connection to `address` and sends log messages with minimum level `level` formatted by `format`.
+
+If `tls` is not specified, an unencrypted connection is established.
+
+If `tls` is specified, the TCP connection is secured with TLS + Client Authentication.
+This is particularly useful in combination with log aggregation services that run on an other machine.
+
+::
+
+    |Parameter|Description|
+    |---------|-----------|
+    |`ca`|PEM-encoded certificate authority that signed the remote server's TLS certificate|
+    |`cert`|PEM-encoded client certificate identifying this zrepl daemon toward the remote server|
+    |`key`|PEM-encoded, unencrypted client private key identifying this zrepl daemon toward the remote server|
+
+
+.. NOTE::
+
+    zrepl uses Go's `crypto/tls` and `crypto/x509` packages and leaves all but the required fields in `tls.Config` at their default values.
+    In case of a security defect in these packages, zrepl has to be rebuilt because Go binaries are statically linked.
--- a/docs/configuration/map_filter_syntax.rst
+++ b/docs/configuration/map_filter_syntax.rst
@ -0,0 +1,101 @@
+Mapping & Filter Syntax
+=======================
+
+For various job types, a filesystem `mapping` or `filter` needs to be
+specified.
+
+Both have in common that they take a filesystem path (in the ZFS filesystem hierarchy)as parameters and return something.
+Mappings return a *target filesystem* and filters return a *filter result*.
+
+The pattern syntax is the same for mappings and filters and is documented in the following section.
+
+Common Pattern Syntax
+---------------------
+
+A mapping / filter is specified as a **YAML dictionary** with patterns as keys and
+results as values.<br />
+The following rules determine which result is chosen for a given filesystem path:
+
+* More specific path patterns win over less specific ones
+* Non-wildcard patterns (full path patterns) win over *subtree wildcards* (`<` at end of pattern)
+
+The **subtree wildcard** `<` means "*the dataset left of `<` and all its children*".
+
+Example
+~~~~~~~
+
+::
+    # Rule number and its pattern
+    1: tank<            # tank and all its children
+    2: tank/foo/bar     # full path pattern (no wildcard)
+    3: tank/foo<        # tank/foo and all its children
+    
+    # Which rule applies to given path?
+    tank/foo/bar/loo => 3
+    tank/bar         => 1
+    tank/foo/bar     => 2
+    zroot            => NO MATCH
+    tank/var/log     => 1
+
+
+Mappings
+--------
+
+Mappings map a *source filesystem path* to a *target filesystem path*.
+Per pattern, either a target filesystem path or `"!"` is specified as a result.
+
+* If no pattern matches, there exists no target filesystem (`NO MATCH`).
+* If the result is a `"!"`, there exists no target filesystem (`NO MATCH`).
+* If the pattern is a non-wildcard pattern, the source path is mapped to the target path on the right.
+* If the pattern ends with a *subtree wildcard* (`<`), the source path is **prefix-trimmed** with the path specified left of `<`.
+  * Note: this means that only for *wildcard-only* patterns (pattern=`<` ) is the source path simply appended to the target path.
+
+The example is from the {{< sampleconflink "localbackup/host1.yml" >}} example config.
+
+::
+
+    jobs:
+    - name: mirror_local
+      type: local
+      mapping: {
+        "zroot/var/db<":    "storage/backups/local/zroot/var/db",
+        "zroot/usr/home<":  "storage/backups/local/zroot/usr/home",
+        "zroot/usr/home/paranoid":  "!", #don't backup paranoid user
+        "zroot/poudriere/ports<": "!", #don't backup the ports trees
+      }
+      ...
+
+Results in the following mappings
+
+::
+
+    zroot/var/db                        => storage/backups/local/zroot/var/db
+    zroot/var/db/a/child                => storage/backups/local/zroot/var/db/a/child
+    zroot/usr/home                      => storage/backups/local/zroot/usr/home
+    zroot/usr/home/paranoid             => NOT MAPPED
+    zroot/usr/home/bob                  => storage/backups/local/zroot/usr/home/bob
+    zroot/usr/src                       => NOT MAPPED
+    zroot/poudriere/ports/2017Q3        => NOT MAPPED
+    zroot/poudriere/ports/HEAD          => NOT MAPPED
+
+Filters
+-------
+
+Valid filter results: `ok` or `!`.
+
+The example below show the source job from the [tutorial]({{< relref "tutorial/_index.md#configure-app-srv" >}}):
+
+The client is allowed access to `zroot/var/db`, `zroot/usr/home` + children except `zroot/usr/home/paranoid`.
+
+::
+
+    jobs:
+    - name: pull_backup
+      type: source
+      ...
+      filesystems: {
+        "zroot/var/db": "ok",
+        "zroot/usr/home<": "ok",
+        "zroot/usr/home/paranoid": "!",
+      }
+      ...
--- a/docs/configuration/misc.rst
+++ b/docs/configuration/misc.rst
@ -0,0 +1,61 @@
+Miscellaneous
+=============
+
+Runtime Directories & UNIX Sockets
+----------------------------------
+
+zrepl daemon creates various UNIX sockets to allow communicating with it:
+
+* the `stdinserver` transport connects to a socket named after `client_identity` parameter
+* the `control` subcommand connects to a defined control socket
+
+There is no further authentication on these sockets.
+Therefore we have to make sure they can only be created and accessed by `zrepl daemon`.
+
+In fact, `zrepl daemon` will not bind a socket to a path in a directory that is world-accessible.
+
+The directories can be configured in the main configuration file:
+
+::
+
+    global:
+      control:
+        sockpath: /var/run/zrepl/control
+      serve:
+        stdinserver:
+          sockdir: /var/run/zrepl/stdinserver
+
+
+Durations & Intervals
+---------------------
+
+Interval & duration fields in job definitions, pruning configurations, etc. must match the following regex:
+
+::
+
+    var durationStringRegex *regexp.Regexp = regexp.MustCompile(`^\s*(\d+)\s*(s|m|h|d|w)\s*$`)
+    // s = second, m = minute, h = hour, d = day, w = week (7 days)
+
+Super-Verbose Job Debugging
+---------------------------
+
+You have probably landed here because you opened an issue on GitHub and some developer told you to do this...
+So just read the annotated comments ;)
+
+::
+
+    job:
+    - name: ...
+      ...
+     # JOB DEBUGGING OPTIONS
+      # should be equal for all job types, but each job implements the debugging itself
+      debug:
+        conn: # debug the io.ReadWriteCloser connection
+          read_dump: /tmp/connlog_read   # dump results of Read() invocations to this file
+          write_dump: /tmp/connlog_write # dump results of Write() invocations to this file
+        rpc: # debug the RPC protocol implementation
+          log: true # log output from rpc layer to the job log
+
+.. ATTENTION::
+
+    Connection dumps will almost certainly contain your or other's private data. Do not share it in a bug report.
--- a/docs/configuration/prune.rst
+++ b/docs/configuration/prune.rst
@ -0,0 +1,59 @@
+Snapshot Pruning
+================
+
+In zrepl, *pruning* means *destroying snapshots by some policy*.
+
+A *pruning policy* takes a list of snapshots and - for each snapshot - decides whether it should be kept or destroyed.
+
+The job context defines which snapshots are even considered for pruning, for example through the `snapshot_prefix` variable.
+Check the [job definition]({{< relref "configuration/jobs.md">}}) for details.
+
+Currently, the retention grid is the only supported pruning policy.
+
+Retention Grid
+--------------
+
+::
+
+    jobs:
+    - name: pull_app-srv
+      ...
+      prune:
+        policy: grid
+        grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d
+                │               │
+                └─ one hour interval
+                                │
+                                └─ 24 adjacent one-hour intervals
+
+The retention grid can be thought of as a time-based sieve:
+
+The `grid` field specifies a list of adjacent time intervals:
+the left edge of the leftmost (first) interval is the `creation` date of the youngest snapshot.
+All intervals to its right describe time intervals further in the past.
+
+Each interval carries a maximum number of snapshots to keep.
+It is secified via `(keep=N)`, where `N` is either `all` (all snapshots are kept) or a positive integer.
+The default value is **1**.
+
+The following procedure happens during pruning:
+
+1. The list of snapshots eligible for pruning is sorted by `creation`
+1. The left edge of the first interval is aligned to the `creation` date of the youngest snapshot
+1. A list of buckets is created, one for each interval
+1. The list of snapshots is split up into the buckets.
+1. For each bucket
+
+   1. the contained snapshot list is sorted by creation.
+   1. snapshots from the list, oldest first, are destroyed until the specified `keep` count is reached.
+   1. all remaining snapshots on the list are kept.
+
+.. ATTENTION::
+
+    The configuration of the first interval (`1x1h(keep=all)` in the example) determines the **maximum allowable replication lag** between source and destination.
+    After the first interval, source and destination likely have different retention settings.
+    This means source and destination may prune different snapshots, prohibiting incremental replication froms snapshots that are not in the first interval.
+
+    **Always** configure the first interval to **`1x?(keep=all)`**, substituting `?` with the maximum time replication may fail due to downtimes, maintenance, connectivity issues, etc.
+    After outages longer than `?` you may be required to perform **full replication** again.
+
--- a/docs/configuration/transports.rst
+++ b/docs/configuration/transports.rst
@ -0,0 +1,103 @@
+.. highlight:: bash
+
+Transports
+==========
+
+A transport provides an authenticated [`io.ReadWriteCloser`](https://golang.org/pkg/io/#ReadWriteCloser) to the RPC layer.
+(An `io.ReadWriteCloser` is essentially a bidirectional reliable communication channel.)
+
+Currently, only the `ssh+stdinserver` transport is supported.
+
+`ssh+stdinserver`
+-----------------
+
+The way the `ssh+stdinserver` transport works is inspired by [git shell](https://git-scm.com/docs/git-shell) and [Borg Backup](https://borgbackup.readthedocs.io/en/stable/deployment.html).
+It is implemented in the Go package `github.com/zrepl/zrepl/sshbytestream`.
+The config excerpts are taken from the [tutorial]({{< relref "tutorial/_index.md" >}}) which you should complete before reading further.
+
+`serve`
+~~~~~~~
+
+::
+
+    jobs:
+    - name: pull_backup
+      type: source
+      serve:
+        type: stdinserver
+        client_identity: backup-srv.example.com
+      ...
+
+The serving job opens a UNIX socket named after `client_identity` in the runtime directory, e.g. `/var/run/zrepl/stdinserver/backup-srv.example.com`.
+
+On the same machine, the :code:`zrepl stdinserver $client_identity` command connects to that socket.
+For example, `zrepl stdinserver backup-srv.example.com` connects to the UNIX socket `/var/run/zrepl/stdinserver/backup-srv.example.com`.
+
+It then passes its stdin and stdout file descriptors to the zrepl daemon via *cmsg(3)*.
+zrepl daemon in turn combines them into an `io.ReadWriteCloser`:
+a `Write()` turns into a write to stdout, a `Read()` turns into a read from stdin.
+
+Interactive use of the `stdinserver` subcommand does not make much sense.
+However, we can force its execution when a user with a particular SSH pubkey connects via SSH.
+This can be achieved with an entry in the `authorized_keys` file of the serving zrepl daemon.
+
+::
+
+    # for OpenSSH >= 7.2
+    command="zrepl stdinserver CLIENT_IDENTITY",restrict CLIENT_SSH_KEY
+    # for older OpenSSH versions
+    command="zrepl stdinserver CLIENT_IDENTITY",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc CLIENT_SSH_KEY
+
+* CLIENT_IDENTITY is substituted with `backup-srv.example.com` in our example
+* CLIENT_SSH_KEY is substituted with the public part of the SSH keypair specified in the `connect` directive on the connecting host.
+
+.. NOTE::
+    You may need to adjust the `PermitRootLogin` option in `/etc/ssh/sshd_config` to `forced-commands-only` or higher for this to work.
+    Refer to sshd_config(5) for details.
+
+To recap, this is of how client authentication works with the `ssh+stdinserver` transport:
+
+* Connections to the `client_identity` UNIX socket are blindly trusted by zrepl daemon.
+* Thus, the runtime directory must be private to the zrepl user (checked by zrepl daemon)
+* The admin of the host with the serving zrepl daemon controls the `authorized_keys` file.
+* Thus, the administrator controls the mapping `PUBKEY -> CLIENT_IDENTITY`.
+
+`connect`
+~~~~~~~~~
+
+::
+
+    jobs:
+    - name: pull_app-srv
+      type: pull
+      connect:
+        type: ssh+stdinserver
+        host: app-srv.example.com
+        user: root
+        port: 22
+        identity_file: /etc/zrepl/ssh/identity
+        options: # optional
+        - "Compression=on"
+
+The connecting zrepl daemon
+
+1. Creates a pipe
+1. Forks
+1. In the forked process
+
+   1. Replaces forked stdin and stdout with the corresponding pipe ends
+   1. Executes the `ssh` binary found in `$PATH`.
+
+      1. The identity file (`-i`) is set to `$identity_file`.
+      1. The remote user, host and port correspond to those configured.
+      1. Further options can be specified using the `options` field, which appends each entry in the list to the command line using `-o $entry`.
+
+1. Wraps the pipe ends in an `io.ReadWriteCloser` and uses it for RPC.
+
+As discussed in the section above, the connecting zrepl daemon expects that `zrepl stdinserver $client_identity` is  executed automatically via an `authorized_keys` file entry.
+
+.. NOTE::
+
+    The environment variables of the underlying SSH process are cleared. `$SSH_AUTH_SOCK` will not be available.
+    It is suggested to create a separate, unencrypted SSH key solely for that purpose.
+
--- a/docs/implementation.rst
+++ b/docs/implementation.rst
@ -0,0 +1,58 @@
+Implementation Overview
+=======================
+
+.. WARNING::
+
+    Incomplete / under construction
+
+The following design aspects may convince you that `zrepl` is superior to a hacked-together shell script solution.
+
+Testability & Performance
+-------------------------
+
+zrepl is written in Go, a real programming language with type safety,
+reasonable performance, testing infrastructure and an (opinionated) idea of
+software engineering.
+
+* key parts & algorithms of zrepl are covered by unit tests (work in progress)
+* zrepl is noticably faster than comparable shell scripts
+
+
+RPC protocol
+------------
+
+While it is tempting to just issue a few `ssh remote 'zfs send ...' | zfs recv`, this has a number of drawbacks:
+
+* The snapshot streams need to be compatible.
+* Communication is still unidirectional. Thus, you will most likely
+  * either not take advantage of features such as *compressed send & recv*
+  * or issue additional `ssh` commands in advance to figure out what features are supported on the other side.
+* Advanced logic in shell scripts is ugly to read, poorly testable and a pain to maintain.
+
+zrepl takes a different approach:
+
+* Define an RPC protocol.
+* Establish an encrypted, authenticated, bidirectional communication channel...
+* ... with zrepl running at both ends of it.
+
+ This has several obvious benefits:
+
+* No blank root shell access is given to the other side.
+* Instead, an *authenticated* peer can *request* filesystem lists, snapshot streams, etc.
+* Requests are then checked against job-specific ACLs, limiting a client to the filesystems it is actually allowed to replicate.
+* The {{< zrepl-transport "transport mechanism" >}} is decoupled from the remaining logic, keeping it extensible.
+
+Protocol Implementation
+~~~~~~~~~~~~~~~~~~~~~~~
+
+zrepl implements its own RPC protocol.
+This is mostly due to the fact that existing solutions do not provide efficient means to transport large amounts of data.
+
+Package [`github.com/zrepl/zrepl/rpc`](https://github.com/zrepl/zrepl/tree/master/rpc) builds a special-case handling around returning an `io.Reader` as part of a unary RPC call.
+
+Measurements show only a single memory-to-memory copy of a snapshot stream is made using `github.com/zrepl/zrepl/rpc`, and there is still potential for further optimizations.
+
+Logging & Transparency
+----------------------
+
+zrepl comes with [rich, structured and configurable logging]({{< relref "configuration/logging.md" >}}), allowing administators to understand what the software is actually doing.
--- a/docs/index.rst
+++ b/docs/index.rst
@ -3,18 +3,68 @@
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

-Welcome to zrepl's documentation!
-=================================
+zrepl - ZFS replication
+-----------------------
+
+.. ATTENTION::
+    zrepl as well as this documentation is still under active development.
+    It is neither feature complete nor is there a stability guarantee on the configuration format.
+    Use & test at your own risk ;)
+
+Getting started
+~~~~~~~~~~~~~~~
+
+The [5 minute tutorial setup]({{< relref "tutorial/_index.md" >}}) gives you a first impression.
+
+Main Features
+~~~~~~~~~~~~~
+
+* Filesystem Replication
+    * [x] Local & Remote
+    * [x] Pull mode
+    * [ ] Push mode
+    * [x] Access control checks when pulling datasets
+    * [x] [Flexible mapping]({{< ref "configuration/map_filter_syntax.md" >}}) rules
+    * [x] Bookmarks support
+    * [ ] Feature-negotiation for
+        * Resumable `send & receive`
+        * Compressed `send & receive`
+        * Raw encrypted `send & receive` (as soon as it is available)
+* Automatic snapshot creation
+    * [x] Ensure fixed time interval between snapshots
+* Automatic snapshot [pruning]({{< ref "configuration/prune.md" >}})
+    * [x] Age-based fading (grandfathering scheme)
+* Flexible, detailed & structured [logging]({{< ref "configuration/logging.md" >}})
+    * [x] `human`, `logfmt` and `json` formatting
+    * [x] stdout, syslog and TCP (+TLS client auth) outlets
+* Maintainable implementation in Go
+    * [x] Cross platform
+    * [x] Type safe & testable code
+
+Contributing
+~~~~~~~~~~~~
+
+We are happy about any help we can get!
+
+* Explore the codebase
+    * These docs live in the `docs/` subdirectory
+* Document any non-obvious / confusing / plain broken behavior you encounter when setting up zrepl for the first time
+* Check the *Issues* and *Projects* sections for things to do
+
+{{% panel header="<i class='fa fa-github'></i> Development Workflow"%}}
+[The <i class='fa fa-github'></i> GitHub repository](https://github.com/zrepl/zrepl) is where all development happens.<br />
+Make sure to read the [Developer Documentation section](https://github.com/zrepl/zrepl) and open new issues or pull requests there.
+{{% /panel %}}
+
+Table of Contents
+~~~~~~~~~~~~~~~~~

 .. toctree::
   :maxdepth: 2
   :caption: Contents:

-
-
-Indices and tables
-==================
-
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+   tutorial
+   installation
+   configuration
+   implementation
+   pr
--- a/docs/installation.rst
+++ b/docs/installation.rst
@ -0,0 +1,88 @@
+Installation
+============
+
+.. TIP::
+
+    Note: check out the [tutorial]({{< relref "tutorial/_index.md" >}}) if you want a first impression of zrepl.
+
+User Privileges
+---------------
+
+It is possible to run zrepl as an unprivileged user in combination with
+[ZFS delegation](https://www.freebsd.org/doc/handbook/zfs-zfs-allow.html).
+
+Also, there is the possibility to run it in a jail on FreeBSD by delegating a dataset to the jail.
+
+However, until we get around documenting those setups, you will have to run zrepl as root or experiment yourself :)
+
+Installation
+------------
+
+zrepl is currently not packaged on any operating system. Signed & versioned releases are planned but not available yet.
+
+Check out the sources yourself, fetch dependencies using dep, compile and install to the zrepl user's `$PATH`.<br />
+**Note**: if the zrepl binary is not in `$PATH`, you will have to adjust the examples in the [tutorial]({{< relref "tutorial/_index.md" >}}).
+
+::
+
+    # NOTE: you may want to checkout & build as an unprivileged user
+    cd /root
+    git clone https://github.com/zrepl/zrepl.git
+    cd zrepl
+    dep ensure
+    go build -o zrepl
+    cp zrepl /usr/local/bin/zrepl
+    rehash
+    # see if it worked
+    zrepl help
+
+Configuration Files
+-------------------
+
+zrepl searches for its main configuration file in the following locations (in that order):
+
+* `/etc/zrepl/zrepl.yml`
+* `/usr/local/etc/zrepl/zrepl.yml`
+
+Alternatively, use CLI flags to specify a config location.
+
+Copy a config from the [tutorial]({{< relref "tutorial/_index.md" >}}) or the `cmd/sampleconf` directory to one of these locations and customize it to your setup.
+
+## Runtime Directories
+
+Check the the [configuration documentation]({{< relref "configuration/misc.md#runtime-directories-unix-sockets" >}}) for more information.
+For default settings, the following should to the trick.
+
+```bash
+mkdir -p /var/run/zrepl/stdinserver
+chmod -R 0700 /var/run/zrepl
+```
+
+
+Running the Daemon
+------------------
+
+All actual work zrepl does is performed by a daemon process.
+
+Logging is configurable via the config file. Please refer to the [logging documentation]({{< relref "configuration/logging.md" >}}).
+
+::
+
+    zrepl daemon
+
+There are no *rc(8)* or *systemd.service(5)* service definitions yet. Note the *daemon(8)* utility on FreeBSD.
+
+.. ATTENTION::
+
+    Make sure to actually monitor the error level output of zrepl: some configuration errors will not make the daemon exit.<br />
+    Example: if the daemon cannot create the [stdinserver]({{< relref "configuration/transports.md#stdinserver" >}}) sockets
+    in the runtime directory, it will emit an error message but not exit because other tasks such as periodic snapshots & pruning are of equal importance.
+
+Restarting
+~~~~~~~~~~
+
+The daemon handles SIGINT and SIGTERM for graceful shutdown.
+
+Graceful shutdown means at worst that a job will not be rescheduled for the next interval.
+
+The daemon exits as soon as all jobs have reported shut down.
--- a/docs/pr.rst
+++ b/docs/pr.rst
@ -0,0 +1,5 @@
+Talks & Presentations
+=====================
+
+* Talk at EuroBSDCon2017 FreeBSD DevSummit ([Slides](https://docs.google.com/presentation/d/1EmmeEvOXAWJHCVnOS9-TTsxswbcGKmeLWdY_6BH4w0Q/edit?usp=sharing), [Event](https://wiki.freebsd.org/DevSummit/201709))
+
--- a/docs/tutorial.rst
+++ b/docs/tutorial.rst
@ -0,0 +1,183 @@
+Tutorial
+========
+
+
+This tutorial shows how zrepl can be used to implement a ZFS-based pull backup.
+We assume the following scenario:
+
+* Production server `app-srv` with filesystems to back up:
+
+   * `zroot/var/db`
+   * `zroot/usr/home` and all its child filesystems
+   * **except** `zroot/usr/home/paranoid` belonging to a user doing backups themselves
+
+* Backup server `backup-srv` with
+
+  * Filesystem `storage/zrepl/pull/app-srv` + children dedicated to backups of `app-srv`
+
+Our backup solution should fulfill the following requirements:
+
+* Periodically snapshot the filesystems on `app-srv` *every 10 minutes*
+* Incrementally replicate these snapshots to `storage/zrepl/pull/app-srv/*` on `backup-srv`
+* Keep only very few snapshots on `app-srv` to save disk space
+* Keep a fading history (24 hourly, 30 daily, 6 monthly) of snapshots on `backup-srv`
+
+Analysis
+--------
+
+We can model this situation as two jobs:
+
+* A **source job** on `app-srv`
+
+  * Creates the snapshots
+  * Keeps a short history of snapshots to enable incremental replication to `backup-srv`
+  * Accepts connections from `backup-srv`
+
+* A **pull job** on `backup-srv`
+
+  * Connects to the `zrepl daemon` process on `app-srv`
+  * Pulls the snapshots to `storage/zrepl/pull/app-srv/*`
+  * Fades out snapshots in `storage/zrepl/pull/app-srv/*` as they age
+
+
+Why doesn't the **pull job** create the snapshots before pulling?
+
+As is the case with all distributed systems, the link between `app-srv` and `backup-srv` might be down for an hour or two.
+We do not want to sacrifice our required backup resolution of 10 minute intervals for a temporary connection outage.
+
+When the link comes up again, `backup-srv` will happily catch up the 12 snapshots taken by `app-srv` in the meantime, without
+a gap in our backup history.
+
+Install zrepl
+-------------
+
+Follow the [OS-specific installation instructions]({{< relref "install/_index.md" >}}) and come back here.
+
+Configure `backup-srv`
+----------------------
+
+We define a **pull job** named `pull_app-srv` in the [main configuration file]({{< relref "install/_index.md#configuration-files" >}} ):::
+
+    jobs:
+    - name: pull_app-srv
+    type: pull
+    connect:
+        type: ssh+stdinserver
+        host: app-srv.example.com
+        user: root
+        port: 22
+        identity_file: /etc/zrepl/ssh/identity
+    interval: 10m
+    mapping: {
+        "<":"storage/zrepl/pull/app-srv"
+    }
+    initial_repl_policy: most_recent
+    snapshot_prefix: zrepl_pull_backup_
+    prune:
+        policy: grid
+        grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d
+
+The `connect` section instructs the zrepl daemon to use the `stdinserver` transport:
+`backup-srv` will connect to the specified SSH server and expect `zrepl stdinserver CLIENT_IDENTITY` instead of the shell on the other side.
+
+It uses the private key specified at `connect.identity_file` which we still need to create:::
+
+    cd /etc/zrepl
+    mkdir -p ssh
+    chmod 0700 ssh
+    ssh-keygen -t ed25519 -N '' -f /etc/zrepl/ssh/identity
+
+Note that most use cases do not benefit from separate keypairs per remote endpoint.
+Thus, it is sufficient to create one keypair and use it for all `connect` directives on one host.
+
+Learn more about [stdinserver]({{< relref "configuration/transports.md#ssh-stdinserver" >}}) and the [**pull job** format]({{< relref "configuration/jobs.md#pull" >}}).
+
+Configure `app-srv`
+-------------------
+
+We define a corresponding **source job** named `pull_backup` in the [main configuration file]({{< relref "install/_index.md#configuration-files" >}})
+`zrepl.yml`:::
+
+    jobs:
+    - name: pull_backup
+    type: source
+    serve:
+        type: stdinserver
+        client_identity: backup-srv.example.com
+    filesystems: {
+        "zroot/var/db": "ok",
+        "zroot/usr/home<": "ok",
+        "zroot/usr/home/paranoid": "!",
+    }
+    snapshot_prefix: zrepl_pull_backup_
+    interval: 10m
+    prune:
+        policy: grid
+        grid: 1x1d(keep=all)
+
+
+The `serve` section corresponds to the `connect` section in the configuration of `backup-srv`.
+
+We now want to authenticate `backup-srv` before allowing it to pull data.
+This is done by limiting SSH connections from `backup-srv` to execute the `stdinserver` subcommand.
+
+Open `/root/.ssh/authorized_keys` and add either of the the following lines.::
+
+    # for OpenSSH >= 7.2
+    command="zrepl stdinserver backup-srv.example.com",restrict CLIENT_SSH_KEY
+    # for older OpenSSH versions
+    command="zrepl stdinserver backup-srv.example.com",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc  CLIENT_SSH_KEY
+
+.. ATTENTION::
+
+    Replace CLIENT_SSH_KEY with the contents of `/etc/zrepl/ssh/identity.pub` from `app-srv`.
+    Mind the trailing `.pub` in the filename.
+    The entries **must** be on a single line, including the replaced CLIENT_SSH_KEY.
+
+
+.. HINT::
+
+    You may need to adjust the `PermitRootLogin` option in `/etc/ssh/sshd_config` to `forced-commands-only` or higher for this to work.
+    Refer to sshd_config(5) for details.
+
+The argument `backup-srv.example.com` is the client identity of `backup-srv` as defined in `jobs.serve.client_identity`.
+
+Again, you both [stdinserver]({{< relref "configuration/transports.md#ssh-stdinserver" >}}) and the [**source job** format]({{< relref "configuration/jobs.md#source" >}}) are documented.
+
+Apply Configuration Changes
+---------------------------
+
+We need to restart the zrepl daemon on **both** `app-srv` and `backup-srv`.
+
+This is [OS-specific]({{< relref "install/_index.md#restarting" >}}).
+
+Watch it Work
+-------------
+
+A common setup is to `watch` the log output and `zfs list` of snapshots on both machines.
+
+If you like tmux, here is a handy script that works on FreeBSD:::
+
+    pkg install gnu-watch tmux
+    tmux new-window
+    tmux split-window "tail -f /var/log/zrepl.log"
+    tmux split-window "gnu-watch 'zfs list -t snapshot -o name,creation -s creation | grep zrepl_pull_backup_'"
+    tmux select-layout tiled
+
+The Linux equivalent might look like this::
+
+    # make sure tmux is installed & let's assume you use systemd + journald
+    tmux new-window
+    tmux split-window "journalctl -f -u zrepl.service"
+    tmux split-window "watch 'zfs list -t snapshot -o name,creation -s creation | grep zrepl_pull_backup_'"
+    tmux select-layout tiled
+
+Summary
+-------
+
+Congratulations, you have a working pull backup. Where to go next?
+
+* Read more about [configuration format, options & job types]({{< relref "configuration/_index.md" >}})
+* Learn about [implementation details]({{<relref "impl/_index.md" >}}) of zrepl.
+
+