Because some jobs add client identity to root_fs and other jobs don't do
that,
we can't reliable detect overlapping of filesystems. And and the same
time we
need an ability to use equal or overlapped root_fs for different jobs.
For
instance see this config:
```
- name: "zdisk"
type: "sink"
root_fs: "zdisk/zrepl"
serve:
type: "local"
listener_name: "zdisk"
```
and
```
- name: "remote-to-zdisk"
type: "pull"
connect:
type: "tls"
root_fs: "zdisk/zrepl/remote"
```
As you can see, two jobs have overlapped root_fs, but actually datasets
are not
overlapped, because job `zdisk` save everything under
`zdisk/zrepl/localhost`,
because it adds client identity. So they actually use two different
filesystems:
`zdisk/zrepl/localhost` and `zdisk/zrepl/remote`. And we can't detect
this
situation during config check. So let's just remove this check, because
it's
admin's duty to configure correct root_fs's.
---------
Co-authored-by: Christian Schwarz <me@cschwarz.com>
Before this change, resuming from an unencrypted dataset with
send.raw=true specified wouldn't work with zrepl due to overly
restrictive resume token checking.
An initial PR to fix this was made in https://github.com/zrepl/zrepl/pull/503
but it didn't address the core of the problem.
The core of the problem was that zrepl assumed that if a resume token
contained `rawok=true, compressok=true`, the resulting send would be
encrypted. But if the sender dataset was unencrypted, such a resume would
actually result in an unencrypted send.
Which could be totally legitimate but zrepl failed to recognize that.
BACKGROUND
==========
The following snippets of OpenZFS code are insightful regarding how the
various ${X}ok values in the resume token are handled:
- 6c3c5fcfbe/module/zfs/dmu_send.c (L1947-L2012)
- 6c3c5fcfbe/module/zfs/dmu_recv.c (L877-L891)
- https://github.com/openzfs/zfs/blob/6c3c5fc/lib/libzfs/libzfs_sendrecv.c#L1663-L1672
Basically, some zfs send flags make the DMU send code set some DMU send
stream featureflags, although it's not a pure mapping, i.e, which DMU
send stream flags are used depends somewhat on the dataset (e.g., is it
encrypted or not, or, does it use zstd or not).
Then, the receiver looks at some (but not all) feature flags and maps
them to ${X}ok dataset zap attributes.
These are funnelled back to the sender 1:1 through the resume_token.
And the sender turns them into lzc flags.
As an example, let's look at zfs send --raw.
if the sender requests a raw send on an unencrypted dataset, the send
stream (and hence the resume token) will not have the raw stream
featureflag set, and hence the resume token will not have the rawok
field set. Instead, it will have compressok, embedok, and depending
on whether large blocks are present in the dataset, largeblockok set.
WHAT'S ZREPL'S ROLE IN THIS?
============================
zrepl provides a virtual encrypted sendflag that is like `raw`,
but further ensures that we only send encrypted datasets.
For any other resume token stuff, it shoudn't do any checking,
because it's a futile effort to keep up with ZFS send/recv features
that are orthogonal to encryption.
CHANGES MADE IN THIS COMMIT
===========================
- Rip out a bunch of needless checking that zrepl would do during
planning. These checks were there to give better error messages,
but actually, the error messages created by the endpoint.Sender.Send
RPC upon send args validation failure are good enough.
- Add platformtests to validate all combinations of
(Unencrypted/Encrypted FS) x (send.encrypted = true | false) x (send.raw = true | false)
for cases both non-resuming and resuming send.
Additional manual testing done:
1. With zrepl 0.5, setup with unencrypted dataset, send.raw=true specified, no send.encrypted specified.
2. Observe that regular non-resuming send works, but resuming doesn't work.
3. Upgrade zrepl to this change.
4. Observe that both regular and resuming send works.
closes https://github.com/zrepl/zrepl/pull/613
It may be desirable to check that a config is valid without checking for
the existence of certificate files (e.g. when validating a config inside
a sandbox without access to the cert files).
This will be very useful for NixOS so that we can check the config file
at nix-build time (e.g. potentially without proper permissions to read cert
files for a TLS connection).
fixes https://github.com/zrepl/zrepl/issues/467
closes https://github.com/zrepl/zrepl/pull/587
Config:
```
- type: push
...
conflict_resolution:
initial_replication: most_recent | all | fali
```
The ``initial_replication`` option determines which snapshots zrepl
replicates if the filesystem has not been replicated before.
If ``most_recent`` (the default), the initial replication will only
transfer the most recent snapshot, while ignoring previous snapshots.
If all snapshots should be replicated, specify ``all``.
Use ``fail`` to make replication of the filesystem fail in case
there is no corresponding fileystem on the receiver.
Code-Level Changes, apart from the obvious:
- Rework IncrementalPath()'s return signature.
Now returns an error for initial replications as well.
- Rename & rework it's consumer, resolveConflict().
Co-authored-by: Graham Christensen <graham@grahamc.com>
Fixes https://github.com/zrepl/zrepl/issues/550
Fixes https://github.com/zrepl/zrepl/issues/187
Closes https://github.com/zrepl/zrepl/pull/592
fixes https://github.com/zrepl/zrepl/issues/504
Problem:
plain send + recv with root_fs encrypted + placeholders causes plain recvs
whereas user would expect encrypt-on-recv
Reason:
We create placeholder filesytems with -o encryption=off.
Thus, children received below those placeholders won't inherit
encryption of root_fs.
Fix:
We'll have three values for `recv.placeholders.encryption: unspecified (default) | off | inherit`.
When we create a placeholder, we will fail the operation if `recv.placeholders.encryption = unspecified`.
The exception is if the placeholder filesystem is to encode the client identity ($root_fs/$client_identity) in a pull job.
Those are created in `inherit` mode if the config field is `unspecified` so that users who don't need
placeholders are not bothered by these details.
Future Work:
Automatically warn existing users of encrypt-on-recv about the problem
if they are affected.
The problem that I hit during implementation of this is that the
`encryption` prop's `source` doesn't quite behave like other props:
`source` is `default` for `encryption=off` and `-` when `encryption=on`.
Hence, we can't use `source` to distinguish the following 2x2 cases:
(1) placeholder created with explicit -o encryption=off
(2) placeholder created without specifying -o encryption
with
(A) an encrypted parent at creation time
(B) an unencrypted parent at creation time
zrepl daemon panics when the snap job triggers
fixup for f5f269bfd5 (bandwidth limiting)
fixes#521
Oct 01 16:14:56 cstp zrepl[56563]: panic: invalid config`BandwidthLimit` field invalid: BucketCapacity must not be zero
Oct 01 16:14:56 cstp zrepl[56563]: panic: end span: span still has active child spans
Oct 01 16:14:56 cstp zrepl[56563]: goroutine 38 [running]:
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/daemon/logging/trace.WithSpan.func2()
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/logging/trace/trace.go:341 +0x2ea
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/daemon/logging/trace.WithTaskAndSpan.func1()
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/logging/trace/trace_convenience.go:40 +0x2e
Oct 01 16:14:56 cstp zrepl[56563]: panic(0xcee9c0, 0xc000676730)
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/go1.16.6/src/runtime/panic.go:965 +0x1b9
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/endpoint.NewSender(0xf5bbc0, 0xc0003840c0, 0xc0000b2c90, 0x4, 0xc0002c5958, 0x0, 0x0, 0x0, 0xc000068cf8)
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/endpoint/endpoint.go:68 +0x1ec
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/daemon/job.(*SnapJob).doPrune(0xc00039e000, 0xf6e3b8, 0xc0006541b0)
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/job/snapjob.go:179 +0x198
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/daemon/job.(*SnapJob).Run(0xc00039e000, 0xf6e3b8, 0xc0001d83c0)
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/job/snapjob.go:127 +0x329
Oct 01 16:14:56 cstp zrepl[56563]: github.com/zrepl/zrepl/daemon.(*jobs).start.func1(0xc0006a4100, 0xf6e3b8, 0xc00022a0f0, 0xf72d18, 0xc00039e000)
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/daemon.go:255 +0x15b
Oct 01 16:14:56 cstp zrepl[56563]: created by github.com/zrepl/zrepl/daemon.(*jobs).start
Oct 01 16:14:56 cstp zrepl[56563]: /home/cs/zrepl/zrepl/daemon/daemon.go:251 +0x425
Oct 01 16:14:56 cstp systemd[1]: zrepl.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 01 16:14:56 cstp systemd[1]: zrepl.service: Failed with result 'exit-code'.
This was merged to master prematurely as the job components are not decoupled well enough
for these signals to be useful yet.
This reverts commit 2c8c2cfa14.
closes#452
From:
github.com/golang/protobuf v1.3.2
google.golang.org/grpc v1.17.0
To:
github.com/golang/protobuf v1.4.3
google.golang.org/grpc v1.35.0
google.golang.org/protobuf v1.25.0
About the two protobuf packages:
https://developers.google.com/protocol-buffers/docs/reference/go/faq
> Version v1.4.0 and higher of github.com/golang/protobuf wrap the new
implementation and permit programs to adopt the new API incrementally. For
example, the well-known types defined in github.com/golang/protobuf/ptypes are
simply aliases of those defined in the newer module. Thus,
google.golang.org/protobuf/types/known/emptypb and
github.com/golang/protobuf/ptypes/empty may be used interchangeably.
Notable Code Changes in zrepl:
- generate protobufs now contain a mutex so we can't copy them by value
anymore
- grpc.WithDialer is deprecated => use grpc.WithContextDialer instead
Go1.12 is now actually required by some of the dependencies.
This commit
- adds a configuration in which no step holds, replication cursors, etc. are created
- removes the send.step_holds.disable_incremental setting
- creates a new config option `replication` for active-side jobs
- adds the replication.protection.{initial,incremental} settings, each
of which can have values
- `guarantee_resumability`
- `guarantee_incremental`
- `guarantee_nothing`
(refer to docs/configuration/replication.rst for semantics)
The `replication` config from an active side is sent to both endpoint.Sender and endpoint.Receiver
for each replication step. Sender and Receiver then act accordingly.
For `guarantee_incremental`, we add the new `tentative-replication-cursor` abstraction.
The necessity for that abstraction is outlined in https://github.com/zrepl/zrepl/issues/340.
fixes https://github.com/zrepl/zrepl/issues/340
This is a stop-gap solution until we re-write the pruner to support
rules for removing step holds.
Note that disabling step holds for incremental sends does not affect
zrepl's guarantee that incremental replication is always possible:
Suppose you yank the external drive during an incremental @from -> @to step:
* restarting that step or future incrementals @from -> @to_later` will be possible
because the replication cursor bookmark points to @from until the step is complete
* resuming @from -> @to will work as long as the pruner on your internal pool doesn't come around to destroy @to.
* in that case, the replication algorithm should determine that the resumable state
on the receiving side isuseless because @to no longer exists on the sending side,
and consequently clear it, and restart an incremental step @from -> @to_later
refs #288
- drop HintMostRecentCommonAncestor rpc call
- it is wrong to put faith into the active side of the replication to always make that call
(we might not trust it, ref pull setup)
- clean up step holds + step bookmarks + replication cursor bookmarks on
send RPC instead
- this makes it symmetric with Receive RPC
- use a cache (endpoint.sendAbstractionsCache) to avoid the cost of
listing the on-disk endpoint abstractions state on every step
The "create" methods for endpoint abstractions (CreateReplicationCursor, HoldStep) are now fully
idempotent and return an Abstraction.
Notes about endpoint.sendAbstractionsCache:
- fills lazily from disk state on first `Get` operation
- fill from disk is generally only attempted once
- unless the `ListAbstractions` fails, in which case the fill from
disk is retried on next `Get` (the current `Get` will observe a
subset of the actual on-disk abstractions)
- the `Invalidate` method is called
- it is a global (zrepl process-wide) cache
fixes#316
package trace:
- introduce the concept of tasks and spans, tracked as linked list within ctx
- see package-level docs for an overview of the concepts
- **main feature 1**: unique stack of task and span IDs
- makes it easy to follow a series of log entries in concurrent code
- **main feature 2**: ability to produce a chrome://tracing-compatible trace file
- either via an env variable or a `zrepl pprof` subcommand
- this is not a CPU profile, we already have go pprof for that
- but it is very useful to visually inspect where the
replication / snapshotter / pruner spends its time
( fixes#307 )
usage in package daemon/logging:
- goal: every log entry should have a trace field with the ID stack from package trace
- make `logging.GetLogger(ctx, Subsys)` the authoritative `logger.Logger` factory function
- the context carries a linked list of injected fields which
`logging.GetLogger` adds to the logger it returns
- `logging.GetLogger` also uses package `trace` to get the
task-and-span-stack and injects it into the returned logger's fields
- **Resumable Send & Recv Support**
No knobs required, automatically used where supported.
- **Hold-Protected Send & Recv**
Automatic ZFS holds to ensure that we can always resume a replication step.
- **Encrypted Send & Recv Support** for OpenZFS native encryption.
Configurable at the job level, i.e., for all filesystems a job is responsible for.
- **Receive-side hold on last received dataset**
The counterpart to the replication cursor bookmark on the send-side.
Ensures that incremental replication will always be possible between a sender and receiver.
Design Doc
----------
`replication/design.md` doc describes how we use ZFS holds and bookmarks to ensure that a single replication step is always resumable.
The replication algorithm described in the design doc introduces the notion of job IDs (please read the details on this design doc).
We reuse the job names for job IDs and use `JobID` type to ensure that a job name can be embedded into hold tags, bookmark names, etc.
This might BREAK CONFIG on upgrade.
Protocol Version Bump
---------------------
This commit makes backwards-incompatible changes to the replication/pdu protobufs.
Thus, bump the version number used in the protocol handshake.
Replication Cursor Format Change
--------------------------------
The new replication cursor bookmark format is: `#zrepl_CURSOR_G_${this.GUID}_J_${jobid}`
Including the GUID enables transaction-safe moving-forward of the cursor.
Including the job id enables that multiple sending jobs can send the same filesystem without interfering.
The `zrepl migrate replication-cursor:v1-v2` subcommand can be used to safely destroy old-format cursors once zrepl has created new-format cursors.
Changes in This Commit
----------------------
- package zfs
- infrastructure for holds
- infrastructure for resume token decoding
- implement a variant of OpenZFS's `entity_namecheck` and use it for validation in new code
- ZFSSendArgs to specify a ZFS send operation
- validation code protects against malicious resume tokens by checking that the token encodes the same send parameters that the send-side would use if no resume token were available (i.e. same filesystem, `fromguid`, `toguid`)
- RecvOptions support for `recv -s` flag
- convert a bunch of ZFS operations to be idempotent
- achieved through more differentiated error message scraping / additional pre-/post-checks
- package replication/pdu
- add field for encryption to send request messages
- add fields for resume handling to send & recv request messages
- receive requests now contain `FilesystemVersion To` in addition to the filesystem into which the stream should be `recv`d into
- can use `zfs recv $root_fs/$client_id/path/to/dataset@${To.Name}`, which enables additional validation after recv (i.e. whether `To.Guid` matched what we received in the stream)
- used to set `last-received-hold`
- package replication/logic
- introduce `PlannerPolicy` struct, currently only used to configure whether encrypted sends should be requested from the sender
- integrate encryption and resume token support into `Step` struct
- package endpoint
- move the concepts that endpoint builds on top of ZFS to a single file `endpoint/endpoint_zfs.go`
- step-holds + step-bookmarks
- last-received-hold
- new replication cursor + old replication cursor compat code
- adjust `endpoint/endpoint.go` handlers for
- encryption
- resumability
- new replication cursor
- last-received-hold
- client subcommand `zrepl holds list`: list all holds and hold-like bookmarks that zrepl thinks belong to it
- client subcommand `zrepl migrate replication-cursor:v1-v2`