Commit Graph

52 Commits

Author SHA1 Message Date
Christian Schwarz
f8200a6386 replication/driver: regulate per-filesystem step planning through stepQueue
before this patch, we'd have unbounded parallelism for ListFilesystemVersions RPCs
2020-02-14 21:42:03 +01:00
Christian Schwarz
f9c7766073 replication/logic: fix race when reading byte counter pointer for report
fixes #214
2019-09-28 16:16:19 +02:00
Christian Schwarz
77d3a1ad4d build: drop go Dep, switch to modules, support Go 1.13
bump enumer to v1.1.1
bump golangci-lint to v1.17.1

no `go mod tidy` because 1.13 and 1.12 seem to alter each other's output

fixes #112
2019-09-14 13:36:44 +02:00
Christian Schwarz
000d8bba66 hotfix: limit concurrency of zfs send & recv commands
ATM, the replication logic sends all dry-run requests in parallel,
which might overwhelm the ZFS pool on the sending side.
Since we use rpc/dataconn for dry sends, this also opens one TCP
connection per dry-run request.

Use a sempahore to limit the degree of concurrency where we know it is a
problem ATM.
As indicated by the comments, the cleaner solution would involve some
kind of 'resource exhaustion' error code.

refs #161
refs #164
2019-03-28 22:17:12 +01:00
Christian Schwarz
5b97953bfb run golangci-lint and apply suggested fixes 2019-03-27 13:12:26 +01:00
Christian Schwarz
afed762774 format source tree using goimports 2019-03-22 19:41:12 +01:00
Christian Schwarz
86fdcfc437 replication: stepqueue: make TestPqNotconcurrent less flaky 2019-03-19 18:50:12 +01:00
Christian Schwarz
edcd258cc9 replication: more elaborate messages for Conflict errors 2019-03-13 18:46:04 +01:00
Christian Schwarz
d50e553ebb handle changes to placeholder state correctly
We assumed that `zfs recv -F FS` would basically replace FS inplace, leaving its children untouched.
That is in fact not the case, it only works if `zfs send -R` is set, which we don't do.

Thus, implement the required functionality manually.

This solves a `zfs recv` error that would occur when a filesystem previously created as placeholder on the receiving side becomes a non-placeholder filesystem (likely due to config change on the sending side):

  zfs send pool1/foo@1 | zfs recv -F pool1/bar
  cannot receive new filesystem stream:
  destination has snapshots (eg. pool1/bar)
  must destroy them to overwrite it
2019-03-13 18:46:04 +01:00
Christian Schwarz
1eb0f12a61 replication: add diff test case 2019-03-13 18:45:40 +01:00
Christian Schwarz
8129ed91f1 zfs + replication: migrate dead zfs/diff_test.go to replication/logic/diff
(and remove the dead code from package zfs)
2019-03-13 16:39:10 +01:00
Christian Schwarz
c87759affe replication/driver: automatic retries on connectivity-related errors 2019-03-13 15:00:40 +01:00
Christian Schwarz
07b43bffa4 replication: refactor driving logic (no more explicit state machine) 2019-03-13 15:00:40 +01:00
Christian Schwarz
796c5ad42d rpc rewrite: control RPCs using gRPC + separate RPC for data transfer
transport/ssh: update go-netssh to new version
    => supports CloseWrite and Deadlines
    => build: require Go 1.11 (netssh requires it)
2019-03-13 13:53:48 +01:00
Christian Schwarz
3535b251ab freeze Go build dependencies in Gopkg.lock
* use pseudo-depdencies in build/build.go to convince dep
* update Travis, Dockerfile and Docs
* build.Dockerfile image now contains the Go build dependencies
* => faster builds
* bump pdu file after protoc update

fixes #106
2018-12-01 14:36:40 +01:00
Christian Schwarz
3472145df6 pruner + proto change: better handling of missing replication cursor
- don't treat missing replication cursor as an error in protocol
- treat it as a per-fs planning error instead
2018-11-16 12:21:54 +01:00
Christian Schwarz
1691839c6b replication: handle context cancellation errors as GlobalError 2018-10-21 19:06:35 +02:00
Christian Schwarz
94427d334b replication + pruner + watchdog: adjust timeouts based on practical experience 2018-10-21 18:37:57 +02:00
Christian Schwarz
b2844569c8 replication: rewrite error handling + simplify state machines
* Remove explicity state machine code for all but replication.Replication
* Introduce explicit error types that satisfy interfaces which provide
  sufficient information for replication.Replication to make intelligent
  retry + queuing decisions

  * Temporary()
  * LocalToFS()

* Remove the queue and replace it with a simple array that we sort each
  time (yay no generics :( )
2018-10-21 18:37:57 +02:00
Christian Schwarz
fffda09f67 replication + pruner: progress markers during planning 2018-10-21 17:50:08 +02:00
Christian Schwarz
45373168ad replication: fix retry wait behavior
An fsrep.Replication is either Ready, Retry or in a terminal state.
The queue prefers Ready over Retry:

Ready is sorted by nextStepDate to progress evenly..
Retry is sorted by error count, to de-prioritize filesystems that fail
often. This way we don't get stuck with individual filesystems
and lose other working filesystems to the watchdog.

fsrep.Replication no longer blocks in Retry state, we have
replication.WorkingWait for that.
2018-10-19 17:23:00 +02:00
Christian Schwarz
69bfcb7bed daemon/active: implement watchdog to handle stuck replication / pruners
ActiveSide.do() can only run sequentially, i.e. we cannot run
replication and pruning in parallel. Why?

* go-streamrpc only allows one active request at a time
(this is bad design and should be fixed at some point)
* replication and pruning are implemented independently, but work on the
same resources (snapshots)

A: pruning might destroy a snapshot that is planned to be replicated
B: replication might replicate snapshots that should be pruned

We do not have any resource management / locking for A and B, but we
have a use case where users don't want their machine fill up with
snapshots if replication does not work.
That means we _have_ to run the pruners.

A further complication is that we cannot just cancel the replication
context after a timeout and move on to the pruner: it could be initial
replication and we don't know how long it will take.
(And we don't have resumable send & recv yet).

With the previous commits, we can implement the watchdog using context
cancellation.
Note that the 'MadeProgress()' calls can only be placed right before
non-error state transition. Otherwise, we could end up in a live-lock.
2018-10-19 17:23:00 +02:00
Christian Schwarz
4ede99b08c replication: simpler PermanentError state + handle context cancellation 2018-10-19 17:23:00 +02:00
Christian Schwarz
3c06235dca replication + zfs: leave From field instead of To field empty for initial send 2018-10-14 13:06:23 +02:00
Christian Schwarz
59a4e2db5f replication: regenerate pdu.pb with new protoc-gen-go 2018-10-13 17:23:39 +02:00
Christian Schwarz
af3d96dab8 use enumer generate tool for state strings 2018-10-12 22:10:49 +02:00
Christian Schwarz
cb83a26c90 replication: wakeup + retry handling: make wakeups work in retry wait states
- handle wakeups in Planning state
- fsrep.Replication yields immediately in RetryWait
- once the queue only contains fsrep.Replication in retryWait:
transition replication.Replication into WorkingWait state
- handle wakeups in WorkingWait state, too
2018-10-12 13:12:28 +02:00
Christian Schwarz
d17ecc3b5c replication/fsrep: report Pending[0] problem as fsrep problem in RetryWait state 2018-10-12 12:45:37 +02:00
Christian Schwarz
2990193512 replication: export SleepUntil in report 2018-09-24 19:23:53 +02:00
Christian Schwarz
fa47667f31 bring back prometheus metrics, with new metrics for replication state machine 2018-09-07 22:22:34 -07:00
Christian Schwarz
975fdee217 replication & pruning: ditch replicated-property, use bookmark as cursor instead
A bookmark with a well-known name is used to track which version was
last successfully received by the receiver.
The createtxg that can be retrieved from the bookmark using `zfs get` is
used to set the Replicated attribute of each snap on the sender:
If the snap's CreateTXG > the cursor's, it is not yet replicated,
otherwise it has been.

There is an optional config option to change the behvior to
`CreateTXG >= the cursor's`, and the implementation defaults to that.

The reason: While things work just fine with `CreateTXG > the cursor's`,
ZFS does not provide size estimates in a `zfs send` dry run
(see acd2418).
However, to enable the use case of keeping the snapshot only around for
the replication, the config flag exists.
2018-09-05 19:51:06 -07:00
Christian Schwarz
acd2418803 handle DryRun send size estimate errors with bookmarks 2018-09-05 17:41:25 -07:00
Christian Schwarz
8eade3d20a replication/pdu: fix broken test 2018-09-04 17:01:46 -07:00
Christian Schwarz
be57d6ce8e replication/diff: replace invalid comparison of CreateTXG with Creation 2018-09-04 14:01:48 -07:00
Christian Schwarz
0c4a3f8dc4 pruning/history: properly communicate via rpc if snapshot does not exist 2018-09-04 14:01:48 -07:00
Christian Schwarz
ad28fd1ecb replication: diff does not need special case for receiver/sender == nil 2018-09-02 15:46:42 -07:00
Christian Schwarz
b95e983d0d bump go-streamrpc to 0.2, cleanup logging
logging should be user-friendly in INFO mode
2018-09-02 15:45:18 -07:00
Anton Schirg
f387e23214 fix: at least two snapshots were needed to start replication 2018-08-30 19:20:18 +02:00
Anton Schirg
48feaff054 fix some status display alignment 2018-08-30 15:21:07 +02:00
Anton Schirg
b5957aca37 do dry runs in planning stage to estimate size of all sends 2018-08-30 12:59:16 +02:00
Anton Schirg
98f3f3dfd8 show expected size of current send
Needs to be changed to send sizes for all planned steps
2018-08-30 12:58:13 +02:00
Anton Schirg
6ca11a7391 byte counter for status 2018-08-30 12:54:30 +02:00
Christian Schwarz
22ca80eb7e remote snapshot destruction & replication status zfs property 2018-08-30 11:51:47 +02:00
Christian Schwarz
a2aa8e7bd7 finish pruner implementation 2018-08-29 19:00:45 +02:00
Christian Schwarz
ee5445777d logging format 'human': continue printing prefixed fields if some are missing 2018-08-26 19:13:09 +02:00
Christian Schwarz
7ff72fb6d9 replication: document most important aspects of Endpoint interface 2018-08-26 15:12:43 +02:00
Christian Schwarz
cf01086df5 build: pin protoc version and update protobuf + regenerate 2018-08-26 14:35:18 +02:00
Christian Schwarz
71203ab325 move various timeouts to package-level variables 2018-08-25 22:30:16 +02:00
Christian Schwarz
88de8ba8bb initial repl policy: get rid of unimplemented options 2018-08-25 22:23:47 +02:00
Christian Schwarz
e30ae972f4 gofmt 2018-08-25 21:30:25 +02:00