Commit Graph

118 Commits

Author SHA1 Message Date
Christian Schwarz
6ab05ee1fa reimplement io.ReadWriteCloser based RPC mechanism
The existing ByteStreamRPC requires writing RPC stub + server code
for each RPC endpoint. Does not scale well.

Goal: adding a new RPC call should

- not require writing an RPC stub / handler
- not require modifications to the RPC lib

The wire format is inspired by HTTP2, the API by net/rpc.

Frames are used for framing messages, i.e. a message is made of multiple
frames which are glued together using a frame-bridging reader / writer.
This roughly corresponds to HTTP2 streams, although we're happy with
just one stream at any time and the resulting non-need for flow control,
etc.

Frames are typed using a header. The two most important types are
'Header' and 'Data'.

The RPC protocol is built on top of this:

- Client sends a header         => multiple frames of type 'header'
- Client sends request body     => mulitiple frames of type 'data'
- Server reads a header         => multiple frames of type 'header'
- Server reads request body     => mulitiple frames of type 'data'
- Server sends response header  => ...
- Server sends response body    => ...

An RPC header is serialized JSON and always the same structure.
The body is of the type specified in the header.

The RPC server and client use some semi-fancy reflection tequniques to
automatically infer the data type of the request/response body based on
the method signature of the server handler; or the client parameters,
respectively.
This boils down to a special-case for io.Reader, which are just dumped
into a series of data frames as efficiently as possible.
All other types are (de)serialized using encoding/json.

The RPC layer and Frame Layer log some arbitrary messages that proved
useful during debugging. By default, they log to a non-logger, which
should not have a big impact on performance.

pprof analysis shows the implementation spends its CPU time
        60% waiting for syscalls
        30% in memmove
        10% ...

On a Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz CPU, Linux 4.12, the
implementation achieved ~3.6GiB/s.

Future optimization may include spice(2) / vmspice(2) on Linux, although
this doesn't fit so well with the heavy use of io.Reader / io.Writer
throughout the codebase.

The existing hackaround for local calls was re-implemented to fit the
new interface of PRCServer and RPCClient.
The 'R'PC method invocation is a bit slower because reflection is
involved inbetween, but otherwise performance should be no different.

The RPC code currently does not support multipart requests and thus does
not support the equivalent of a POST.

Thus, the switch to the new rpc code had the following fallout:

- Move request objects + constants from rpc package to main app code
- Sacrifice the hacky 'push = pull me' way of doing push
-> need to further extend RPC to support multipart requests or
     something to implement this properly with additional interfaces
-> should be done after replication is abstracted better than separate
     algorithms for doPull() and doPush()
2017-09-01 19:24:53 +02:00
Christian Schwarz
e5b713ce5b docs: pattern syntax: more precise terminology 2017-08-11 18:45:39 +02:00
Christian Schwarz
64baa3915f docs: bump theme 2017-08-11 18:44:53 +02:00
Christian Schwarz
d9064d46f6 docs: improve welcome page 2017-08-09 23:42:50 +02:00
Christian Schwarz
cd9bfbff6c docs: bump theme version 2017-08-09 23:36:39 +02:00
Christian Schwarz
73d586f305 diff: actually fix publish.sh script 2017-08-09 22:05:29 +02:00
Christian Schwarz
9e9f464de7 docs: remove GH pages repo as submodule, adjust publish.sh
Would need to bump zrepl main repo for every publish otherwise...
2017-08-09 21:41:46 +02:00
Christian Schwarz
44b77a8ef9 rpc: always log goodbye 2017-08-09 21:03:12 +02:00
Christian Schwarz
676ac41677 fix leaking channel when closing connection 2017-08-09 21:03:05 +02:00
Christian Schwarz
ca1a482e9e sshbytestream & IOCommand: fix handling of dead child process
SSH catches SIGTERM, tears down its connection, then exits with
platform-specific exit code.
2017-08-09 21:01:06 +02:00
Christian Schwarz
e2bbd4287e docs: include GH pages repo as submodule 2017-08-09 16:18:13 +02:00
Christian Schwarz
c1e792dc51 docs: initial commit 2017-08-09 16:13:34 +02:00
Christian Schwarz
4e45b4090b pull log output: optimize to be readable by humans 2017-08-06 18:28:05 +02:00
Christian Schwarz
cba083cadf Make zfs.DatasetPath json.Marshaler and json.Unmarshaler
Had to resort to using pointers to zfs.DatasetPath everywhere... Should
find a better solution for that.
2017-08-06 16:22:15 +02:00
Christian Schwarz
2ce07c9342 rework filters & mappings
config defines a single datastructure that can act both as a Map and as a Filter
(DatasetMapFilter)

Cleanup wildcard syntax along the way (also changes semantics).
2017-08-06 16:21:54 +02:00
Christian Schwarz
3fac6a67df extract PullACL check into function 2017-08-06 16:21:54 +02:00
Christian Schwarz
4732fdd4cc Implement placeholder filesystems.
Note the docs on the placeholder user property introduced with this
commit. The solution is not really satisfying but couldn't think of a
better one OTOMH
2017-08-06 16:21:54 +02:00
Christian Schwarz
8eb4a2ba44 Rudimentary progress reporting on send / recv side. 2017-08-06 16:21:54 +02:00
Christian Schwarz
d1999fc17c Remove months as a possible time interval unit as it is too volatile.
Thanks to @erdgeist for pointing that out.

refs #2
2017-07-09 00:38:16 +02:00
Dirk Engling
5afbedbd87 Shrink the 'monthly' interval from 32 weeks to 32 days 2017-07-09 00:11:02 +02:00
Christian Schwarz
9ab6f18f82 zfs: fix/update tests for diffs for createtxg & guid 2017-07-09 00:08:50 +02:00
Christian Schwarz
4b373fbd95 zfs & replication: explicit conflict types for FilesystemDiff + handling in repl 2017-07-08 13:13:16 +02:00
Christian Schwarz
8e378d76b9 scratchpad: repeat: run a command in a certain interval or as soon as it finishes 2017-07-08 12:08:34 +02:00
Christian Schwarz
2c13fbe6ec config: rename 'pools' section to 'remotes' 2017-07-08 12:08:34 +02:00
Christian Schwarz
e951beaef5 Simplify CLI by requiring explicit job names.
Job names are derived from job type + user-defined name in config file
CLI now has subcommands corresponding 1:1 to the config file sections:

        push,pull,autosnap,prune

A subcommand always expects a job name, thus executes exactly one job.

Dict-style syntax also used for PullACL and Sink sections.

jobrun package is currently only used for autosnap, all others need to
be invoked repeatedly via external tool.
Plan is to re-integrate jobrun in an explicit daemon-mode (subcommand).
2017-07-08 11:13:50 +02:00
Christian Schwarz
b44a005bbb Switch to using https://github.com/spf13/cobra for CLI.
Use opportunity to structure project by subcommands.
2017-07-06 13:36:55 +02:00
Christian Schwarz
655b3ab55f implement automatic snapshotting feature 2017-07-02 00:02:33 +02:00
Christian Schwarz
8c8a6ee905 implement snapshot pruning feature 2017-07-02 00:02:33 +02:00
Christian Schwarz
e0d39ddf11 Implement RetentionGrid structure. 2017-07-01 23:19:31 +02:00
Christian Schwarz
c7f140a00f zfs: support destroy 2017-07-01 23:19:31 +02:00
Christian Schwarz
c22190e981 zfs: extract filesystem version code to separate file & add filtering support 2017-07-01 23:19:31 +02:00
Christian Schwarz
2b6f3ece6b jobrun: fix timing issue and minor printing issues
Would add offset by 1 sec
(possibly to avoid being scheduled just a little bit too early).

Turns out this leads to delays for jobs with interval < 2s :)
2017-07-01 23:19:31 +02:00
Christian Schwarz
2c50c8fd63 cmd: run: flag for running jobs only once 2017-07-01 23:19:31 +02:00
Christian Schwarz
4f86fa8332 cmd: support for pprof over http 2017-07-01 23:19:31 +02:00
Christian Schwarz
af2aa9dfe1 cmd/jobrun: repeat strategies as part of jobrun 2017-07-01 23:19:25 +02:00
Christian Schwarz
93d098162e cmd: run: select job to run 2017-06-09 20:54:01 +02:00
Christian Schwarz
d8adce6110 zfs: Support foo/bar/* globs 2017-05-20 19:50:24 +02:00
Christian Schwarz
5f84d30972 util/ReadWriteCloserLogger: handle unset readlog | writelog 2017-05-20 19:39:32 +02:00
Christian Schwarz
3b1cac1ea2 cmd: make --logfile global parameter 2017-05-20 18:17:08 +02:00
Christian Schwarz
35dcfc234e Implement push support.
Pushing is achieved by inverting the roles on the established
connection, i.e. the client tells the server what data it should pull
from the client (PullMeRequest).

Role inversion is achieved by moving the server loop to the serverLoop
function of ByteStreamRPC, which can be called from both the Listen()
function (server-side) and the PullMeRequest() client-side function.

A donwside of this PullMe approach is that the replication policies
become part of the rpc, because the puller must follow the policy.
2017-05-20 18:17:08 +02:00
Christian Schwarz
c7161cf8e6 handler: remove PushMapping, rename PullMapping to PullACL 2017-05-20 17:43:49 +02:00
Christian Schwarz
3c7f782dac rpc: remove FilesystemRequest.Direction (unused) 2017-05-20 17:43:49 +02:00
Christian Schwarz
40fe7e643d cmd: Move replication logic to separate file. 2017-05-20 17:29:37 +02:00
Christian Schwarz
7ad2ed5956 Rename sink -> stdinserver subcommand. 2017-05-16 16:43:39 +02:00
Christian Schwarz
f36ef41c39 scratchpad/chunker: fix api breakage 2017-05-14 15:58:33 +02:00
Christian Schwarz
defe134c8b sshbytestream: default ServerAliveInterval 2017-05-14 14:16:12 +02:00
Christian Schwarz
b1a3a57623 cmd close RPC with timeout 2017-05-14 14:11:19 +02:00
Christian Schwarz
48a4e8033a rpc: close outgoing SSH connection on exit. 2017-05-14 14:11:19 +02:00
Christian Schwarz
04206ebd8b util.IOCommand: Close() gracefully via SIGTERM 2017-05-14 14:11:19 +02:00
Christian Schwarz
ee570bb060 refactor: consolidate ForkReader-like implementations to IOCommand 2017-05-14 12:27:15 +02:00