Commit Graph

367 Commits

Author SHA1 Message Date
Christian Schwarz
301c7b2dd5 restructure and rename, making mainfsm the replication package itself 2018-08-22 00:14:12 +02:00
Christian Schwarz
2f205d205b remove EndpointPair abstraction 2018-08-21 22:15:00 +02:00
Christian Schwarz
38532abf45 enforce encapsulation by breaking up replication package into packages
not perfect yet, public shouldn't be required to use 'common' package to
use replication package
2018-08-16 21:05:21 +02:00
Christian Schwarz
c7d28fee8f gofmt 2018-08-16 14:02:33 +02:00
Christian Schwarz
bf1e626b9a proper queue abstraction 2018-08-16 14:02:16 +02:00
Christian Schwarz
93929b61e4 propert locking on FSReplication 2018-08-16 12:01:51 +02:00
Christian Schwarz
5479463783 always use ReplicationState, and have a map from that to the rsfs 2018-08-16 11:02:34 +02:00
Christian Schwarz
094eced2c7 WIP: states with updater func instead of direct locking 2018-08-16 01:26:09 +02:00
Christian Schwarz
991f13a3da Reporting 2018-08-15 20:29:49 +02:00
Christian Schwarz
7303d91abf WIP state-machine based replication 2018-08-11 12:19:10 +02:00
Christian Schwarz
c1f3076eb3 WIP2 logging done somewhat 2018-08-10 17:06:00 +02:00
Christian Schwarz
74445a0017 fixup 2018-08-08 13:12:50 +02:00
Christian Schwarz
a0b320bfeb streamrpc now requires net.Conn => use it instead of rwc everywhere 2018-08-08 13:09:51 +02:00
Christian Schwarz
1826535e6f WIP 2018-07-15 17:36:53 +02:00
Christian Schwarz
1a8d2c5ebe replication: context support and propert closing of stale readers 2018-07-08 23:31:46 +02:00
Christian Schwarz
8cca0a8547 Initial working version
Summary:
* Logging is still bad
* test output in a lot of placed
* FIXMEs every where

Test Plan: None, just review

Differential Revision: https://phabricator.cschwarz.com/D2
2018-06-24 10:44:00 +02:00
Christian Schwarz
fa6426f803 WIP: zfs: hacky resume token parsing 2018-05-02 21:26:56 +02:00
Christian Schwarz
0918ef6815 WIP: diffing and replication algorithm 2018-05-02 21:26:24 +02:00
Christian Schwarz
181875a89b build: add dependency on prometheus client_golang to Gopkg.toml 2018-04-14 11:41:43 +02:00
Christian Schwarz
67743d2a66 docs: promote monitoring on front page 2018-04-14 11:30:48 +02:00
Christian Schwarz
386d3b19b2 docs: fix missing slash in sampleconf link text 2018-04-14 11:25:31 +02:00
Christian Schwarz
9d7110eaad config: fix shadowed error return values 2018-04-14 11:25:12 +02:00
Christian Schwarz
82ea535692 daemon: expose prometheus in new global.monitoring config section + document it
refs #67
2018-04-14 11:24:47 +02:00
Christian Schwarz
a4da029105 cmd: prometheus job type and Task instrumentation
refs #67
2018-04-13 23:37:53 +02:00
Christian Schwarz
aa3865d0a3 daemon: Job types as dedicated type
refs #67
2018-04-05 22:22:55 +02:00
Christian Schwarz
0895e02844 daemon: Task: track relation to parent job
refs #67
2018-04-05 22:18:22 +02:00
Christian Schwarz
0764f8824e zfs: prometheus metrics
refs #67
2018-04-05 22:12:25 +02:00
Christian Schwarz
30057d4e59 build: fix warning for cached builds with Go 1.10 2018-04-01 17:53:51 +02:00
Christian Schwarz
75fd21e454 make generate: stringer was updated and now uses strconv instead of fmt
bd4635fd25 (diff-0415b5286e4cf3e373f349d917e5e039)
2018-04-01 15:30:04 +02:00
Christian Schwarz
0d2f73d728 docs: tutorial: minor refinements 2018-04-01 14:58:12 +02:00
Christian Schwarz
9b803aad2d docs: tutorial: document known_hosts file setup
fixes #64
2018-04-01 14:58:04 +02:00
Christian Schwarz
fb74addc1e bump go-rwccmd to support ssh error messages
this is a follow-up to ccd062e

fixes #65
2018-04-01 14:34:05 +02:00
Christian Schwarz
7f89372cfa docs: fix enumeration in ssh+stdinserver docs 2018-03-04 17:20:08 +01:00
Christian Schwarz
26b436463d ssh+stdinserver: connect: dial_timeout
This  is a follow-up to ccd062e
2018-03-04 17:19:41 +01:00
Christian Schwarz
61af396fdd build: render release artifacts into subdirectory
* reproducible tarball
* includes go version
* sha512sum

The sha512 sum file should be signed manually, don't want that in the
Makefile because we may build in docker.
2018-02-18 16:46:54 +01:00
Christian Schwarz
792c1a23b2 build: track dependency on go-netssh explicitly in Gopkg.toml 2018-02-18 15:26:48 +01:00
Christian Schwarz
7464e967c8 docs: changelog remove senseless headline 2018-02-18 13:35:57 +01:00
Christian Schwarz
921deb43f5 docs: changelog for 0.0.3 2018-02-18 13:35:40 +01:00
Christian Schwarz
4cf910874d rpc: make DataType a stringer, fixing debug messages 2018-02-18 13:33:53 +01:00
Christian Schwarz
3ba3648f0f zfs: use channel as iterator for ZFSList results
The old approach with ZFSList would keep the two-dimensional array of
lines and their fields in memory (for a short time), which could easily
consume 100s of MiB with > 10000 snapshots / bookmarks (see #34)

fixes #61
2018-02-18 13:28:46 +01:00
Christian Schwarz
aa92261ea7 bookmarking: prune policy for bookmarks
refs #34
2018-02-17 20:48:31 +01:00
Christian Schwarz
8e34843eb1 autosnap: do not treat zero fs filter results as fatal 2018-02-17 19:27:00 +01:00
Christian Schwarz
bfaf6fdfbb daemon: fix missing newline on parse error 2018-02-17 17:43:55 +01:00
Christian Schwarz
f992fed968 control pprof rewrite: expose pprof metrics via HTTP server controlled from CLI 2018-02-17 16:20:10 +01:00
Christian Schwarz
94967b596c docs: document changes to ssh+stdinserver transport implementation: ccd062e 2018-02-17 15:16:29 +01:00
Christian Schwarz
759dae4552 build: further fixups of ccd062e: remove ref to deleted sshbytestream subpkg 2018-02-17 14:28:04 +01:00
Christian Schwarz
f3d3a7f5f8 stdinserver: fixup ccd062e: assert socket is in private directory 2018-02-17 14:12:44 +01:00
Christian Schwarz
ccd062e238 ssh+stdinserver: dump sshbytestream for github.com/problame/go-netssh
Cleaner abstractions + underlying go-rwccmd package does proper handling
of asynchronous exits, etc.
2018-02-17 01:08:15 +01:00
Christian Schwarz
fc1c46ffd7 logger: fix ReplaceWith: would case parent field to be nil
Now WithField and ReplaceWith are wrappers around a common
forkLogger routine

regression introduced in 51377a8
2018-02-16 21:19:15 +01:00
Christian Schwarz
6b5bd0a43c job pull + source: fix broken connection teardown
Issue #56 shows zombie SSH processes.
We fix this by actually Close()ing the RWC in job pull.
If this fixes #56 it also fixes #6 --- it's the same issue.

Additionally, debugging around this revealed another issue: just
Close()ing the sshbytestream in job source will apparently outpace the
normal data stream of stdin and stdout (URG or PUSH flags?).  leading
to ugly errors in the logs.
With proper TCP connections, we would simply set the connection to
linger and close it, letting the kernel handle the final timeout. Meh.

refs #56
refs #6
2018-02-16 20:57:27 +01:00