Christian Schwarz
a0f72b585b
remove JobStatus, Task abstraction and 'control status' subcommand
...
Control status will be replaced by job-specific output at some point.
Task was not useful anymore with state machine, may reintroduce
something similar at a later point, but consider alternatives:
- opentracing.io
- embedding everything in ctx
- activity stack would work easily
- log entries via proxy logger.Logger object
- progress reporting should be in status reports of individial jobs
2018-08-26 19:08:30 +02:00
Christian Schwarz
7ff72fb6d9
replication: document most important aspects of Endpoint interface
2018-08-26 15:12:43 +02:00
Christian Schwarz
f6be5b776b
cmd: clean up usage of contextKeyLog through getter and setter functions
2018-08-26 14:58:57 +02:00
Christian Schwarz
666ead2646
make go vet happy
2018-08-26 14:51:20 +02:00
Christian Schwarz
ea0e3a29e4
fixup 88de8ba8bb
: gofmt
2018-08-25 22:30:44 +02:00
Christian Schwarz
71203ab325
move various timeouts to package-level variables
2018-08-25 22:30:16 +02:00
Christian Schwarz
88de8ba8bb
initial repl policy: get rid of unimplemented options
2018-08-25 22:23:47 +02:00
Christian Schwarz
861e5f8313
special logging fields: from now on only 'job', 'task', 'subsystem'
2018-08-25 22:15:37 +02:00
Christian Schwarz
e30ae972f4
gofmt
2018-08-25 21:30:25 +02:00
Christian Schwarz
e082816de5
fixup d677cde6d0
: unused import
2018-08-25 15:16:38 +02:00
Christian Schwarz
f46d1bc338
fixup 70aad0940f
: fix broken config_test.go
2018-08-25 13:02:38 +02:00
Christian Schwarz
51cfcfe79b
job source: do not stop listener on accept() errors
...
refs #77
2018-08-25 13:00:51 +02:00
Christian Schwarz
d677cde6d0
implement tcp and tcp+tls transports
2018-08-25 12:58:17 +02:00
Christian Schwarz
54c9dcb7c1
move replication policy constants to package replication
2018-08-22 10:11:14 +02:00
Christian Schwarz
9b537ec704
simplify naming in endpoint package
2018-08-22 10:05:21 +02:00
Christian Schwarz
70aad0940f
cmd: move replication endpoints into subpackage
2018-08-22 00:43:58 +02:00
Christian Schwarz
7b3a84e2a3
move replication package to project root (independent of cmd package)
2018-08-22 00:19:03 +02:00
Christian Schwarz
301c7b2dd5
restructure and rename, making mainfsm the replication package itself
2018-08-22 00:14:12 +02:00
Christian Schwarz
2f205d205b
remove EndpointPair abstraction
2018-08-21 22:15:00 +02:00
Christian Schwarz
38532abf45
enforce encapsulation by breaking up replication package into packages
...
not perfect yet, public shouldn't be required to use 'common' package to
use replication package
2018-08-16 21:05:21 +02:00
Christian Schwarz
c7d28fee8f
gofmt
2018-08-16 14:02:33 +02:00
Christian Schwarz
bf1e626b9a
proper queue abstraction
2018-08-16 14:02:16 +02:00
Christian Schwarz
93929b61e4
propert locking on FSReplication
2018-08-16 12:01:51 +02:00
Christian Schwarz
5479463783
always use ReplicationState, and have a map from that to the rsfs
2018-08-16 11:02:34 +02:00
Christian Schwarz
094eced2c7
WIP: states with updater func instead of direct locking
2018-08-16 01:26:09 +02:00
Christian Schwarz
991f13a3da
Reporting
2018-08-15 20:29:49 +02:00
Christian Schwarz
7303d91abf
WIP state-machine based replication
2018-08-11 12:19:10 +02:00
Christian Schwarz
c1f3076eb3
WIP2 logging done somewhat
2018-08-10 17:06:00 +02:00
Christian Schwarz
74445a0017
fixup
2018-08-08 13:12:50 +02:00
Christian Schwarz
a0b320bfeb
streamrpc now requires net.Conn => use it instead of rwc everywhere
2018-08-08 13:09:51 +02:00
Christian Schwarz
1826535e6f
WIP
2018-07-15 17:36:53 +02:00
Christian Schwarz
1a8d2c5ebe
replication: context support and propert closing of stale readers
2018-07-08 23:31:46 +02:00
Christian Schwarz
8cca0a8547
Initial working version
...
Summary:
* Logging is still bad
* test output in a lot of placed
* FIXMEs every where
Test Plan: None, just review
Differential Revision: https://phabricator.cschwarz.com/D2
2018-06-24 10:44:00 +02:00
Christian Schwarz
0918ef6815
WIP: diffing and replication algorithm
2018-05-02 21:26:24 +02:00
Christian Schwarz
9d7110eaad
config: fix shadowed error return values
2018-04-14 11:25:12 +02:00
Christian Schwarz
82ea535692
daemon: expose prometheus in new global.monitoring config section + document it
...
refs #67
2018-04-14 11:24:47 +02:00
Christian Schwarz
a4da029105
cmd: prometheus job type and Task instrumentation
...
refs #67
2018-04-13 23:37:53 +02:00
Christian Schwarz
aa3865d0a3
daemon: Job types as dedicated type
...
refs #67
2018-04-05 22:22:55 +02:00
Christian Schwarz
0895e02844
daemon: Task: track relation to parent job
...
refs #67
2018-04-05 22:18:22 +02:00
Christian Schwarz
26b436463d
ssh+stdinserver: connect: dial_timeout
...
This is a follow-up to ccd062e
2018-03-04 17:19:41 +01:00
Christian Schwarz
aa92261ea7
bookmarking: prune policy for bookmarks
...
refs #34
2018-02-17 20:48:31 +01:00
Christian Schwarz
8e34843eb1
autosnap: do not treat zero fs filter results as fatal
2018-02-17 19:27:00 +01:00
Christian Schwarz
bfaf6fdfbb
daemon: fix missing newline on parse error
2018-02-17 17:43:55 +01:00
Christian Schwarz
f992fed968
control pprof rewrite: expose pprof metrics via HTTP server controlled from CLI
2018-02-17 16:20:10 +01:00
Christian Schwarz
94967b596c
docs: document changes to ssh+stdinserver transport implementation: ccd062e
2018-02-17 15:16:29 +01:00
Christian Schwarz
f3d3a7f5f8
stdinserver: fixup ccd062e
: assert socket is in private directory
2018-02-17 14:12:44 +01:00
Christian Schwarz
ccd062e238
ssh+stdinserver: dump sshbytestream for github.com/problame/go-netssh
...
Cleaner abstractions + underlying go-rwccmd package does proper handling
of asynchronous exits, etc.
2018-02-17 01:08:15 +01:00
Christian Schwarz
6b5bd0a43c
job pull + source: fix broken connection teardown
...
Issue #56 shows zombie SSH processes.
We fix this by actually Close()ing the RWC in job pull.
If this fixes #56 it also fixes #6 --- it's the same issue.
Additionally, debugging around this revealed another issue: just
Close()ing the sshbytestream in job source will apparently outpace the
normal data stream of stdin and stdout (URG or PUSH flags?). leading
to ugly errors in the logs.
With proper TCP connections, we would simply set the connection to
linger and close it, letting the kernel handle the final timeout. Meh.
refs #56
refs #6
2018-02-16 20:57:27 +01:00
Christian Schwarz
921bccb960
job source: use task logger
2018-02-15 23:51:57 +01:00
Christian Schwarz
5f2c14adab
zfs: use custom datatype to pass ZFS properties in ZFSSet
...
refs #55
2018-01-05 18:42:10 +01:00