Control status will be replaced by job-specific output at some point.
Task was not useful anymore with state machine, may reintroduce
something similar at a later point, but consider alternatives:
- opentracing.io
- embedding everything in ctx
- activity stack would work easily
- log entries via proxy logger.Logger object
- progress reporting should be in status reports of individial jobs
Summary:
* Logging is still bad
* test output in a lot of placed
* FIXMEs every where
Test Plan: None, just review
Differential Revision: https://phabricator.cschwarz.com/D2
Issue #56 shows zombie SSH processes.
We fix this by actually Close()ing the RWC in job pull.
If this fixes#56 it also fixes#6 --- it's the same issue.
Additionally, debugging around this revealed another issue: just
Close()ing the sshbytestream in job source will apparently outpace the
normal data stream of stdin and stdout (URG or PUSH flags?). leading
to ugly errors in the logs.
With proper TCP connections, we would simply set the connection to
linger and close it, letting the kernel handle the final timeout. Meh.
refs #56
refs #6
While filesystems is also not the right term (since it excludes ZVOLs),
we want to stay consistent with comments & terminology used in docs.
BREAK CONFIG
fixes#17
We lost the nice context-stack [jobname][taskname][...] at the beginning
of each log line when switching to logrus.
Define some field names that define these contexts.
Write a human-friendly formatter that presents these field names like
the solution we had before logrus.
Write some other formatters for logfmt and json output along the way.
Limit ourselves to stdout logging for now.
Implement
* pruning on source side
* local job
* test subcommand for doing a dry-run of a prune policy
* use a non-blocking callback from autosnap to trigger the depending
jobs -> avoids races, looks saner in the debug log
Done:
* implement autosnapper that asserts interval between snapshots
* implement pruner
* job pull: pulling + pruning
* job source: autosnapping + serving
TODO
* job source: pruning
* job local: everything
* fatal errors such as serve that cannot bind socket must be more
visible
* couldn't things that need a snapshotprefix just use a interface
Prefixer() instead? then we could have prefixsnapshotfilter and not
duplicate it every time...
* either go full context.Context or not at all...? just wait because
community climate around it isn't that great and we only need it for
cancellation? roll our own?
Don't use jobrun for daemon, just call JobDo() once, the job must
organize stuff itself.
Sacrifice all the oneshot commands, they will be reintroduced as
client-calls to the daemon.