Christian Schwarz
7303d91abf
WIP state-machine based replication
2018-08-11 12:19:10 +02:00
Christian Schwarz
c1f3076eb3
WIP2 logging done somewhat
2018-08-10 17:06:00 +02:00
Christian Schwarz
74445a0017
fixup
2018-08-08 13:12:50 +02:00
Christian Schwarz
a0b320bfeb
streamrpc now requires net.Conn => use it instead of rwc everywhere
2018-08-08 13:09:51 +02:00
Christian Schwarz
1826535e6f
WIP
2018-07-15 17:36:53 +02:00
Christian Schwarz
1a8d2c5ebe
replication: context support and propert closing of stale readers
2018-07-08 23:31:46 +02:00
Christian Schwarz
8cca0a8547
Initial working version
...
Summary:
* Logging is still bad
* test output in a lot of placed
* FIXMEs every where
Test Plan: None, just review
Differential Revision: https://phabricator.cschwarz.com/D2
2018-06-24 10:44:00 +02:00
Christian Schwarz
0918ef6815
WIP: diffing and replication algorithm
2018-05-02 21:26:24 +02:00
Christian Schwarz
9d7110eaad
config: fix shadowed error return values
2018-04-14 11:25:12 +02:00
Christian Schwarz
82ea535692
daemon: expose prometheus in new global.monitoring config section + document it
...
refs #67
2018-04-14 11:24:47 +02:00
Christian Schwarz
a4da029105
cmd: prometheus job type and Task instrumentation
...
refs #67
2018-04-13 23:37:53 +02:00
Christian Schwarz
aa3865d0a3
daemon: Job types as dedicated type
...
refs #67
2018-04-05 22:22:55 +02:00
Christian Schwarz
0895e02844
daemon: Task: track relation to parent job
...
refs #67
2018-04-05 22:18:22 +02:00
Christian Schwarz
26b436463d
ssh+stdinserver: connect: dial_timeout
...
This is a follow-up to ccd062e
2018-03-04 17:19:41 +01:00
Christian Schwarz
aa92261ea7
bookmarking: prune policy for bookmarks
...
refs #34
2018-02-17 20:48:31 +01:00
Christian Schwarz
8e34843eb1
autosnap: do not treat zero fs filter results as fatal
2018-02-17 19:27:00 +01:00
Christian Schwarz
bfaf6fdfbb
daemon: fix missing newline on parse error
2018-02-17 17:43:55 +01:00
Christian Schwarz
f992fed968
control pprof rewrite: expose pprof metrics via HTTP server controlled from CLI
2018-02-17 16:20:10 +01:00
Christian Schwarz
94967b596c
docs: document changes to ssh+stdinserver transport implementation: ccd062e
2018-02-17 15:16:29 +01:00
Christian Schwarz
f3d3a7f5f8
stdinserver: fixup ccd062e
: assert socket is in private directory
2018-02-17 14:12:44 +01:00
Christian Schwarz
ccd062e238
ssh+stdinserver: dump sshbytestream for github.com/problame/go-netssh
...
Cleaner abstractions + underlying go-rwccmd package does proper handling
of asynchronous exits, etc.
2018-02-17 01:08:15 +01:00
Christian Schwarz
6b5bd0a43c
job pull + source: fix broken connection teardown
...
Issue #56 shows zombie SSH processes.
We fix this by actually Close()ing the RWC in job pull.
If this fixes #56 it also fixes #6 --- it's the same issue.
Additionally, debugging around this revealed another issue: just
Close()ing the sshbytestream in job source will apparently outpace the
normal data stream of stdin and stdout (URG or PUSH flags?). leading
to ugly errors in the logs.
With proper TCP connections, we would simply set the connection to
linger and close it, letting the kernel handle the final timeout. Meh.
refs #56
refs #6
2018-02-16 20:57:27 +01:00
Christian Schwarz
921bccb960
job source: use task logger
2018-02-15 23:51:57 +01:00
Christian Schwarz
5f2c14adab
zfs: use custom datatype to pass ZFS properties in ZFSSet
...
refs #55
2018-01-05 18:42:10 +01:00
Christian Schwarz
787675aee8
control status command: only show verbose logs on user request
2017-12-30 13:53:19 +01:00
Christian Schwarz
01e0519b7b
control status subcommand: fix typo in usage
2017-12-30 13:44:55 +01:00
Christian Schwarz
8742b7f763
handler: fix typo in log message
2017-12-30 13:29:04 +01:00
Christian Schwarz
56f13741f9
test pattern subcommand: better example command
2017-12-29 22:45:38 +01:00
Christian Schwarz
61842988b9
Task & TaskStatus: DeepCopy(): actually copy lastUpdate field
...
otherwise, only changes to activity level would udpate TaskStatus
LastUpdate field
refs #10
2017-12-29 21:43:12 +01:00
Christian Schwarz
be7176bee7
Puller: fix wrong filesystem log field usage
...
was introduced in 9465b593
2017-12-29 21:25:42 +01:00
Christian Schwarz
839eccf513
logger.Outlet: WriteEntry must not block
...
- make TCPOutlet fully asynchronous, dropping messages if connection is
not fast enough
- syslog is just fine for now, local anyways
- stdout same thing
refs #26
2017-12-29 17:21:58 +01:00
Christian Schwarz
acd9aedb98
cmd control status: unify job logs, option to show only one job & always show logs
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
835cf6b12f
cmd control status: warn about inactive tasks
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
4b3d83ec1f
TaskStatus: add LastUpdate field
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
d13c6e3fc3
job local: refactor + use Task API
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
63fa7a67e9
job source: refactor + use Task API
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
7d89d1fb00
job pull: refactor + use Task API
...
refs #10
2017-12-27 18:34:24 +01:00
Christian Schwarz
b69089a527
Puller: refactor + use Task API
...
* drop rx byte count functionality
* will be re-added to Task as necessary
refs #10
2017-12-27 14:39:47 +01:00
Christian Schwarz
59e34942d1
Puller: make main interface public
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
91c4a97f72
Pruner: refactor + use Task API
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
13562b48ed
IntervalAutosnap: refactor + use Task API
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
58ee796394
adopt Task API: infect datastructures
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
ce351146cf
job control: implement JobStatus
2017-12-27 14:39:46 +01:00
Christian Schwarz
14b8d69a63
cmd control status + expose DaemonStatus via control API
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
8c7e373049
daemon: DaemonStatus + JobStatus + dummy implementation
...
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
2c87b15e83
daemon: Task abstraction + TaskStatus
...
An instance of Task tracks a single thread of activity that is part of a Job.
While the docs already use this terminology of tasks being composed of jobs,
the code did not have an object to represent these semantics.
Now it does:
* A task t is initialized with a root activity, which is its name
* t can t.Enter() and t.Finish() an activity, building
a stack of activities
* t's code can get a logger t.Log() whose logTaskField is set to the
concatenated stack of activities
* t's code can update IO progress it made since leaving idle state
* t's code's log output vie t.Log() is captured since leaving idle
state
* FIXME: find a way to bound that buffer
refs #10
refs #48
2017-12-27 14:39:46 +01:00
Christian Schwarz
d7f3fb93ae
bash completions: hidden subcommand + integrate into Makefile
2017-12-27 14:39:46 +01:00
Christian Schwarz
ebf209427a
logging: support ignoring fields in HumanFormatter
...
should be refactored to logger one day so the implementation of ignoring
is not duplicated to each outlet.
refs #10
2017-12-27 14:39:46 +01:00
Christian Schwarz
261d095108
logger: support forking of outlets
...
refs #10
2017-12-27 13:50:07 +01:00
Christian Schwarz
583a63a68f
refactor: encapsulate pulling in a struct
...
refs #10
2017-12-24 15:23:28 +01:00
Christian Schwarz
896f31bbf3
'zrepl version' and 'zrepl control version' subcommand + maintainer README
...
Version is autodetected on build using git
If it cannot be detected with git, an override must be provided.
For tracability of distros, the distroy packagers should override as
well, which is why I added a README entry for package mainatiners.
refs #35
2017-11-18 21:12:48 +01:00
Christian Schwarz
bfbab9382e
fixup: remove unused StdoutOutlet function
...
refs #28
2017-11-17 00:36:48 +01:00
Christian Schwarz
2bfcfa5be8
logging: first outlet receives logger error message
...
Abandons stderr special-casing:
* looks weird on shell and IO redirection to same file because of
interleaving of stdout and stderr
* better than a separate dedicated outlet because it does not require
additional configuration
fixes #28
BREAK SEMANTICS CONFIG
2017-11-17 00:25:38 +01:00
Christian Schwarz
a7f70a566d
logger: write internal / outlet errors to an error outlet
...
refs #28
2017-11-16 23:49:47 +01:00
Christian Schwarz
b576253ea8
logging: fixup 4763486
: implementation would parse 'date' instead of 'time' field in config
2017-11-15 11:14:20 +01:00
Christian Schwarz
476348689a
logging: stdout outlet: include time in output if tty or forced through config
2017-11-15 11:04:34 +01:00
Christian Schwarz
ed68bffea5
bookmark every snapshot
...
replication logic already supports bookmarks \o/
refs #34
2017-11-13 10:59:46 +01:00
Christian Schwarz
51af880701
refactor: parametrize PrefixFilter VersionType check
...
refs #34
2017-11-13 10:59:22 +01:00
Christian Schwarz
cef63ac176
logging: stdout formatter: use logfmt package to format non-special stdout fields + handle errors
...
refs #40
2017-11-13 10:58:07 +01:00
Christian Schwarz
f3433df617
cmd/sampleconf/zrep.yml: remove it, it's from the stone ages
2017-10-05 21:48:18 +02:00
Christian Schwarz
161ce3b3c3
autosnap: fix log level when fs filter does not match any fs
2017-10-05 21:22:17 +02:00
Christian Schwarz
83bb97a845
control job: wrong error on context done
2017-10-05 21:20:01 +02:00
Christian Schwarz
40919d06c2
source job: fix errnous log message when accept() on closed listener
2017-10-05 21:19:42 +02:00
Christian Schwarz
c48069ce88
retention grid: interva length monotonicity: exception for keep=all
...
fixes #6
2017-10-05 20:34:35 +02:00
Christian Schwarz
72d288567e
mappings: fix aliasing bug with '<' wildcards
...
In contrast to any 'something<' mapping, a '<' mapping cannot be unique
Thus, '<' mappings are thus just an append to target, which is exactly
what we get when trimming empty prefix ''.
Otherwise, given mapping
{ "<": "storage/backups/app-srv" }
Before (clearly a conflict)
zroot => storage/backups/app-srv
storage => storage/backups/app-srv
After:
zroot => storage/backups/app-srv/zroot
storage => storage/backups/app-srv/storage
However, mapping directly with subtree wildcard is still possible, just
not with the root wildcard
{
"<" "storage/backups/app-srv"
"zroot/var/db<": "storage/db_replication/app-srv"
}
fixes #22
2017-10-05 20:10:05 +02:00
Christian Schwarz
b5d46e2ec3
impl: don't reference m.entries again
2017-10-05 18:55:02 +02:00
Christian Schwarz
83d450b1f2
config: support days (d) and weeks (w) in durations
...
fixes #18
2017-10-05 15:17:37 +02:00
Christian Schwarz
3e647c14c0
config: source job: rename field 'datasets' to 'filesystems'
...
While filesystems is also not the right term (since it excludes ZVOLs),
we want to stay consistent with comments & terminology used in docs.
BREAK CONFIG
fixes #17
2017-10-05 13:39:05 +02:00
Christian Schwarz
b95260f4b5
config: logging: defaults + definition as list
...
* Stdout logger as default logger
* Clearer keyword / value separation
* Allows multiple outlet definitions
BREAK CONFIG
fixes #20
fixes #19
2017-10-05 13:31:16 +02:00
Christian Schwarz
e6d08149ef
docs: update 'mappping & filter syntax' + more elaborate sampleconf
2017-10-02 18:29:58 +02:00
Christian Schwarz
45670a7e5d
make vet happy: 'don't leak contexts'
2017-09-30 16:39:52 +02:00
Christian Schwarz
aab43af27c
tcp outlet: fix error handling on write failure
...
Also: clarify semantics of RetryInterval
2017-09-30 16:38:48 +02:00
Christian Schwarz
0cbee78b40
fix unreachable code & missing stringer-generated code
2017-09-30 16:31:55 +02:00
Christian Schwarz
03955196a9
cmd: config: build identity map
...
not necessary with one cert but good practice
2017-09-24 16:25:41 +02:00
Christian Schwarz
54b391f77c
tcp outlet: add newline after each entry
...
otherwise tools like graylog don't parse it
2017-09-24 16:24:43 +02:00
Christian Schwarz
c1a5b04065
TLS support for TCP logger
2017-09-24 14:34:50 +02:00
Christian Schwarz
d5df354e64
sampleconf for supported logging
2017-09-24 02:10:29 +02:00
Christian Schwarz
fae34f5927
implement logfmt formatter
2017-09-24 02:09:50 +02:00
Christian Schwarz
c4c38d5b23
add syslog outlet
2017-09-24 02:05:41 +02:00
Christian Schwarz
e0e362c4ff
dump logrus and roll our own logger instead
2017-09-24 00:57:52 +02:00
Christian Schwarz
c31ec8c646
convert more code to structured logging
2017-09-23 17:52:29 +02:00
Christian Schwarz
83edcb3889
experimental TCP hook for logrus
2017-09-23 12:58:13 +02:00
Christian Schwarz
9465b593f9
cmd: configurable logrus formatters
...
We lost the nice context-stack [jobname][taskname][...] at the beginning
of each log line when switching to logrus.
Define some field names that define these contexts.
Write a human-friendly formatter that presents these field names like
the solution we had before logrus.
Write some other formatters for logfmt and json output along the way.
Limit ourselves to stdout logging for now.
2017-09-23 11:24:36 +02:00
Christian Schwarz
3ff9e6d2f7
structured logging for control job
2017-09-23 11:07:08 +02:00
Christian Schwarz
bfcba7b281
cmd: logging using logrus
2017-09-22 17:01:54 +02:00
Christian Schwarz
a459f0a0f6
go-yaml: direct dependency on github repo
2017-09-22 15:29:54 +02:00
Christian Schwarz
e87ce3f7cf
cmd: no context + logging for config parsing
2017-09-22 14:13:30 +02:00
Christian Schwarz
458c28e1d0
cmd: UNIX sockets: try to autoremove stale sockets
2017-09-18 00:16:28 +02:00
Christian Schwarz
eaed271a00
cmd: config: remove annoying parser logs
2017-09-18 00:16:28 +02:00
Christian Schwarz
3eaba92025
cmd: introduce control socket & subcommand
...
Move pprof debugging there.
2017-09-18 00:16:28 +02:00
Christian Schwarz
aea62a9d85
cmd: extract listening on a UNIX socket in a private directory into a helper func
2017-09-17 23:41:51 +02:00
Christian Schwarz
1a62d635a6
cmd: test: would always run testCmdGlobalInit
2017-09-17 23:40:40 +02:00
Christian Schwarz
9cd83399d3
cmd: remove global state in main.go
...
* refactoring
* Now supporting default config locations
2017-09-17 18:32:00 +02:00
Christian Schwarz
4ac7e78e2b
cmd: config: was using wrong reference to config
2017-09-17 17:45:02 +02:00
Christian Schwarz
71650819d3
cmd: remove stderrFile option
2017-09-17 17:25:24 +02:00
Christian Schwarz
6a05e101cf
WIP daemon:
...
Implement
* pruning on source side
* local job
* test subcommand for doing a dry-run of a prune policy
* use a non-blocking callback from autosnap to trigger the depending
jobs -> avoids races, looks saner in the debug log
2017-09-16 21:13:19 +02:00
Christian Schwarz
b168274048
fixup dmf tests
2017-09-16 20:32:01 +02:00
Christian Schwarz
cd4e09ebb3
cmd: handler: privatise & rename variables
2017-09-16 20:27:08 +02:00
Christian Schwarz
e3ec093d53
cmd: handler: check FilesystemVersionFilter as part of ACL
2017-09-16 20:24:46 +02:00
Christian Schwarz
dc3378e890
cmd: daemon: use closure-local variable when starting job
2017-09-16 20:21:05 +02:00