Commit Graph

774 Commits

Author SHA1 Message Date
Christian Schwarz
5e17d7ba80 docs: add recent supporters 2019-11-26 00:45:13 +01:00
Christian Schwarz
0261dbfe3d docs: 0.2.1 changelog 2019-11-20 20:16:41 +01:00
Christian Schwarz
4301f741db dist/systemd: remove @privileged from SystemCallFilter + cleanup comments
fixes #237
2019-11-20 18:44:14 +01:00
Christian Schwarz
7e743c74dc docs + samples: adjust ssh 'Compression' arg in examples 2019-11-20 18:19:16 +01:00
Christian Schwarz
ad0b055245 daemon/prometheus: fix crash if listener cannot be created
refs #238

  zrepl version=v0.2.0-11-gdc39c81 GOOS=linux GOARCH=amd64 Compiler=gc
 starting daemon
 [pull_source]: starting job
 [_prometheus]: starting job
 [connection_loss_tidyup]: starting job
 [connection_loss_tidyup]: wait for wakeups
 [_control]: starting job
 [_prometheus]: cannot listen err="listen tcp 10.0.0.200:9091: bind: cannot assign requested add
 [_prometheus]: job exited
 panic: runtime error: invalid memory address or nil pointer dereference
         panic: runtime error: invalid memory address or nil pointer dereference
 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x81ea4d]
 goroutine 25 [running]:
 net/http.(*onceCloseListener).close(...)
         /usr/local/go/src/net/http/server.go:3330
 sync.(*Once).doSlow(0xc00018b060, 0xc0000c7bc0)
         /usr/local/go/src/sync/once.go:66 +0xe3
 sync.(*Once).Do(...)
         /usr/local/go/src/sync/once.go:57
 net/http.(*onceCloseListener).Close(0xc00018b050, 0xc0003a6000, 0xe0)
         /usr/local/go/src/net/http/server.go:3326 +0x77
 panic(0xb1d1c0, 0x11d7d90)
         /usr/local/go/src/runtime/panic.go:679 +0x1b2
 net/http.(*onceCloseListener).Accept(0xc00018b050, 0xc000120020, 0xb0fd20, 0x11d7ce0, 0xbee6e0)
         <autogenerated>:1 +0x32
 net/http.(*Server).Serve(0xc0003a6000, 0x0, 0x0, 0x0, 0x0)
         /usr/local/go/src/net/http/server.go:2896 +0x286
 net/http.Serve(...)
         /usr/local/go/src/net/http/server.go:2468
 github.com/zrepl/zrepl/daemon.(*prometheusJob).Run(0xc000109940, 0xd19ec0, 0xc00018a930)
         /go/src/github.com/zrepl/zrepl/daemon/prometheus.go:75 +0x23e
 github.com/zrepl/zrepl/daemon.(*jobs).start.func1(0xc0000f68c0, 0xd22c40, 0xc000116ee0, 0xd1ba4
         /go/src/github.com/zrepl/zrepl/daemon/daemon.go:220 +0x121
 created by github.com/zrepl/zrepl/daemon.(*jobs).start
         /go/src/github.com/zrepl/zrepl/daemon/daemon.go:216 +0x52e
 zrepl.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
 zrepl.service: Failed with result 'exit-code'.
2019-11-16 22:11:13 +01:00
Christian Schwarz
27db3e6f70 docs: supporters: update & add viz for different kinds of support 2019-11-16 22:11:07 +01:00
Christian Schwarz
4994b7a9ea rpc/dataconn + build: support GOOS={solaris,illumos} 2019-11-16 22:07:47 +01:00
Christian Schwarz
080f2c0616 build: Makefile: refactor cross-builds + release, add i386 targets 2019-11-16 22:07:47 +01:00
Christian Schwarz
d469cc04b6 transports/ssh: bump go-netssh to improve dial errors
from go-netssh changelog:

    dial: better error handling if ssh command exits with non-zero exit status

    SSHError.Error() relied on go-rwccmd behavior of returning io.EOF if the
    ssh binary exited with status code 0.

    We no longe ruse go-rwccmd => capture Stderr ourselves using zrepl's
    circlog (depending on zrepl is not pretty, but since this package is supposedly
    only used by zrepl ATM, this is fine)

    refs https://github.com/zrepl/zrepl/issues/237
2019-11-16 22:07:47 +01:00
Christian Schwarz
e0a25d04ac build: Makefile: set GO111MODULE=on for all go commands 2019-11-16 22:07:47 +01:00
Christian Schwarz
b0f2c79944 build: go mods: split build deps into subgomod, bump prometheus to 1.2.1, tweaked go mod tidy
tweaked go mod tidy: see comment in go.mod
2019-11-16 22:07:47 +01:00
Christian Schwarz
d2bc40f78d docs: transports: ssh: better copy-pastable connect section 2019-11-16 22:07:47 +01:00
Christian Schwarz
9e54f11960 dist/systemd: fix ssh-transport: create stdinserver runtime directory
tested to work on Debian Stretch

refs #237
2019-11-16 22:07:38 +01:00
Andy Fiddaman
6eda1f743f Fix typo in tutorial.rst 2019-11-05 09:57:30 -08:00
Andy Fiddaman
6787decef1 Add OmniOS (illumos distribution) to list of OSs 2019-11-05 09:49:57 -08:00
Christian Schwarz
d56d45a2ab docs: install: apt: fix snippet display & link to packaging repo 2019-10-21 16:35:23 +02:00
Christian Schwarz
fcf16a163a docs: install: apt snippet: idempotent, bash compat, multiarch compat
Co-authored-by: Janis Streib <me@janis-streib.de>
Co-authored-by: Christian Schwarz <me@cschwarz.com>
2019-10-21 16:21:51 +02:00
Christian Schwarz
dc39c819a3 docs: add debian + ubuntu installation 2019-10-18 20:18:42 +02:00
Christian Schwarz
1048b09487 build: include config examples and dist in noarch tarball 2019-10-18 20:18:42 +02:00
Richard Poettler
3806e97404 docs: add copr repo for Fedora/CentOS
closes #229
2019-10-16 10:46:02 +02:00
chenhao
c396f9508a zfs: replace hard coded zfs command in ZFSDestroy
fixes #231
2019-10-16 10:22:53 +02:00
Christian Schwarz
b9933f6cb2 platformtest: add zfsGet bookmark handling & replicationCursor tests
This encodes the observation made in issue #230 :
In the ZFS version shipped in Ubuntu 16.04 where
`zfs get someprop a#bookmark` does not work.
2019-10-14 17:54:14 +02:00
Christian Schwarz
0ba4b5eda6 zfs: helper for ZFSGet guid and createtxg 2019-10-14 17:54:14 +02:00
Christian Schwarz
18d2c350de platformtest: harness: -failure.stop-and-keep-pool mode, prettier logging 2019-10-14 17:54:14 +02:00
Christian Schwarz
f8f9fd11cd platformtest: logging-related refactorings 2019-10-14 17:32:58 +02:00
John Ramsden
b422e6f12e docs: installation: add Arch Linux 'from source' package 2019-10-13 12:33:27 +02:00
Christian Schwarz
f8d5082bdd docs: remove outdated implementation references + remove 0.2-rc* from published docs 2019-10-13 12:26:39 +02:00
Christian Schwarz
ffe677e55a docs: snapshotting: command hook type: not the only hook type anymore 2019-10-13 12:16:31 +02:00
Christian Schwarz
84eefa57bc rpc/grpcclientidentity: remove hard-coded deadline in listener adatper causing crash
Verified once again that grpc.DialContext is indeed non-blocking.
However, it checks in a defer stmt that the passed dial is not ctx.Done().
That is highly unusual if the dial is non-blocking.
But it might still happen, maybe because of machine suspend during the function call and before the defer stmt is executed.

panic:
context deadline exceeded
goroutine 49 [running]:
github.com/zrepl/zrepl/rpc/grpcclientidentity/grpchelper.ClientConn(0x1906ea0, 0xc0003ea1e0, 0x1921620, 0xc0002da660, 0x0)
        /gopath/src/github.com/zrepl/zrepl/rpc/grpcclientidentity/grpchelper/authlistener_grpc_adaptor_wrapper.go:49 +0x38c
github.com/zrepl/zrepl/rpc.NewClient(0x1906f00, 0xc0002d60f0, 0x1921620, 0xc0002da640, 0x1921620, 0xc0002da660, 0x1921620, 0xc0002da6a0, 0x1921620)
        /gopath/src/github.com/zrepl/zrepl/rpc/rpc_client.go:53 +0x199
github.com/zrepl/zrepl/daemon/job.(*modePush).ConnectEndpoints(0xc0000d1e90, 0x1921620, 0xc0002da640, 0x1921620, 0xc0002da660, 0x1921620, 0xc0002da6a0, 0x1906f00, 0xc0002d60f0)
        /gopath/src/github.com/zrepl/zrepl/daemon/job/active.go:105 +0x15d
github.com/zrepl/zrepl/daemon/job.(*ActiveSide).do(0xc0000d6120, 0x1918720, 0xc00020f170)
        /gopath/src/github.com/zrepl/zrepl/daemon/job/active.go:356 +0x236
github.com/zrepl/zrepl/daemon/job.(*ActiveSide).Run(0xc0000d6120, 0x1918720, 0xc00009c660)
        /gopath/src/github.com/zrepl/zrepl/daemon/job/active.go:347 +0x289
github.com/zrepl/zrepl/daemon.(*jobs).start.func1(0xc0000fc880, 0x1921620, 0xc0002da120, 0x191a320, 0xc0000d6120, 0x1918720, 0xc0002d6a80)
2019-10-10 14:02:12 +02:00
Juergen Hoetzel
ad77371e38 docs: include Arch Linux installation 2019-10-06 20:38:00 +02:00
Juergen Hoetzel
c524acb2df Fix invalid comment syntax 2019-10-06 16:23:20 +02:00
Christian Schwarz
3edfe535c6 docs: fix typo on index page 2019-10-05 14:59:51 +02:00
Juergen Hoetzel
d3b99e8e39 Fix typo 2019-10-05 14:58:49 +02:00
Christian Schwarz
3c03f21419 docs: SEPA hint, supporters, fix publish script 2019-10-03 11:57:19 +02:00
Christian Schwarz
5c95c21727 transport/local: configurable dial_timeout for connect, default 2s 2019-09-29 19:05:54 +02:00
Christian Schwarz
a6b578b648 rpc/dataconn/stream: Conn: handle concurrent Close calls + goroutine leak fix
* Add Close() in closeState to identify the first closer
* Non-first closers get an error
* Reads and Writes from the Conn get an error if the conn was closed
  during the Read / Write was running
* The first closer starts _separate_ goroutine draining the c.frameReads channel
* The first closer then waits for the goroutine that fills c.frameReads
  to exit

refs 3bfe0c16d0
fixes #174

readFrames would block on `reads <-`
   but only after that would stream.Conn.readFrames close c.waitReadFramesDone
   which was too late because stream.Conn.Close would wait for c.waitReadFramesDone to be closed before draining the channel
                              ^^^^^^ (not frameconn.Conn, that closed successfully)

   195 @ 0x1032ae0 0x1006cab 0x1006c81 0x1006a65 0x15505be 0x155163e 0x1060bc1
           0x15505bd       github.com/zrepl/zrepl/rpc/dataconn/stream.readFrames+0x16d             github.com/zrepl/zrepl/rpc/dataconn/stream/stream.go:220
           0x155163d       github.com/zrepl/zrepl/rpc/dataconn/stream.(*Conn).readFrames+0xbd      github.com/zrepl/zrepl/rpc/dataconn/stream/stream_conn.go:71

   195 @ 0x1032ae0 0x10078c8 0x100789e 0x100758b 0x1552678 0x1557a4b 0x1556aec 0x1060bc1
           0x1552677       github.com/zrepl/zrepl/rpc/dataconn/stream.(*Conn).Close+0x77           github.com/zrepl/zrepl/rpc/dataconn/stream/stream_conn.go:191
           0x1557a4a       github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn.func1+0x5a      github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:93
           0x1556aeb       github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn+0x87b           github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:176
2019-09-29 19:05:54 +02:00
Christian Schwarz
8af824df41 docs: promote monetary support in changelog 2019-09-29 19:04:53 +02:00
Christian Schwarz
58ab25919e platformtest: dedicated pool per test, Makefile target, maintainer notice
fixes #216
fixes #211
2019-09-29 18:48:44 +02:00
Christian Schwarz
215848f476 docs: 0.2 changelog 2019-09-28 17:50:07 +02:00
Christian Schwarz
f9c7766073 replication/logic: fix race when reading byte counter pointer for report
fixes #214
2019-09-28 16:16:19 +02:00
Christian Schwarz
f976212ec9 config: validate presence of port in addresses
fixes #213
2019-09-28 14:25:14 +02:00
Christian Schwarz
8c88e168c1 rpc/dataconn/client: ReqRecv to log level Debug
reported by @avg-l
2019-09-28 11:49:20 +02:00
Christian Schwarz
a78c854404 rpc/dataconn/frameconn: mask ECONNRESET error on Close()
fixes #190
2019-09-28 11:49:20 +02:00
Christian Schwarz
8a5af2f80e build/circleci: apt update before installing
hope this fixes the spurious apt install failures
2019-09-27 21:31:05 +02:00
Christian Schwarz
f7aa26d418 zrepl status: follow up c4be60c: import screen terminfo
This is $TERM on FreeBSD and FreeNAS.

fixes #204
ref https://github.com/gdamore/tcell/issues/252
2019-09-27 21:31:05 +02:00
Christian Schwarz
b5ff1a9926 snapper + client/status: snapshotting reports 2019-09-27 21:31:00 +02:00
Ross Williams
729c83ee72 pre- and post-snapshot hooks
* stack-based execution model, documented in documentation
* circbuf for capturing hook output
* built-in hooks for postgres and mysql
* refactor docs, too much info on the jobs page, too difficult
  to discover snapshotting & hooks

Co-authored-by: Ross Williams <ross@ross-williams.net>
Co-authored-by: Christian Schwarz <me@cschwarz.com>

fixes #74
2019-09-27 21:25:59 +02:00
Ross Williams
00434f4ac9 daemon/logging: format human: treat 'subsystem' field prefixed
logging.Subsystem != string
=> typecast failed

(Also, nobody guarantees that e.Fields contains `field`)
2019-09-27 20:39:43 +02:00
Christian Schwarz
2cd9173bfb transport/local: hard fail, more aggressive connect timeout 2019-09-14 13:43:46 +02:00
Christian Schwarz
7ba3ae077f client/status: job filter flag 2019-09-14 13:43:46 +02:00