We had too many spurious test failures in the past.
But on a developer machine, the tests don't usually fail because the
system isn't loaded as much.
So, only disable test on CircleCI.
From:
github.com/golang/protobuf v1.3.2
google.golang.org/grpc v1.17.0
To:
github.com/golang/protobuf v1.4.3
google.golang.org/grpc v1.35.0
google.golang.org/protobuf v1.25.0
About the two protobuf packages:
https://developers.google.com/protocol-buffers/docs/reference/go/faq
> Version v1.4.0 and higher of github.com/golang/protobuf wrap the new
implementation and permit programs to adopt the new API incrementally. For
example, the well-known types defined in github.com/golang/protobuf/ptypes are
simply aliases of those defined in the newer module. Thus,
google.golang.org/protobuf/types/known/emptypb and
github.com/golang/protobuf/ptypes/empty may be used interchangeably.
Notable Code Changes in zrepl:
- generate protobufs now contain a mutex so we can't copy them by value
anymore
- grpc.WithDialer is deprecated => use grpc.WithContextDialer instead
Go1.12 is now actually required by some of the dependencies.
GO111MODULE=on golangci-lint run ./...
endpoint/endpoint.go:487:9: S1039: unnecessary use of fmt.Sprintf (gosimple)
panic(fmt.Sprintf("ClientIdentityKey context value must be set"))
^
platformtest/platformtest_ops.go:259:41: S1039: unnecessary use of fmt.Sprintf (gosimple)
return nil, &LineError{scan.Text(), fmt.Sprintf("unexpected tokens at EOL")}
^
platformtest/platformtest_ops.go:266:41: S1039: unnecessary use of fmt.Sprintf (gosimple)
return nil, &LineError{scan.Text(), fmt.Sprintf("unexpected tokens at EOL")}
^
util/optionaldeadline/optionaldeadline_test.go:97:50: SA1029: should not use built-in type string as key for value; define your own type to avoid collisions (staticcheck)
pctx := context.WithValue(context.Background(), "key", "value")
^
rpc/rpc_debug.go:8:5: var `debugEnabled` is unused (unused)
rpc/dataconn/dataconn_debug.go:8:5: var `debugEnabled` is unused (unused)
rpc/dataconn/frameconn/frameconn.go:42:9: S1039: unnecessary use of fmt.Sprintf (gosimple)
panic(fmt.Sprintf("frame header is 8 bytes long"))
^
platformtest/platformtest_ops.go:322:40: S1039: unnecessary use of fmt.Sprintf (gosimple)
return nil, &LineError{scan.Text(), fmt.Sprintf("unexpected tokens at EOL")}
- `timeoutconn` handles state, yet calls to Read/Write make a copy of
that state (non-pointer receiver) so any outbound calls will not have
the state updated
- Even without the copy issue, the renew methods can in edge cases set a
new deadline _after_ DisableTimeouts have been called, consider the
following racy behavior:
1. `renewReadDeadline` is called, checks `renewDeadlinesDisabled`
(not disabled)
2. `DisableTimeouts` is called, sets `renewDeadlinesDisabled`
3. `DisableTimeouts` invokes `c.SetDeadline`
4. `renewReadDeadline` invokes `c.SetReadDeadline`
To fix the above, the `Conn` receiver was made to be a pointer
everywhere and access to renewDeadlinesDisabled is now guarded
by an RWMutex instead of using atomics.
closes#415
* The SendStream.Close() was not called by dataconn.Server, which left the zfs send process dangling.
* When the source job's ctx interceptor closed the task, the dangling zfs send was detect by the trace package and panicked.
020-07-25T19:54:41-04:00 [ERRO][latitude][rpc.data][cyZj$J3Ca$J3Ca.CJwB]: cannot write send stream err="frameconn: shutting down"
panic: end task: 1 active child tasks: end task: task still has active child tasks
goroutine 196966 [running]:
github.com/zrepl/zrepl/daemon/logging/trace.WithTask.func1.1(0xc000320680, 0xcde000)
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace.go:221 +0x2f7
github.com/zrepl/zrepl/daemon/logging/trace.WithTask.func1()
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace.go:237 +0x38
github.com/zrepl/zrepl/daemon/logging/trace.WithTaskAndSpan.func1()
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace_convenience.go:41 +0x37
github.com/zrepl/zrepl/daemon/job.(*PassiveSide).Run.func1(0xdcf780, 0xc0000a3560, 0xdc65a0, 0xc00035e620, 0xc0000a34d0)
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/job/passive.go:194 +0x2e7
github.com/zrepl/zrepl/rpc.NewServer.func3(0xdcf780, 0xc0001ce4b0, 0xdc65e0, 0xc00035e600, 0xc0000a34d0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/rpc_server.go:82 +0xd5
github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn(0xc0000a2ba0, 0xc00018eca0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:149 +0x3be
github.com/zrepl/zrepl/rpc/dataconn.(*Server).Serve.func3(0xc0000b8180, 0xc0000a2ba0, 0xc00018eca0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:108 +0x5d
created by github.com/zrepl/zrepl/rpc/dataconn.(*Server).Serve
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:106 +0x24a
2020-07-25T19:58:55-04:00 [ERRO][latitude][rpc.data][Pt4F$gCWT$gCWT.fzhc]: cannot write send stream err="frameconn: shutting down"
panic: end task: 1 active child tasks: end task: task still has active child tasks
fixes#348
package trace:
- introduce the concept of tasks and spans, tracked as linked list within ctx
- see package-level docs for an overview of the concepts
- **main feature 1**: unique stack of task and span IDs
- makes it easy to follow a series of log entries in concurrent code
- **main feature 2**: ability to produce a chrome://tracing-compatible trace file
- either via an env variable or a `zrepl pprof` subcommand
- this is not a CPU profile, we already have go pprof for that
- but it is very useful to visually inspect where the
replication / snapshotter / pruner spends its time
( fixes#307 )
usage in package daemon/logging:
- goal: every log entry should have a trace field with the ID stack from package trace
- make `logging.GetLogger(ctx, Subsys)` the authoritative `logger.Logger` factory function
- the context carries a linked list of injected fields which
`logging.GetLogger` adds to the logger it returns
- `logging.GetLogger` also uses package `trace` to get the
task-and-span-stack and injects it into the returned logger's fields
- prior to this patch, context cancellation would leave rpc.Server open
- did not make problems because context was only cancelled by SIGINT,
which was immediately followed by os.Exit
* Add Close() in closeState to identify the first closer
* Non-first closers get an error
* Reads and Writes from the Conn get an error if the conn was closed
during the Read / Write was running
* The first closer starts _separate_ goroutine draining the c.frameReads channel
* The first closer then waits for the goroutine that fills c.frameReads
to exit
refs 3bfe0c16d0fixes#174
readFrames would block on `reads <-`
but only after that would stream.Conn.readFrames close c.waitReadFramesDone
which was too late because stream.Conn.Close would wait for c.waitReadFramesDone to be closed before draining the channel
^^^^^^ (not frameconn.Conn, that closed successfully)
195 @ 0x1032ae0 0x1006cab 0x1006c81 0x1006a65 0x15505be 0x155163e 0x1060bc1
0x15505bd github.com/zrepl/zrepl/rpc/dataconn/stream.readFrames+0x16d github.com/zrepl/zrepl/rpc/dataconn/stream/stream.go:220
0x155163d github.com/zrepl/zrepl/rpc/dataconn/stream.(*Conn).readFrames+0xbd github.com/zrepl/zrepl/rpc/dataconn/stream/stream_conn.go:71
195 @ 0x1032ae0 0x10078c8 0x100789e 0x100758b 0x1552678 0x1557a4b 0x1556aec 0x1060bc1
0x1552677 github.com/zrepl/zrepl/rpc/dataconn/stream.(*Conn).Close+0x77 github.com/zrepl/zrepl/rpc/dataconn/stream/stream_conn.go:191
0x1557a4a github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn.func1+0x5a github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:93
0x1556aeb github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn+0x87b github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:176