* The SendStream.Close() was not called by dataconn.Server, which left the zfs send process dangling.
* When the source job's ctx interceptor closed the task, the dangling zfs send was detect by the trace package and panicked.
020-07-25T19:54:41-04:00 [ERRO][latitude][rpc.data][cyZj$J3Ca$J3Ca.CJwB]: cannot write send stream err="frameconn: shutting down"
panic: end task: 1 active child tasks: end task: task still has active child tasks
goroutine 196966 [running]:
github.com/zrepl/zrepl/daemon/logging/trace.WithTask.func1.1(0xc000320680, 0xcde000)
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace.go:221 +0x2f7
github.com/zrepl/zrepl/daemon/logging/trace.WithTask.func1()
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace.go:237 +0x38
github.com/zrepl/zrepl/daemon/logging/trace.WithTaskAndSpan.func1()
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/logging/trace/trace_convenience.go:41 +0x37
github.com/zrepl/zrepl/daemon/job.(*PassiveSide).Run.func1(0xdcf780, 0xc0000a3560, 0xdc65a0, 0xc00035e620, 0xc0000a34d0)
/home/jeremy/go/src/github.com/zrepl/zrepl/daemon/job/passive.go:194 +0x2e7
github.com/zrepl/zrepl/rpc.NewServer.func3(0xdcf780, 0xc0001ce4b0, 0xdc65e0, 0xc00035e600, 0xc0000a34d0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/rpc_server.go:82 +0xd5
github.com/zrepl/zrepl/rpc/dataconn.(*Server).serveConn(0xc0000a2ba0, 0xc00018eca0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:149 +0x3be
github.com/zrepl/zrepl/rpc/dataconn.(*Server).Serve.func3(0xc0000b8180, 0xc0000a2ba0, 0xc00018eca0)
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:108 +0x5d
created by github.com/zrepl/zrepl/rpc/dataconn.(*Server).Serve
/home/jeremy/go/src/github.com/zrepl/zrepl/rpc/dataconn/dataconn_server.go:106 +0x24a
2020-07-25T19:58:55-04:00 [ERRO][latitude][rpc.data][Pt4F$gCWT$gCWT.fzhc]: cannot write send stream err="frameconn: shutting down"
panic: end task: 1 active child tasks: end task: task still has active child tasks
fixes#348
package trace:
- introduce the concept of tasks and spans, tracked as linked list within ctx
- see package-level docs for an overview of the concepts
- **main feature 1**: unique stack of task and span IDs
- makes it easy to follow a series of log entries in concurrent code
- **main feature 2**: ability to produce a chrome://tracing-compatible trace file
- either via an env variable or a `zrepl pprof` subcommand
- this is not a CPU profile, we already have go pprof for that
- but it is very useful to visually inspect where the
replication / snapshotter / pruner spends its time
( fixes#307 )
usage in package daemon/logging:
- goal: every log entry should have a trace field with the ID stack from package trace
- make `logging.GetLogger(ctx, Subsys)` the authoritative `logger.Logger` factory function
- the context carries a linked list of injected fields which
`logging.GetLogger` adds to the logger it returns
- `logging.GetLogger` also uses package `trace` to get the
task-and-span-stack and injects it into the returned logger's fields
- prior to this patch, context cancellation would leave rpc.Server open
- did not make problems because context was only cancelled by SIGINT,
which was immediately followed by os.Exit