One-stop ZFS backup & replication solution
Go to file
Goran Mekic bc5e1ede04
metric to detect filesystems rules that don't match any local dataset (#653)
This PR adds a Prometheus counter called
`zrepl_zfs_list_unmatched_user_specified_dataset_count`.
Monitor for increases of the counter to detect filesystem filter rules that
have no effect because they don't match any local filesystem.

An example use case for this is the following story:
1. Someone sets up zrepl with `filesystems` filter for `zroot/pg14<`.
2. During the upgrade to Postgres 15, they rename the dataset to `zroot/pg15`,
   but forget to update the zrepl `filesystems` filter.
3. zrepl will not snapshot / replicate the `zroot/pg15<` datasets.

Since `filesystems` rules are always evaluated on the side that has the datasets,
we can smuggle this functionality into the `zfs` module's `ZFSList` function that
is used by all jobs with a `filesystems` filter.

Dashboard changes:
- histogram with increase in $__interval, one row per job
- table with increase in $__range
- explainer text box, so, people know what the previous two are about
We had to re-arrange some panels, hence the Git diff isn't great.

closes https://github.com/zrepl/zrepl/pull/653

Co-authored-by: Christian Schwarz <me@cschwarz.com>
Co-authored-by: Goran Mekić <meka@tilda.center>
2023-05-02 22:13:52 +02:00
.circleci circleci: use orb circlci/go for module caching 2023-02-26 13:08:05 +01:00
.github docs: GitHub Sponsors link 2020-06-14 15:26:05 +02:00
build build: use go 1.19 for testing & release builds 2022-10-27 00:19:06 +02:00
cli Reformat all files with make format. 2020-08-31 23:57:45 +02:00
client zrepl status: hide progress bar once all filesystems reach terminal state (#674) 2023-05-02 19:28:56 +02:00
config remove unused JobDebugSettings along with docs 2022-12-22 18:13:45 +01:00
daemon metric to detect filesystems rules that don't match any local dataset (#653) 2023-05-02 22:13:52 +02:00
dist metric to detect filesystems rules that don't match any local dataset (#653) 2023-05-02 22:13:52 +02:00
docs docs: address setup with two or more external disks (#695) 2023-05-02 18:57:26 +02:00
endpoint metric to detect filesystems rules that don't match any local dataset (#653) 2023-05-02 22:13:52 +02:00
logger logger: fix go-1.15-discovered conversion from int to string 2020-08-12 21:44:02 +02:00
packaging fix make deb-docker for all platforms but amd64 2021-09-13 22:54:21 +02:00
platformtest platformtest: fix logmockzfs wrapper script / make test-platform for Go 1.19 2023-02-26 13:08:05 +01:00
pruning pruning/keep_last_n: correctly handle the case where count > matching snaps 2021-03-25 22:42:01 +01:00
replication zrepl status: hide progress bar once all filesystems reach terminal state (#674) 2023-05-02 19:28:56 +02:00
rpc run go1.19 gofmt and make adjustments as needed 2022-10-24 22:22:41 +02:00
tlsconf build: go1.14 + address tlsconf deprecation notice 2020-03-27 12:40:57 +01:00
transport Add --skip-cert-check flag to zrepl configcheck to prevent checking cert files 2022-07-08 20:18:41 +02:00
util snapper: fix delayed snapshots caused by system suspend/resume 2022-10-27 00:19:06 +02:00
version prometheus: convert zrepl_version_daemon to zrepl_start_time metric 2022-01-20 19:33:18 +01:00
zfs metric to detect filesystems rules that don't match any local dataset (#653) 2023-05-02 22:13:52 +02:00
.gitignore Rudimentary Makefile specifying requirements for a release 2017-09-30 16:40:39 +02:00
.gitmodules docs: move hugo docs to old directory 2017-11-11 23:25:12 +01:00
.golangci.yml lint: allow empty else branches 2022-09-25 17:10:53 +02:00
build.Dockerfile build: parametrized pipeline (no more approval for release build), embed exact go version info in artifacts 2020-09-06 15:39:34 +02:00
build.installprotoc.bash build: Linux arm64 support 2019-06-23 15:25:26 +02:00
debian build: rpm + deb targets, build-in-docker targets, CircleCI pipeline rewrite 2020-09-02 21:34:52 +02:00
go.mod refactor snapper & support cron-based snapshotting 2022-09-25 19:23:44 +02:00
go.sum refactor snapper & support cron-based snapshotting 2022-09-25 19:23:44 +02:00
lazy.sh Update to protobuf v1.25 and grpc 1.35; bump CI to go1.12 2021-01-25 00:39:01 +01:00
LICENSE LICENSE + docs: adjust copyright 2018-10-13 17:34:05 +02:00
main.go implement new 'zrepl status' 2021-03-14 18:24:25 +01:00
Makefile build: fix deb-docker performance on newer Docker 2022-10-27 00:47:12 +02:00
README.md docs: badges & links to Matrix chat room 2022-01-09 12:05:19 +01:00

GitHub license Language: Go User Docs Support me on Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal Twitter Chat

zrepl

zrepl is a one-stop ZFS backup & replication solution.

User Documentation

User Documentation can be found at zrepl.github.io.

Bug Reports

  1. If the issue is reproducible, enable debug logging, reproduce and capture the log.
  2. Open an issue on GitHub, with logs pasted as GitHub gists / inline.

Feature Requests

  1. Does your feature request require default values / some kind of configuration? If so, think of an expressive configuration example.
  2. Think of at least one use case that generalizes from your concrete application.
  3. Open an issue on GitHub with example conf & use case attached.
  4. Optional: Post a bounty on the issue, or contact Christian Schwarz for contract work.

The above does not apply if you already implemented everything. Check out the Coding Workflow section below for details.

Building, Releasing, Downstream-Packaging

This section provides an overview of the zrepl build & release process. Check out docs/installation/compile-from-source.rst for build-from-source instructions.

Overview

zrepl is written in Go and uses Go modules to manage dependencies. The documentation is written in ReStructured Text using the Sphinx framework.

Install build dependencies using ./lazy.sh devsetup. lazy.sh uses python3-pip to fetch the build dependencies for the docs - you might want to use a venv. If you just want to install the Go dependencies, run ./lazy.sh godep.

The test suite is split into pure Go tests (make test-go) and platform tests that interact with ZFS and thus generally require root privileges (sudo make test-platform). Platform tests run on their own pool with the name zreplplatformtest, which is created using the file vdev in /tmp.

For a full code coverage profile, run make test-go COVER=1 && sudo make test-platform && make cover-merge. An HTML report can be generated using make cover-html.

Code generation is triggered by make generate. Generated code is committed to the source tree.

Build & Release Process

The Makefile is catering to the needs of developers & CI, not distro packagers. It provides phony targets for

  • local development (building, running tests, etc)
  • building a release in Docker (used by the CI & release management)
  • building .deb and .rpm packages out of the release artifacts.

Build tooling & dependencies are documented as code in lazy.sh. Go dependencies are then fetched by the go command and pip dependencies are pinned through a requirements.txt.

We use CircleCI for continuous integration. There are two workflows:

  • ci runs for every commit / branch / tag pushed to GitHub. It is supposed to run very fast (<5min and provides quick feedback to developers). It runs formatting checks, lints and tests on the most important OSes / architectures. Artifacts are published to minio.cschwarz.com (see GitHub Commit Status).

  • release runs

    • on manual triggers through the CircleCI API (in order to produce a release)
    • periodically on master Artifacts are published to minio.cschwarz.com (see GitHub Commit Status).

Releases are issued via Git tags + GitHub Releases feature. The procedure to issue a release is as follows:

  • Issue the source release:
    • Git tag the release on the master branch.
    • Push the tag.
    • Run ./docs/publish.sh to re-build & push zrepl.github.io.
  • Issue the official binary release:
    • Run the release pipeline (triggered via CircleCI API)
    • Download the artifacts to the release manager's machine.
    • Create a GitHub release, edit the changelog, upload all the release artifacts, including .rpm and .deb files.
    • Issue the GitHub release.
    • Add the .rpm and .deb files to the official zrepl repos, publish those.

Official binary releases are not re-built when Go receives an update. If the Go update is critical to zrepl (e.g. a Go security update that affects zrepl), we'd issue a new source release. The rationale for this is that whereas distros provide a mechanism for this ($zrepl_source_release-$distro_package_revision), GitHub Releases doesn't which means we'd need to update the existing GitHub release's assets, which nobody would notice (no RSS feed updates, etc.). Downstream packagers can read the changelog to determine whether they want to push that minor release into their distro or simply skip it.

Additional Notes to Distro Package Maintainers

  • Run the platform tests (Docs -> Usage -> Platform Tests) on a test system to validate that zrepl's abstractions on top of ZFS work with the system ZFS.
  • Ship a default config that adheres to your distro's hier and logging system.
  • Ship a service manager file and please try to upstream it to this repository.
    • dist/systemd contains a Systemd unit template.
  • Ship other material provided in ./dist, e.g. in /usr/share/zrepl/.
  • Have a look at the Makefile's ZREPL_VERSION variable and how it passed to Go's ldFlags. This is how zrepl version knows what version number to show. Your build system should set the ldFlags flags appropriately and add a prefix or suffix that indicates that the given zrepl binary is a distro build, not an official one.
  • Make sure you are informed about new zrepl versions, e.g. by subscribing to GitHub's release RSS feed.

Contributing Code

  • Open an issue when starting to hack on a new feature
  • Commits should reference the issue they are related to
  • Docs improvements not documenting new features do not require an issue.

Breaking Changes

Backward-incompatible changes must be documented in the git commit message and are listed in docs/changelog.rst.

Glossary & Naming Inconsistencies

In ZFS, dataset refers to the objects filesystem, ZVOL and snapshot.
However, we need a word for filesystem & ZVOL but not a snapshot, bookmark, etc.

Toward the user, the following terminology is used:

  • filesystem: a ZFS filesystem or a ZVOL
  • filesystem version: a ZFS snapshot or a bookmark

Sadly, the zrepl implementation is inconsistent in its use of these words: variables and types are often named dataset when they in fact refer to a filesystem.

There will not be a big refactoring (an attempt was made, but it's destroying too much history without much gain).

However, new contributions & patches should fix naming without further notice in the commit message.