Commit Graph

1118 Commits

Author SHA1 Message Date
Christian Schwarz
b8f55a97ba build: circleci: use large class for release-build job 2024-09-08 23:19:45 +00:00
Christian Schwarz
82adb2b9f5 build: circleci: remove obsolete script
The binary packaging workflow has long since
been moved to this repo (I don't think the external
workflow work ever completed).
2024-09-08 23:19:45 +00:00
Christian Schwarz
7b6adab6b1 build: circleci: only archive artifacts/release 2024-09-08 23:19:45 +00:00
Christian Schwarz
e390aa0c5a build: circleci: update VM image used for release builds
Doesn't matter much because everything happens inside Docker.
2024-09-08 23:19:45 +00:00
Christian Schwarz
5a8f0b9a24 build: make release: check toolchain GOVERSION matches expecations (and refactor/extend Makefile a bit) 2024-09-08 23:19:45 +00:00
Christian Schwarz
3cb1865909 chore: trace spans: use crypto/rand for generating them
math/rand.Read is deprecated in newer Go versions.

Also, it appears that crypto/rand is faster when used from multiple
goroutines: https://gist.github.com/problame/0699acd6f99db4163f26f0b8a61569f3
2024-09-08 23:19:45 +00:00
Christian Schwarz
0ab92d4861 build: avoid compiling platformtest test list generator
This also fixes a deprecation warning.
2024-09-08 23:19:45 +00:00
Christian Schwarz
740ab4b1b2 chore: io/ioutil has been deprecated 2024-09-08 23:19:45 +00:00
Christian Schwarz
48c5b60024 chore: grpc.DialContext has been deprecated 2024-09-08 23:19:45 +00:00
Christian Schwarz
40fd700855 chore: newer staticcheck complains about useless fmt.Sprintf 2024-09-08 20:57:09 +00:00
Christian Schwarz
def510abfd chore: require go 1.22/1.23, upgrade protobuf, upgrade all deps
Go upgrade:
- Go 1.23 is current => use that for release builds
- Go 1.22 is less than one year old, it's desirable to support it.
- The [`Go Toolchains`](https://go.dev/doc/toolchain) stuff is available
  in both of these (would also be in Go 1.21). That is quite nice stuff,
  but required some changes to how we versions we use in CircleCI and
  the `release-docker` Makefile target.

Protobuf upgrade:
- Go to protobuf GH release website
- Download latest locally
- run `sha256sum`
- replace existing pinned hashes
- `make generate`

Deps upgrade:
- `go get -t -u all`
- repository moves aren't handled well automatically, fix manually
- repeat until no changes
2024-09-08 20:49:09 +00:00
Christian Schwarz
08769a8752 fix: accidental use of wrong logging package 2024-09-08 12:57:58 +00:00
Christian Schwarz
5615f4929a
fix: replication of placeholder filesystems (#744)
fixes https://github.com/zrepl/zrepl/issues/742

Before this PR, when chaining replication from
A => B => C, if B had placeholders and the `filesystems`
included these placeholders, we'd incorrectly
fail the planning phase with error
`sender does not have any versions`.

The non-placeholder child filesystems of these placeholders
would then fail to replicate because of the
initial-replication-dependency-tracking that we do, i.e.,
their parent failed to initially replication, hence
they fail to replicate as well
(`parent(s) failed during initial replication`).

We can do better than that because we have the information
whether a sender-side filesystem is a placeholder.
This PR makes the planner act on that information.
The outcome is that placeholders are replicated as
placeholders (albeit the receiver remains in control
of how these placeholders are created, i.e., `recv.placeholders`)
The mechanism to do it is:
1. Don't plan any replication steps for filesystems that
   are placeholders on the sender.
2. Ensure that, if a receiving-side filesystem exists, it
   is indeed a placeholder.

Check (2) may seem overly restrictive, but, the goal here
is not just to mirror all non-placeholder filesystems, but
also to mirror the hierarchy.

Testing performed:
- [x] confirm with issue reporter that this PR fixes their issue
- [x] add a regression test that fails without the changes in this PR
2024-09-05 23:26:42 +02:00
Logan Pulley
440b07443f
Remove dead Bountysource link (#806)
Bountysource is dead.


https://www.theblockchain-group.com/wp-content/uploads/2023/11/TBG-CP17112023.pdf
2024-07-28 20:04:07 +02:00
Florian
e2fcf9ff5b
docs: add missing newline for codeblock in docs/compile-from-source.rst (#768)
to make it render correctly
2024-07-13 20:42:50 +02:00
Christian Schwarz
a5f6bc3697
github: disable dependabot for docs (#800) 2024-07-13 18:22:25 +02:00
Christian Schwarz
9c63736489
treat empty jobs & empty YAML as valid & ship empty jobs in deb/rpm (#788)
fixes https://github.com/zrepl/zrepl/issues/784
obsoletes https://github.com/zrepl/zrepl/pull/787
2024-05-14 19:18:22 +02:00
Fermín Olaiz
830536715e
docs: use $zrepl_apt_repo_file on installation snippet (#783) 2024-05-08 00:30:50 +02:00
Denis Shaposhnikov
ebc46cf1c0
Fix last_n keep rule (#691) (#750)
From https://github.com/zrepl/zrepl/issues/691

The last_n prune rule keeps everything, regardless of if it matches the
regex or not, if there are less than count snapshot. The expectation
would be to never keep non-regex snapshots, regardless of number.
2023-12-22 13:38:14 +01:00
Denis Shaposhnikov
27012e5623
Allow same root_fs for different jobs: sinks and so on (#752)
Because some jobs add client identity to root_fs and other jobs don't do
that,
we can't reliable detect overlapping of filesystems. And and the same
time we
need an ability to use equal or overlapped root_fs for different jobs.
For
instance see this config:

```
  - name: "zdisk"
    type: "sink"
    root_fs: "zdisk/zrepl"
    serve:
      type: "local"
      listener_name: "zdisk"
```
and
```
  - name: "remote-to-zdisk"
    type: "pull"
    connect:
      type: "tls"
    root_fs: "zdisk/zrepl/remote"
```

As you can see, two jobs have overlapped root_fs, but actually datasets
are not
overlapped, because job `zdisk` save everything under
`zdisk/zrepl/localhost`,
because it adds client identity. So they actually use two different
filesystems:
`zdisk/zrepl/localhost` and `zdisk/zrepl/remote`. And we can't detect
this
situation during config check. So let's just remove this check, because
it's
admin's duty to configure correct root_fs's.

---------

Co-authored-by: Christian Schwarz <me@cschwarz.com>
2023-11-01 00:12:54 +01:00
Christian Schwarz
30faaec26a
build: ci: fix quickcheck-docs for external PRs (#763)
fixes https://github.com/zrepl/zrepl/issues/762
2023-11-01 00:12:23 +01:00
Christian Schwarz
21e0ae63a6 build: fix rpm builds, broken by ef9a63b: support package revisions 2023-10-07 18:46:28 +00:00
Christian Schwarz
370f40881d build: wrap-and-checksum didn't include .deb files
fixup of 9d5c892023
2023-10-07 17:03:27 +00:00
Christian Schwarz
fb71a7e4b0 build: forward ZREPL_VERSION and ZREPL_PACKAGE_RELEASE to docker targets 2023-10-07 16:36:43 +00:00
Christian Schwarz
ef9a63b075 build: support package revisions 2023-10-07 16:36:43 +00:00
Christian Schwarz
faef059edf build: get rid of bins-all target special case, bring back test vet lint steps of release target 2023-10-07 16:36:43 +00:00
Christian Schwarz
ad9fbf7b6d build: generic _impl target to run a make target for all GOOS/GOARCH combinations 2023-10-07 16:26:53 +00:00
Christian Schwarz
3bd17b8069 build: remove GO_SUPPORTS_ILLUMOS cruft
illumos is supported by all Go versions that can build zrepl
2023-10-07 16:26:53 +00:00
Christian Schwarz
99bf1487ae build: make release only build the binaries 2023-10-07 16:26:53 +00:00
Christian Schwarz
c3b4f01c44 build: CGO_ENABLED=0 for all builds 2023-10-07 16:26:50 +00:00
Christian Schwarz
9d5c892023 build: tooling to use CircleCI artifacts for releasing
Also, include RPMs and DEBs in the sha256sum.txt
2023-10-01 15:33:03 +00:00
Christian Schwarz
d8d1d25ec2 docs: changelog for 0.6.1 2023-09-10 11:13:14 +00:00
Christian Schwarz
d02d7e5e1d address updated golangci-lint errors: S1011 (gosimple)
should replace loop with `copy.outs[level] = append(copy.outs[level], os.outs[level]...)` (gosimple)
2023-09-10 11:12:46 +00:00
Christian Schwarz
39f8ff62f0 address updated golangci-lint errors: ineffectual assignment to err (ineffassign) 2023-09-10 11:12:46 +00:00
Christian Schwarz
9a434b0e54 go1.21: update golangci-lint (current version panics on go 1.21)
Command used:

```
cd build
go get -u github.com/golangci/golangci-lint/cmd/golangci-lint
go mod tidy
```

Further, golangci-lint requires go 1.20 to build, so, use that as the lowest version in the CI.
2023-09-10 11:12:46 +00:00
Christian Schwarz
b5053d2659 build: use Go 1.21 2023-09-10 10:19:23 +00:00
Christian Schwarz
0fe2ac6b90 debian packaging: make it work on non-x86_64 hosts (arm64 builder, specifically) 2023-09-10 10:12:20 +00:00
Christian Schwarz
95c924968a circleci: migrate to scheduled pipelines
https://circleci.com/docs/migrate-scheduled-workflows-to-scheduled-pipelines/
2023-09-09 11:55:04 +00:00
Christian Schwarz
523a3bb26b build: address breakage by golang:1.19 Docker image switching to bookworm
Debian bookworm apparently _requires_ pip to be used in venv, at least
when we use it inside the build.Dockerfile.

So, do that.
2023-09-09 11:55:04 +00:00
Christian Schwarz
96396b2e86 circleci: fixup bc92660: docs/publish.sh script -P option didn't work
forgot to add it to getopt
2023-09-09 11:08:03 +00:00
Sven Kirmess
8749f0bd3d docs: talks: add note on keep_bookmarks option (#687)
Mention that the keep_bookmarks option was removed since the talks were given.
2023-09-09 10:47:28 +00:00
Christian Schwarz
bc92660e09
circleci: ensure docs/publish.sh works as part of pre-merge ci workflow (#736) 2023-09-09 12:41:06 +02:00
Christian Schwarz
8b0637ddcc
docs: switch to sphinx-multiversion for multi-versioned docs (#734)
The sphinxcontrib-versioning seems unmaintainted and I can't
get the fork that we used before this PR working on Python 3.10.

The situation wrt maintenance doesn't seem much better for
sphinx-multiversion, but, at least I could get it to work
with current sphinx versions.

The main problem with sphinx-multiversion is that it doesn't render
anything at `/`. I.e., `https://zrepl.github.io/configuration.html` will
be 404.
That's different from `sphinxcontrib-versioning`, and thus switching
to sphinx-multiversion would break URLs.
We host on GitHub pages and don't control the webserver,
so, we can't use webserver-level redirects to keep the URLs working.
We could create JS-level redirects, or `http-equiv`, but that's ugly as
well.
The simplest solution was to fork sphinx-multiversion and hard-code
zrepl's specific needs into that fork.
The fork is based off v0.2.4 and pinned via requirements.txt.
Here are its unique commits:
https://github.com/Holzhaus/sphinx-multiversion/compare/master...zrepl:sphinx-multiversion:zrepl

We should revisit `sphinx-polyversion` in the future once its docs
improve.
See
https://github.com/Holzhaus/sphinx-multiversion/issues/88#issuecomment-1606221194

This PR updates the various Python packages, as I couldn't get
sphinx-multiversion to work with the (very old) versions that were
pinned in `requirements.txt` prior to this PR.
This PR's `requirements.txt` is from a clean Python 3.10 venv on Ubuntu
22.10 after running

```
pip install sphinx sphinx-rtd-theme
pip install 'git+https://github.com/zrepl/sphinx-multiversion/@52c915d7ad898d9641ec48c8bbccb7d4f079db93#egg=sphinx_multiversion'
```
2023-09-09 12:21:25 +02:00
Christian Schwarz
bbdc6f5465
fix handling of tenative cursor presence if protection strategy doesn't use it (#714)
Before this PR, we would panic in the `check` phase of `endpoint.Send()`'s `TryBatchDestroy` call in the following cases: the current protection strategy does NOT produce a tentative replication cursor AND
  * `FromVersion` is a tentative cursor bookmark
  * `FromVersion` is a snapshot, and there exists a tentative cursor bookmark for that snapshot
  * `FromVersion` is a bookmark != tentative cursor bookmark, but there exists a tentative cursor bookmark for the same snapshot as the `FromVersion` bookmark

In those cases, the `check` concluded that we would delete `FromVersion`.
It came to that conclusion because the tentative cursor isn't part of `obsoleteAbs` if the protection strategy doesn't produce a tentative replication cursor.

The scenarios above can happen if the user changes the protection strategy from "with tentative cursor" to one "without tentative replication cursor", while there is a tentative replication cursor on disk.
The workaround was to rename the tentative cursor.

In all cases above, `TryBatchDestroy` would have destroyed the tentative cursor.

In case 1, that would fail the `Send` step and potentially break replication if the cursor is the last common bookmark. The `check` conclusion was correct.

In cases 2 and 3, deleting the tentative cursor would have been fine because `FromVersion` was a different entity than the tentative cursor. So, destroying the tentative cursor would be the right call.

The solution in this PR is as follows:
* add the `FromVersion` to the `liveAbs` set of live abstractions
* rewrite the `check` closure to use the full dataset path (`fullpath`) to identify the concrete ZFS object instead of the `zfs.FilesystemVersionEqualIdentity`, which is only identified by matching GUID.
  * Holds have no dataset path and are not the `FromVersion` in any case, so disregard them.

fixes #666
2023-07-04 20:21:48 +02:00
Goran Mekic
bc5e1ede04
metric to detect filesystems rules that don't match any local dataset (#653)
This PR adds a Prometheus counter called
`zrepl_zfs_list_unmatched_user_specified_dataset_count`.
Monitor for increases of the counter to detect filesystem filter rules that
have no effect because they don't match any local filesystem.

An example use case for this is the following story:
1. Someone sets up zrepl with `filesystems` filter for `zroot/pg14<`.
2. During the upgrade to Postgres 15, they rename the dataset to `zroot/pg15`,
   but forget to update the zrepl `filesystems` filter.
3. zrepl will not snapshot / replicate the `zroot/pg15<` datasets.

Since `filesystems` rules are always evaluated on the side that has the datasets,
we can smuggle this functionality into the `zfs` module's `ZFSList` function that
is used by all jobs with a `filesystems` filter.

Dashboard changes:
- histogram with increase in $__interval, one row per job
- table with increase in $__range
- explainer text box, so, people know what the previous two are about
We had to re-arrange some panels, hence the Git diff isn't great.

closes https://github.com/zrepl/zrepl/pull/653

Co-authored-by: Christian Schwarz <me@cschwarz.com>
Co-authored-by: Goran Mekić <meka@tilda.center>
2023-05-02 22:13:52 +02:00
Tercio Filho
2b3daaf9f1
zrepl status: hide progress bar once all filesystems reach terminal state (#674)
* Added `IsTerminal` method
* Made rendering of progress bar conditional based on IsTerminal
2023-05-02 19:28:56 +02:00
Sebastian Jäger
2b3df7e342
docs: address setup with two or more external disks (#695) 2023-05-02 18:57:26 +02:00
Christian Schwarz
5e4d4188f4 circleci: use orb circlci/go for module caching 2023-02-26 13:08:05 +01:00
Christian Schwarz
1e8ffe4486 circleci: run platform tests in CircleCI 2023-02-26 13:08:05 +01:00
Christian Schwarz
59389b84a2 platformtest: fix logmockzfs wrapper script / make test-platform for Go 1.19
See the comment in the script.

refs https://github.com/golang/go/issues/53962

 used by make test-platform breaks the test on Go 1.19
2023-02-26 13:08:05 +01:00