In
05f128868f azureblob: add --azureblob-no-head-object
we incorrectly parsed the size of the object as the Content-Length of
the returned header. This is incorrect in the presense of Range
requests.
This fixes the problem by parsing the Content-Range header if
avaialble to read the correct length from if a Range request was
issued.
See: #5734
This reverts commit
dc06973796 Revert "s3: use rclone's low level retries instead of AWS SDK to fix listing retries"
Which in turn reverted
5470d34740 "backend/s3: use low-level-retries as the number of SDK retries"
So we are back where we started.
It then modifies it to set the AWS SDK to `--low-level-retries`
retries, but set the rclone retries to 2 so that directory listings
can be retried.
Before this change the cleanup routine exited on the first deletion
error.
This change counts any errors on deletion and exits when the iteration
is complete with an error showing the number of deletion failures.
Deletion failures will be logged.
Before this change we uses limit/offset paging for directories in the
main directory listing routine and in the trash cleanup listing.
This switches to the new scheme of limit/marker which is more reliable
on a directory which is continuously changing. It has the disadvantage
that it doesn't tell us the total number of items available, however
that wasn't information rclone uses.
This changes the interface to NewObject so that if NewObject is called
on a directory then it should return fs.ErrorIsDir if possible without
doing any extra work, otherwise fs.ErrorObjectNotFound.
Tested on integration test server with:
go run integration-test.go -tests backend -run TestIntegration/FsMkdir/FsPutFiles/FsNewObjectDir -branch fix-stat -maxtries 1
The egress charges while using a CloudFront CDN url is cheaper when
compared to accessing the file directly from S3. So added a download
URL advanced option, which when set downloads the file using it.
Before this patch the md5all option would skip creating metadata with
hashsum if base filesystem provided md5, in hope to pass it through.
However, if base hash is slow (for example on local fs), chunker passed
slow md5 but never reported this fact in features.
This patch makes chunker snapshot base hashsum in metadata when md5all is
set and base hashsum is slow since chunker was intended to provide only
instant hashsums from the start.
Fixes#5508
Before this change, when uploading to a crypt, the ObjectInfo
accidentally used the encrypted size, not the unencrypted size when
--crypt-no-data-encryption was set.
Fixes#5498
In presence of no_data_encryption the Crypt's Put method used to over-optimize
and returned base object. This patch makes it return Crypt-wrapped object now.
Fixes#5498
I discovered that `rclone` always upload in chunks of 16MiB whenever
uploading a file smaller than `--drive-upload-cutoff`. This is
undesirable since the purpose of the flag `--drive-upload-cutoff` is
to *prevent* chunking below a certain file size.
I realized that it wasn't `rclone` forcing the 16MiB chunks. The
`google-api-go-client` forces a chunk size default of
[`googleapi.DefaultUploadChunkSize`](32bf29c2e1/googleapi/googleapi.go (L55-L57))
bytes for resumable type uploads. This means that all requests that
use `*drive.Service` directly for upload without specifying a
`googleapi.ChunkSize` will be forced to use a *`resumable`*
`uploadType` (rather than `multipart`) for files less than
`googleapi.DefaultUploadChunkSize`. This is also noted directly in the
Drive API client documentation [here](https://pkg.go.dev/google.golang.org/api/drive/v3@v0.44.0#FilesUpdateCall.Media).
This fixes the problem by passing `googleapi.ChunkSize(0)` to
`Media()` method calls, which is the only way to disable chunking
completely. This is mentioned in the API docs
[here](https://pkg.go.dev/google.golang.org/api/googleapi@v0.44.0#ChunkSize).
The other alternative would be to pass
`googleapi.ChunkSize(f.opt.ChunkSize)` -- however, I'm *strongly* in
favor of *not* doing this for performance reasons. By not explicitly
passing a `googleapi.ChunkSize(0)`, we effectively allow
[`PrepareUpload()`](https://pkg.go.dev/google.golang.org/api/internal/gensupport@v0.44.0#PrepareUpload)
to create a
[`NewMediaBuffer`](https://pkg.go.dev/google.golang.org/api/internal/gensupport@v0.44.0#NewMediaBuffer)
that copies the original `io.Reader` passed to `Media()` in order to
check that its size is less than `ChunkSize`, which will unnecessarily
consume time and memory.
`minChunkSize` is also changed to be `googleapi.MinUploadChunkSize`,
as it is something specified we have no control over.
Google Drive API allows for clauses like "modifiedTime > '2012-06-04T12:00:00'"
in the query param, so the filter flags --max-age and --min-age can be applied
directly at the directory listing phase rather than in a filter.
This is extremely helpful when we want to do an incremental backup of a remote
drive with many files but the number of recently changed file is small.
Co-authored-by: fotile96 <fotile96@users.noreply.github.com>
This replaces built-in os.MkdirAll with a patched version that stops the recursion
when reaching the volume part of the path. The original version would continue recursion,
and for extended length paths end up with \\? as the top-level directory, and the error
message would then be something like:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
Before this change the union's feature flags were a strict AND of the
underlying remotes. This means that a union of a local disk (which can
Move but not Copy) and a bucket based remote (which can Copy but not
Move) could neither Move nor Copy.
This fix advertises Move in the union if all the remotes can Move or
Copy. It also implements Move as Copy+Delete (like rclone does
normally) if the underlying union does not support Move.
This enables renames to work with unions of local disk and bucket
based remotes expected.
Fixes#5632
This was started in
3626f10f26 pcloud: add sha256 support - fixes#5496
But this support turned out to be incomplete and caused the
integration tests to fail.
After updating rclone's dependencies these tests started failing on
windows/386
- TestInternalDoubleWrittenContentMatches
- TestInternalMaxChunkSizeRespected
The failures look like this. The root cause is unknown. The `Wait(n=1)
would exceed context deadline` errors come from golang.org/x/time/rate
but it isn't clear what is calling them.
2021/08/20 21:57:16 ERROR : worker-0 <one>: object open failed 0: rate: Wait(n=1) would exceed context deadline
[snip ~10 duplicates]
2021/08/20 21:57:56 ERROR : tidwcm1629496636/one: (0/26) error (chunk not found 0) response
2021/08/20 21:58:02 ERROR : worker-0 <one>: object open failed 0: rate: Wait(n=1) would exceed context deadline
--- FAIL: TestInternalDoubleWrittenContentMatches (45.77s)
cache_internal_test.go:310:
Error Trace: cache_internal_test.go:310
Error: Not equal:
expected: "one content updated double"
actual : ""
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-one content updated double
+
Test: TestInternalDoubleWrittenContentMatches
2021/08/20 21:58:03 original size: 23592960
2021/08/20 21:58:03 updated size: 12
In this commit the config system was re-arranged
94dbfa4ea fs: change Config callback into state based callback #3455
This passed the password as a temporary config parameter but forgot to
reveal it in the API call.
At some point some google docs files started having sizes returned in
their listing information.
This then caused rclone to treat the docs as files which caused
downloads to fail.
The API docs now state that google docs may have sizes (whereas I'm
pretty sure it didn't earlier).
This fix removes the check for size, so google docs are identified
solely by not having an MD5 checksum.
This change fixes the bug described below:
if a file is removed while the local backend List() runs,
the call will flag an accounting error.
The bug manifests itself if local backend is the Sync target
due to intrinsic concurrency.
The odds to hit this bug depend on --checkers and --transfers.
Chunker over local backend is affected even more because
updating a composite object with a smaller size content
translates into removing chunks on the underlying file system
and involves a number of List() calls.
- Unify all hash names as lowercase alphanumerics without punctuation.
- Legacy names continue to work but disappear from docs, they can be depreciated or dropped later.
- Make rclone hashsum print supported hash list in case of wrong spelling.
- Update documentation.
Fixes#5071Fixes#4841
Before this change, rclone would always check the root to see if it
was an object.
This change doesn't check to see if the root is an object if the path
ends with a /
This avoids a transaction where rclone HEADs the path to see if it
exists.
See #4990
macOS stores files in NFD form and transferring them like this to some
systems causes the Korean language to display incorrectly.
This adds the flag --local-unicode-normalization to optionally
normalize the file names to NFC.
This also removes the (long deprecated) --local-no-unicode-normalization flag
See: https://forum.rclone.org/t/support-for-korean-jaso-conversion/19435
This is a very large change which turns the post Config function in
backends into a state based call and response system so that
alternative user interfaces can be added.
The existing config logic has been converted, but it is quite
complicated and folloup commits will likely be needed to fix it!
Follow up commits will add a command line and API based way of using
this configuration system.
It was discovered on some Android systems, the stat size of a symlink
is different to the size that readlink returns.
This was giving errors like this
transport connection broken: http: ContentLength=30 with Body length 28
There are enough exceptions to the size of readlink being different to
the size of stat that this patch now always does readlink to work out
the size of a symlink.
Since symlinks are relatively uncommon this shouldn't affect
performance too much and will mean that the size is always correct.
This deprecates the --local-zero-size-links flag which is now
effectively always enabled.
See: https://forum.rclone.org/t/problem-with-symlinks-and-links/23840/
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
v1.4.6 of uplink allows us to do a negative offset from the end of the
file. This removes a round trip when requesting the last N bytes of a
file.
Previous to v1.4.6 of uplink it wasn't possible to do a negative offset
on download. This meant that to fulfill the semantics of http range
headers it was necessary to first fetch the size of the object via a
stat call and compute absolute offset and length.
Restructuring of config code in v1.55 resulted in config
file being loaded early at process startup. If configuration
file is encrypted this means user will need to supply the password,
even when running commands that does not use config.
This also lead to an issue where mount with --deamon failed to
decrypt the config file when it had to prompt user for passord.
Fixes#5236Fixes#5228
Including the bucket name as part of the `fileNamePrefix` passed to
`b2_get_download_authorization` results in a link valid for objects that
have the bucket name as part of the object path; e.g.,
rclone link :b2:some-bucket/some-file
would result in a public link valid for the object
`some-bucket/some-file` in the `some-bucket` bucket (in rclone-remote
parlance, `:b2:some-bucket/some-bucket/some-file`). This will almost
certainly result in a broken link.
The B2 docs don't explicitly specify this behavior, but the example
given for `fileNamePrefix` provides some clarification.
See https://www.backblaze.com/b2/docs/b2_get_download_authorization.html.
This code removes the code added in
15d19131bd s3: use aws web identity role provider
This code no longer works because it doesn't initialise the
tokenFetcher - leading to a nil pointer crash.
The proper way to initialise this is with the
NewWebIdentityCredentials but it isn't clear where to get the other
parameters: roleARN, roleSessionName, path.
In the linked issue a user reports rclone working with EKS anyway, so
perhaps this code is no longer needed.
If it is needed, hopefully someone who knows AWS better will come
along and fix it!
See: https://forum.rclone.org/t/add-support-for-aws-sso/23569
Betweeen rclone v1.54 and v1.55 there was an approx 3x performance
regression when transferring to distant SFTP servers (in particular
rsync.net).
This turned out to be due to the library github.com/pkg/sftp rclone
uses. Concurrent writes used to be enabled in this library by default
(for v1.12.0 as used in rclone v1.54) but they are no longer enabled
(for v1.13.0 as used in rclone v1.55) for safety reasons and it is
necessary to enable them specifically.
The safety concerns are due to the uncertainty as to whether writes
come in order and whether a half completed file might have holes in
it. This isn't a problem for rclone since a) it doesn't restart
uploads and b) it has a post-transfer checksum test.
This change introduces a new flag `--sftp-disable-concurrent-writes`
to control the feature which defaults to false, meaning that
concurrent writes are enabled as in v1.54.
However this isn't quite enough to fix the problem as the sftp library
needs to be able to sniff the size of the stream from the reader
passed in, so this also adds a `Size` interface to the reader to
enable this. This involved a patch to the library.
The library was reverted to v1.12.0 for v1.55.1 - this patch installs
v1.13.0+master to fix the Size interface problem.
See: https://github.com/pkg/sftp/issues/426
Before this change, rclone checked to see if an object existed before
doing an upload by listing the destination directory. This was very
inefficient, especially with large directories.
After this change rclone uses the pre upload check API call which
checks to see if it is OK to upload an object, and also returns the ID
of an existing object which saves rclone having to do a directory
listing.
OneDrive randomly returns the error message: "InvalidAuthenticationToken: Unable to initialize RPS". These unexpected errors typically caused the entire rclone command to fail.
This work around recognizes these errors and marks them for a low level retry, that mostly succeeds. This will make rclone commands complete without being noticeable affected.
Fixes: #5270
With the file version format standardized in lib/version, `crypt` can
now treat the version strings separately from the encrypted/decrypted
file names. This allows --b2-versions to work with `crypt`.
Fixes#1627
Co-authored-by: Luc Ritchie <luc.ritchie@gmail.com>
Before this change rclone would auth over https even when the server
was configured with http.
Authing over http obviously isn't ideal, however this type of server
is on-premise and doesn't work over https.
PR #4266 modified ftpConnection to make ftp library into using
a custom dial function which is QoS aware and takes care of TLS.
However the ServerConn.Login function from the ftp library also needs
TLS config passed explicitly as a trigger for sending PSBZ and PROT
options to FTP server. This was not taken care of resulting in
failure to connect via FTP with implicit TLS.
This PR fixes that.
Fixes#5210
In
a3fcadddc8 sftp: close idle connections after --sftp-idle-timeout (1m by default)
Idle SFTP connections were closed after 1 minute. However due to the
way SSH multiplexes connections over a single SSH connection this
meant that if uploads or downloads went on for more than one minute
they failed with "EOF errors" as their underlying connection was
closed.
This fixes the problem by not clearing idle connections if there are
any transfers in progress.
Fixes#5197
This reverts the library update done in this commit.
713f8f357d sftp: fix "file not found" errors for read once servers
Reverting this commit triples the performance to a far away sftp server.
See: https://github.com/pkg/sftp/issues/426
Before this change when the context was cancelled (due to
--max-duration for example) this could deadlock when uploading
multipart uploads.
This change fixes the problem by introducing another go routine to
monitor the context and close the pipe with an error when the context
errors.
When reading files from B2 via cloudflare using --b2-download-url
cloudflare strips the Content-Length headers (presumably so it can
inject stuff into the body).
This caused rclone to think the file was corrupted as the length
didn't match.
The patch uses the old length read from the listing if there is no
Content-Length.
See: https://forum.rclone.org/t/b2-cloudflare-error-directory-not-found/23026
This commit broke the initialisation of the union backend
f17d7c0012 union: refactor to use fspath.SplitFs instead of fs.ParseRemote #4996
This patch fixes it.
Box recently changed their API, changing the case of returned API items
> On May 10th, 2021, as part of our continued infrastructure upgrade,
> Box's API response headers will standardize to return in a case
> insensitive manner, in line with industry best practices and our API
> documentation. Applications that are using these headers, such as
> "location" and "retry-after", will need to verify that their
> applications are checking for these headers in a case-insensitive
> fashion.
Rclone was reading the raw headers from the `http.Header` and not
using the `Get` accessor method which meant that it was sensitive to
case changes.
This fixes the problem by using the `Get` accessor method.
See: https://forum.rclone.org/t/box-backend-incompatible-with-box-api-changes-being-deployed/22972
If you exceed rate limits, dropbox tells you to wait for 300 seconds -
this is rather a long time for the user to be waiting for rclone to
finish, so emit a NOTICE level log instead of a DEBUG.
Some sftp servers don't allow the user to access the file after upload.
In this case the error message indicates that using
--sftp-set-modtime=false would fix the problem. However it doesn't
because SetModTime does a stat call which can't be disabled.
Update SetModTime failed: SetModTime stat failed: object not found
After upload this patch checks for an `object not found` error if
set_modtime == false and ignores it, returning the expected size of
the object instead.
It also makes SetModTime do nothing if set_modtime = false
https://forum.rclone.org/t/sftp-update-setmodtime-failed/22873
These were added by accident in
d9959b0271 drive: pass context on to drive SDK - this will help with cancellation
Which added lots of new Context() calls but duplicated some existing
ones.
Before this we just failed if the ftp connection or login failed.
This change adds a pacer just for the ftp connect and retries if the
connection failed to Dial or the login returns a 421 error.
This is implemented as a state machine parser so it can emit sensible
error messages.
It does not use the connection strings elsewhere in rclone yet - see
subsequent commits.
An optional fuzzer is implemented for the Parse function.
Before this change, when using an all create method with one of the
upstreams being read only, if there was an existing file on the read
only remote, it was impossible to update it.
This change detects that situation and creates the file on a
read/write upstream. This file will shadow the file on the read/only
upstream. If it is deleted the read only upstream file will be visible
again.
Fixes#4929
Before this fix using the epff policy could double close a channel.
The fix refactors the code to make that impossible and cancels any
running queries when the first query is found.
Before this change the config file needed to be explicitly reloaded.
This coupled the config file implementation with the backends
needlessly.
This change stats the config file to see if it needs to be reloaded on
every config file operation.
This allows us to remove calls to
- config.SaveConfig
- config.GetFresh
Which now makes the the only needed interface to the config file be
that provided by configmap.Map when rclone is not being configured.
This also adds tests for configfile
It introduces a new flag --sftp-disable-concurrent-reads to stop the
problematic behaviour in the SFTP library for read-once servers.
This upgrades the sftp library to v1.13.0 which has the fix.
This change checks the context whenever rclone might retry, and
doesn't retry if the current context has an error.
This fixes the pathological behaviour of `--max-duration` refusing to
exit because all the context deadline exceeded errors were being
retried.
This unfortunately meant changing the shouldRetry logic in every
backend and doing a lot of context propagation.
See: https://forum.rclone.org/t/add-flag-to-exit-immediately-when-max-duration-reached/22723