This commits ports a fast C-implementation from https://github.com/namazso/QuickXorHash
It uses new crypto/subtle code from go1.20 to avoid the use of unsafe.
Typical speedups are about 25x when using go1.20
goos: linux
goarch: amd64
cpu: Intel(R) Celeron(R) N5105 @ 2.00GHz
QuickXorHash-Before 2.49ms 422MB/s ±11% 100.00%
QuickXorHash-Subtle 87.9µs 11932MB/s ± 5% +2730.83% + 42.17%
Co-Author: @namazso
Uploading 100 files of each 1 MB took 20 seconds before. With above fix it takes around 2 seconds now.
10x time improvement in line with pacer's sleep reduction from 100ms to 10ms
Before this change when uploading files bigger than 1TiB, the chunk
calculator would work out that the chunk size needed to be bigger than
the default 100 MiB to fit within the 10,000 parts limit.
However the uploader was still using the memory pool for the old chunk
size and this caused errors like
panic: runtime error: slice bounds out of range [:122683392] with capacity 100663296
The fix for this is to make a temporary pool with the larger chunk
size and use it during the upload of the large file.
See: https://forum.rclone.org/t/rclone-cannot-complete-upload-to-b2-restarts-upload-frequently/35617/
Before this change we were sending webdav requests to the go http
FileServer. In go1.20 these (rightly) started returning errors which
caused the tests to fail.
The test has been changed to properly mock up an About query and
response so an end to end test of adding headers is possible.
Passwords for encrypted libraries are kept in memory in the server
and flushed after an hour.
This MR fixes an issue when the library password expires after 1 hour.
rclone sync erroneously deleted folders renamed to a different case on
crypts where directory name encryption was disabled and the underlying
remote was case insensitive.
Example: Renaming the folder Test to tEST before a sync to a crypt having
remote=OneDrive:crypt and directory_name_encryption=false could result in
the folder and all its content being deleted. The following sync would
correctly create the tEST folder and upload all of the content.
Additional tests have revealed other potential issues when using
filename_encryption=off or directory_name_encryption=false on case
insensitive remotes. The documentation has been updated to warn about
potential problems when using these combinations.
This error was caused by rclone supplying an empty
`x-ms-blob-public-access:` header when creating a container for
private access, rather than omitting it completely.
This is a valid way of specifying containers should be private, but if
the storage account has the flag "Blob public access" unset then it
gives "409 Public access is not permitted on this storage account".
This patch fixes the problem by only supplying the header if the
access is set.
Fixes#6645
Storj switched to a single global s3 endpoint backed by a BGP routing.
We want to stop advertizing the former regional endpoints and have the
global one as the only option.
Before this change, when a new object was created s3 returns its
versionID (on a versioned bucket) and rclone recorded it in the
object.
This means that when rclone came to delete the object it would delete
it with the versionID.
However it is common to forbid actions with versionIDs on buckets so
as to preserve the historical record and these operations would fail
whereas they succeeded in pre-v1.60.0 versions.
This patch fixes the problem by not recording versions of objects
supplied by the S3 API on upload unless `--s3-versions` or
`--s3-version-at` is used. This makes rclone behave as it did before
v1.60.0 when version support was introduced.
See: https://forum.rclone.org/t/s3-and-intermittent-403-errors-with-file-renames-and-drag-and-drop-operations-in-windows-explorer/34773
This patch implements --use-server-modtime for the Azureblob backend.
It does this by not reading the time from the metadata if the global
flag is set.
When the SDK was upgraded it started delivering metadata where the
keys were not in lower case as per the old SDK.
Rclone normalises the case of the keys for storage in the Object, but
the directory marker check was being done with the unnormalised keys
as it needs to be done before the Object is created.
This fixes the directory marker check to do a case insensitive compare
of the metadata keys.
Before this change, we were taking the version ID straight from the
XML blob returned by the SDK and thus pinning the XML into memory
which bulked up the average memory per object from about 400 bytes to
4k.
Copying the string fixes the excess memory usage.
This reverts commit 4f386a1ccd.
It turns out that Alibaba OSS does support list v2 and the detection
code was wrong.
This means that users of the gov version of Alibaba will have to add
`list_version 1` to their config files.
See #6600
In this commit
ab849b3613 s3: fix listing loop when using v2 listing on v1 server
The ContinuationToken was tested for existence, but it is the
NextContinuationToken that we are interested in.
See: #6600
Previously it was limited to plain ASCII (0-9, A-Z, a-z).
Implemented by adding \p{L}\p{N} alongside the \w in the regex,
even though these overlap it means we can be sure it is 100%
backwards compatible.
Fixes#6618
This was caused by
a9bd0c8de6 s3: reduce memory consumption for s3 objects
Which assumed that the StorageClass would always be set, but it isn't
set for Versions.
The updates the authentication to include
- Auth from the environment
1. Environment Variables
2. Managed Service Identity Credentials
3. Azure CLI credentials (as used by the az tool)
- Account and Shared Key
- SAS URL
- Service principal with client secret
- Service principal with certificate
- User with username and password
- Managed Service Identity Credentials
And rationalises the auth order.
Normally rclone will check the container exists before uploading if it
hasn't listed the container yet.
Often rclone will be running with a limited set of permissions which
means rclone can't create the container anyway, so this stops the
check.
This will save a transaction.
This commit switches from using the old Azure go modules
github.com/Azure/azure-pipeline-go/pipeline
github.com/Azure/azure-storage-blob-go/azblob
github.com/Azure/go-autorest/autorest/adal
To the new SDK
github.com/Azure/azure-sdk-for-go/
This stops rclone using deprecated code and enables the full range of
authentication with Azure.
See #6132 and #5284
Before this change, rclone would enter a listing loop if it used v2
listing on a v1 server and the list exceeded 1000 items.
This change detects the problem and gives the user a helpful message.
Fixes#6600
Copying the storageClass string instead of using a pointer to the original string.
This prevents the Go garbage collector from keeping large amounts of
XMLNode structs and references in memory, created by xmlutil.XMLToStruct()
from the aws-sdk-go.
This commit uses the MLST command (where available) to get the status
for single files rather than listing the parent directory and looking
for the file. This makes actions such as using `--files-from` much quicker.
* use getEntry to lookup remote files when supported
* findItem now expects the full path directly
It makes the expected argument similar to the getInfo method, the
difference now is that one is returning a FileInfo whereas
the other is returning an ftp Entry.
Fixes#6225
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Before this fix a chain compress -> crypt -> s3 was giving errors
BadDigest: The Content-MD5 you specified did not match what we received.
This was because the crypt backend was encrypting the underlying local
object to calculate the hash rather than the contents of the metadata
stream.
It did this because the crypt backend incorrectly identified the
object as a local object.
This fixes the problem by making sure the crypt backend does not
unwrap anything but fs.OverrideRemote objects.
See: https://forum.rclone.org/t/not-encrypting-or-compressing-before-upload/32261/10
Before this change we were putting connections into the connection
pool which had a local context in.
This meant that when the operation had finished the context was
cancelled and the connection became unusable.
See: https://forum.rclone.org/t/failed-to-sync-context-canceled/34017/
Before this change, when using -server-side-across-configs rclone
would direct Move/Copy/DirMove to the destination server.
However this should be directed to the source server. This is a little
unclear in the RFC, but the name of the parameter "Destination:" seems
clear and this is how dCache and Rucio have implemented it.
See: https://forum.rclone.org/t/webdav-copy-request-implemented-incorrectly/34072/