Some storage providers e.g. S3 don't have an efficient rename operation.
Before this change, when chunker finished an upload, the server-side copy
and delete operations that renamed temporary chunks to their final names
could take a significant amount of time.
This PR records transaction identifier (versioning) in the metadata of
chunker composite objects striving to remove the need for rename
operations on such backends.
This approach will be triggered be the new "transactions" configuration
option, which can be "rename" (the default) or "norename".
We implement the new approach for uploads (Put operations).
The chunker Move operation still uses the rename operation of
underlying backend. Filling this gap is left for a later PR.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Before this change, if folder level access permissions policy was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.
Failed to create file system for "s3:bucket/path/": Forbidden: Forbidden
status code: 403, request id: XXXX, host id:
Previous to this change
53aa03cc44 s3: complete sse-c implementation
rclone would assume any errors when HEAD-ing the object implied it
didn't exist and this test would not fail.
This change reverts the functionality of the test to work as it did
before, meaning any errors on HEAD will make rclone assume the object
does not exist and the path is referring to a directory.
Fixes#4990
This implements polling support for the Dropbox backend. The Dropbox SDK dependency had to be updated due to an auth issue, which was fixed on Jan 12 2021. A secondary internal Dropbox service was created to handle unauthorized SDK requests, as is necessary when using the ListFolderLongpoll function/endpoint. The config variable was renamed to cfg to avoid potential conflicts with the imported config package.
Sharepoint 2016 returns status 204 to the purge request
even if the directory to purge does not really exist.
This change adds an extra check to detect this condition
and returns a proper error code.
The go-ntlmssp NTLM negotiator has to try various authentication methods.
Intermediate responses from Sharepoint have status code 401, only the
final one is different. When rclone runs a large operation in parallel
goroutines according to --checkers or --transfers, one of threads can
receive intermediate 401 response targeted for another one and returns
the 401 authentication error to the user.
This patch fixes that.
On-premises Sharepoint returns HTTP errors 400 or 500 in
reply to attempts to use file names with special characters
like hash, percent, tilde, invalid UTF-7 and so on.
This patch activates transparent encoding of such characters.
As per Microsoft documentation, Windows authentication
(NTLM/Kerberos/Negotiate) is not supported with HTTP/2.
This patch disables transparent HTTP/2 support when the
vendor setting is "sharepoint-ntlm". Otherwise connections
to IIS/10.0 can fail with HTTP_1_1_REQUIRED.
Co-authored-by: Georg Neugschwandtner <georg.neugschwandtner@gmx.net>
The most popular keyword for the Sharepoint in-house or company
installations is "On-Premises".
"Microsoft OneDrive account" is in fact just a Microsoft account.
Co-authored-by: Georg Neugschwandtner <georg.neugschwandtner@gmx.net>
Add new option option "sharepoint-ntlm" for the vendor setting.
Use it when your hosted Sharepoint is not tied to the OneDrive
accounts and uses NTLM authentication.
Also add documentation and integration test.
Fixes: #2171
S3 backend shared_credentials_file option wasn't working neither from
config option nor from command line option. This was caused cause
shared_credentials_file_provider works as part of chain provider, but in
case user haven't specified access_token and access_key we had removed
(set nil) to credentials field, that may contain actual credentials got
from ChainProvider.
AWS_SHARED_CREDENTIALS_FILE env varible as far as i understood worked,
cause aws_sdk code handles it as one of default auth options, when
there's not configured credentials.
This change adds the scopes rclone wants during the oauth request.
Previously rclone left these blank to get a default set.
This allows rclone to add the "members.read" scope which is necessary
for "impersonate" to work, but only when it is in use as it require
authorisation from a Team Admin.
See: https://forum.rclone.org/t/dropbox-no-members-read/22223/3
Some virtual filesystems (such as Google Drive File Stream) may
incorrectly set the actual file size equal to the preallocated space,
causing checksum and file size checks to fail.
This flag can be used to disable preallocation for local backends of
this type.
Before this change, if an application key limited to a prefix was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.
Failed to create file system for "b2:bucket/path/":
failed to HEAD for download: Unknown 401 (401 unknown)
This change assumes any errors on HEAD will make rclone assume the
object does not exist and the path is referring to a directory.
See: https://forum.rclone.org/t/b2-error-on-application-key-limited-to-a-prefix/22159/
Before this change, if --b2-chunk-size was raised above 200M then this
error would be produced:
b2: upload cutoff: 200M is less than chunk size 1G
This change automatically reaises --b2-upload-cutoff to be the value
of --b2-chunk-size if it is below it, which stops this error being
generated.
Fixes#4475
Before this change, running
rclone backend copyid drive: ID file.txt
Failed with the error
command "copyid" failed: failed copying "ID" "file.txt": can't use empty string as a path
This fixes the problem.
This makes sure that partially uploaded large files are removed
unless the `--swift-leave-parts-on-error` flag is supplied.
- refactor swift.go
- add unit test for swift with chunk
- add unit test for large object with fail case
- add "-" to white list char during encode.
Assume the Stat size of links is zero (and read them instead)
On some virtual filesystems (such ash LucidLink), reading a link size via a
Stat call always returns 0.
However, on unix it reads as the length of the text in the link. This may
cause errors like this when syncing:
Failed to copy: corrupted on transfer: sizes differ 0 vs 13
Setting this flag causes rclone to read the link and use that as the size of
the link instead of 0 which in most cases fixes the problem.
Fixes#4950
Signed-off-by: Riccardo Iaconelli <riccardo@kde.org>
This patch changes to using the default page limit for listing
unfinished multpart uploads rather than 1000. 1000 is the maximum
specified in the docs, but setting anything larger than 200 gives an
error.
- fix test case FsNewObjectCaseInsensitive (PR #4830)
- continue PR #4917, add comments in metadata detection code
- add warning about metadata detection in user documentation
- change metadata size limits, make room for future development
- hide critical chunker parameters from command line
Before this patch chunker required that there is at least one
data chunk to start checking for a composite object.
Now if chunker finds at least one potential temporary or control
chunk, it marks found files as a suspected composite object.
When later rclone tries a concrete operation on the object,
it performs postponed metadata read and decides: is this a native
composite object, incompatible metadata or just garbage.
This includes an HDFS docker image to use with the integration tests.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
The current authentication scheme works without creating
a public download endpoint for a private bucket as in the B2 official blog.
On the contrary, if the existing authorization header gets duplicated
in the Cloudflare Workers script, one might receive 401 Unauthorized errors.
Before this change the webdav backend didn't truncate Range requests
to the size of the object. Most webdav providers are OK with this (it
is RFC compliant), but it causes 4shared to return 500 internal error.
Because Range requests are used in mounting, this meant that mounting
didn't work for 4shared.
This change truncates the Range request to the size of the object.
See: https://forum.rclone.org/t/cant-copy-use-files-on-webdav-mount-4shared-that-have-foreign-characters/21334/
Before this change, attempting to update an archive tier blob failed
with a 409 error message:
409 This operation is not permitted on an archived blob.
This change detects if we are overwriting a blob and either generates
the error (if `--azureblob-archive-tier-delete` is not set):
can't update archive tier blob without --azureblob-archive-tier-delete
Or deletes the blob first before uploading it again (if
`--azureblob-archive-tier-delete` is set).
Fixes#4819
Before this change if you attempted to list a remote set up with a SAS
URL outside its container then it would crash the Azure SDK.
A check is done to make sure the root is inside the container when
starting the backend which is usually enough, but when two SAS URL
based remotes are mounted in a union, the union backend attempts to
read paths outside the named container. This was causing a mysterious
crash in the Azure SDK.
This fixes the problem by checking to see if the container in the
listing is the one in the SAS URL before listing the directory and
returning directory not found if it isn't.
Before this change when NewObject was called the b2 backend would list
the directory that the object was in in order to find it.
Unfortunately list calls are Class C transactions and cost more.
This patch switches to using HEAD requests instead. These are Class B
transactions. It is then necessary to parse the headers from response
back into the data that we get from the listing. However B2 returns
exactly the same data, just in a different form.
Rclone will use the old directory listing method when looking for
files with versions as these can't be found via a HEAD request.
This change will particularly benefit --files-from, rclone serve
restic but most operations will see some benefit.
Starting September 30th, 2021, the Dropbox OAuth flow will no longer
return long-lived access tokens. It will instead return short-lived
access tokens, and optionally return refresh tokens.
This patch adds the token_access_type=offline parameter which causes
dropbox to return short lived tokens now.
Before this change rclone would upload the whole of multipart files
before receiving a message from dropbox that the path was too long.
This change hard codes the 255 rune limit and checks that before
uploading any files.
Fixes#4805
Before this change, rclone would retry files with filenames that were
too long again and again.
This changed recognises the malformed_path error that is returned and
marks it not to be retried which stops unnecessary retrying of the file.
See #4805
Before this change rclone was using the copy endpoint to copy large objects.
This can fail for large objects with this error:
Error 413: Copy spanning locations and/or storage classes could
not complete within 30 seconds. Please use the Rewrite method
This change makes Copy use the Rewrite method as suggested by the
error message which should be good for any size of copy.
Yandex appears to ignore mime types set as part of the PUT request or
as part of a PATCH request.
The docs make no mention of being able to set a mime type, so set
WriteMimeType=false indicating the backend can't set mime types on
uploaded files.
This is done by making fs.Config private and attaching it to the
context instead.
The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
Before this change, small objects uploaded with SSE-AWS/SSE-C would
not have MD5 sums.
This change adds metadata for these objects in the same way that the
metadata is stored for multipart uploaded objects.
See: #1824#2827
If rclone is configured for server side encryption - either aws:kms or
sse-c (but not sse-s3) then don't treat the ETags returned on objects
as MD5 hashes.
This fixes being able to upload small files.
Fixes#1824
This shouldn't be read as encouraging the use of math/rand instead of
crypto/rand in security sensitive contexts, rather as a safer default
if that does happen by accident.
For some reason the API started returning some integers as strings in
JSON. This is probably OK in Javascript but it upsets Go.
This is easily fixed with the `json:"name,size"` struct tag.
Before this change a circular symlink would cause rclone to error out from the listings.
After this change rclone will skip a circular symlink and carry on the listing,
producing an error at the end.
Fixes#4743
This adds a context.Context parameter to NewFs and related calls.
This is necessary as part of reading config from the context -
backends need to be able to read the global config.
It seems that when doing chunked uploads to onedrive, if the chunks
take more than 3 minutes or so to upload then they may timeout with
error 504 Gateway Timeout.
This change produces an error (just once) suggesting lowering
`--onedrive-chunk-size` or decreasing `--transfers`.
This is easy to replicate with:
rclone copy -Pvv --bwlimit 0.05M 20M onedrive:20M
See: https://forum.rclone.org/t/default-onedrive-chunk-size-does-not-work/20010/
Minor wording change to help for explicit and implicit FTPS flags. More consistent between flags. Add 's' to request because only one 'client' mentioned.
As reported in
https://github.com/rclone/rclone/issues/4660#issuecomment-705502792
After switching to a password callback function, if the ssh connection
aborts and needs to be reconnected then the user is-reprompted for their
password. Instead we now remember the password they entered and just give
that back. We do lose the ability for them to correct mistakes, but that's
the situation from before switching to callbacks. We keep the benefits
of not asking for passwords until the SSH connection succeeds (right
known_hosts entry, for example).
This required a small refactor of how `f := &Fs{}` was built, so we can
store the saved password in the Fs object
Before this change rclone returned the size from the Stat call of the
link. On Windows this reads as 0 always, however on unix it reads as
the length of the text in the link. This caused errors like this when
syncing:
Failed to copy: corrupted on transfer: sizes differ 0 vs 13
This change causes Windows platforms to read the link and use that as
the size of the link instead of 0 which fixes the problem.
Based on Issue 4087
https://github.com/rclone/rclone/issues/4087
Current behaviour is insecure. If the user specifies this value then we
switch to validating the server hostkey and so can detect server changes
or MITM-type attacks.
This allows files to be copied by ID from google drive. These can be
copied to any rclone remote and if the remote is a google drive then
server side copy will be attempted.
Fixes#3625
This type of error is unlikely to be an error that can be resolved by a retry,
and is triggered in #2296 by files with a timestamp before the unix epoch.
The maximum value for the --s3--copy-cutoff should be 5GiB as tested
with AWS S3.
However b2 have implemented this as 5GB rather than 5GiB so having the
default at 5 GiB makes the b2s3 server side copy of a large file by
default.
This patch sets the default to 4768 MiB which is slightly less than
5GB.
This should have very little effect on anything.
If in future rclone can lower this limit more if Copy can multithread.
See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/76
Before this change the s3 multipart server side copy was not
preserving the metadata of the object. This was most noticeable
because the modtime was not preserved.
This change fetches the metadata from the object before starting the
copy and overwrites it if requires.
It will also mean any other metadata is preserved.
See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/70
Before this change, when the above backends created a new backend they
didn't put it into the backend cache.
This meant that rc commands acting on those backends did not work.
This was fixed by making sure the backends use the backend cache.
See: https://forum.rclone.org/t/rclone-rc-backend-command-not-working-as-expected/18834
Before this change writing with the all policy deadlocked while
uploading.
This change fixes the problem by fixing the multi reader, closing the
pipes at the correct time with the correct error. This is factored
into a new function as it was used twice.
This patch also adds a new test which tests the all policies.
Before this fix we were reading the hash from the upload using the
string "ETag", however the go runtime normalises the tag into "Etag"
so we were in fact always reading an empty string.
This bug was introduced in
aeea4430d5 swift: efficiency: slim Object and reduce requests on upload
It was spotted by the integration tests.
The fix was just to use the canonical form "Etag" instead of "ETag".
In this commit
a2afa9aadd fs: Add directory to optional Purge interface
We failed to encrypt the directory name so the Purge failed.
This was spotted by the integration tests.
In this commit:
cbf3d43561 drive: fix missing items when listing using --fast-list / ListR
We introduced a bug where under specific circumstances it could cause
a "panic: send on closed channel".
This was caused by:
- rclone engaging the workaround from the commit above
- one of the listing routines returning an error
- this caused the `in` channel to be closed to stop the readers
- however the workaround was recycling stuff into the `in` channel at the time
- hence the panic on closed channel
This fix factors out the sending to the `in` channel into `sendJob`
and calls this both from the master go routine and the list
runners. `sendJob` detects the `in` channel being closed properly and
also deals correctly with contention on the `in` channel.
Fixes#4511
When using `rclone authorize` the hostname doesn't get set in the
config file.
This commit allows it to be set in the configurator and gives the user
a hint that it needs setting.
This reverts part of
151f03378f s3: fix upload of single files into buckets without create permission
This erroneously assumed that a HEAD request on a non existent object
would return "NotFound" if the bucket was found. In fact it returns
"NotFound" when the bucket isn't found also.
This will break the fix for #4297 - however that can be made to work
using the new --s3-assume-bucket-exists flag
Before this change, rclone was looking for the file without the
extension to see if it existed which meant that it never did.
This change checks the destination file exists firsts, before removing
the extension.
Google drive appears to no longer be copying the modification time of
google docs.
Setting the mod time immediately after the copy doesn't work either,
so this patch copies the object, waits for 1 second and then sets the
modtime.
Fixes#4517
This was only working for files in the root directory and wasn't
looking at the encoding.
This is fixed to use NewObject which takes both things into account
and it makes the share by ID instead of by path.
This problem was spotted by the integration tests.
Before this change we errored out if one upstream errored in Purge or
About.
This change checks for fs.ErrorDirNotFound and skips that backend in
this case.
1. adds SharedOptions data structure to oauthutil
2. adds config.ConfigToken option to oauthutil.SharedOptions
3. updates the backends that have oauth functionality
Fixes#2849
After uploading a multipart object, rclone deletes any unused parts.
Probably as part of the listing unification, the detection of the
parts beloning to the current upload was failing and calling Update
was deleting the parts for the current object.
This change fixes the detection and deletes all the old parts but none
of the new ones now.
Fixes#4075