When doing a multipart upload or copy, if a InvalidBlobOrBlock error
is received, it can mean that there are uncomitted blocks from a
previous failed attempt with a different length of ID.
This patch makes rclone attempt to clear the uncomitted blocks and
retry if it receives this error.
Before this change, if a multipart upload was aborted, then rclone
would leave uncommitted blocks lying around. Azure has a limit of
100,000 uncommitted blocks per storage account, so when you then try
to upload other stuff into that account, or simply the same file
again, you can run into this limit. This causes errors like the
following:
BlockCountExceedsLimit: The uncommitted block count cannot exceed the
maximum limit of 100,000 blocks.
This change removes the uncommitted blocks if a multipart upload is
aborted or fails.
If there was an existing destination file, it takes care not to
overwrite it by recomitting already comitted blocks.
This means that the scheme for allocating block IDs had to change to
make them different for each block and each upload.
Fixes#5583
It was reported that rclone copy occasionally uploaded corrupted data
to azure blob.
This turned out to be a race condition updating the block count which
caused blocks to be duplicated.
This bug was introduced in this commit in v1.64.0 and will be fixed in v1.65.2
0427177857 azureblob: implement OpenChunkWriter and multi-thread uploads #7056
This race only seems to happen if `--checksum` is used but can happen otherwise.
Unfortunately Azure blob does not check the MD5 that we send them so
despite sending incorrect data this corruption is not detected. The
corruption is detected when rclone tries to download the file, so
attempting to copy the files back to local disk will result in errors
such as:
ERROR : file.pokosuf5.partial: corrupted on transfer: md5 hash differ "XXX" vs "YYY"
This adds a check to test the blocklist we upload is as we expected
which would have caught the problem had it been in place earlier.
This implements the OpenChunkWriter interface for azureblob which
enables multi-thread uploads.
This makes the memory controls of the s3 backend inoperative; they are
replaced with the global ones.
--azureblob-memory-pool-flush-time
--azureblob-memory-pool-use-mmap
By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
This commit switches from using the old Azure go modules
github.com/Azure/azure-pipeline-go/pipeline
github.com/Azure/azure-storage-blob-go/azblob
github.com/Azure/go-autorest/autorest/adal
To the new SDK
github.com/Azure/azure-sdk-for-go/
This stops rclone using deprecated code and enables the full range of
authentication with Azure.
See #6132 and #5284