Before this change if we copied files of unknown size, then they lost
their metadata.
This was particularly noticeable using --s3-decompress.
This change adds metadata to Rcat and RcatSized and changes Copy to
pass the metadata in when it calls Rcat for an unknown sized input.
Fixes#6546
Before this patch the md5all option would skip creating metadata with
hashsum if base filesystem provided md5, in hope to pass it through.
However, if base hash is slow (for example on local fs), chunker passed
slow md5 but never reported this fact in features.
This patch makes chunker snapshot base hashsum in metadata when md5all is
set and base hashsum is slow since chunker was intended to provide only
instant hashsums from the start.
Fixes#5508
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
Some storage providers e.g. S3 don't have an efficient rename operation.
Before this change, when chunker finished an upload, the server-side copy
and delete operations that renamed temporary chunks to their final names
could take a significant amount of time.
This PR records transaction identifier (versioning) in the metadata of
chunker composite objects striving to remove the need for rename
operations on such backends.
This approach will be triggered be the new "transactions" configuration
option, which can be "rename" (the default) or "norename".
We implement the new approach for uploads (Put operations).
The chunker Move operation still uses the rename operation of
underlying backend. Filling this gap is left for a later PR.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
- fix test case FsNewObjectCaseInsensitive (PR #4830)
- continue PR #4917, add comments in metadata detection code
- add warning about metadata detection in user documentation
- change metadata size limits, make room for future development
- hide critical chunker parameters from command line
Note: chunker implements many irrelevant methods (UserInfo, Disconnect etc),
but they are required by TestIntegration/FsCheckWrap and cannot be removed.
Dropped API methods: MergeDirs DirCacheFlush PublicLink UserInfo Disconnect OpenWriterAt
Meta formats:
- renamed old simplejson format to wdmrcompat.
- new simplejson format supports hash sums and verification of chunk size/count.
Change list:
- split-chunking overlay for mailru
- add to all
- fix linter errors
- fix integration tests
- support chunks without meta object
- fix package paths
- propagate context
- fix formatting
- implement new required wrapper interfaces
- also test large file uploads
- simplify options
- user friendly name pattern
- set default chunk size 2G
- fix building with golang 1.9
- fix ci/cd on a separate branch
- fix updated object name (SyncUTFNorm failed)
- fix panic in Box overlay
- workaround: Box rename failed if name taken
- enhance comments in unit test
- fix formatting
- embed wrapped remote rather than inherit
- require wrapped remote to support move (or copy)
- implement 3 (keep fstest)
- drop irrelevant file system interfaces
- factor out Object.mainChunk
- refactor TestLargeUpload as InternalTest
- add unit test for chunk name formats
- new improved simplejson meta format
- tricky case in test FsIsFile (fix+ignore)
- remove debugging print
- hide temporary objects from listings
- fix bugs in chunking reader:
- return EOF immediately when all data is sent
- handle case when wrapped remote puts by hash (bug detected by TestRcat)
- chunked file hashing (feature)
- server-side copy across configs (feature)
- robust cleanup of temporary chunks in Put
- linear download strategy (no read-ahead, feature)
- fix unexpected EOF in the box multipart uploader
- throw error if destination ignores data