In c5ac96e9e7 we made --files-from only read the objects specified and
don't scan directories.
This caused problems with Google drive (very very slow) and B2
(excessive API consumption) so it was decided to make the old
behaviour (traversing the directories) the default with --files-from
and use the existing --no-traverse flag (which has exactly the right
semantics) to enable the new non scanning behaviour.
See: https://forum.rclone.org/t/using-files-from-with-drive-hammers-the-api/8726Fixes#3102Fixes#3095
Before the fix we were only de-duping the ListR batches.
Afterwards we dedupe everything.
This will have the consequence that rclone uses more memory as it will
build a map of all the directory names, not just the names in a given
directory.
This dramatically increases the speed (7x in my tests) of the de-dupe
as google drive supports ListR directly and dedupe did not work with
`--fast-list`.
Fixes#2902
This will increase speed for backends which support ListR and will not
have the memory overhead of using --fast-list.
It also means that errors are queued until the end so as much of the
remote will be listed as possible before returning an error.
Commands affected are:
- lsf
- ls
- lsl
- lsjson
- lsd
- md5sum/sha1sum/hashsum
- size
- delete
- cat
- settier
It otherwise has the nearly the same interface as walk.Walk which it
will fall back to if it can't use ListR.
Using walk.ListR will speed up file system operations by default and
use much less memory and start immediately compared to if --fast-list
had been supplied.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
This brings it up to par with lsjson.
This commit also reworks the framework to use ListJSON internally
which removes duplicated code and makes testing easier.
This puts a shim on the reader opened by Copy so that if an error is
returned, the reader is re-opened at the correct seek point.
This should make downloading very large files more reliable.
This replaces the `sync.Pool` allocator with lib/pool. This
implements a pool of buffers of up to 64MB which can be re-used but is
flushed every 5 seconds.
If `--use-mmap` is set then rclone will use mmap for memory
allocations which is much better at returning memory to the OS.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
This fixes several things wrong with the layout of the stats.
Transfers which haven't started are printed in the same format as
those which have so the stats with `--progress` don't show horrible
artifacts.
Checkers and transfers now get a ": checkers" and ": transfers" label
on the end of the stats line. Transfers will have the transfer stats
when the transfer has started instead of this.
There was a bug in the routine which shortened the file names (it
always produces strings 1 too long). This is now fixed with a test.
The formatting string was wrong with a fixed width of 45 - this is now
replaces with the value of `--stats-file-name-length`.
This also meant that there were unecessary leading spaces in the file
names. So the default `--stats-file-name-length` was raised to 45
from 40.
Cookies are handled by cookiejar in memory with fshttp module through
the entire session.
One useful scenario is, with HTTP storage system where index server
adds authentication cookie while redirecting to CDN for actual files.
Also, it can be helpful to reuse fshttp in other storage systems
requiring cookie.
This means that rclone will pick up tokens from concurrently running
rclones. This helps for Box which only allows each refresh token to
be used once.
Without this fix, rclone caches the refresh token at the start of the
run, then when the token expires the refresh token may have been used
already by a concurrently running rclone.
This also will retry the oauth up to 5 times at 1 second intervals.
See: https://forum.rclone.org/t/box-token-refresh-timing/8175
Before this change rclone would read the list of files from the
files-from parameter and check they existed one at a time. This could
take a very long time for lots of files.
After this change, rclone will check up to --checkers in parallel.
The --no-traverse flag was not implemented when the new sync routines
(using the march package) was implemented.
This re-implements --no-traverse in march by trying to find a match
for each object with NewObject rather than from a directory listing.
Before this change TestPurge would remove a container and subsequent
tests would fail because the container was still being deleted so
couldn't be created.
This was fixed by introducing an fstest.NewRunIndividual() test runner
for TestPurge which causes the test to be run on a new container.
If `--rc-user` or `--rc-pass` is set then the URL that is opened with
`--rc-files` will have the authorization in the URL in the
`http://user:pass@localhost/` style.
Before this change, Purge on the fallback path would try to delete
directories starting from the root rather than the dir passed in.
Rmdirs would also attempt to delete the root.
Before this change using --files-from would scan all the directories
that the files could possibly be in causing rclone to do more work
that was necessary.
After this change, rclone constructs an in memory tree using the
--fast-list mechanism but from all of the files in the --files-from
list and without scanning any directories.
Any objects that are not found in the --files-from list are ignored
silently.
This mechanism is used for sync/copy/move (march) and all of the
listing commands ls/lsf/md5sum/etc (walk).
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Before this change remotes without server side Move (eg swift, s3,
gcs) would not be able to rename files.
After it means nearly all remotes will be able to rename files on
rclone mount with the notable exceptions of b2 and yandex.
This changes checks to see if the remote can do Move or Copy then
calls `operations.Move` to do the actual move. This will do a server
side Move or Copy but won't download and re-upload the file.
It also checks to see if the destination exists first which avoids
conflicts or duplicates.
Fixes#1965Fixes#2569
Before this change the moving average for the individual file stats
would start at 0 and only converge to the correct value over 15-30
seconds.
This change starts the weighting period as 1 and moves it up once per
sample which gets the average to a better value instantly.
Previous to this change package used for this
github.com/VividCortex/ewma took a 0 average to mean reset the
statistics. This happens quite often when transferring files though a
buffer.
Replace that implementation with a simple home grown one (with about
the same constant), without that feature.
This change allows remotes to be created on the fly without a config
file by using the remote type prefixed with a : as the remote name, Eg
:s3: to make an s3 remote.
This assumes the user is supplying the backend config via command line
flags or environment variables.
The race detector currently detects a race with len(chan) against
close(chan).
See: https://github.com/golang/go/issues/27070
Skip the tests which trip this bug under the race detector.
--max-backlog controls the queue length.
Add statistics for the check/upload/rename queues.
This means that checking can complete before the uploads which will
give rclone the ability to show exactly what is outstanding.
This unifies the 3 methods of reading config
* command line
* environment variable
* config file
And allows them all to be configured in all places. This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.
The backend changes are:
* Use the new configmap.Mapper parameter
* Use configstruct to parse it into an Options struct
* Add all config to []fs.Option including defaults and help
* Remove all uses of pflag
* Remove all uses of config.FileGet
Before this change if the rclone was running in an environment which
couldn't find the HOME directory, it would print a warning about
supplying a --config flag even if the user had done so.
Before this copyto would parse windows paths incorrectly.
This change moves the parsing code into fspath and makes sure
fspath.Split calls fspath.Parse which does the parsing correctly for
This also renames fspath.RemoteParse to fspath.Parse for consistency
--one-way argument will check that all files on source matches the files on detination,
but not the other way. For example files present on destination but not on source will not
trigger an error.
Fixes: #1526
A deadlock could occur since we have now put a mutex on GetBytes from
StatsInfo.String (s.mu) - progress (acc.statmu) and read (acc.statmu)
- GetBytes (s.mu).
Fix this by giving stringSet its own locking and excluding the call
which caused the deadlock from the mutex in StatsInfo.String.
- make Close permanent and return errors afterwards
- use RangeSeek from the wrapped reader if present
- add a limit to chunk growth
- correct RangeSeek interface behavior
- add tests
Unfortunately this commit attempts to create every directory rather
than just the empty ones, so will need re-working.
Removing this feature for the 1.41 release
This reverts commit 0daced29db.
Somehow in the code reorganisation of
11da2a6c9b the check for --min-age and
--max-age got switched around. This commit fixes that and means you
can use --min-age and --max-age together.
* Implement about for:
* local, crypt, cache, drive, swift, hubic, onedrive, pcloud, dropbox
* Implement `--json` and `---full` flag for `rclone about`
* change About interface to return a Usage structure
* Remove operations.About as it is too thin an interface
* Implement Integration test
Relates to #1138 and #1564
This introduces a method of making provider specific configuration
within a remote. This is useful particularly in s3.
This commit does the basic configuration in S3 for IBM COS.
This problem was introduced with eca99b33c0. It seems Box is the only
remote which converts time zones, so if you give it a GMT time zone,
it returns a PST time zone which represents the same instant.
This implements a remote control protocol activated with the --rc flag
and a new command `rclone rc` to use that interface.
Still to do
* docs - need finishing
* tests
This is a problem when syncing a file which just needed its modtime
set with dropbox which can't set the mod time of a file without
re-uploading it.
Before this change we would delete the file, then the server side move
would fail moving the file to the backup-dir because it no longer
existed.
After this change the destination file is moved to the backup-dir
instead of being deleted and the new file is uploaded.
Fixes#2134
This removes the old system of part accounting and replaces it with a
system of popping off the accounting reader and wrapping up new ones
as necessary.
This makes it much easier to carry the context down the chain of
wrapped readers and get the limiting as near as possible to the
output. This makes the accounting more accurate and the bandwidth
limiting smoother.
Fixes#2029 and Fixes#1443
A Range request can never request 0 bytes however this change was made
to make a clearer signal that the limit means read to the end.
Add test and more documentation and fixup uses
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.
The new code layout is documented in CONTRIBUTING.
Before this if the client_id/client_secret was edited it would
disappear when asking for the new token.
This means the post config is done after the user has confirmed the
config is OK which can't be helped.
RepeatableReaderSized has a pre-allocated buffer which should help
with memory usage - before it grew the buffer. Since we know the size
of the chunks, pre-allocating it should be much more efficient.
RepeatableReaderBuffer uses the buffer passed in.
RepeatableLimit* are convenience funcitions for wrapping a reader in
an io.LimitReader and then a RepeatableReader with the same buffer
size.
Now --dump-flag is written as --dump flag. This is a comma separated list which can contain
* headers - HTTP headers as before
* bodies - HTTP bodies as before
* requests - HTTP request bodies
* responses - HTTP response bodies
* auth - HTTP auth
* filters - Filter rexeps
Leave --dump-headers and --dump-bodies for the time being but remove
the other --dump-* flags as they aren't used very often.
This was leaking goroutines in the short file case beause it wasn't
calling Close() on the Account object. This became apparent when
testing with mount.
Previously config sub commands were manually parsed rather than using
cobra.
Make config command have the following sub commands:
* create Create a new remote with name, type and options.
* delete Delete an existing remote <name>.
* dump Dump the config file as JSON.
* edit Enter an interactive configuration session.
* file Show path of configuration file in use.
* providers List in JSON format all the providers and options.
* show Print (decrypted) config file, or the config for a single remote.
* update Update options in an existing remote.
The following changes were made to existing commands
* listproviders was renamed to providers
* listoptions was removed in favour of providing the output in providers
* jsonconfig was renamed to create
* an optional parameter was added to the show command