Commit Graph

33 Commits

Author SHA1 Message Date
88e30eecbf march: fix deadlock when using --no-traverse - fixes #8656
This ocurred whenever there were more than 100 files in the source due
to the output channel filling up.

The fix is not to use list.NewSorter but take more care to output the
dst objects in the same order the src objects are delivered. As the
src objects are delivered sorted, no sorting is needed.

In order not to cause another deadlock, we need to send nil dst
objects which is safe since this adjusts the termination conditions
for the channels.

Thanks to @jeremy for the test script the Go tests are based on.
2025-07-04 14:52:28 +01:00
013c563293 lib/transform: add transform library and --name-transform flag
lib/transform adds the transform library, supporting advanced path name
transformations for converting and renaming files and directories by applying
prefixes, suffixes, and other alterations.

It also adds the --name-transform flag for use with sync, copy, and move.

Multiple transformations can be used in sequence, applied in the order they are
specified on the command line.

By default --name-transform will only apply to file names. The means only the leaf
file name will be transformed. However some of the transforms would be better
applied to the whole path or just directories. To choose which which part of the
file path is affected some tags can be added to the --name-transform:

file	Only transform the leaf name of files (DEFAULT)
dir	Only transform name of directories - these may appear anywhere in the path
all	Transform the entire path for files and directories

Example syntax:
--name-transform file,prefix=ABC
--name-transform dir,prefix=DEF
2025-06-04 17:24:07 +01:00
41a407dcc9 march: split src and dst
splits m.key into separate functions for src and dst to prepare for
lib/transform which will want to do transforms on the src side only.

Co-Authored-By: Nick Craig-Wood <nick@craig-wood.com>
2025-06-04 17:24:07 +01:00
5173ca0454 march: fix syncing with a duplicate file and directory
As part of the out of memory syncing code, in this commit

0148bd4668 march: Implement callback based syncing

we changed the syncing method to use a sorted stream of directory
entries.

Unfortunately as part of this change the sort order of files and
directories became undefined.

This meant that if there existed both a file `foo` and a directory
`foo` in the same directory (as is common on object storage systems)
then these could be matched up incorrectly.

They could be matched up correctly like this

- `foo` (directory) - `foo` (directory)
- `foo` (file)      - `foo` (file)

Or incorrectly like this (one of many possibilities)

- no match          - `foo` (file)
- `foo` (directory) - `foo` (directory)
- `foo` (file)      - no match

Just depending on how the input listings were ordered.

This in turn made container based syncing with a duplicated file and
directory name erratic, deleting files when it shouldn't.

This patch ensures that directories always sync before files by adding
a suffix to the sort key depending on whether the entry was a file or
directory.
2025-06-04 10:54:31 +01:00
385465bfa9 sync: implement --list-cutoff to allow on disk sorting for reduced memory use
Before this change, rclone had to load an entire directory into RAM in
order to sort it so it could be synced.

With directories with millions of entries, this used too much memory.

This fixes the probem by using an on disk sort when there are more
than --list-cutoff entries in a directory.

Fixes #7974
2025-04-08 18:02:24 +01:00
0148bd4668 march: Implement callback based syncing
This changes the syncing method to take callbacks for directory
listings rather than being passed the entire directory listing at
once.

This will enable out of memory syncing.
2025-04-08 18:02:24 +01:00
401cf81034 build: modernize Go usage
This commit modernizes Go usage. This was done with:

go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...

Then files needed to be `go fmt`ed and a few comments needed to be
restored.

The modernizations include replacing

- if/else conditional assignment by a call to the built-in min or max functions added in go1.21
- sort.Slice(x, func(i, j int) bool) { return s[i] < s[j] } by a call to slices.Sort(s), added in go1.21
- interface{} by the 'any' type added in go1.18
- append([]T(nil), s...) by slices.Clone(s) or slices.Concat(s), added in go1.21
- loop around an m[k]=v map update by a call to one of the Collect, Copy, Clone, or Insert functions from the maps package, added in go1.21
- []byte(fmt.Sprintf...) by fmt.Appendf(nil, ...), added in go1.19
- append(s[:i], s[i+1]...) by slices.Delete(s, i, i+1), added in go1.21
- a 3-clause for i := 0; i < n; i++ {} loop by for i := range n {}, added in go1.22
2025-02-28 11:31:14 +00:00
8a6fc8535d accounting: fix global error acounting
fs.CountError is called when an error is encountered. The method was
calling GlobalStats().Error(err) which incremented the error at the
global stats level. This led to calls to core/stats with group= filter
returning an error count of 0 even if errors actually occured.

This change requires the context to be provided when calling
fs.CountError. Doing so, we can retrieve the correct StatsInfo to
increment the errors from.

Fixes #5865
2024-09-30 17:20:42 +01:00
88bd80c1fa march: Fix excessive parallelism when using --no-traverse
When using `--no-traverse` the march routines call NewObject on each
potential object in the destination.

The concurrency limiter was accidentally arranged so that there were
`--checkers` * `--checkers` NewObject calls going on at once.

This became obvious when using the sftp backend which used too many
connections.

Fixes #5824
2023-11-20 17:36:31 +00:00
bd787e8f45 filter: Fix incorrect filtering with UseFilter context flag and wrapping backends
In this commit

8d1fff9a82 local: obey file filters in listing to fix errors on excluded files

We started using filters in the local backend so the user could short
circuit troublesome files/directories at a low level.

However this caused a number of integration tests to fail. This turned
out to be in backends wrapping the local backend. For example the
combine backend test failed because it changes the paths passed to the
local backend so they no longer match the paths in the current filter.

To fix this, a new feature flag `FilterAware` was added and the
UseFilter context flag is only passed to backends which support it. As
the wrapping backends don't support the flag, this fixes the problems
in the integration tests.

In future the wrapping backends could modify the active filters to
match the path modifications and then they could set the FilterAware
flag.

See #6376
2022-09-05 16:19:50 +01:00
e43b5ce5e5 Remove github.com/pkg/errors and replace with std library version
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.

This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
2021-11-07 11:53:30 +00:00
a2545066e2 drive: constrain list by filter #5023
Google Drive API allows for clauses like "modifiedTime > '2012-06-04T12:00:00'"
in the query param, so the filter flags --max-age and --min-age can be applied
directly at the directory listing phase rather than in a filter.
This is extremely helpful when we want to do an incremental backup of a remote
drive with many files but the number of recently changed file is small.

Co-authored-by: fotile96 <fotile96@users.noreply.github.com>
2021-10-07 22:11:22 +03:00
1773717a47 fs/march: improve errors when root source/destination doesn't exist
See: https://forum.rclone.org/t/rclone-attempts-to-read-files-in-the-destination-directory-when-the-source-doesnt-exist/23412
2021-05-17 16:38:03 +01:00
a223b78872 fs: support multi-threads to head dst object
Signed-off-by: zhuc <zhucan.k8s@gmail.com>
2020-12-02 16:26:37 +00:00
c22d04aa30 filter: deglobalise to put filter config into the context #4685 2020-11-27 17:28:42 +00:00
2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
899c8e0697 march: added flag to allow Unicode filenames to remain unique
If your filenames contain two near-identical Unicode characters,
rclone will normalize these, making them identical. This flag
gives you the ability to keep them unique. This might
create unintended side effects, such as duplicating files that
contain certain Unicode characters, when downloading them from
certain cloud providers to a macOS filesystem.

Fixes #4228
2020-05-15 12:28:01 +01:00
207474abab sync: add --no-check-dest flag - fixes #3616 2019-12-29 16:47:57 +00:00
75a6c49f87 Fix error counter - fixes #3650
For few commands, RClone counts a error multiple times. This was fixed by
creating a new error type which keeps a flag to remember if the error has
already been counted or not. The CountError function now wraps the original
error eith the above new error type and returns it.
2019-11-18 14:13:02 +00:00
af05e290cf Fix --files-from without --no-traverse doing a recursive scan
In a28239f005 we made --files-from obey --no-traverse.  In the
process this caused --files-from without --no-traverse to do a
complete recursive scan unecessarily.

This was only noticeable in users of fs/march, so sync/copy/move/etc
not in ls/lsf/etc.

This fix makes sure that we use conventional directory listings in
fs/march unless `--files-from` and `--no-traverse` is set or
`--fast-list` is active.

Fixes #3619
2019-10-15 19:51:01 +01:00
6812844b3d march: Fix checking sub-directories when using --no-traverse 2019-08-13 19:30:56 +01:00
57d5de6fba build: fix up package paths after repo move
git grep -l github.com/ncw/rclone | xargs -d'\n' perl -i~ -lpe 's|github.com/ncw/rclone|github.com/rclone/rclone|g'
goimports -w `find . -name \*.go`
2019-07-28 18:47:38 +01:00
bc70bff125 fs/dirtree: factor DirTree out of fs/walk and add tests 2019-07-02 15:26:55 +01:00
cc0800a72e march: return errors when listing dirs
Partially fixes #3172
2019-07-01 15:32:46 +01:00
f78cd1e043 Add context propagation to rclone
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions

Context propagation is needed for stopping transfers and passing other
request-scoped values.
2019-06-19 11:59:46 +01:00
4b27c6719b fs: Allow sync of a file and a directory with the same name
When sorting fs.DirEntries we sort by DirEntry type and
when synchronizing files let the directories be before objects,
so when the destintation fs doesn't support duplicate names,
we will only lose duplicated object instead of whole directory.

The enables synchronisation to work with a file and a directory of the same name
which is reasonably common on bucket based remotes.
2019-06-09 15:57:05 +01:00
1124c423ee fs: add --ignore-case-sync for forced case insensitivity - fixes #2773 2019-06-03 21:12:10 +01:00
107293c80e copy,move: restore --no-traverse flag
The --no-traverse flag was not implemented when the new sync routines
(using the march package) was implemented.

This re-implements --no-traverse in march by trying to find a match
for each object with NewObject rather than from a directory listing.
2018-12-02 20:28:13 +00:00
e3c4ebd59a march: factor calling parameters into a structure 2018-12-02 18:07:26 +00:00
c5ac96e9e7 Make --files-from only read the objects specified and don't scan directories
Before this change using --files-from would scan all the directories
that the files could possibly be in causing rclone to do more work
that was necessary.

After this change, rclone constructs an in memory tree using the
--fast-list mechanism but from all of the files in the --files-from
list and without scanning any directories.

Any objects that are not found in the --files-from list are ignored
silently.

This mechanism is used for sync/copy/move (march) and all of the
listing commands ls/lsf/md5sum/etc (walk).
2018-10-20 18:13:31 +01:00
d178233e74 sync,march: check the cancel context on every channel send and receive
This fixes a deadlock on sync when all the copying channels receive a
Fatal Error.
2018-05-05 12:58:28 +01:00
80588a5a6b Replace "golang.org/x/net/context" with "context" for go1.7+ #2154 2018-04-07 11:42:08 +01:00
11da2a6c9b Break the fs package up into smaller parts.
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.

The new code layout is documented in CONTRIBUTING.
2018-01-15 17:51:14 +00:00