Commit Graph

80 Commits

Author SHA1 Message Date
Nick Craig-Wood
2a817e21cb vfs: fix excess CPU used by VFS cache cleaner looping
Before this change the VFS cache cleaner would loop indefinitely while
the cache was above quota. This used up all the CPU.

This fix prevents the cache cleaner from looping. It will be kicked on
ENOSPACE and run in its scheduled time otherwise so this should be
sufficient.

See: https://forum.rclone.org/t/vfs-keeps-checking-same-files/32120
2022-08-04 10:19:47 +01:00
Nick Craig-Wood
a07d376fb1 vfs: reduce memory usage by re-ordering commonly used structures 2022-08-04 10:19:47 +01:00
Nick Craig-Wood
e749bc58f4 vfs: reduce memory use by embedding sync.Cond 2022-08-04 10:19:47 +01:00
Nick Craig-Wood
bc705e14d8 vfscache: fix fatal error: sync: unlock of unlocked mutex error
This message is a double panic and was actually caused by an assertion
panic in:

vfs/vfscache/downloaders/downloaders.go

This is triggered by the code added relatively recently to fix a bug
with renaming files:

ec72432cec vfs: fix failed to _ensure cache internal error: downloaders is nil error

So it appears that item.o may be nil at this point.

This patch detects item.o being nil and fetches it again with NewObject.

Fixes #6190 Fixes #6235
2022-06-21 14:28:53 +01:00
albertony
ec117593f1 Fix lint issues reported by staticcheck
Used staticcheck 2022.1.2 (v0.3.2)

See: staticcheck.io
2022-06-13 21:13:50 +02:00
albertony
0e77072dcc vfs: error strings should not be capitalized 2022-05-16 12:43:43 +02:00
albertony
2437eb3cce vfs: fix incorrect detection of root in parent directory utility function
When using filepath.Dir, a difference to path.Dir is that it returns os PathSeparator
instead of slash when the path consists entirely of separators.

Also fixed casing of the function name, use OS in all caps instead of Os
as recommended here: https://github.com/golang/go/wiki/CodeReviewComments#initialisms
2022-05-16 12:43:43 +02:00
Nick Craig-Wood
d4da9b98d6 vfs: add --vfs-fast-fingerprint for less accurate but faster fingerprints 2022-03-22 16:33:24 +00:00
Nick Craig-Wood
ec72432cec vfs: fix failed to _ensure cache internal error: downloaders is nil error
This error was caused by renaming an open file.

When the file was renamed in the cache, the downloaders were cleared,
however the downloaders were not re-opened when needed again, instead
this error was generated.

This fix re-opens the downloaders if they have been closed by renaming
the file.

Fixes #5984
2022-03-03 17:43:29 +00:00
Bumsu Hyeon
4b99e84242
vfs/cache: fix handling of special characters in file names (#5875) 2022-01-13 13:23:25 +01:00
Nick Craig-Wood
d252816706 vfs: add vfs/stats remote control to show statistics - fixes #5816 2021-11-23 18:00:21 +00:00
Nick Craig-Wood
e43b5ce5e5 Remove github.com/pkg/errors and replace with std library version
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.

This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
2021-11-07 11:53:30 +00:00
Atílio Antônio
c08d48a50d
docs: improve grammar and fix typos (#5361)
This alters some comments in source files, but is interested mainly in documentation files and help messages.
2021-11-04 12:50:43 +01:00
albertony
f3e71f129c config: convert --cache-dir value to an absolute path 2021-10-11 15:08:35 +02:00
albertony
fbc7f2e61b lib/file: improve error message when attempting to create dir on nonexistent drive on windows
This replaces built-in os.MkdirAll with a patched version that stops the recursion
when reaching the volume part of the path. The original version would continue recursion,
and for extended length paths end up with \\? as the top-level directory, and the error
message would then be something like:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
2021-10-01 23:18:39 +02:00
albertony
3a2f748aeb vfs: ensure names used in cache path are legal on current os
Fixes #5360
2021-08-19 20:14:50 +02:00
albertony
18be4ad10d vfs: fix issue where empty dirs would build up in cache meta dir 2021-08-19 20:14:50 +02:00
Nick Craig-Wood
c86a55c798 vfs: fix duplicates on rename - fixes #5469
Before this change, if there was an existing file being uploaded when
a file was renamed on top of it, then both would be uploaded. This
causes a duplicate in Google Drive as both files get uploaded at the
same time. This was triggered reliably by LibreOffice saving doc
files.

This fix removes any duplicates in the upload queue on rename.
2021-07-30 19:31:02 +01:00
Nick Craig-Wood
16d1da2c1e vfs: remove item.metaDirty as it was confusing and not used
See discussion in #5277
2021-04-28 09:33:22 +01:00
Nick Craig-Wood
00a0ee1899 vfs: fix modtime changing when reading file into cache - fixes #5277
Before this change but after:

aea8776a43 vfs: fix modtimes not updating when writing via cache #4763

When a file was opened read-only the modtime was read from the cached
file. However this modtime wasn't correct leading to an incorrect
result.

This change fixes the definition of `item.IsDirty` to be true only
when the data is dirty. This fixes the problem as a read only file
isn't considered dirty.
2021-04-28 09:33:22 +01:00
albertony
2925e1384c Use binary prefixes for size and rate units
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
2021-04-27 02:25:52 +03:00
Leo Luan
8f23cae1c0 vfs: Add cache reset for --vfs-cache-max-size handling at cache poll interval
The vfs-cache-max-size parameter is probably confusing to many users.
The cache cleaner checks cache size periodically at the --vfs-cache-poll-interval
(default 60 seconds) interval and remove cache items in the following order.

(1) cache items that are not in use and with age > vfs-cache-max-age
(2) if the cache space used at this time still is larger than
vfs-cache-max-size, the cleaner continues to remove cache items that are
not in use.

The cache cleaning process does not remove cache items that are currently in use.
If the total space consumed by in-use cache items exceeds vfs-cache-max-size, the
periodical cache cleaner thread does not do anything further and leaves the in-use
cache items alone with a total space larger than vfs-cache-max-size.

A cache reset feature was introduced in 1.53 which resets in-use (but not dirty,
i.e., not being updated) cache items when additional cache data incurs an ENOSPC
error.  But this code was not activated in the periodical cache cleaning thread.

This patch adds the cache reset step in the cache cleaner thread during cache
poll to reset cache items until the total size of the remaining cache items is
below vfs-cache-max-size.
2021-04-26 17:55:52 +01:00
Nick Craig-Wood
2a40f00077 vfs: fix a code path which allows dirty data to be removed causing data loss
Before this change the VFS layer could remove a locally cached file
even if it had data which needed to be written back, thus causing data loss.

See: https://forum.rclone.org/t/rclone-1-55-doesnt-save-file-changes-if-the-file-has-been-reopened-during-upload-google-drive-mount/23646
2021-04-20 16:36:38 +01:00
Nick Craig-Wood
a4c4ddf052 vfs: rename files in cache and cancel uploads on directory rename
Before this change rclone did not cancel an uploads or rename the
cached files in the directory cache when a directory was renamed.

This caused issues with uploads arriving in the wrong place on bucket
based file systems.

See: https://forum.rclone.org/t/after-a-directory-renmane-using-mv-files-are-not-visible-any-longer/22797
2021-03-22 09:07:01 +00:00
Nick Craig-Wood
aea8776a43 vfs: fix modtimes not updating when writing via cache - fixes #4763
This reads modtime from a dirty cache item if it exists. This mirrors
the way reading the size works.

This fixes the mod time not updating when the file is written, only
when the writeback completes.

See: https://forum.rclone.org/t/rclone-mount-and-changing-timestamps-after-writes/22629
2021-03-16 13:31:47 +00:00
Nick Craig-Wood
687a3b1832 vfs: fix data race discovered by the race detector
This fixes a place where we read from item.o without the item.mu held.
2021-03-15 19:22:07 +00:00
albertony
459cc70a50 vfs: fix invalid cache path on windows when using :backend: as remote
The initial ':' is included in the ad-hoc remote name, but is illegal character
in Windows path. Replacing it with '^', which is legal in filesystems but illegal
in regular remote names, so name conflict is avoided.

Fixes #4544
2021-01-30 16:18:15 +00:00
Nick Craig-Wood
b80d498304 vfs: fix file leaks with --vfs-cache-mode full and --buffer-size 0
Before this change using --vfs-cache-mode full and --buffer-size 0
together caused the vfs downloader to open more and more downloaders.

This is fixed by introducing a minimum size of 1M for the window to
look for an existing downloader.

Fixes #4892
2021-01-21 18:35:04 +00:00
Nick Craig-Wood
4f8ee736b1 vfs: make cache dir absolute before using it to fix path too long errors
If --cache-dir is passed in as a relative path, then rclone will not
be able to turn it into a UNC path under Windows, which means that
file names longer than 260 chars will fail when stored in the cache.

This patch makes the --cache-dir path absolute before using it.

See: https://forum.rclone.org/t/handling-of-long-paths-on-windows-260-characters/20913
2020-12-11 10:00:51 +00:00
Nick Craig-Wood
2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
Nick Craig-Wood
2347762b0d vfs: fix "file already exists" error for stale cache files
Before this change if a file was uploaded through a mount, then
deleted externally, trying to upload that file again could give EEXIST
"file already exists".

This was because the file already existing in the cache was confusing
rclone into thinking it already had the file.

The fix is to check that if rclone has a stale cache file then to
ignore it in this situation.

See: https://forum.rclone.org/t/rclone-cant-reuse-filenames/20400
2020-11-13 10:32:21 +00:00
Nick Craig-Wood
1fb6ad700f accounting: add context.Context #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood
d846210978 fs: Add context to NewFs #3257 #4685
This adds a context.Context parameter to NewFs and related calls.

This is necessary as part of reading config from the context -
backends need to be able to read the global config.
2020-11-09 18:05:54 +00:00
Josh Soref
d0888edc0a Spelling fixes
Fix spelling of: above, already, anonymous, associated,
authentication, bandwidth, because, between, blocks, calculate,
candidates, cautious, changelog, cleaner, clipboard, command,
completely, concurrently, considered, constructs, corrupt, current,
daemon, dependencies, deprecated, directory, dispatcher, download,
eligible, ellipsis, encrypter, endpoint, entrieslist, essentially,
existing writers, existing, expires, filesystem, flushing, frequently,
hierarchy, however, implementation, implements, inaccurate,
individually, insensitive, longer, maximum, metadata, modified,
multipart, namedirfirst, nextcloud, obscured, opened, optional,
owncloud, pacific, passphrase, password, permanently, persimmon,
positive, potato, protocol, quota, receiving, recommends, referring,
requires, revisited, satisfied, satisfies, satisfy, semver,
serialized, session, storage, strategies, stringlist, successful,
supported, surprise, temporarily, temporary, transactions, unneeded,
update, uploads, wrapped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-14 15:21:31 +01:00
Leo Luan
c5c56cda02 vfs: Add a missed update of used cache space
The missed update can cause incorrect before-cleaning cache stats
and a pre-mature condition broadcast in purgeOld before the cache
space use is reduced below the quota.
2020-10-06 16:35:23 +01:00
Leo Luan
2295123cad vfs: Add exponential backoff during ENOSPC retries
Add an exponentially increasing delay during retries up ENOSPC error
to avoid exhausting the 10 retries too soon when the cache space
recovery from item resets is not available from the file system yet
or consumed by other large cache writes.
2020-10-06 16:35:23 +01:00
Leo Luan
ff0280c0cb vfs: Fix missed concurrency control between some item operations and reset
Item reset is invoked by cache cleaner for synchronous recovery
from ENOSPC errors. The reset operation removes the cache file and
closes/reopens the downloaders.  Although most parts of reset and
other item operations are done with the item mutex held, the mutex
is released during fd.WriteAt and downloaders calls. We used preAccess
and postAccess calls to serialize Reset, ReadAt, and Open, but missed
some other item operations. The patch adds preAccess/postAccess
calls in Sync, Truncate, Close, WriteAt, and rename.
2020-10-06 16:35:23 +01:00
Leo Luan
64d736a57b vfs: Fix a race condition in retryFailedResets
A failed item reset is saved in the errItems for retryFailedResets
to process.  If the item gets closed before the retry, the item may
have been removed from the c.item array. Previous code did not
account for this condition. This patch adds the check for the
exitence of the retry items in retryFailedResets.
2020-10-06 16:35:23 +01:00
Leo Luan
5f1d5a1897 vfs: Fix a deadlock vulnerability in downloaders.Close
The downloaders.Close() call acquires the downloaders' mutex before
calling the wait group wait and the main downloaders thread has a
periodical (5 seconds interval) call to kick its waiters and the
waiter dispatch function tries to get the mutex. So a deadlock can
occur if the Close() call starts, gets the mutex, while the main
downloader thread already got the timer's tick and proceeded to
call kickWaiters. The deadlock happens when the Close call gets
the mutex between the timer's kick and the main downloader thread
gets the mutex first. So it's a pretty short period of time and
it probably explains why the problem has not surfaced, maybe
something like tens of nanoseconds out of 5 seconds (~10^^-8).
It took 5 days of continued stressing the Close calls for the
deadlock to appear.
2020-10-06 16:35:23 +01:00
Hekmon
66def93373 mount cmd: update systemd status with cache stats 2020-10-06 16:21:30 +01:00
Nick Craig-Wood
18ccf0f871 vfs: detect and recover from a file being removed externally from the cache
Before this change if a file was removed from the cache while rclone
is running then rclone would not notice and proceed to re-create it
full of zeros.

This change notices files that we expect to have data in going missing
and if they do logs an ERROR recovers.

It isn't recommended deleting files from the cache manually with
rclone running!

See: https://forum.rclone.org/t/corrupted-data-streaming-after-vfs-meta-files-removed/18997
Fixes #4602
2020-09-18 10:30:02 +01:00
Nick Craig-Wood
6a56ac1032 vfs,local: Log an ERROR if we fail to set the file to be sparse
See: https://forum.rclone.org/t/rclone-1-53-release/18880/73
2020-09-11 15:36:47 +01:00
Nick Craig-Wood
27b9ae4fc3 vfs: fix spurious error "vfs cache: failed to _ensure cache EOF"
Before this change the error message was produced for every file which
was confusing users.

After this change we check for EOF and return from ReadAt at that
point.

See: https://forum.rclone.org/t/rclone-1-53-release/18880/10
2020-09-03 10:25:00 +01:00
Sam Edwards
23b2c58018 vfs: Quiet removeNotInUse logging to debug when not removing 2020-09-02 11:55:20 +01:00
Leo Luan
c665201b85 vfs: support synchronous cache space recovery upon ENOSPC
This patch provides the support of synchronous cache space recovery
to allow read threads to recover from ENOSPC errors when cache space
can be recovered from cache items that are not in use or safe to be
reset/emptied .

The patch complements the existing cache cleaning process in two ways.

Firstly, the existing cache cleaning process is time-driven that runs
periodically. The cache space can run out while the cache cleaner
thread is still waiting for its next scheduled run. The io threads
encountering ENOSPC return an internal error to the applications
in this case even when cache space can be recovered to avoid this
error. This patch addresses this problem by having the read threads
kick the cache cleaner thread in this condition to recover cache
space preventing unnecessary ENOSPC errors from being seen by the
applications.

Secondly, this patch enhances the cache cleaner to support cache
item reset. Currently the cache purge process removes cache
items that are not in use. This may not be sufficient when the
total size of the working set exceeds the cache directory's
capacity. Like in the current code, this patch starts the purge
process by removing cache files that are not in use. Cache items
whose access times are older than vfs-cache-max-age are removed first.
After that, other not-in-use items are removed in LRU order until
vfs-cache-max-size is reached. If the vfs-cache-max-size (the quota)
is still not reached at this time, this patch adds a cache reset
step to reset/empty cache files that are still in use but not
dirtied.  This enables application processes to continue without
seeing an error even when the working set depletes the cache space
as long as there is not a large write working set hoarding the
entire cache space.

By design this patch does not add ENOSPC error recovery for write
IOs. Rclone does not empty a write cache item until the file data
is written back to the backend upon close. Allowing more cache
space to be consumed by dirty cache items when the cache space is
already running low would increase the risk of exhausting the cache
space in a way that the vfs mount becomes unreadable.
2020-08-25 21:12:06 +01:00
Nick Craig-Wood
94a0991584 vfs: set the modtime of the cache file immediately
Before this change we set the modtime of the cache file when all
writers had finished.

This has the unfortunate effect that the file is uploaded with the
wrong modtime which means on backends which can't set modtimes except
when uploading files it is wrong.

This change sets the modtime of the cache file immediately in the
cache and in turn sets the modtime in the file info.
2020-08-20 16:24:04 +01:00
Nick Craig-Wood
4d7f91309b vfs: fix download threads timing out
Before this fix, download threads would fill up the buffer and then
timeout even though data was still being read from them. If the client
was streaming slower than network speed this caused the downloader to
stop and be restarted continuously. This caused more potential for
skips in the download and unecessary network transactions.

This patch fixes that behaviour - as long as a downloader is being
read from more often than once every 5 seconds, it won't timeout.

This was done by:

- kicking the downloader whenever ensureDownloader is called
- making the downloader loop if it has already downloaded past the maxOffset
- making setRange() always kick the downloader
2020-08-06 17:26:18 +01:00
Nick Craig-Wood
109b695621 vfs: add --vfs-read-ahead parameter for use with --vfs-cache-mode full
This parameter causes extra read-ahead over --buffer-size which is not
buffered in memory but on disk.
2020-08-06 17:26:18 +01:00
Nick Craig-Wood
421585dd72 accounting: add context to Account and propagate changes #3257
This is preparation for getting the Accounting to check the context,
buf first we need to get it in place. Since this is one of those
changes that makes lots of noise, this is in a seperate commit.
2020-07-28 16:41:17 +01:00
Nick Craig-Wood
b2f4f52b64 vfs cache: make logging consistent and remove some debug logging 2020-07-06 17:32:53 +01:00