Commit Graph

64 Commits

Author SHA1 Message Date
albertony
478434ffef vfs: fix issue where empty dirs would build up in cache meta dir 2021-09-18 12:20:47 +01:00
Nick Craig-Wood
d0de426500 vfs: fix duplicates on rename - fixes #5469
Before this change, if there was an existing file being uploaded when
a file was renamed on top of it, then both would be uploaded. This
causes a duplicate in Google Drive as both files get uploaded at the
same time. This was triggered reliably by LibreOffice saving doc
files.

This fix removes any duplicates in the upload queue on rename.
2021-07-28 16:45:26 +01:00
Nick Craig-Wood
16d1da2c1e vfs: remove item.metaDirty as it was confusing and not used
See discussion in #5277
2021-04-28 09:33:22 +01:00
Nick Craig-Wood
00a0ee1899 vfs: fix modtime changing when reading file into cache - fixes #5277
Before this change but after:

aea8776a43 vfs: fix modtimes not updating when writing via cache #4763

When a file was opened read-only the modtime was read from the cached
file. However this modtime wasn't correct leading to an incorrect
result.

This change fixes the definition of `item.IsDirty` to be true only
when the data is dirty. This fixes the problem as a read only file
isn't considered dirty.
2021-04-28 09:33:22 +01:00
albertony
2925e1384c Use binary prefixes for size and rate units
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
2021-04-27 02:25:52 +03:00
Leo Luan
8f23cae1c0 vfs: Add cache reset for --vfs-cache-max-size handling at cache poll interval
The vfs-cache-max-size parameter is probably confusing to many users.
The cache cleaner checks cache size periodically at the --vfs-cache-poll-interval
(default 60 seconds) interval and remove cache items in the following order.

(1) cache items that are not in use and with age > vfs-cache-max-age
(2) if the cache space used at this time still is larger than
vfs-cache-max-size, the cleaner continues to remove cache items that are
not in use.

The cache cleaning process does not remove cache items that are currently in use.
If the total space consumed by in-use cache items exceeds vfs-cache-max-size, the
periodical cache cleaner thread does not do anything further and leaves the in-use
cache items alone with a total space larger than vfs-cache-max-size.

A cache reset feature was introduced in 1.53 which resets in-use (but not dirty,
i.e., not being updated) cache items when additional cache data incurs an ENOSPC
error.  But this code was not activated in the periodical cache cleaning thread.

This patch adds the cache reset step in the cache cleaner thread during cache
poll to reset cache items until the total size of the remaining cache items is
below vfs-cache-max-size.
2021-04-26 17:55:52 +01:00
Nick Craig-Wood
2a40f00077 vfs: fix a code path which allows dirty data to be removed causing data loss
Before this change the VFS layer could remove a locally cached file
even if it had data which needed to be written back, thus causing data loss.

See: https://forum.rclone.org/t/rclone-1-55-doesnt-save-file-changes-if-the-file-has-been-reopened-during-upload-google-drive-mount/23646
2021-04-20 16:36:38 +01:00
Nick Craig-Wood
a4c4ddf052 vfs: rename files in cache and cancel uploads on directory rename
Before this change rclone did not cancel an uploads or rename the
cached files in the directory cache when a directory was renamed.

This caused issues with uploads arriving in the wrong place on bucket
based file systems.

See: https://forum.rclone.org/t/after-a-directory-renmane-using-mv-files-are-not-visible-any-longer/22797
2021-03-22 09:07:01 +00:00
Nick Craig-Wood
aea8776a43 vfs: fix modtimes not updating when writing via cache - fixes #4763
This reads modtime from a dirty cache item if it exists. This mirrors
the way reading the size works.

This fixes the mod time not updating when the file is written, only
when the writeback completes.

See: https://forum.rclone.org/t/rclone-mount-and-changing-timestamps-after-writes/22629
2021-03-16 13:31:47 +00:00
Nick Craig-Wood
687a3b1832 vfs: fix data race discovered by the race detector
This fixes a place where we read from item.o without the item.mu held.
2021-03-15 19:22:07 +00:00
albertony
459cc70a50 vfs: fix invalid cache path on windows when using :backend: as remote
The initial ':' is included in the ad-hoc remote name, but is illegal character
in Windows path. Replacing it with '^', which is legal in filesystems but illegal
in regular remote names, so name conflict is avoided.

Fixes #4544
2021-01-30 16:18:15 +00:00
Nick Craig-Wood
b80d498304 vfs: fix file leaks with --vfs-cache-mode full and --buffer-size 0
Before this change using --vfs-cache-mode full and --buffer-size 0
together caused the vfs downloader to open more and more downloaders.

This is fixed by introducing a minimum size of 1M for the window to
look for an existing downloader.

Fixes #4892
2021-01-21 18:35:04 +00:00
Nick Craig-Wood
4f8ee736b1 vfs: make cache dir absolute before using it to fix path too long errors
If --cache-dir is passed in as a relative path, then rclone will not
be able to turn it into a UNC path under Windows, which means that
file names longer than 260 chars will fail when stored in the cache.

This patch makes the --cache-dir path absolute before using it.

See: https://forum.rclone.org/t/handling-of-long-paths-on-windows-260-characters/20913
2020-12-11 10:00:51 +00:00
Nick Craig-Wood
2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
Nick Craig-Wood
2347762b0d vfs: fix "file already exists" error for stale cache files
Before this change if a file was uploaded through a mount, then
deleted externally, trying to upload that file again could give EEXIST
"file already exists".

This was because the file already existing in the cache was confusing
rclone into thinking it already had the file.

The fix is to check that if rclone has a stale cache file then to
ignore it in this situation.

See: https://forum.rclone.org/t/rclone-cant-reuse-filenames/20400
2020-11-13 10:32:21 +00:00
Nick Craig-Wood
1fb6ad700f accounting: add context.Context #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood
d846210978 fs: Add context to NewFs #3257 #4685
This adds a context.Context parameter to NewFs and related calls.

This is necessary as part of reading config from the context -
backends need to be able to read the global config.
2020-11-09 18:05:54 +00:00
Josh Soref
d0888edc0a Spelling fixes
Fix spelling of: above, already, anonymous, associated,
authentication, bandwidth, because, between, blocks, calculate,
candidates, cautious, changelog, cleaner, clipboard, command,
completely, concurrently, considered, constructs, corrupt, current,
daemon, dependencies, deprecated, directory, dispatcher, download,
eligible, ellipsis, encrypter, endpoint, entrieslist, essentially,
existing writers, existing, expires, filesystem, flushing, frequently,
hierarchy, however, implementation, implements, inaccurate,
individually, insensitive, longer, maximum, metadata, modified,
multipart, namedirfirst, nextcloud, obscured, opened, optional,
owncloud, pacific, passphrase, password, permanently, persimmon,
positive, potato, protocol, quota, receiving, recommends, referring,
requires, revisited, satisfied, satisfies, satisfy, semver,
serialized, session, storage, strategies, stringlist, successful,
supported, surprise, temporarily, temporary, transactions, unneeded,
update, uploads, wrapped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-14 15:21:31 +01:00
Leo Luan
c5c56cda02 vfs: Add a missed update of used cache space
The missed update can cause incorrect before-cleaning cache stats
and a pre-mature condition broadcast in purgeOld before the cache
space use is reduced below the quota.
2020-10-06 16:35:23 +01:00
Leo Luan
2295123cad vfs: Add exponential backoff during ENOSPC retries
Add an exponentially increasing delay during retries up ENOSPC error
to avoid exhausting the 10 retries too soon when the cache space
recovery from item resets is not available from the file system yet
or consumed by other large cache writes.
2020-10-06 16:35:23 +01:00
Leo Luan
ff0280c0cb vfs: Fix missed concurrency control between some item operations and reset
Item reset is invoked by cache cleaner for synchronous recovery
from ENOSPC errors. The reset operation removes the cache file and
closes/reopens the downloaders.  Although most parts of reset and
other item operations are done with the item mutex held, the mutex
is released during fd.WriteAt and downloaders calls. We used preAccess
and postAccess calls to serialize Reset, ReadAt, and Open, but missed
some other item operations. The patch adds preAccess/postAccess
calls in Sync, Truncate, Close, WriteAt, and rename.
2020-10-06 16:35:23 +01:00
Leo Luan
64d736a57b vfs: Fix a race condition in retryFailedResets
A failed item reset is saved in the errItems for retryFailedResets
to process.  If the item gets closed before the retry, the item may
have been removed from the c.item array. Previous code did not
account for this condition. This patch adds the check for the
exitence of the retry items in retryFailedResets.
2020-10-06 16:35:23 +01:00
Leo Luan
5f1d5a1897 vfs: Fix a deadlock vulnerability in downloaders.Close
The downloaders.Close() call acquires the downloaders' mutex before
calling the wait group wait and the main downloaders thread has a
periodical (5 seconds interval) call to kick its waiters and the
waiter dispatch function tries to get the mutex. So a deadlock can
occur if the Close() call starts, gets the mutex, while the main
downloader thread already got the timer's tick and proceeded to
call kickWaiters. The deadlock happens when the Close call gets
the mutex between the timer's kick and the main downloader thread
gets the mutex first. So it's a pretty short period of time and
it probably explains why the problem has not surfaced, maybe
something like tens of nanoseconds out of 5 seconds (~10^^-8).
It took 5 days of continued stressing the Close calls for the
deadlock to appear.
2020-10-06 16:35:23 +01:00
Hekmon
66def93373 mount cmd: update systemd status with cache stats 2020-10-06 16:21:30 +01:00
Nick Craig-Wood
18ccf0f871 vfs: detect and recover from a file being removed externally from the cache
Before this change if a file was removed from the cache while rclone
is running then rclone would not notice and proceed to re-create it
full of zeros.

This change notices files that we expect to have data in going missing
and if they do logs an ERROR recovers.

It isn't recommended deleting files from the cache manually with
rclone running!

See: https://forum.rclone.org/t/corrupted-data-streaming-after-vfs-meta-files-removed/18997
Fixes #4602
2020-09-18 10:30:02 +01:00
Nick Craig-Wood
6a56ac1032 vfs,local: Log an ERROR if we fail to set the file to be sparse
See: https://forum.rclone.org/t/rclone-1-53-release/18880/73
2020-09-11 15:36:47 +01:00
Nick Craig-Wood
27b9ae4fc3 vfs: fix spurious error "vfs cache: failed to _ensure cache EOF"
Before this change the error message was produced for every file which
was confusing users.

After this change we check for EOF and return from ReadAt at that
point.

See: https://forum.rclone.org/t/rclone-1-53-release/18880/10
2020-09-03 10:25:00 +01:00
Sam Edwards
23b2c58018 vfs: Quiet removeNotInUse logging to debug when not removing 2020-09-02 11:55:20 +01:00
Leo Luan
c665201b85 vfs: support synchronous cache space recovery upon ENOSPC
This patch provides the support of synchronous cache space recovery
to allow read threads to recover from ENOSPC errors when cache space
can be recovered from cache items that are not in use or safe to be
reset/emptied .

The patch complements the existing cache cleaning process in two ways.

Firstly, the existing cache cleaning process is time-driven that runs
periodically. The cache space can run out while the cache cleaner
thread is still waiting for its next scheduled run. The io threads
encountering ENOSPC return an internal error to the applications
in this case even when cache space can be recovered to avoid this
error. This patch addresses this problem by having the read threads
kick the cache cleaner thread in this condition to recover cache
space preventing unnecessary ENOSPC errors from being seen by the
applications.

Secondly, this patch enhances the cache cleaner to support cache
item reset. Currently the cache purge process removes cache
items that are not in use. This may not be sufficient when the
total size of the working set exceeds the cache directory's
capacity. Like in the current code, this patch starts the purge
process by removing cache files that are not in use. Cache items
whose access times are older than vfs-cache-max-age are removed first.
After that, other not-in-use items are removed in LRU order until
vfs-cache-max-size is reached. If the vfs-cache-max-size (the quota)
is still not reached at this time, this patch adds a cache reset
step to reset/empty cache files that are still in use but not
dirtied.  This enables application processes to continue without
seeing an error even when the working set depletes the cache space
as long as there is not a large write working set hoarding the
entire cache space.

By design this patch does not add ENOSPC error recovery for write
IOs. Rclone does not empty a write cache item until the file data
is written back to the backend upon close. Allowing more cache
space to be consumed by dirty cache items when the cache space is
already running low would increase the risk of exhausting the cache
space in a way that the vfs mount becomes unreadable.
2020-08-25 21:12:06 +01:00
Nick Craig-Wood
94a0991584 vfs: set the modtime of the cache file immediately
Before this change we set the modtime of the cache file when all
writers had finished.

This has the unfortunate effect that the file is uploaded with the
wrong modtime which means on backends which can't set modtimes except
when uploading files it is wrong.

This change sets the modtime of the cache file immediately in the
cache and in turn sets the modtime in the file info.
2020-08-20 16:24:04 +01:00
Nick Craig-Wood
4d7f91309b vfs: fix download threads timing out
Before this fix, download threads would fill up the buffer and then
timeout even though data was still being read from them. If the client
was streaming slower than network speed this caused the downloader to
stop and be restarted continuously. This caused more potential for
skips in the download and unecessary network transactions.

This patch fixes that behaviour - as long as a downloader is being
read from more often than once every 5 seconds, it won't timeout.

This was done by:

- kicking the downloader whenever ensureDownloader is called
- making the downloader loop if it has already downloaded past the maxOffset
- making setRange() always kick the downloader
2020-08-06 17:26:18 +01:00
Nick Craig-Wood
109b695621 vfs: add --vfs-read-ahead parameter for use with --vfs-cache-mode full
This parameter causes extra read-ahead over --buffer-size which is not
buffered in memory but on disk.
2020-08-06 17:26:18 +01:00
Nick Craig-Wood
421585dd72 accounting: add context to Account and propagate changes #3257
This is preparation for getting the Accounting to check the context,
buf first we need to get it in place. Since this is one of those
changes that makes lots of noise, this is in a seperate commit.
2020-07-28 16:41:17 +01:00
Nick Craig-Wood
b2f4f52b64 vfs cache: make logging consistent and remove some debug logging 2020-07-06 17:32:53 +01:00
Nick Craig-Wood
c65ed26a7e vfs: vfscache: Fix renaming of items while they are being uploaded
Previous to the fix, if an item was being uploaded and it was renamed,
the upload would fail with missing checksum errors.

This change cancels any uploads in progress if the file is renamed.
2020-07-06 17:32:53 +01:00
Nick Craig-Wood
df5dbaf49b vfs: writeback: add Rename call for renaming items in the writeback queue 2020-07-06 17:32:53 +01:00
Nick Craig-Wood
63ebe4ca8d vfs: Reduce logging of metadata expiry to debug 2020-07-02 14:52:12 +01:00
Nick Craig-Wood
8301a72453 vfs: Fix over downloading with --vfs-cache-mode full and --buffer-size 0
This was caused by the signal to stop buffering being ignored when
there was no buffer!

This is fixed by explicitly checking for no buffering and stopping.
2020-06-30 12:03:39 +01:00
Nick Craig-Wood
15402e46c9 vfs: Add recovered items on cache reload to directory listings
Before this change, if we restarted an upload after a restart then the
file would get uploaded but never added to the directory listings.

This change makes sure we add virtual items to the directory cache
when reloading the cache so that they show up properly.
2020-06-30 12:03:39 +01:00
Nick Craig-Wood
939860eb85 vfs: vfscache make TestCacheCleaner test more reliable 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
530dc77cde vfs: Fix race condition in vfscache 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
143abe39f2 vfs: add tests for downloaders 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
ee04732cbb vfs: factor writeback and downloaders into their own packages 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
79455cc71e vfs: downloaders: remove unused osPath 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
042e5fe097 vfs: downloader: limit the reader to 10 errors before giving up 2020-06-30 12:03:39 +01:00
Nick Craig-Wood
d273a9d82d vfs: remove items from writeback when dirty, don't just cancel the upload
This stops open items continually trying to be uploaded
2020-06-30 12:03:39 +01:00
Nick Craig-Wood
ed32a759ed vfs: add test for writeBack.cancelUpload 2020-06-30 12:01:36 +01:00
Nick Craig-Wood
ef2d036884 vfs: make writeback heap sort in insertion order if expiry times equal
This makes the tests 100% consistent on platforms which have a lower
resolution timer like Windows.
2020-06-30 12:01:36 +01:00
Nick Craig-Wood
746c41f527 vfs: fix race in writeback tests 2020-06-30 12:01:36 +01:00
Nick Craig-Wood
b0fb457746 vfs: add tests for writeback 2020-06-30 12:01:36 +01:00