The vfs-cache-max-size parameter is probably confusing to many users.
The cache cleaner checks cache size periodically at the --vfs-cache-poll-interval
(default 60 seconds) interval and remove cache items in the following order.
(1) cache items that are not in use and with age > vfs-cache-max-age
(2) if the cache space used at this time still is larger than
vfs-cache-max-size, the cleaner continues to remove cache items that are
not in use.
The cache cleaning process does not remove cache items that are currently in use.
If the total space consumed by in-use cache items exceeds vfs-cache-max-size, the
periodical cache cleaner thread does not do anything further and leaves the in-use
cache items alone with a total space larger than vfs-cache-max-size.
A cache reset feature was introduced in 1.53 which resets in-use (but not dirty,
i.e., not being updated) cache items when additional cache data incurs an ENOSPC
error. But this code was not activated in the periodical cache cleaning thread.
This patch adds the cache reset step in the cache cleaner thread during cache
poll to reset cache items until the total size of the remaining cache items is
below vfs-cache-max-size.
Betweeen rclone v1.54 and v1.55 there was an approx 3x performance
regression when transferring to distant SFTP servers (in particular
rsync.net).
This turned out to be due to the library github.com/pkg/sftp rclone
uses. Concurrent writes used to be enabled in this library by default
(for v1.12.0 as used in rclone v1.54) but they are no longer enabled
(for v1.13.0 as used in rclone v1.55) for safety reasons and it is
necessary to enable them specifically.
The safety concerns are due to the uncertainty as to whether writes
come in order and whether a half completed file might have holes in
it. This isn't a problem for rclone since a) it doesn't restart
uploads and b) it has a post-transfer checksum test.
This change introduces a new flag `--sftp-disable-concurrent-writes`
to control the feature which defaults to false, meaning that
concurrent writes are enabled as in v1.54.
However this isn't quite enough to fix the problem as the sftp library
needs to be able to sniff the size of the stream from the reader
passed in, so this also adds a `Size` interface to the reader to
enable this. This involved a patch to the library.
The library was reverted to v1.12.0 for v1.55.1 - this patch installs
v1.13.0+master to fix the Size interface problem.
See: https://github.com/pkg/sftp/issues/426
This reverts the library update done in this commit.
713f8f357d sftp: fix "file not found" errors for read once servers
Reverting this commit triples the performance to a far away sftp server.
See: https://github.com/pkg/sftp/issues/426
It introduces a new flag --sftp-disable-concurrent-reads to stop the
problematic behaviour in the SFTP library for read-once servers.
This upgrades the sftp library to v1.13.0 which has the fix.
This implements polling support for the Dropbox backend. The Dropbox SDK dependency had to be updated due to an auth issue, which was fixed on Jan 12 2021. A secondary internal Dropbox service was created to handle unauthorized SDK requests, as is necessary when using the ListFolderLongpoll function/endpoint. The config variable was renamed to cfg to avoid potential conflicts with the imported config package.
Add new option option "sharepoint-ntlm" for the vendor setting.
Use it when your hosted Sharepoint is not tied to the OneDrive
accounts and uses NTLM authentication.
Also add documentation and integration test.
Fixes: #2171
Instead of only adding SCSU, add it as an existing table.
Allow direct SCSU and add a, perhaps, reasonable table as well.
Add byte interfaces that doesn't base64 encode the URL as well with `EncodeBytes` and `DecodeBytes`.
Fuzz tested and decode tests added.
This includes an HDFS docker image to use with the integration tests.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Uplink v1.4.1 provides two important improvements for rclone:
* Fix for a connection handling issue where an open project could
potentially become unusable because the underlying connection had
failed.
* Fix for concurrent use issue in drpc.
This patch provides the support of synchronous cache space recovery
to allow read threads to recover from ENOSPC errors when cache space
can be recovered from cache items that are not in use or safe to be
reset/emptied .
The patch complements the existing cache cleaning process in two ways.
Firstly, the existing cache cleaning process is time-driven that runs
periodically. The cache space can run out while the cache cleaner
thread is still waiting for its next scheduled run. The io threads
encountering ENOSPC return an internal error to the applications
in this case even when cache space can be recovered to avoid this
error. This patch addresses this problem by having the read threads
kick the cache cleaner thread in this condition to recover cache
space preventing unnecessary ENOSPC errors from being seen by the
applications.
Secondly, this patch enhances the cache cleaner to support cache
item reset. Currently the cache purge process removes cache
items that are not in use. This may not be sufficient when the
total size of the working set exceeds the cache directory's
capacity. Like in the current code, this patch starts the purge
process by removing cache files that are not in use. Cache items
whose access times are older than vfs-cache-max-age are removed first.
After that, other not-in-use items are removed in LRU order until
vfs-cache-max-size is reached. If the vfs-cache-max-size (the quota)
is still not reached at this time, this patch adds a cache reset
step to reset/empty cache files that are still in use but not
dirtied. This enables application processes to continue without
seeing an error even when the working set depletes the cache space
as long as there is not a large write working set hoarding the
entire cache space.
By design this patch does not add ENOSPC error recovery for write
IOs. Rclone does not empty a write cache item until the file data
is written back to the backend upon close. Allowing more cache
space to be consumed by dirty cache items when the cache space is
already running low would increase the risk of exhausting the cache
space in a way that the vfs mount becomes unreadable.
Uplink v1.2.0 comes with two improvements related to rclone:
* Fix for resource leak in uploads.
* The socket dialer comes with better congestion control in some
environments. On Linux environments, if a congestion controller named
'ledbat' is installed, it will be used. Consider installing
https://github.com/silviov/TCP-LEDBAT
Allows to compress short arbitrary strings and returns a string using base64 url encoding.
Generator for tables included and a few samples has been added. Add more to init.go
Tested with fuzzing for crash resistance and symmetry, see fuzz.go
This uses the refactored goftp library which doesn't include the minio
driver. This reduces the binary size by 1.5MB
See: https://gitea.com/goftp/server/pulls/120
This fixes a regression in the rclone tests from the v1.0.6 upgrade of
uplink. The failure was due to an improperly converted error resulting
in the wrong type of error.