Commit Graph

3446 Commits

Author SHA1 Message Date
Michał Matczuk
f396550934 backend/local: Avoid polluting page cache when uploading local files to remote backends
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.

```
fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768
read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768
read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768
read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768
fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768
read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768
read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768
read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768
fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768
read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768
read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768
read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768
fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0
```

Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat).

This patch does not have a performance impact. Please find below results
of an experiment comparing local copy of 1GB file with and without this
patch.

With the patch:

```
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
(mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 13.19
        System time (seconds): 1.12
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2212
        Voluntary context switches: 5755
        Involuntary context switches: 9782
        Swaps: 0
        File system inputs: 4155264
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
```

Without the patch:

```
(master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 14.46
        System time (seconds): 0.81
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2228
        Voluntary context switches: 7190
        Involuntary context switches: 1980
        Swaps: 0
        File system inputs: 2097152
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(master)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 262144    | 100.000 |
+-----------+----------------+------------+-----------+---------+
```
2019-08-08 23:41:52 +01:00
Nick Craig-Wood
6f87267b34 accounting: fix locking in Transfer to avoid deadlock with --progress
Before this change, using -P occasionally deadlocked on the transfer
mutex and the stats mutex since they call each other via the progress
printing.

This is fixed by shortening the locking windows and converting the
mutex to a RW mutex.
2019-08-08 15:46:46 +01:00
Nick Craig-Wood
9d1fb2f4e7 Revert "cmd: shorten the locking window when using --progress to avoid deadlock"
This reverts commit fdef567da6.

The problem turned out to be elsewhere.
2019-08-08 15:19:41 +01:00
Nick Craig-Wood
99b3154abd Revert "filter: Add BoundedRecursion method"
This reverts commit 047f00a411.

It turns out that BoundedRecursion is the wrong thing to measure.
2019-08-08 14:15:50 +01:00
Nick Craig-Wood
6c38bddf3e walk: fix listing with filters listing whole remote
Prior to this fix, a request such as

    rclone lsf -R --include "/dir/**" remote:

Would use ListR which is very inefficient as it lists the whole remote
for one directory.

This changes it to use recursive walking if the filters imply any
directory filtering.  So `--include *.jpg` and `--exclude *.jpg` will
still use ListR wheras `--include "/dir/**` will not.
2019-08-08 14:15:50 +01:00
Nick Craig-Wood
a00a0471a8 filter: Add UsesDirectoryFilters method 2019-08-08 14:15:50 +01:00
Nick Craig-Wood
9e81fc343e swift: fix upload when using no_chunk to return the correct size
When using the VFS with swift and --swift-no-chunk, PutStream was
returning objects with size -1 which was causing corrupted transfer
messages.

This was fixed by counting the bytes transferred in a streamed file
and updating the metadata with that.
2019-08-08 12:41:46 +01:00
Nick Craig-Wood
fdef567da6 cmd: shorten the locking window when using --progress to avoid deadlock
Before this change, using -P occasionally deadlocked on the progress
mutex and the stats mutex since they call each other.

This is fixed by shortening the locking window in the progress routine
so as not to include the stats calculation.
2019-08-08 12:37:50 +01:00
Nick Craig-Wood
d377842395 vfs: make write without cache more efficient
This updates the out of sequence write code to be more efficient using
a conditional lock with a timeout.
2019-08-08 12:37:50 +01:00
Nick Craig-Wood
c014b2e66b rcat: fix slowdown on systems with multiple hashes
Before this fix rclone calculated all the hashes on transfer.  This
was particularly slow for the local backend.

After the fix we just calculate one hash which is enough for data
integrity.
2019-08-08 12:37:50 +01:00
Nick Craig-Wood
62b769a0a7 serve sftp: fix spurious debugs on server close 2019-08-08 12:37:50 +01:00
Nick Craig-Wood
84b5da089e serve sftp: fix detection of whether server is authorized 2019-08-08 12:37:50 +01:00
Nick Craig-Wood
d0c65b4c5e copyurl: fix copying files that return HTTP errors 2019-08-07 22:29:44 +01:00
Nick Craig-Wood
e502be475a azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files
In 0386d22cc9 we introduced a test for 0 length files read the
way mount does.

This test failed on these backends which we fix up here.
2019-08-06 15:18:08 +01:00
negative0
27a075e9fc rcd: Removed the shorthand for webgui. Shorthand is reserved for rsync compatibility. 2019-08-06 12:50:31 +01:00
Nick Craig-Wood
5065c422b4 lib/random: unify random string generation into random.String
This was factored from fstest as we were including the testing
enviroment into the main binary because of it.

This was causing opening the browser to fail because of 8243ff8bc8.
2019-08-06 12:44:08 +01:00
Nick Craig-Wood
72d5b11d1b serve restic: rename test file to avoid it being linked into main binary 2019-08-06 12:42:52 +01:00
Nick Craig-Wood
526a3347ac rcd: Fix permissions problems on cache directory with web gui download 2019-08-06 12:06:57 +01:00
Nick Craig-Wood
23910ba53b servetest: add tests for --auth-proxy 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
ee7101e6af serve: factor out common testing parts for ftp, sftp and webdav tests 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
36c1b37dd9 serve webdav: support --auth-proxy 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
72782bdda6 serve ftp: implement --auth-proxy 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
b94eef16c1 serve ftp: refactor to bring into line with other serve commands 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
d75fbe4852 serve sftp: implement auth proxy 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
e6ab237fcd serve: add auth proxy infrastructure 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
a7eec91d69 vfs: add Fs() method to return underlying fs.Fs 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
b3e94b018c cache: factor fs cache into lib/cache 2019-08-06 11:43:42 +01:00
Nick Craig-Wood
ca0e9ea55d build: add Azure Pipelines build status to README 2019-08-06 10:46:36 +01:00
Nick Craig-Wood
53e3c2e263 build: add azure pipelines build 2019-08-06 10:31:32 +01:00
Nick Craig-Wood
02eb747d71 serve http/webdav/restic: implement --prefix - fixes #3398
--prefix enables the servers to serve from a non root prefix.  This
enables easier proxying.
2019-08-06 10:30:48 +01:00
Chaitanya Bankanhal
d51a970932 rcd: Change URL after webgui move to rclone organization 2019-08-05 16:22:40 +01:00
Nick Craig-Wood
a9438cf364 build: add .gitattributes to mark generated files
This makes sure that GitHub ignores the auto generated documentation
files for language detection and diffs.

See: https://github.com/github/linguist#overrides for more info
2019-08-04 15:20:15 +01:00
Nick Craig-Wood
5ef3c988eb bin: add script to test all commits compile for git bisect 2019-08-04 13:29:59 +01:00
Nick Craig-Wood
78150e82a2 docs: update bugs and limitations document 2019-08-04 12:33:39 +01:00
Nick Craig-Wood
6f0cc51eeb Add Chaitanya Bankanhal to contributors 2019-08-04 12:33:39 +01:00
Chaitanya Bankanhal
84e2806c4b rc: Rclone-WebUI integration with rclone
This adds experimental support for web gui integration so that rclone can fetch and run a web based GUI using the --rc-web-ui and related flags.

It downloads and caches a webui zip file which it then unpacks and opens in the browser.
2019-08-04 12:32:37 +01:00
Nick Craig-Wood
0386d22cc9 vfs: add test for 0 length files read in the way mount does 2019-08-03 18:25:44 +01:00
Nick Craig-Wood
0be14120e4 swift: use FixRangeOption to fix 0 length files via the VFS 2019-08-03 18:25:44 +01:00
Nick Craig-Wood
95af1f9ccf fs: fix FixRangeOption so it works with 0 length files 2019-08-03 18:25:44 +01:00
Nick Craig-Wood
629b7eacd8 b2: fix integration tests after accounting changes
In 53a1a0e3ef we started returning non nil from NewObject when
an object isn't found.  This breaks the integration tests and the API
expected of a backend.

This fixes it.
2019-08-03 13:30:31 +01:00
yparitcher
d3149acc32 b2: link sharing 2019-08-03 13:30:31 +01:00
Aleksandar Jankovic
6a3e301303 accounting: add call to clear stats
- Make calls more consistent by changing path to kebab case.
- Add stacktrace information to job panics
2019-08-02 16:56:19 +01:00
Nick Craig-Wood
5be968c0ca drive: update API for teamdrive use - fixes #3348 2019-08-02 16:06:23 +01:00
Nick Craig-Wood
f1a687c540 Add justina777 to contributors 2019-08-02 15:57:25 +01:00
justina777
94ee43fe54 log: add object and objectType to json logs 2019-08-02 15:57:09 +01:00
Nick Craig-Wood
c2635e39cc build: fix appveyor secure variables after project move 2019-07-28 22:46:26 +01:00
Nick Craig-Wood
8c511ec9cd docs: fix star count on website 2019-07-28 20:58:21 +01:00
Nick Craig-Wood
ac0dce78d0 cmd: fix up stats printing on macOS after accounting change 2019-07-28 20:38:20 +01:00
Nick Craig-Wood
f347514f62 build: fix up CI and CI badges after repo move 2019-07-28 20:07:04 +01:00
Nick Craig-Wood
57d5de6fba build: fix up package paths after repo move
git grep -l github.com/ncw/rclone | xargs -d'\n' perl -i~ -lpe 's|github.com/ncw/rclone|github.com/rclone/rclone|g'
goimports -w `find . -name \*.go`
2019-07-28 18:47:38 +01:00