rclone/docs/content/commands/rclone_serve_s3.md
Nick Craig-Wood 7060777d1d docs: fix hugo warning: found no layout file for "html" for kind "term"
Hugo has been making this warning for a while

WARN found no layout file for "html" for kind "term": You should
create a template file which matches Hugo Layouts Lookup Rules for
this combination.

This turned out to be the addition of the `groups:` keyword to the
command frontmatter. Hugo is doing something with this keyword though
this isn't documented in the frontmatter documentation.

The fix was removing the `groups:` keyword from the frontmatter since
it was never used by hugo.
2024-06-15 12:59:49 +01:00

30 KiB

title description status versionIntroduced
rclone serve s3 Serve remote:path over s3. Experimental v1.65

rclone serve s3

Serve remote:path over s3.

Synopsis

serve s3 implements a basic s3 server that serves a remote via s3. This can be viewed with an s3 client, or you can make an s3 type remote to read and write to it with rclone.

serve s3 is considered Experimental so use with care.

S3 server supports Signature Version 4 authentication. Just use --auth-key accessKey,secretKey and set the Authorization header correctly in the request. (See the AWS docs).

--auth-key can be repeated for multiple auth pairs. If --auth-key is not provided then serve s3 will allow anonymous access.

Please note that some clients may require HTTPS endpoints. See the SSL docs for more information.

This command uses the VFS directory cache. All the functionality will work with --vfs-cache-mode off. Using --vfs-cache-mode full (or writes) can be used to cache objects locally to improve performance.

Use --force-path-style=false if you want to use the bucket name as a part of the hostname (such as mybucket.local)

Use --etag-hash if you want to change the hash uses for the ETag. Note that using anything other than MD5 (the default) is likely to cause problems for S3 clients which rely on the Etag being the MD5.

Quickstart

For a simple set up, to serve remote:path over s3, run the server like this:

rclone serve s3 --auth-key ACCESS_KEY_ID,SECRET_ACCESS_KEY remote:path

For example, to use a simple folder in the filesystem, run the server with a command like this:

rclone serve s3 --auth-key ACCESS_KEY_ID,SECRET_ACCESS_KEY local:/path/to/folder

The rclone.conf for the server could look like this:

[local]
type = local

The local configuration is optional though. If you run the server with a remote:path like /path/to/folder (without the local: prefix and without an rclone.conf file), rclone will fall back to a default configuration, which will be visible as a warning in the logs. But it will run nonetheless.

This will be compatible with an rclone (client) remote configuration which is defined like this:

[serves3]
type = s3
provider = Rclone
endpoint = http://127.0.0.1:8080/
access_key_id = ACCESS_KEY_ID
secret_access_key = SECRET_ACCESS_KEY
use_multipart_uploads = false

Note that setting disable_multipart_uploads = true is to work around a bug which will be fixed in due course.

Bugs

When uploading multipart files serve s3 holds all the parts in memory (see #7453). This is a limitaton of the library rclone uses for serving S3 and will hopefully be fixed at some point.

Multipart server side copies do not work (see #7454). These take a very long time and eventually fail. The default threshold for multipart server side copies is 5G which is the maximum it can be, so files above this side will fail to be server side copied.

For a current list of serve s3 bugs see the serve s3 bug category on GitHub.

Limitations

serve s3 will treat all directories in the root as buckets and ignore all files in the root. You can use CreateBucket to create folders under the root, but you can't create empty folders under other folders not in the root.

When using PutObject or DeleteObject, rclone will automatically create or clean up empty folders. If you don't want to clean up empty folders automatically, use --no-cleanup.

When using ListObjects, rclone will use / when the delimiter is empty. This reduces backend requests with no effect on most operations, but if the delimiter is something other than / and empty, rclone will do a full recursive search of the backend, which can take some time.

Versioning is not currently supported.

Metadata will only be saved in memory other than the rclone mtime metadata which will be set as the modification time of the file.

Supported operations

serve s3 currently supports the following operations.

  • Bucket
    • ListBuckets
    • CreateBucket
    • DeleteBucket
  • Object
    • HeadObject
    • ListObjects
    • GetObject
    • PutObject
    • DeleteObject
    • DeleteObjects
    • CreateMultipartUpload
    • CompleteMultipartUpload
    • AbortMultipartUpload
    • CopyObject
    • UploadPart

Other operations will return error Unimplemented.

Server options

Use --addr to specify which IP address and port the server should listen on, eg --addr 1.2.3.4:8000 or --addr :8080 to listen to all IPs. By default it only listens on localhost. You can use port :0 to let the OS choose an available port.

If you set --addr to listen on a public or LAN accessible IP address then using Authentication is advised - see the next section for info.

You can use a unix socket by setting the url to unix:///path/to/socket or just by using an absolute path name. Note that unix sockets bypass the authentication - this is expected to be done with file system permissions.

--addr may be repeated to listen on multiple IPs/ports/sockets.

--server-read-timeout and --server-write-timeout can be used to control the timeouts on the server. Note that this is the total time for a transfer.

--max-header-bytes controls the maximum number of bytes the server will accept in the HTTP header.

--baseurl controls the URL prefix that rclone serves from. By default rclone will serve from the root. If you used --baseurl "/rclone" then rclone would serve from a URL starting with "/rclone/". This is useful if you wish to proxy rclone serve. Rclone automatically inserts leading and trailing "/" on --baseurl, so --baseurl "rclone", --baseurl "/rclone" and --baseurl "/rclone/" are all treated identically.

TLS (SSL)

By default this will serve over http. If you want you can serve over https. You will need to supply the --cert and --key flags. If you wish to do client side certificate validation then you will need to supply --client-ca also.

--cert should be a either a PEM encoded certificate or a concatenation of that with the CA certificate. --key should be the PEM encoded private key and --client-ca should be the PEM encoded client certificate authority certificate.

--min-tls-version is minimum TLS version that is acceptable. Valid values are "tls1.0", "tls1.1", "tls1.2" and "tls1.3" (default "tls1.0").

VFS - Virtual File System

This command uses the VFS layer. This adapts the cloud storage objects that rclone uses into something which looks much more like a disk filing system.

Cloud storage objects have lots of properties which aren't like disk files - you can't extend them or write to the middle of them, so the VFS layer has to deal with that. Because there is no one right way of doing this there are various options explained below.

The VFS layer also implements a directory cache - this caches info about files and directories (but not the data) in memory.

VFS Directory Cache

Using the --dir-cache-time flag, you can control how long a directory should be considered up to date and not refreshed from the backend. Changes made through the VFS will appear immediately or invalidate the cache.

--dir-cache-time duration   Time to cache directory entries for (default 5m0s)
--poll-interval duration    Time to wait between polling for changes. Must be smaller than dir-cache-time. Only on supported remotes. Set to 0 to disable (default 1m0s)

However, changes made directly on the cloud storage by the web interface or a different copy of rclone will only be picked up once the directory cache expires if the backend configured does not support polling for changes. If the backend supports polling, changes will be picked up within the polling interval.

You can send a SIGHUP signal to rclone for it to flush all directory caches, regardless of how old they are. Assuming only one rclone instance is running, you can reset the cache like this:

kill -SIGHUP $(pidof rclone)

If you configure rclone with a remote control then you can use rclone rc to flush the whole directory cache:

rclone rc vfs/forget

Or individual files or directories:

rclone rc vfs/forget file=path/to/file dir=path/to/dir

VFS File Buffering

The --buffer-size flag determines the amount of memory, that will be used to buffer data in advance.

Each open file will try to keep the specified amount of data in memory at all times. The buffered data is bound to one open file and won't be shared.

This flag is a upper limit for the used memory per open file. The buffer will only use memory for data that is downloaded but not not yet read. If the buffer is empty, only a small amount of memory will be used.

The maximum memory used by rclone for buffering can be up to --buffer-size * open files.

VFS File Caching

These flags control the VFS file caching options. File caching is necessary to make the VFS layer appear compatible with a normal file system. It can be disabled at the cost of some compatibility.

For example you'll need to enable VFS caching if you want to read and write simultaneously to a file. See below for more details.

Note that the VFS cache is separate from the cache backend and you may find that you need one or the other or both.

--cache-dir string                     Directory rclone will use for caching.
--vfs-cache-mode CacheMode             Cache mode off|minimal|writes|full (default off)
--vfs-cache-max-age duration           Max time since last access of objects in the cache (default 1h0m0s)
--vfs-cache-max-size SizeSuffix        Max total size of objects in the cache (default off)
--vfs-cache-min-free-space SizeSuffix  Target minimum free space on the disk containing the cache (default off)
--vfs-cache-poll-interval duration     Interval to poll the cache for stale objects (default 1m0s)
--vfs-write-back duration              Time to writeback files after last use when using cache (default 5s)

If run with -vv rclone will print the location of the file cache. The files are stored in the user cache file area which is OS dependent but can be controlled with --cache-dir or setting the appropriate environment variable.

The cache has 4 different modes selected by --vfs-cache-mode. The higher the cache mode the more compatible rclone becomes at the cost of using disk space.

Note that files are written back to the remote only when they are closed and if they haven't been accessed for --vfs-write-back seconds. If rclone is quit or dies with files that haven't been uploaded, these will be uploaded next time rclone is run with the same flags.

If using --vfs-cache-max-size or --vfs-cache-min-free-size note that the cache may exceed these quotas for two reasons. Firstly because it is only checked every --vfs-cache-poll-interval. Secondly because open files cannot be evicted from the cache. When --vfs-cache-max-size or --vfs-cache-min-free-size is exceeded, rclone will attempt to evict the least accessed files from the cache first. rclone will start with files that haven't been accessed for the longest. This cache flushing strategy is efficient and more relevant files are likely to remain cached.

The --vfs-cache-max-age will evict files from the cache after the set time since last access has passed. The default value of 1 hour will start evicting files from cache that haven't been accessed for 1 hour. When a cached file is accessed the 1 hour timer is reset to 0 and will wait for 1 more hour before evicting. Specify the time with standard notation, s, m, h, d, w .

You should not run two copies of rclone using the same VFS cache with the same or overlapping remotes if using --vfs-cache-mode > off. This can potentially cause data corruption if you do. You can work around this by giving each rclone its own cache hierarchy with --cache-dir. You don't need to worry about this if the remotes in use don't overlap.

--vfs-cache-mode off

In this mode (the default) the cache will read directly from the remote and write directly to the remote without caching anything on disk.

This will mean some operations are not possible

  • Files can't be opened for both read AND write
  • Files opened for write can't be seeked
  • Existing files opened for write must have O_TRUNC set
  • Files open for read with O_TRUNC will be opened write only
  • Files open for write only will behave as if O_TRUNC was supplied
  • Open modes O_APPEND, O_TRUNC are ignored
  • If an upload fails it can't be retried

--vfs-cache-mode minimal

This is very similar to "off" except that files opened for read AND write will be buffered to disk. This means that files opened for write will be a lot more compatible, but uses the minimal disk space.

These operations are not possible

  • Files opened for write only can't be seeked
  • Existing files opened for write must have O_TRUNC set
  • Files opened for write only will ignore O_APPEND, O_TRUNC
  • If an upload fails it can't be retried

--vfs-cache-mode writes

In this mode files opened for read only are still read directly from the remote, write only and read/write files are buffered to disk first.

This mode should support all normal file system operations.

If an upload fails it will be retried at exponentially increasing intervals up to 1 minute.

--vfs-cache-mode full

In this mode all reads and writes are buffered to and from disk. When data is read from the remote this is buffered to disk as well.

In this mode the files in the cache will be sparse files and rclone will keep track of which bits of the files it has downloaded.

So if an application only reads the starts of each file, then rclone will only buffer the start of the file. These files will appear to be their full size in the cache, but they will be sparse files with only the data that has been downloaded present in them.

This mode should support all normal file system operations and is otherwise identical to --vfs-cache-mode writes.

When reading a file rclone will read --buffer-size plus --vfs-read-ahead bytes ahead. The --buffer-size is buffered in memory whereas the --vfs-read-ahead is buffered on disk.

When using this mode it is recommended that --buffer-size is not set too large and --vfs-read-ahead is set large if required.

IMPORTANT not all file systems support sparse files. In particular FAT/exFAT do not. Rclone will perform very badly if the cache directory is on a filesystem which doesn't support sparse files and it will log an ERROR message if one is detected.

Fingerprinting

Various parts of the VFS use fingerprinting to see if a local file copy has changed relative to a remote file. Fingerprints are made from:

  • size
  • modification time
  • hash

where available on an object.

On some backends some of these attributes are slow to read (they take an extra API call per object, or extra work per object).

For example hash is slow with the local and sftp backends as they have to read the entire file and hash it, and modtime is slow with the s3, swift, ftp and qinqstor backends because they need to do an extra API call to fetch it.

If you use the --vfs-fast-fingerprint flag then rclone will not include the slow operations in the fingerprint. This makes the fingerprinting less accurate but much faster and will improve the opening time of cached files.

If you are running a vfs cache over local, s3 or swift backends then using this flag is recommended.

Note that if you change the value of this flag, the fingerprints of the files in the cache may be invalidated and the files will need to be downloaded again.

VFS Chunked Reading

When rclone reads files from a remote it reads them in chunks. This means that rather than requesting the whole file rclone reads the chunk specified. This can reduce the used download quota for some remotes by requesting only chunks from the remote that are actually read, at the cost of an increased number of requests.

These flags control the chunking:

--vfs-read-chunk-size SizeSuffix        Read the source objects in chunks (default 128M)
--vfs-read-chunk-size-limit SizeSuffix  Max chunk doubling size (default off)

Rclone will start reading a chunk of size --vfs-read-chunk-size, and then double the size for each read. When --vfs-read-chunk-size-limit is specified, and greater than --vfs-read-chunk-size, the chunk size for each open file will get doubled only until the specified value is reached. If the value is "off", which is the default, the limit is disabled and the chunk size will grow indefinitely.

With --vfs-read-chunk-size 100M and --vfs-read-chunk-size-limit 0 the following parts will be downloaded: 0-100M, 100M-200M, 200M-300M, 300M-400M and so on. When --vfs-read-chunk-size-limit 500M is specified, the result would be 0-100M, 100M-300M, 300M-700M, 700M-1200M, 1200M-1700M and so on.

Setting --vfs-read-chunk-size to 0 or "off" disables chunked reading.

VFS Performance

These flags may be used to enable/disable features of the VFS for performance or other reasons. See also the chunked reading feature.

In particular S3 and Swift benefit hugely from the --no-modtime flag (or use --use-server-modtime for a slightly different effect) as each read of the modification time takes a transaction.

--no-checksum     Don't compare checksums on up/download.
--no-modtime      Don't read/write the modification time (can speed things up).
--no-seek         Don't allow seeking in files.
--read-only       Only allow read-only access.

Sometimes rclone is delivered reads or writes out of order. Rather than seeking rclone will wait a short time for the in sequence read or write to come in. These flags only come into effect when not using an on disk cache file.

--vfs-read-wait duration   Time to wait for in-sequence read before seeking (default 20ms)
--vfs-write-wait duration  Time to wait for in-sequence write before giving error (default 1s)

When using VFS write caching (--vfs-cache-mode with value writes or full), the global flag --transfers can be set to adjust the number of parallel uploads of modified files from the cache (the related global flag --checkers has no effect on the VFS).

--transfers int  Number of file transfers to run in parallel (default 4)

VFS Case Sensitivity

Linux file systems are case-sensitive: two files can differ only by case, and the exact case must be used when opening a file.

File systems in modern Windows are case-insensitive but case-preserving: although existing files can be opened using any case, the exact case used to create the file is preserved and available for programs to query. It is not allowed for two files in the same directory to differ only by case.

Usually file systems on macOS are case-insensitive. It is possible to make macOS file systems case-sensitive but that is not the default.

The --vfs-case-insensitive VFS flag controls how rclone handles these two cases. If its value is "false", rclone passes file names to the remote as-is. If the flag is "true" (or appears without a value on the command line), rclone may perform a "fixup" as explained below.

The user may specify a file name to open/delete/rename/etc with a case different than what is stored on the remote. If an argument refers to an existing file with exactly the same name, then the case of the existing file on the disk will be used. However, if a file name with exactly the same name is not found but a name differing only by case exists, rclone will transparently fixup the name. This fixup happens only when an existing file is requested. Case sensitivity of file names created anew by rclone is controlled by the underlying remote.

Note that case sensitivity of the operating system running rclone (the target) may differ from case sensitivity of a file system presented by rclone (the source). The flag controls whether "fixup" is performed to satisfy the target.

If the flag is not provided on the command line, then its default value depends on the operating system where rclone runs: "true" on Windows and macOS, "false" otherwise. If the flag is provided without a value, then it is "true".

The --no-unicode-normalization flag controls whether a similar "fixup" is performed for filenames that differ but are canonically equivalent with respect to unicode. Unicode normalization can be particularly helpful for users of macOS, which prefers form NFD instead of the NFC used by most other platforms. It is therefore highly recommended to keep the default of false on macOS, to avoid encoding compatibility issues.

In the (probably unlikely) event that a directory has multiple duplicate filenames after applying case and unicode normalization, the --vfs-block-norm-dupes flag allows hiding these duplicates. This comes with a performance tradeoff, as rclone will have to scan the entire directory for duplicates when listing a directory. For this reason, it is recommended to leave this disabled if not needed. However, macOS users may wish to consider using it, as otherwise, if a remote directory contains both NFC and NFD versions of the same filename, an odd situation will occur: both versions of the file will be visible in the mount, and both will appear to be editable, however, editing either version will actually result in only the NFD version getting edited under the hood. --vfs-block- norm-dupes prevents this confusion by detecting this scenario, hiding the duplicates, and logging an error, similar to how this is handled in rclone sync.

VFS Disk Options

This flag allows you to manually set the statistics about the filing system. It can be useful when those statistics cannot be read correctly automatically.

--vfs-disk-space-total-size    Manually set the total disk space size (example: 256G, default: -1)

Alternate report of used bytes

Some backends, most notably S3, do not report the amount of bytes used. If you need this information to be available when running df on the filesystem, then pass the flag --vfs-used-is-size to rclone. With this flag set, instead of relying on the backend to report this information, rclone will scan the whole remote similar to rclone size and compute the total used space itself.

WARNING. Contrary to rclone size, this flag ignores filters so that the result is accurate. However, this is very inefficient and may cost lots of API calls resulting in extra charges. Use it as a last resort and only with caching.

rclone serve s3 remote:path [flags]

Options

      --addr stringArray                       IPaddress:Port or :Port to bind server to (default [127.0.0.1:8080])
      --allow-origin string                    Origin which cross-domain request (CORS) can be executed from
      --auth-key stringArray                   Set key pair for v4 authorization: access_key_id,secret_access_key
      --baseurl string                         Prefix for URLs - leave blank for root
      --cert string                            TLS PEM key (concatenation of certificate and CA certificate)
      --client-ca string                       Client certificate authority to verify clients with
      --dir-cache-time Duration                Time to cache directory entries for (default 5m0s)
      --dir-perms FileMode                     Directory permissions (default 0777)
      --etag-hash string                       Which hash to use for the ETag, or auto or blank for off (default "MD5")
      --file-perms FileMode                    File permissions (default 0666)
      --force-path-style                       If true use path style access if false use virtual hosted style (default true) (default true)
      --gid uint32                             Override the gid field set by the filesystem (not supported on Windows) (default 1000)
  -h, --help                                   help for s3
      --key string                             TLS PEM Private key
      --max-header-bytes int                   Maximum size of request header (default 4096)
      --min-tls-version string                 Minimum TLS version that is acceptable (default "tls1.0")
      --no-checksum                            Don't compare checksums on up/download
      --no-cleanup                             Not to cleanup empty folder after object is deleted
      --no-modtime                             Don't read/write the modification time (can speed things up)
      --no-seek                                Don't allow seeking in files
      --poll-interval Duration                 Time to wait between polling for changes, must be smaller than dir-cache-time and only on supported remotes (set 0 to disable) (default 1m0s)
      --read-only                              Only allow read-only access
      --server-read-timeout Duration           Timeout for server reading data (default 1h0m0s)
      --server-write-timeout Duration          Timeout for server writing data (default 1h0m0s)
      --uid uint32                             Override the uid field set by the filesystem (not supported on Windows) (default 1000)
      --umask int                              Override the permission bits set by the filesystem (not supported on Windows) (default 2)
      --vfs-block-norm-dupes                   If duplicate filenames exist in the same directory (after normalization), log an error and hide the duplicates (may have a performance cost)
      --vfs-cache-max-age Duration             Max time since last access of objects in the cache (default 1h0m0s)
      --vfs-cache-max-size SizeSuffix          Max total size of objects in the cache (default off)
      --vfs-cache-min-free-space SizeSuffix    Target minimum free space on the disk containing the cache (default off)
      --vfs-cache-mode CacheMode               Cache mode off|minimal|writes|full (default off)
      --vfs-cache-poll-interval Duration       Interval to poll the cache for stale objects (default 1m0s)
      --vfs-case-insensitive                   If a file name not found, find a case insensitive match
      --vfs-disk-space-total-size SizeSuffix   Specify the total space of disk (default off)
      --vfs-fast-fingerprint                   Use fast (less accurate) fingerprints for change detection
      --vfs-read-ahead SizeSuffix              Extra read ahead over --buffer-size when using cache-mode full
      --vfs-read-chunk-size SizeSuffix         Read the source objects in chunks (default 128Mi)
      --vfs-read-chunk-size-limit SizeSuffix   If greater than --vfs-read-chunk-size, double the chunk size after each chunk read, until the limit is reached ('off' is unlimited) (default off)
      --vfs-read-wait Duration                 Time to wait for in-sequence read before seeking (default 20ms)
      --vfs-refresh                            Refreshes the directory cache recursively in the background on start
      --vfs-used-is-size rclone size           Use the rclone size algorithm for Used size
      --vfs-write-back Duration                Time to writeback files after last use when using cache (default 5s)
      --vfs-write-wait Duration                Time to wait for in-sequence write before giving error (default 1s)

Filter Options

Flags for filtering directory listings.

      --delete-excluded                     Delete files on dest excluded from sync
      --exclude stringArray                 Exclude files matching pattern
      --exclude-from stringArray            Read file exclude patterns from file (use - to read from stdin)
      --exclude-if-present stringArray      Exclude directories if filename is present
      --files-from stringArray              Read list of source-file names from file (use - to read from stdin)
      --files-from-raw stringArray          Read list of source-file names from file without any processing of lines (use - to read from stdin)
  -f, --filter stringArray                  Add a file filtering rule
      --filter-from stringArray             Read file filtering patterns from a file (use - to read from stdin)
      --ignore-case                         Ignore case in filters (case insensitive)
      --include stringArray                 Include files matching pattern
      --include-from stringArray            Read file include patterns from file (use - to read from stdin)
      --max-age Duration                    Only transfer files younger than this in s or suffix ms|s|m|h|d|w|M|y (default off)
      --max-depth int                       If set limits the recursion depth to this (default -1)
      --max-size SizeSuffix                 Only transfer files smaller than this in KiB or suffix B|K|M|G|T|P (default off)
      --metadata-exclude stringArray        Exclude metadatas matching pattern
      --metadata-exclude-from stringArray   Read metadata exclude patterns from file (use - to read from stdin)
      --metadata-filter stringArray         Add a metadata filtering rule
      --metadata-filter-from stringArray    Read metadata filtering patterns from a file (use - to read from stdin)
      --metadata-include stringArray        Include metadatas matching pattern
      --metadata-include-from stringArray   Read metadata include patterns from file (use - to read from stdin)
      --min-age Duration                    Only transfer files older than this in s or suffix ms|s|m|h|d|w|M|y (default off)
      --min-size SizeSuffix                 Only transfer files bigger than this in KiB or suffix B|K|M|G|T|P (default off)

See the global flags page for global options not listed here.

SEE ALSO