2016-08-03 22:36:28 +02:00
|
|
|
---
|
|
|
|
title: "rclone dedupe"
|
2020-09-02 17:59:04 +02:00
|
|
|
description: "Interactively find duplicate filenames and delete/rename them."
|
2016-08-03 22:36:28 +02:00
|
|
|
slug: rclone_dedupe
|
|
|
|
url: /commands/rclone_dedupe/
|
2020-02-10 13:31:45 +01:00
|
|
|
# autogenerated - DO NOT EDIT, instead edit the source code in cmd/dedupe/ and as part of making a release run "make commanddocs"
|
2016-08-03 22:36:28 +02:00
|
|
|
---
|
2020-05-22 12:17:37 +02:00
|
|
|
# rclone dedupe
|
2016-08-03 22:36:28 +02:00
|
|
|
|
2020-09-02 17:59:04 +02:00
|
|
|
Interactively find duplicate filenames and delete/rename them.
|
2016-08-03 22:36:28 +02:00
|
|
|
|
2020-05-22 12:17:37 +02:00
|
|
|
## Synopsis
|
2016-08-03 22:36:28 +02:00
|
|
|
|
|
|
|
|
2020-09-02 17:59:04 +02:00
|
|
|
|
|
|
|
By default `dedupe` interactively finds files with duplicate
|
|
|
|
names and offers to delete all but one or rename them to be
|
2021-02-02 14:42:35 +01:00
|
|
|
different. This is known as deduping by name.
|
2020-09-02 17:59:04 +02:00
|
|
|
|
2021-02-02 14:42:35 +01:00
|
|
|
Deduping by name is only useful with backends like Google Drive which
|
|
|
|
can have duplicate file names. It can be run on wrapping backends
|
|
|
|
(e.g. crypt) if they wrap a backend which supports duplicate file
|
|
|
|
names.
|
2016-08-03 22:36:28 +02:00
|
|
|
|
2021-02-02 14:42:35 +01:00
|
|
|
However if --by-hash is passed in then dedupe will find files with
|
|
|
|
duplicate hashes instead which will work on any backend which supports
|
|
|
|
at least one hash. This can be used to find files with duplicate
|
|
|
|
content. This is known as deduping by hash.
|
2020-09-02 17:59:04 +02:00
|
|
|
|
2021-02-02 14:42:35 +01:00
|
|
|
If deduping by name, first rclone will merge directories with the same
|
|
|
|
name. It will do this iteratively until all the identically named
|
|
|
|
directories have been merged.
|
|
|
|
|
|
|
|
Next, if deduping by name, for every group of duplicate file names /
|
|
|
|
hashes, it will delete all but one identical files it finds without
|
|
|
|
confirmation. This means that for most duplicated files the `dedupe` command will not be interactive.
|
2020-09-02 17:59:04 +02:00
|
|
|
|
|
|
|
`dedupe` considers files to be identical if they have the
|
2021-02-02 14:42:35 +01:00
|
|
|
same file path and the same hash. If the backend does not support hashes (e.g. crypt wrapping
|
2020-09-02 17:59:04 +02:00
|
|
|
Google Drive) then they will never be found to be identical. If you
|
|
|
|
use the `--size-only` flag then files will be considered
|
|
|
|
identical if they have the same size (any hash will be ignored). This
|
|
|
|
can be useful on crypt backends which do not support hashes.
|
2017-09-30 15:19:47 +02:00
|
|
|
|
2021-02-02 14:42:35 +01:00
|
|
|
Next rclone will resolve the remaining duplicates. Exactly which
|
|
|
|
action is taken depends on the dedupe mode. By default rclone will
|
|
|
|
interactively query the user for each one.
|
|
|
|
|
2020-09-02 17:59:04 +02:00
|
|
|
**Important**: Since this can cause data loss, test first with the
|
|
|
|
`--dry-run` or the `--interactive`/`-i` flag.
|
2016-08-03 22:36:28 +02:00
|
|
|
|
|
|
|
Here is an example run.
|
|
|
|
|
|
|
|
Before - with duplicates
|
|
|
|
|
|
|
|
$ rclone lsl drive:dupes
|
|
|
|
6048320 2016-03-05 16:23:16.798000000 one.txt
|
|
|
|
6048320 2016-03-05 16:23:11.775000000 one.txt
|
|
|
|
564374 2016-03-05 16:23:06.731000000 one.txt
|
|
|
|
6048320 2016-03-05 16:18:26.092000000 one.txt
|
|
|
|
6048320 2016-03-05 16:22:46.185000000 two.txt
|
|
|
|
1744073 2016-03-05 16:22:38.104000000 two.txt
|
|
|
|
564374 2016-03-05 16:22:52.118000000 two.txt
|
|
|
|
|
|
|
|
Now the `dedupe` session
|
|
|
|
|
|
|
|
$ rclone dedupe drive:dupes
|
|
|
|
2016/03/05 16:24:37 Google drive root 'dupes': Looking for duplicates using interactive mode.
|
2020-09-02 17:59:04 +02:00
|
|
|
one.txt: Found 4 files with duplicate names
|
|
|
|
one.txt: Deleting 2/3 identical duplicates (MD5 "1eedaa9fe86fd4b8632e2ac549403b36")
|
2016-08-03 22:36:28 +02:00
|
|
|
one.txt: 2 duplicates remain
|
2020-09-02 17:59:04 +02:00
|
|
|
1: 6048320 bytes, 2016-03-05 16:23:16.798000000, MD5 1eedaa9fe86fd4b8632e2ac549403b36
|
|
|
|
2: 564374 bytes, 2016-03-05 16:23:06.731000000, MD5 7594e7dc9fc28f727c42ee3e0749de81
|
2016-08-03 22:36:28 +02:00
|
|
|
s) Skip and do nothing
|
|
|
|
k) Keep just one (choose which in next step)
|
|
|
|
r) Rename all to be different (by changing file.jpg to file-1.jpg)
|
|
|
|
s/k/r> k
|
|
|
|
Enter the number of the file to keep> 1
|
|
|
|
one.txt: Deleted 1 extra copies
|
2021-02-02 14:42:35 +01:00
|
|
|
two.txt: Found 3 files with duplicate names
|
2016-08-03 22:36:28 +02:00
|
|
|
two.txt: 3 duplicates remain
|
2020-09-02 17:59:04 +02:00
|
|
|
1: 564374 bytes, 2016-03-05 16:22:52.118000000, MD5 7594e7dc9fc28f727c42ee3e0749de81
|
|
|
|
2: 6048320 bytes, 2016-03-05 16:22:46.185000000, MD5 1eedaa9fe86fd4b8632e2ac549403b36
|
|
|
|
3: 1744073 bytes, 2016-03-05 16:22:38.104000000, MD5 851957f7fb6f0bc4ce76be966d336802
|
2016-08-03 22:36:28 +02:00
|
|
|
s) Skip and do nothing
|
|
|
|
k) Keep just one (choose which in next step)
|
|
|
|
r) Rename all to be different (by changing file.jpg to file-1.jpg)
|
|
|
|
s/k/r> r
|
|
|
|
two-1.txt: renamed from: two.txt
|
|
|
|
two-2.txt: renamed from: two.txt
|
|
|
|
two-3.txt: renamed from: two.txt
|
|
|
|
|
|
|
|
The result being
|
|
|
|
|
|
|
|
$ rclone lsl drive:dupes
|
|
|
|
6048320 2016-03-05 16:23:16.798000000 one.txt
|
|
|
|
564374 2016-03-05 16:22:52.118000000 two-1.txt
|
|
|
|
6048320 2016-03-05 16:22:46.185000000 two-2.txt
|
|
|
|
1744073 2016-03-05 16:22:38.104000000 two-3.txt
|
|
|
|
|
|
|
|
Dedupe can be run non interactively using the `--dedupe-mode` flag or by using an extra parameter with the same value
|
|
|
|
|
|
|
|
* `--dedupe-mode interactive` - interactive as above.
|
|
|
|
* `--dedupe-mode skip` - removes identical files then skips anything left.
|
|
|
|
* `--dedupe-mode first` - removes identical files then keeps the first one.
|
|
|
|
* `--dedupe-mode newest` - removes identical files then keeps the newest one.
|
|
|
|
* `--dedupe-mode oldest` - removes identical files then keeps the oldest one.
|
2018-04-28 12:46:27 +02:00
|
|
|
* `--dedupe-mode largest` - removes identical files then keeps the largest one.
|
2020-02-01 11:31:42 +01:00
|
|
|
* `--dedupe-mode smallest` - removes identical files then keeps the smallest one.
|
2016-08-03 22:36:28 +02:00
|
|
|
* `--dedupe-mode rename` - removes identical files then renames the rest to be different.
|
2021-02-02 14:42:35 +01:00
|
|
|
* `--dedupe-mode list` - lists duplicate dirs and files only and changes nothing.
|
2016-08-03 22:36:28 +02:00
|
|
|
|
|
|
|
For example to rename all the identically named photos in your Google Photos directory, do
|
|
|
|
|
|
|
|
rclone dedupe --dedupe-mode rename "drive:Google Photos"
|
|
|
|
|
|
|
|
Or
|
|
|
|
|
|
|
|
rclone dedupe rename "drive:Google Photos"
|
|
|
|
|
|
|
|
|
|
|
|
```
|
2017-06-15 21:12:26 +02:00
|
|
|
rclone dedupe [mode] remote:path [flags]
|
2016-08-03 22:36:28 +02:00
|
|
|
```
|
|
|
|
|
2020-05-22 12:17:37 +02:00
|
|
|
## Options
|
2016-08-03 22:36:28 +02:00
|
|
|
|
|
|
|
```
|
2021-02-02 14:42:35 +01:00
|
|
|
--by-hash Find indentical hashes rather than names
|
2020-02-01 11:31:42 +01:00
|
|
|
--dedupe-mode string Dedupe mode interactive|skip|first|newest|oldest|largest|smallest|rename. (default "interactive")
|
2017-09-30 15:19:47 +02:00
|
|
|
-h, --help help for dedupe
|
2016-08-03 22:36:28 +02:00
|
|
|
```
|
|
|
|
|
2019-06-20 17:18:02 +02:00
|
|
|
See the [global flags page](/flags/) for global options not listed here.
|
|
|
|
|
2020-05-22 12:17:37 +02:00
|
|
|
## SEE ALSO
|
2016-08-03 22:36:28 +02:00
|
|
|
|
2018-10-15 12:03:08 +02:00
|
|
|
* [rclone](/commands/rclone/) - Show help for rclone commands, flags and backends.
|
2018-03-19 11:06:13 +01:00
|
|
|
|