[Lilux-help] Duplicate files

Brent Frère brent at bfrere.net
Sun Feb 20 12:19:38 CET 2005


Patrick Useldinger a écrit :

> Brent Frère wrote:
>
>> Sorry. This looks interesting to me but I don't fully understand what 
>> you are trying to do.
>
> Find duplicate files (i.e. same length and content, not necessarily 
> same name) on a single file-system.

No time to write correctly the script, but it is indeed interesting, 
especially when you do un rsync and the source part has a folder that 
has been renamed. It would be great to have the renamed folder being 
detected and renamed at the destination side instead of re-copying all 
the files...

Here is the idea.
Do a find. For each file, compute a md5sum. Do a sort of it. Detect the 
sets of files having matching md5sum. Do a binary compare of each couple 
of such files. If it matches, you found it !

Roughly speaking:

# find . -type f -exec md5sum {} | sort > md5sum.lst \;
# uniq md5sum.lst > md5sum.uniq
# for each couple in `diff md5sum.lst md5sum.uniq`; do
 >    cmp $1 $2
 >    done

-- 
Brent Frère

Private e-mail:  Brent at BFrere.net

Postal address: 5, rue de Mamer
                L-8280 Kehlen
                Grand-Duchy of Luxembourg
                European Union

Mobile: +352-021/29.05.98
Fax:    +352-26.30.05.96
Home:   +352-307.341
URL:    http://BFrere.net

If you have problem with my digital signature, please install the appropriate authority certificate by browsing https://www.cacert.org/certs/root.crt.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: brent.vcf
Type: text/x-vcard
Size: 216 bytes
Desc: not available
URL: <http://lilux.lu/pipermail/lilux-help/attachments/20050220/ca8c8c90/attachment.vcf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3383 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lilux.lu/pipermail/lilux-help/attachments/20050220/ca8c8c90/attachment.bin>


More information about the Lilux-help mailing list