Couple of tools for Duplicate Photos and Media Titles

Despite saving nearly all of my photos stored online straight from my phone, I still many photos and other media stored locally. In part, this is due to having photos from other cameras now and over the years or other media (music library for example).

However, curation is a bit of a headache so I only try to clean every once in a while. Here are a few of the commands that I use to remove duplicates, fix titles or make other changes.

Music and Movie Titles

To check titles on media files like movies or music, I print out the filename (easy to find via ls!) and the title using exiftool like this (the line break between File Name and Title is important to grep -F):

ls -1 |  while read a; do echo "$a"; exiftool "$a" | grep -F "File Name
Title" ; echo " "; done

If I want to update the title:

exiftool -Title=Zombie_Island my_homemade_zombie_movie.m4v

Duplicate Photos

To find duplicate photos, you could rely on the filename, but you might find that the same filename (image_001.jpg) was used for a number of images. So, the best thing to do is compare checksums on the files:

find . -type f -exec md5sum {} \; | sort | uniq --all-repeated=separate -w 15 > dupes.txt

and fine the duplicates to keep:
grep -A1 ^$ dupes.txt | grep / > dup-keep.txt
wc -l dup-keep.txt #to check how many

find the files to move to a holding directory prior to deleting rather than deleting straight away - something could go wrong!:
grep -vf dup-keep.txt dupes.txt > dup-move.txt # slow if there are many...
grep -f dup-keep.txt dup-move.txt  # should be empty

remove the empty lines and check:
grep -v ^$ dup-move.txt > dup-m1.txt
grep -c / dup-move.txt  dup-m1.txt
grep -c / dup-m1.txt
mv dup-m1.txt dup-move.txt

test out "moving" the duplicates into the duplicates directory. I say "moving" as it's really a copy and delete:

mkdir duplicates

cut -d" " -f3-99 dup-move.txt | head | cpio -pvd duplicates
cut -d" " -f3-99 dup-move.txt | head | while read a; do echo $a; \rm "$a" ; done

If that looks good, then do the rest:
cut -d" " -f3-99 dup-move.txt | cpio -pvd duplicates
cut -d" " -f3-99 dup-move.txt | while read a; do echo $a; \rm "$a" ; done

Check for duplicates in the local directory again:

mv duplicates ../
find . -type f -exec md5sum {} \; | sort | uniq --all-repeated=separate -w 15 > dupes.txt


Popular Posts