How do I detect duplicates?

flossie
Posts: 23
Joined: 08 May 12 10:25
Location: Christchurch, New Zealand

How do I detect duplicates?

Post by flossie » 15 Nov 13 10:26

Hi folks

I'm wondering if there's a nice easy way looking me right in the face to detect duplicates?

I've had to reimport a bunch of images a few times due to accidentally assigning the wrong renaming profile to them. It would be GREAT if I could easily clean up my mess.

Thanks!
Laura.

Hert
Posts: 21124
Joined: 13 Sep 03 7:24

Re: How do I detect duplicates?

Post by Hert » 15 Nov 13 13:00

Hi Laura,

I figured that it would be simple for me to script something together, but then I ran into a limitation. So I have released a minor update that gives the scripter a few more instruments to work with databases.

Attached is a script that will give you duplicates, based on the FileSize. That may give a few mis-hits in the rare case where you have different file with exact the same file size.

BUT: first upgrade to build 177. The script won't run without that update. You can download the 177 update from the website or your Control Panel at https://cp.idimager.com

After installing build 177, run the attached script by saving it and the open it in Tools -> Scripter. There click the run button to execute the script.

Hope that helps

Hert
Attachments
laura.psc
(2.87 KiB) Downloaded 135 times
This is a User-to-User forum which means that users post questions here for other users.
Feature requests, change suggestions, or bugs can be logged in the ticketing system

David Grundy
Posts: 326
Joined: 13 May 07 16:40
Location: Hong Kong

Re: How do I detect duplicates?

Post by David Grundy » 15 Nov 13 15:03

** OT **

Gosh. I'm kinda hoping that Laura will now say that she has many Raw files all the same size, and we'll see some other weird scripts (although I admit I can't see how to use any part of this one in a more general context!)

I had a slightly surreal Google experience just now when I googled "ADs.openset" expecting to find something about Delphi or SQL and instead found
"Can a Closed Set Be Open? Can an Open Set Be Closed?"

Hert
Posts: 21124
Joined: 13 Sep 03 7:24

Re: How do I detect duplicates?

Post by Hert » 15 Nov 13 16:45

David,

Closed sets open...very deep :)

This script is very low level and in fact FileSize can be replaced with idSignature to find dups based on file signatures. But then I found out that file name is part of the signature which makes it useless for duplicate file detection for files with different names. Then I fell back to the file size. Multiple files with identical sizes? Small chance
This is a User-to-User forum which means that users post questions here for other users.
Feature requests, change suggestions, or bugs can be logged in the ticketing system

flossie
Posts: 23
Joined: 08 May 12 10:25
Location: Christchurch, New Zealand

Re: How do I detect duplicates?

Post by flossie » 15 Nov 13 22:30

Great, thank you very much Hert! I'm sure others will find this useful too!

I can't be the only one who has multiple rename profiles and doesn't quite choose the right one 100% of the time :)

Cheers,
Laura.

*OT*

David,
Good luck with that. BTW the answer is yes. Clopen sets. Apparently. Some interesting wee concepts to visualise for the morning, thanks to you and Wikipedia! I figure if the analogy was colours and not 'geometry, topology or related branches of (boring old) mathematics', then,
Rainbow = an open set
PMS chart = a closed set
(And perhaps an optical illusion a clopen set?)

:)

george
Posts: 1162
Joined: 24 Jun 07 15:57
Location: USA

Re: How do I detect duplicates?

Post by george » 15 Nov 13 22:31

As a former math major, I can state definitively that sets can be open, closed, both or neither. I'll leave finding examples as an exercise for the reader.
George

David Grundy
Posts: 326
Joined: 13 May 07 16:40
Location: Hong Kong

Re: How do I detect duplicates?

Post by David Grundy » 16 Nov 13 5:37

Thanks Hert, interesting commentary.

I agree it is unlikely to find a coincidental file size match these days.

Just for the record then, some older raw formats were standard sizes. For example, just checking for crw files on my local drive now, I find that I had many .crw files all at 9,219,600 bytes, taken with a Canon A620 (running CDHK to get raw capability), before I converted most of them to DNG a while ago. I probably also have lots of other old crw files from various different cameras, but they're all DNGs now so I can't easily check the original crw filesizes. But I do remember being used to seeing screen upon screen of file lists with identical file sizes when transferring pictures from CF cards in the (relatively distant) past.

... David

David Grundy
Posts: 326
Joined: 13 May 07 16:40
Location: Hong Kong

Re: How do I detect duplicates?

Post by David Grundy » 16 Nov 13 6:07

** OT - clopen sets etc **

Seeing the question took me back unexpectedly to my younger days; I once knew lots of set theory without having to think hard about it. More recently I've spent a lot of time developing and using stochastic cashflow projection models, and have rather wished I'd focussed on probability and statistics rather than Pure Math at uni. Still, it's good to be reminded of this stuff.

... David

mphillips
Posts: 2035
Joined: 31 May 07 12:02
Location: Parkwood,Johannesburg,South Africa

Re: How do I detect duplicates?

Post by mphillips » 18 Nov 13 6:17

Hi There

I am feeling a little slow this morning :-)

I ran the script and came up with about 2200 files in my database.

Now I want to see the filesize of the "offending" pictures - and I can't, for the life of me, remember where to see an image's file size. I tried the Image Details, Technical - Nothing, I tried the Info Button but there are no "file properties" displayed there, I tried grid view - no joy either. I know that I can do a thumbs script but surely I should not have to resort to that to see basic file props ?

Also how do I sort by file size ? There is no column in grid view that I could see.

Thanks

MikeP
Mike Phillips
http://www.mikeandmorag.co.za
D800, CNX2, Supreme

Hert
Posts: 21124
Joined: 13 Sep 03 7:24

Re: How do I detect duplicates?

Post by Hert » 18 Nov 13 7:21

Mike,

The file size is displayed in the Info Panel.

Hope that helps

Hert

[update; double checked and the file size is not in the Grid by default]
This is a User-to-User forum which means that users post questions here for other users.
Feature requests, change suggestions, or bugs can be logged in the ticketing system

mphillips
Posts: 2035
Joined: 31 May 07 12:02
Location: Parkwood,Johannesburg,South Africa

Re: How do I detect duplicates?

Post by mphillips » 18 Nov 13 8:34

Hi Hert

Thanks for the fast response.

Yes - There is an "approximate and rounded" file size in the Info Panel - I guess I was looking for more details / exact - maybe in image details.

Also I cannot see the column to tick in the Grid View.
18-11-2013 09-31-16.jpg
18-11-2013 09-31-16.jpg (71.91 KiB) Viewed 4959 times
Thanks

MikeP
Mike Phillips
http://www.mikeandmorag.co.za
D800, CNX2, Supreme

Hert
Posts: 21124
Joined: 13 Sep 03 7:24

Re: How do I detect duplicates?

Post by Hert » 18 Nov 13 8:39

Mike,
mphillips wrote:Also I cannot see the column to tick in the Grid View.
I think that I updated my post around the same time. It's not in the Grid indeed.

And indeed it is rounded. For now, to see the file size in detail, you can add %ImageFileSize as a custom field

Hert
This is a User-to-User forum which means that users post questions here for other users.
Feature requests, change suggestions, or bugs can be logged in the ticketing system

mphillips
Posts: 2035
Joined: 31 May 07 12:02
Location: Parkwood,Johannesburg,South Africa

Re: How do I detect duplicates?

Post by mphillips » 18 Nov 13 8:39

Hi Hert

Just a follow up.

I added some Custom Fields to the info panel and got a strange result on file size versus file size short versus windows explorer .

Please see attached:
18-11-2013 09-37-02.jpg
18-11-2013 09-37-02.jpg (185.59 KiB) Viewed 4959 times
Should these all not be approx the same ?

Thanks

MikeP
Mike Phillips
http://www.mikeandmorag.co.za
D800, CNX2, Supreme

Hert
Posts: 21124
Joined: 13 Sep 03 7:24

Re: How do I detect duplicates?

Post by Hert » 18 Nov 13 8:48

Mike,

A file size of 9150002 bytes

equals

9150002/1024 = 8936 KB

equals

9150002/(1024*1024) = 8,7 MB

Hert
This is a User-to-User forum which means that users post questions here for other users.
Feature requests, change suggestions, or bugs can be logged in the ticketing system

mphillips
Posts: 2035
Joined: 31 May 07 12:02
Location: Parkwood,Johannesburg,South Africa

Re: How do I detect duplicates?

Post by mphillips » 18 Nov 13 9:22

Hi Hert

You are quite correct !

How do I perform a Calc in a Custom Field e.g. %filesize/1024 to return the same result as Windows Explorer.

Thanks

MikeP
Mike Phillips
http://www.mikeandmorag.co.za
D800, CNX2, Supreme

Post Reply