delete unwanted orphaned keywords from files

peter wijn
Posts: 37
Joined: 24 Oct 11 18:55

delete unwanted orphaned keywords from files

Post by peter wijn »

Hi there,

Perhaps some of you could help me on the following.

I read:
http://forum.idimager.com/viewtopic.php ... s&start=15
which I think could be very instructive for me, except that I don't quite understand it. Non native English has something to do with that, probably.
Off course, I am trying to help myself by reading again and trying, but some tiny hints from people with more in depth knowledge will surely speed things up.

I have a very nice set of hierarchical keywords in my database. I guess they should be called hierarchical catalog labels? It works. I do find my photos easily and rather fast.

When looking into the files and in the database, I see that not allways the entire hierarchy of labels is entered. Sometimes only the most detailed ones, this is probably due to reorganising the structure at later stages. For instance, 'zonsopkomst' sunset used to be a daughter of 'moments' in 'Others', now it is related to 'licht' light in 'Styles'. Problem 1): When I drag 'zonsopkomst' to 'licht', photos that are prior assigned to 'zonsopkomst' don't get the tag 'licht', only new assigned ons (yes, not forgetting to select all these and to check them on and off solves this in future use). Problem 2): simply dragging the 'zonsopkomst' to 'licht' leaves 'moments' as a checked catalog label to the photo (same thing about not forgetting to uncheck). (I guess this could be called a 'widow' label / keyword).
1) Is there a fast way to get the actual hierarchy of labels set to all the photos in the database? I do nature photography, with all kinds of plants, birds, insects, so I've got quite a few entries to run through.

Next, like Harald, I keep finding a series of old and faulty keywords in the photos, which are not related to my desired PSU catalog labels (there is the difference to Haralds case). I am seeing those again now while looking for a substitute for Capture NX2 as raw-convertor.
I thought I cleaned these up years ago while building up an ID-Imager database from iView / Microsoft Expression Media written keywords (that has taken quite some effort). But it appears they are still there (at least partly). While typing this, I realise this could be due to putting back old files from a back-up after a HD-crash (Ouff, with quite some luck I was 100% save, except for some work...). However, I am not really sure that I cleaned the files, my focus being on the catalogue at the time.
2) What would be the fastest way to delete orphaned keywords from the files? I hope this can be done without needing to mess up my quite nice and rather tidy database by reading again the keywords from file and enter them as new catalog labels.
2a) Would running through all directories and apply 'synchronise XMP' be enough? Or, 2b) is it better/faster to run some other program to clean the EXIF and rewrite the desired labels back to the files as keywords after that? If so, 2c) which software would you advise for Nikon NEF's (99%), JPEGs and Tiff's).
(Yes, I am very confident that these old tags are rather a pain and I am sure I want to get rid of all of them. Some snowy dutch pictures are tagged Oman and klauwier)

3) What am I doing wrong on the following: when I change some catalog label from 'A, B' to 'A B' (without the ','), I still find a lot of 'A, B' tags in the files - and missing 'A B' -, while synchronisation is on, and has been running and snoring wildly on the process? It seems synchronisation doesn't always keep up? This also happens when updating from catalogue labels like 'firstename' to 'Firstename Lastname'. Database ok, (some) files not ok.

4) please do advice me to make a back-up in advance ... ;)

(... Hey, in case that 2a) is not enough, perhaps I could - 2d) - work in a copy my database? Read all keywords from exif to catalog labels in that copy (probably for each directory separately), and subsequently delete all catalog labels with synchronisation on. That could be an extremely fast operation when deleting the parent catalog labels 2e), but probably it should be done per directory to avoid new orphans 2f (also considering point 3)? And, after that, delete the copy of the catalog, open the good old catalog and synchronise the directories again to load the correct labels to keywords... This 2c) and 2f) would imply running through the directories twice, so I just hope 2a) turns out to be enough...)

My first action will be: check again if 2a) works after all on a directory where I find orphaned keywords tagged to the files. I'll report in a few hours.
Next, I'ld like to understand point 3 first, since this could be crucial in the process.

thanks for reading, and possibly your reply
HaraldE
Posts: 267
Joined: 29 Apr 07 21:30
Location: Bålsta, Sweden

Re: delete unwanted orphaned keywords from files

Post by HaraldE »

Hello Peter,

Just wanted to say I will probably not be much of a help for a long while. As you say, my case is easier since I want to have a clean get-out and then start from scratch.
The easy part is to get rid of the words from the catalog, this I can do from withing the Catalog (By Category). The more tricky part if to make sure they also are removed from the metadata in the files.
And I have not yet decided best way here. Need time for some carefull testing.

Also I did get this valid comment from Mike
consider determining your overall cataloging convention before tidying up your previous efforts.
This is also part of my thinking and to be honest I will probably not have time for this till next year.

Good luck, Harald
vlad
Posts: 895
Joined: 01 Sep 08 14:20

Re: delete unwanted orphaned keywords from files

Post by vlad »

Hi Peter,
peter wijn wrote:Hi there,

Perhaps some of you could help me on the following.
Ok, I'm going to try helping you with some ideas, hopefully at least some of them will work.
I read:
http://forum.idimager.com/viewtopic.php ... s&start=15
which I think could be very instructive for me, except that I don't quite understand it. Non native English has something to do with that, probably.
Rest assured, your non native English is excellent. (It's up to you to take this as a compliment or not, though, as I'm a non native English speaker myself :wink: )
Off course, I am trying to help myself by reading again and trying, but some tiny hints from people with more in depth knowledge will surely speed things up.

I have a very nice set of hierarchical keywords in my database. I guess they should be called hierarchical catalog labels? It works. I do find my photos easily and rather fast.

When looking into the files and in the database, I see that not allways the entire hierarchy of labels is entered. Sometimes only the most detailed ones, this is probably due to reorganising the structure at later stages. For instance, 'zonsopkomst' sunset used to be a daughter of 'moments' in 'Others', now it is related to 'licht' light in 'Styles'.
When you say "it is related", I assume you actually mean that 'zonsopkomst' has been relocated under 'licht'. Also, I am inferring from the rest of your post that the "Also assign its parents" setting is enabled in the 'zonsopkomst' label. Is that correct?
Problem 1): When I drag 'zonsopkomst' to 'licht', photos that are prior assigned to 'zonsopkomst' don't get the tag 'licht', only new assigned ons (yes, not forgetting to select all these and to check them on and off solves this in future use).
AFAIK, this is by design - the reasoning being that a label relocation (or a change in the label details, for that matter) does (should?) not automatically trigger the label reassignment (along with the updated parent chain). I believe there is room here for one or more feature requests and/or custom relocation scripts - in the meanwhile, the solution is indeed to manually reassign the relocated label.
Problem 2): simply dragging the 'zonsopkomst' to 'licht' leaves 'moments' as a checked catalog label to the photo (same thing about not forgetting to uncheck). (I guess this could be called a 'widow' label / keyword).
Yep, the same reasoning applies: updating the label's parent does not automatically unassign the old parent (which errs on the side of caution, but requesting and acting on the user's input would be very nice).
1) Is there a fast way to get the actual hierarchy of labels set to all the photos in the database? I do nature photography, with all kinds of plants, birds, insects, so I've got quite a few entries to run through.
Well...maybe: there is a very old (September 2008) IdImager script for fixing the parent assign chain. (As a historical detail, I believe it was kindly provided by Hert just after I hit the same problem as you, shortly after I had started using IdImager.) I have no idea if it still works (correctly) in Photo Supreme - if it does not, someone (myself included) might be able to update it accordingly. Let us know if you try it. (However, keep in mind that it will not magically unassign the old parent chain, IIRC.)
Next, like Harald, I keep finding a series of old and faulty keywords in the photos, which are not related to my desired PSU catalog labels (there is the difference to Haralds case).
You're saying "desired PSU catalog labels", but I'm inferring (again) that what you mean is that there are currently no PSU catalog labels (desired or not) defined for those keywords. Is that correct?
2) What would be the fastest way to delete orphaned keywords from the files? I hope this can be done without needing to mess up my quite nice and rather tidy database by reading again the keywords from file and enter them as new catalog labels.
What you could do, I believe, is to set Preferences -> Write settings -> Keywords processing to "Replace keywords with Catalog labels". (I'm inferring - yet again - that you don't have it already set; is that correct?) I don't know if this setting enables all images in the Catalog to be marked out-of-sync - if it does not, you may want to uncheck the "Only update out-of-sync images" write setting and then save the metadata for all cataloged images (or only on the relevant collection of images with orphaned keywords, if you are able to identify it). (Not sure if Tools -> Save Metdata to File for All Out-Of-Sync Images" would work in this case or not.)
2a) Would running through all directories and apply 'synchronise XMP' be enough?
I'm not sure if that would be either useful or enough. If (and only if) your entire metadata in the catalog is up-to-date (that is: it reflects the desired state), I would just save the entire metadata, as previously explained.
Or, 2b) is it better/faster to run some other program to clean the EXIF and rewrite the desired labels back to the files as keywords after that? If so, 2c) which software would you advise for Nikon NEF's (99%), JPEGs and Tiff's).
I can't advise you here. As a DAM application, I would say PSU should be ideal for achieving precisely the job you're describing - if it is not, then please explain why and/or report the encountered issues and shortcomings into Mantis.
(Yes, I am very confident that these old tags are rather a pain and I am sure I want to get rid of all of them. Some snowy dutch pictures are tagged Oman and klauwier)
C'mon, are you sure it never snows in Oman? :lol:
3) What am I doing wrong on the following: when I change some catalog label from 'A, B' to 'A B' (without the ','), I still find a lot of 'A, B' tags in the files - and missing 'A B' -, while synchronisation is on, and has been running and snoring wildly on the process? It seems synchronisation doesn't always keep up? This also happens when updating from catalogue labels like 'firstename' to 'Firstename Lastname'. Database ok, (some) files not ok.
No idea. But I would try to isolate and reproduce the issue with a single image and a single label change. More specifically: pick such a file with stale metadata - is it marked out-of-sync? Does forcing a write-sync update the metadata correctly? What is the write setting for keyword processing?
4) please do advice me to make a back-up in advance ... ;)
Ok, here it is on record: I do advise you to make a back-up in advance! ;)
(... Hey, in case that 2a) is not enough, perhaps I could - 2d) - work in a copy my database? Read all keywords from exif to catalog labels in that copy (probably for each directory separately), and subsequently delete all catalog labels with synchronisation on. That could be an extremely fast operation when deleting the parent catalog labels 2e), but probably it should be done per directory to avoid new orphans 2f (also considering point 3)? And, after that, delete the copy of the catalog, open the good old catalog and synchronise the directories again to load the correct labels to keywords... This 2c) and 2f) would imply running through the directories twice, so I just hope 2a) turns out to be enough...)
This is getting (too) complicated... Maybe proceed along the lines I've suggested and see how it goes before implementing Plans B or C? (Of course, the database backup always comes before Plan A ;))
My first action will be: check again if 2a) works after all on a directory where I find orphaned keywords tagged to the files. I'll report in a few hours.
Next, I'ld like to understand point 3 first, since this could be crucial in the process.
Any result about 2a?

Hope that helps,
Vlad
Mike Buckley
Posts: 1194
Joined: 10 Jul 08 13:18

Re: delete unwanted orphaned keywords from files

Post by Mike Buckley »

peter wijn wrote:Problem 1): When I drag 'zonsopkomst' to 'licht', photos that are prior assigned to 'zonsopkomst' don't get the tag 'licht', only new assigned ons (yes, not forgetting to select all these and to check them on and off solves this in future use). Problem 2): simply dragging the 'zonsopkomst' to 'licht' leaves 'moments' as a checked catalog label to the photo (same thing about not forgetting to uncheck). (I guess this could be called a 'widow' label / keyword).
If I understand you correctly (and I may not), I wonder if your problems will be solved if you use a different method of making that happen. Instead of dragging and dropping, try editing the catalog label to change the parent. When I do that with automatic synching enabled, the change is made in the catalog and the image file just fine. I haven't tried dragging and dropping.
peter wijn
Posts: 37
Joined: 24 Oct 11 18:55

Re: delete unwanted orphaned keywords from files

Post by peter wijn »

thanks Mike, Vlad, Harald
Í shall do some homework with your recommendations.
... What you could do, I believe, is to set Preferences -> Write settings -> Keywords processing to "Replace keywords with Catalog labels". (I'm inferring - yet again - that you don't have it already set; is that correct?) I don't know if this setting enables all images in the Catalog to be marked out-of-sync - if it does not, you may want to uncheck the "Only update out-of-sync images" write setting and then save the metadata for all cataloged images (or only on the relevant collection of images with orphaned keywords, if you are able to identify it). (Not sure if Tools -> Save Metdata to File for All Out-Of-Sync Images" would work in this case or not.)

I'll start checking out this one: "Replace keywords with Catalog labels", not sure about that one. I'll check too if the XMP-synch under right clicking I have been using, is the same as the 'save metadata to file..." thing you mention.
(Somewhere, I think to have read from Hert one shouldn't expect this to work when applied to an entire catalogue at once?)
(what I mean with 'non native' is also a difficulty in understanding of the exact meaning and implication in a real world workflow of words like tag, keyword, catalogue label, XMP-data, EXIF-information, IPTC-data, meta-data... I did read a bit about metadata, but a large part of it is unnescesary for my use. Perhaps I should start changing the setting of my version of PSU to English in order to let it correlate directly to advice in this forum.)
vlad
Posts: 895
Joined: 01 Sep 08 14:20

Re: delete unwanted orphaned keywords from files

Post by vlad »

vlad wrote:you may want to uncheck the "Only update out-of-sync images" write setting and then save the metadata for all cataloged images
I have realized there is no quick way to select all cataloged images (or am I simply missing it?) - I have therefore submitted FR #2679.
Mike Buckley wrote:Instead of dragging and dropping, try editing the catalog label to change the parent. When I do that with automatic synching enabled, the change is made in the catalog and the image file just fine.
Mike, what is the exact change that you're talking about?

Even if the label has "Also assign its parents" enabled, changing its parent (whether in the details editor or by dragging and dropping) does not update the label assignments and the standard keyword field (xmp:dc:subject), although it does update the hierarchical info inside lr:hierarchicalSubject (assuming hierarchical keywords are enabled, of course). I assume Peter was talking about the former field, although I'm not 100% sure.
vlad
Posts: 895
Joined: 01 Sep 08 14:20

Re: delete unwanted orphaned keywords from files

Post by vlad »

peter wijn wrote:(what I mean with 'non native' is also a difficulty in understanding of the exact meaning and implication in a real world workflow of words like tag, keyword, catalogue label, XMP-data, EXIF-information, IPTC-data, meta-data...
Well, I think this is less about being a native English speaker and more about being a native speaker of the Adobe, DAM & IdImager jargon! (But, then again, I don't know anyone who is - although one may never know with the techie toddlers these days...) :lol:
Mke
Posts: 675
Joined: 15 Jun 14 14:39

Re: delete unwanted orphaned keywords from files

Post by Mke »

vlad wrote: I have realized there is no quick way to select all cataloged images (or am I simply missing it?) - I have therefore submitted FR #2679.
You can get all cataloged images via "folders", but only if all your images are in one folder tree.
Mike Buckley
Posts: 1194
Joined: 10 Jul 08 13:18

Re: delete unwanted orphaned keywords from files

Post by Mike Buckley »

Mike Buckley wrote:Instead of dragging and dropping, try editing the catalog label to change the parent. When I do that with automatic synching enabled, the change is made in the catalog and the image file just fine.
vlad wrote:Mike, what is the exact change that you're talking about? Even if the label has "Also assign its parents" enabled, changing its parent (whether in the details editor or by dragging and dropping) does not update the label assignments and the standard keyword field (xmp:dc:subject), although it does update the hierarchical info inside lr:hierarchicalSubject (assuming hierarchical keywords are enabled, of course). I assume Peter was talking about the former field, although I'm not 100% sure.
I'm referring to the hierarchy of Lightroom keywords. That's because if they're being used, if delimited keywords are not being used and if parents are not assigned, there is no hierarchy displayed in the xmp:dc:subject field. That's fine with me, as I don't see any practical benefit to displaying a hierarchy in the xmp:dc:subject field whether by assigning parents or using delimited keywords so long as the hierarchy of Lightroom keywords is being used.

Now that I realize the complexity of the situation Peter is enduring (I had not previously been aware of it), it seems to me that he might want to consider using those configurations I explained. The current hassle he is experiencing would be eliminated.
peter wijn
Posts: 37
Joined: 24 Oct 11 18:55

Re: delete unwanted orphaned keywords from files

Post by peter wijn »

Meantime, the first main problem is solved.

It's a combination of things written here:
- "As a DAM application, I would say PSU should be ideal for achieving precisely the job you're describing"
Realising this is important :). It should be possible.

- "is to set Preferences -> Write settings -> Keywords processing to "Replace keywords with Catalog labels".

- "save the metadata for all cataloged images"
I didn't precisely use that one for now, but used right clicking on a small set selected images, to be in total control.

It appeared that in some of my catalog labels I repeated myself: for instance label "Ophrys lupercalis Bruine Spiegelorchis" contained as synonyms "Ophrys lupercalis" and "bruine spiegelorchis", also, the parent, the group "bruine spiegelorchis" still contained ophrys sulcata (which is the first of this group I found. There is at least a dozen of these brown ophrys, which are almost impossible to tell apart). All in all, among the keywords of the photo file these words occur three times. This looks like very old inconsistent I-am-not-in-control crap from Expression Media, but it is in fact more recent self produced crap ;). Sorry Expression Media.

Remaining problems,
- assigning the new parents to photos when a catalog labels that has been moved from one parent to another. Well, these catalog labels are not all as consistent and nice and tidy as I thought them to be, so I've planned to do a bit of revisiting all of them. Which involves automated synchronisation now.
- the 'widow' parent of catalog labels, well, these shouldn't be so big a problem, since they are describing something that used to be relavant to the photos.

At work, and lets see ...
thanks a lot

And then I reread Mike's comment...
peter wijn
Posts: 37
Joined: 24 Oct 11 18:55

Re: delete unwanted orphaned keywords from files

Post by peter wijn »

Mike Buckley wrote:... I'm referring to the hierarchy of Lightroom keywords. That's because if they're being used, if delimited keywords are not being used and if parents are not assigned, there is no hierarchy displayed in the xmp:dc:subject field. That's fine with me, as I don't see any practical benefit to displaying a hierarchy in the xmp:dc:subject field whether by assigning parents or using delimited keywords so long as the hierarchy of Lightroom keywords is being used.

Now that I realize the complexity of the situation Peter is enduring (I had not previously been aware of it), it seems to me that he might want to consider using those configurations I explained. The current hassle he is experiencing would be eliminated.
I think you wrote something which is valuable for me to understand. However, my 'non-nativeness' was hitting me hard on this one Mike. I didn't have a clue.
So I tried google delimited keywords and read the first entry:
http://manual.idimager.com/keywording.htm

It's a very good manual. I didn't understand it when I read it before when I installed ID-imager v5, but now I do (I think I do).

thanks.
Mike Buckley
Posts: 1194
Joined: 10 Jul 08 13:18

Re: delete unwanted orphaned keywords from files

Post by Mike Buckley »

Peter,

For greater clarity about why you would choose to use Lightroom keywords and why you would choose not to use delimited keywords, review the PhotoSupreme QuickStart Cataloging manual beginning with the last sentence on page 20 and continuing on page 21.

Keep in mind, though, the other very relevant decision pertaining to whether to assign parent labels. That decision is not discussed in the information you or I are citing. Your current implementation of assigned parents is causing you issues. That's fine if there is a real benefit to having the parents assigned that makes it worthwhile having to deal with those issues. For me and my use, it's not at all worthwhile, so I don't assign the parent labels.
peter wijn
Posts: 37
Joined: 24 Oct 11 18:55

Re: delete unwanted orphaned keywords from files

Post by peter wijn »

Humm, ah, even the normal present quickstart manual... Happily it starts at page 17 on my version, since page 21 and 22 are not present.
I couldn't read this before... (partly because my eyes stopped resolving my screens resolution some time ago... I changed the size of the letters only recently...)
And because translation of hierarchical keywords is 'geneste keywords' which I didn't match.
That all solved.
Adding hierarchical/geneste keywords now probably offers best future interchangability.
Mike Buckley
Posts: 1194
Joined: 10 Jul 08 13:18

Re: delete unwanted orphaned keywords from files

Post by Mike Buckley »

peter wijn wrote:Humm, ah, even the normal present quickstart manual... Happily it starts at page 17 on my version, since page 21 and 22 are not present.
It didn't help that I typed the incorrect page numbers pertaining to my version. I should have mentioned pages 20 and 21 (now corrected). My apologies!
lippe
Posts: 296
Joined: 12 Aug 06 11:26
Location: Wondelgem (Belgium)

Re: delete unwanted orphaned keywords from files

Post by lippe »

peter wijn wrote:And because translation of hierarchical keywords is 'geneste keywords' which I didn't match.
Peter, 'Geneste trefwoordtags' is the term used for hierarchical keywords in the dutch documentation for Adobe Lightroom.
(Write hierarchical keywords = Geneste trefwoordtags schrijven)
Photo Supreme V6, LR6, darktable, FPV, PSE14 - vaio i5 @ 2.5GHz + 8GB , 850 EVO 500GB - WD 1TB - Windows 10 Pro 64 bits- DS216play - EOS 600D
Post Reply