Is increase of database after area detection comprehensible?

Post Reply
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Is increase of database after area detection comprehensible?

Post by Robosoc »

I just finished automatic face recognition by the external tool TagThatPhoto (TTP), with writing the area descriptions into the files (jpg) and made a quick foder scan in PSU to let PSU realize that there have been changes in >20.000 images. I then read in alle the images with changes into PSU to synchronize them.

In TTP I write faces & tags to image metadata with a setting to add "Microsoft face regions" and "MWG face regions". I know this may be unneccassry double information in the jpg-files, but I wasn't sure what works best with PSU and on a test with only 3 files it looked to me as if PSU worked good with this setting.

Before that import my database had a size of 566.233 MB, after that import and after compacting it it shows 1.850.406 MB. More than tripple!
That does not seem to be comprehensible only by informations of areas and additional tags to me, does it? (Is compacting realy doing its job???)

I did the work in TTP in a couple of steps and mass-syncrhonised PSU twice or tripple in the past days, asuming that at the end I always have the latest state (I don't doubt that is the case). But unfortunatly I did not watch the size increase of the database during those steps.

I do have backups of either my photos before TTP action and of the PSU database before importing the new informations.

Is there any idea how I can check if there isn't anything wrong?

I could build up a complete new database based on the images of cause, but than I am going to loos folder labeling and Stack-informations, which I realy would dislike.

I am on PSU6 V3641 and did perform compacting in this version and the version I had instaled before (at least 6.0.0.3635, maybe a version inbetween but I don't think so).
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Is increase of database after area detection comprehensible?

Post by Hert »

I think your images were not all fully imported to the database. 500MB for a >20K images database sounds like very low. Maybe this was an upgraded database from an older version?
1.8GB sounds more like what I would expect for images will average metadata content. You're good. No need to reimport everything.
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

I tried to double check this issue. And I am still pretty sure that all my ~24.700 images have been fully imported before and all of them seemed to have a thumbnail (I reloaded the backed up database, compacted it and checked for unknown thumbs with 0 result).

I also checked if the size of the images themself have dramatically increased after the TTP activity and I can hardly see any difference in the sice of my image folders. (both roughly 87,8 GB)

Just to get sure I did mess something up with my catalogue I better build up a brand new catalog from importing all images again.... but before I need to figure out a way to have my stacks somehow saved so that I will have less work to reproduce them. Maybe I give each Stack a Tag like
Stack01, Stack02, Stack03..
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Is increase of database after area detection comprehensible?

Post by Hert »

but before I need to figure out a way to have my stacks somehow saved
Stacks are part of ICS and is written to metadata if you have ICS writing ON (the default). ICS is your extra "catalog backup".

Before importing your new images, make sure to enable ICS reading. Then after all images are imported then don't forget to disable ICS reading again. But keep ICS writing enabled, it's your extra backup that will safe you when everything else fails. ICS reading is only needed to "restore a catalog" so not needed during normal operation. Therefore you should keep that switched OFF after the restore.

Again, I see no reason to rebuild your catalog. Save yourself the time and effort. But of course that's up to you :D
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

Thank you Hert for giving me the hint with the ICS- Shema. Just to be sure: Is this the right setting (it looks correct to me):
1.jpg
1.jpg (30.41 KiB) Viewed 5468 times
2.jpg
2.jpg (21.71 KiB) Viewed 5468 times
I intend to use two Computers with Standalone PSU and plan to synchronize the database with goodsync on automatic mode (automatic synchronization on file change with delay...) via WLAN . So I would be happy if the catalog is as small as possible and tripple the size means ~tripple the snyc time, which increases failure and conflict possibilities. Therefore I realy want to make sure, if 1,8 GB is the best I can get as I still did not see what was missing before, when the database was <600 MB.

I do understand that my chances a low, but it is worth a try in my eyes.
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

Got a result ... but that does not make me really happy and gives me rather less certainty :-(

1) I set up a new catalog in a complete new folder.
2) I opended it, kept all settings as deault but added "Read IDimagaer ICS sheme (if availabel)
3) I impored the one root folder, in which all my images are in
4) I waited several hours until all services finished (Synch and thumbs)

The new database seem to have inlcluded all current 24668 images.
The size of the catalog is now 1,057 GB (so a third pretty much differing result inbetween the others)

Many statistics do not match at all. For example the "not Catalog labeled" Info at States shows
- 5484 images in the biggest catalog 1,85 GB
- only 4051 images in the newly build catlog with 1,057 GB.

Stacked Images have not been rebuild be ICS sheme, the new Catalog does not sho a single stacked image.

When I moved to PSU V6 I started to build my catalog from the images completly new and did not use ICS sheme then. In V6 world I decided to stick with English installations to be sure not to mix up with the translated root categories such as People (in english) and and Menschen (in german), right from the start.
Anyway, the new catalog from yesterday (build from XMP and ICS I understand) found 14.994 labeled images in "Menschen" and 10.838 in "People". "Menschen" does not exist in my catalogs since I use V6 and all catalogs have been fully in Sync (no "out-of-Sync images even after Verify folder all), that is definatly the case for the two last backed up catalogs before TTP face recignition (0,566 GB) and after TTP face recognition (1,85 GB).

Are this really drives my crazy and things like that were the reason why I decided to start with complete new catalog in V6. But it seemes to my, that even verify folder all, does not loook for 100% all differences between the current catalog and the image meta data.
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

Step by step I will now try to understand what is all messed up in my images.
So one issue what I found in the 1,85 GB (after TTP) catalog is, that there are images with Areas and Names linked to the area, where the Label of this area is not in the catalog label list:
Screenshot 2021-04-11 092505.jpg
Screenshot 2021-04-11 092505.jpg (366.24 KiB) Viewed 5427 times
I believe that happened while working in PSU and is not related to TTP: I will fokus on that a little more below.

The very same picture and in sync in both catalogs is recognised as following after newly build catalog:
Screenshot 2021-04-11 093421.jpg
Screenshot 2021-04-11 093421.jpg (379.95 KiB) Viewed 5427 times
So where does this come from in my point of view. (I will have to write this later, as I have family responsibilites now...coming back later.
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

When I had done the job in TTP with recognisizing about 240 different people that I gave a name to (yeah, what a Job) and thousands of automatic face detections, I brought those informations to PSU by "verify folder quick" and than Read all metadatas.

That worked out very good, but led of cause to a very flat list of many peoples who have not been in the catalog before collected directly under the lable people. All faces belonging to people that I already had in my hierachicle catalog (such us People.Family.Sven -> myself) have been merges automatically though.

I then had to structure those people that have been new to PSU. Some have not been new but I had them in the catalog by a slighly other name, so I had to merge some of the people. For this structuring process I decided to set PSU to travel mode, so that Synching would be processed after finalising tha strucuring work. I did so, to avoid synching images with mor then one person twice or even more often.

The label "Ammelie" which you see in the example images of my last post is a good example of these. Because the label Ammelie was only being produced by TTP and was new to PSU then. When I performed the work in TTP I wasn't sure about the name of this girl and called her Ammelie, later I remembered her name was Emmalie. And I used this new name from then. So while being in travel mode I merged something like this: Merge People.Ammelie to a label People.Frriends.Vacationfriends.Emmalie. On processes like this one, there must have been something gone wrong.

But the result currently is as following:

Catalog-Hierarchy of the 1,85 GB Catalog processed with a lot of semi-automtic TTP and manuell PSU work:
Screenshot 2021-04-11 150959.jpg
Screenshot 2021-04-11 150959.jpg (22.11 KiB) Viewed 5408 times
As already mentioned: No image out-of-sync, no result on "verify folder all" search
So my expectation is, that PSU wrote the latest informations into the files (XMP and ICS).


Catalog Hierarchy of the newly build catalog 1,057 GB Catalog
Screenshot 2021-04-11 151254.jpg
Screenshot 2021-04-11 151254.jpg (45.88 KiB) Viewed 5408 times
So, what I don't understand:

Why are there still Names like Ammelie in the image meta informations, that PSU V6 finds on import, when there is no out of sync detection before from a catalog where I don't have this name in my catalog anymore.
It would be correct if the name Ammelie would not appear in both, the name "Emmalie" should have been linked to the face of the left girl. And the name Leonie should be labeled only once from PEOPLE and not from MENSCHEN-branch.
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Is increase of database after area detection comprehensible?

Post by Hert »

So my expectation is, that PSU wrote the latest informations into the files (XMP and ICS).
The out-of-sync indicator is something that PSU keeps. If metadata in the file changes afterwards then PSU has no knowledge about that.
If you want to be sure that what PSU has in its catalog is in the metadata then you must write to the file first.

Check that particular file's metadata (right click -> Run from Repository -> Metadata -> Full Exif dump) and check where it has the Ammelie in the metadata. All I can do is guess, which won't help you at all.
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

Sorry, Hert, by trying to explain and show exactly what I wonder I was for sure giving to much information.

Let me try to narrow one of my problems down:
In an existing catalog I would expect that "Veryfy folder all" will find any difference in metadata between what is stored in the catalog and what is stored in the metadata of the image itself. After Verify I should be able to decide if I want to read the imagedata to the catalog or the other way around to have those images in snyc. If I abort this decision the image should be marked out of synch in PSU.

This is my understanding of PSU and therefore my expected bahavior.

The complete story above proofes that this is not the case, at least not on that image, allthough PSU is able to find all informations within the image, when I completly import the image. So "Verify" does not find all diferences...how can I scan allready imported folders for all changes?
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Is increase of database after area detection comprehensible?

Post by Hert »

In an existing catalog I would expect that "Veryfy folder all" will find any difference in metadata between what is stored in the catalog and what is stored in the metadata of the image itself. After Verify I should be able to decide if I want to read the imagedata to the catalog or the other way around to have those images in snyc. If I abort this decision the image should be marked out of synch in PSU.
"Verify Files All" finds the files that are changed compared to what PSU has stored in its database. It does so by comparing the binary signatures of the file with the binary signature that PSU keeps in the catalog.
If, for a changed file, you decide to read metadata from the file then PSU will read metadata to the catalog using your existing Sync-Read settings...exactly the same as a manual right click -> Metadata -> Read Metadata from File.

If you think this file is incorrect, then use right click -> Metadata -> Read Metadata from File to reproduce this. The result should be exactly the same. Afterwards check what is in its metadata (as explained in my previous reply) to analyze the result.
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

So with my limited (but by far not beginner anymore) knowledge I checked and analysed my situation (incl. reading the full EXIF dumb content) and still feel that there is something inconstistant insinde PSU, and I think I can even proof part of it now:

Again, I use one example picture of my catalog that shows two areas with linked labels in the preview, but only one label in the labeling assistent on the right side of the screen.
1.jpg
1.jpg (368.1 KiB) Viewed 5232 times
  • right click -> Metadata -> Read Metadata from File does not change anything, the name Ammelie would not appear blue in the "Selected Labelas" box!
  • (optional step, same result at the end) right click -> Metadata -> Save Metadata to File, now I would expect that a new catalogue would interprete
  • the metadate in the same way
  • build a complete new catalog
  • imported only the same one single picture and the result is:
2.jpg
2.jpg (338.33 KiB) Viewed 5232 times
Conclusion: On an import PSU detects "Ammalie" as a label, while "Read Metadata from File" on a picture that is already in the catalog it does not! So importing and Read Metadata does not have the same result

So now I am pretty sure that I lost the label "Ammalie" in the first place on catalog works inside PSU and not because of faulty behavior of TTP.

I am still not able to 100% reproduce the "loosing" of a label, but at least I seem to be able to reproduce the reason, how to produce a labeled area without a label selected:

In the Catalog, where I took the second screenshot from and where I only have this one image in, I do following:
  • no matter if activate travel mode or not (I checked both, because I thought it was related to travel mode, but found that it is not)
  • right click the catalog label -> Details... -> Change name
That will change the name of the label, but PSU would not recognise that this label was linked to an area (which it realy should in my point of view) and will not have a label for this area-name anymore but it keeps this area-name.
3.jpg
3.jpg (346.11 KiB) Viewed 5232 times
Unfortunatly I beleive I have messed around strongly with this now! So I have for sure thousands of pictures with an area hierarchical label, that is not in the same place of the hierarchie anymore or even has a changed name. And this is one of the the reasons, why there is not at all constistancy between my "groomed" catalog and the catalog that would build up from a complete import of all my images.
Robosoc
Posts: 38
Joined: 11 Apr 10 8:56
Location: Germany

Re: Is increase of database after area detection comprehensible?

Post by Robosoc »

@Hert, may I ask you for your opinion on my findings described within my last post?
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Is increase of database after area detection comprehensible?

Post by Hert »

A manual read of metadata (right click -> Metadata -> Read Metadata from file) is completely identical to importing. The results depend on your sync settings.
You state that you created a new database. Preferences are stored in the catalog and your new catalog therefore has the default preferences.
You probably have tweaked your Sync setting causing the difference.
Best to reset your sync settings to the default. There's a button for that.
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Post Reply