Search Text Truncated After Recent Search Overhaul

Post Reply
aaronj
Posts: 22
Joined: 03 Sep 21 18:41

Search Text Truncated After Recent Search Overhaul

Post by aaronj »

Maybe this is just something that happened to me as a result of some circumstance, but the recent update that changed the structure of search data seems to have truncated the search values at e.g. spaces, underscores, hyphens, etc.

E.g. there are 54 items in the catalog with names that begin with 20170917 followed by an underscore and 6 digits indicating the time. They can only be returned by the search box using the portion of the reference up to the occurrence of the first problem character (an underscore) but the reference for the items is truncated at that point.
2022-08-15 12_19_39-Window.png
2022-08-15 12_19_39-Window.png (4.58 KiB) Viewed 1331 times
While you can at least get to items in some manner with that first example, if you move to images with an "IMG_" prefix, that is no longer the case as the only way to return any of them by their reference is as the entire group bearing that prefix using the search "IMG".

Presumably arbitrary truncation of the search values isn't the intended behavior and either there is a bug or I had something not function properly during the upgrade process. I should note that I watched the upgrade/search update process run to completion, and there were no errors or disruptions during that process.

Also note that I'm on the PostgreSQL version and running PostgreSQL v 13
aaronj
Posts: 22
Joined: 03 Sep 21 18:41

Re: Search Text Truncated After Recent Search Overhaul

Post by aaronj »

It occurred to me after making the first post that the missing content might be separately searchable, and sure enough, it is, so you can separately find the portions of the reference (along with any other items that they overlap with) but cannot find the specific combined reference.
E.g. item with reference 20170917_152405
can be returned by separately searching 20170917 and getting it along with 53 other items that aren't what I want, or 152405 and three other items that I don't want, but not by searching its specific reference of 20170917_152405
Breaking some fields into individual words to search may be a good idea, but at least the image reference seems like it should at least retain one version that is 100% intact for searching because components of a reference with joining characters stripped away are not the same thing as the full combined reference in terms of usefulness for searching.
Hert
Posts: 7870
Joined: 13 Sep 03 6:24

Re: Search Text Truncated After Recent Search Overhaul

Post by Hert »

The underscore is a separator.

You can join searches with the plus. Eg
20170917+152405
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
aaronj
Posts: 22
Joined: 03 Sep 21 18:41

Re: Search Text Truncated After Recent Search Overhaul

Post by aaronj »

That can get you the specific item in this example but is still less useful (points 1 and 2 below) than if the full reference was a separate term and isn't a practical/viable substitute (point 3) for a specific term including the full exact characters of the reference/filename.
1. You have to know what characters are treated as whitespace in any reference and replace them with a + and can't just copy/paste the reference if you have the exact reference to copy, and
2. If you know only part of the reference or are otherwise manually typing you don't get real-time available results in the drop-down as you type which you can choose to complete it and you instead have to know/type the whole thing and run the search to know if you will return anything.
3. Perhaps most importantly, you encounter somewhere between a substantial and massive hit to performance vs returning the reference as a single term.
Having to join multiple searches with a "+" is inherently more taxing on resources to run but is compounded thanks to both common prefixes and numbering by many cameras that resets at 10,000, making numbers far from unique as well. E.g. IMG_0001 would require a single index seek but instead requires two seeks + a join if we have to search for IMG+0001.
In my case the IMG+0001 example takes 96 seconds to complete, and on top of that still gets things I wasn't targeting like IMG_20200719_000122.
It isn't as bad if joining less common terms like the prior example 20170917+152405 which returns in 12 seconds, but it is still triple the 4 seconds it takes to return a reference that doesn't require a + due to having no characters treated as whitespace.
aaronj
Posts: 22
Joined: 03 Sep 21 18:41

Re: Search Text Truncated After Recent Search Overhaul

Post by aaronj »

I saw this is addressed in build 4494, so for anyone reading this in the future:
"Build 4494; When searching using the Search Box then the "full data value" is now also searched (not only the search tokens generated from a search value) ; e.g. IMG_1337 will give a result now"

Thanks!
Post Reply