Search Logic #3

Post Reply
David Grundy
Posts: 241
Joined: 13 May 07 16:40
Location: Hong Kong

Search Logic #3

Post by David Grundy » 03 Aug 13 11:30

I'm confused again. (Sounds familiar, I know.) I don't know whether I'm the only person who finds the search results hard to predict?

For example, in my test database, I have a label 20130803Test assigned to 4 image files out of 68.
Four searches, with the number of results:
1. AND(Test20130803) --> 4
2. OR(Test20130803) --> 4
3. AND(excluding (Test20130803*)) --> 64
4. OR(excluding (Test20130803*)) --> 0

Why do 3 and 4 produce different answers? I have spent some time thinking about this, and I haven't yet thought of a set of logic rules that would produce zero responses to #4. Because of this sort of thing I don't trust OR searches: I just don't know what is supposed to be included in the results (except in the very very simplest cases).

I expected compound Dynamic Searches to work like this.
  • Every item shown separately in the Dynamic Search pane is separately evaluated and notionally produces a list of files.
    (Exception: Where different kinds of star ratings are in the search pane and are all include or all exclude, I can see an argument for treating all the star terms as a single star ratings term combined with OR, before any other operation. Same for different colour labels.)
  • Then apply the AND or OR join to all of these lists of files.
    (So if there is only one item in the search window, AND and OR should always produce the same results. I expected this to be the case for #3 and #4.)
I'm pretty confident that PSu's Dynamic Search is doing something quite different from this.

Can anyone explain the sequence of logical steps used in the Dynamic Search? I'm frustrated by not understanding what is going to happen with compound searches. When I do understand it perhaps I won't agree with the approach, but at least I might be able to work out how to specify a search that gets the results I am looking for.

Thanks
... David

jstartin
Posts: 400
Joined: 23 Aug 06 13:47
Location: UK

Re: Search Logic #3

Post by jstartin » 04 Aug 13 14:41

I know nothing about SQL, and only dimly remember a little general concept stuff about RDBMSs, but I will hazard a guess at this. I will be most interested to find out if I am anywhere near correct :wink: .

For completeness I mention that compound searches are about applying multiple selection criteria to produce a results set: "AND" narrows the search by requiring more than one criterion to be met for inclusion of a record in the results set (Janet AND John must be present); "OR" widens the search by adding multiple results sets together (Janet OR John must be present).

Now in your example you only have one explicit criterion for each search so AND/OR might not be strictly relevant, but as a general principle I would think that whenever AND/OR are used there must be an implicit starting point. AND is narrowing the results by excluding some of them and the "logical" starting point is the whole catalog. In contrast OR is widening the search to include more catalog entries and the "logical" starting point is an empty results set.

I imagine these differing starting points (if they are as I suppose) are preserved when the criterion is changed to NOT (exclude). If I am right about all this then your searches could be explained as something like:

1. Form a result set containing all images. Narrow this by making from it a smaller set of images labelled "Test20130803".
2. Take a set containing nothing. Merge the set of images labelled "Test20130803" with it.
3. Form a result set containing all images. Narrow this by removing images labelled "Test20130803".
4. Take a set containing nothing. Remove from this images labelled "Test20130803", still leaving nothing.

The answers are all correct, but 4 is the answer to a silly question :wink:
Jim (Photo Supreme: AMD Quad-Core A8-5500 Accelerated Processor 3.2 GHz; internal AMD Radeon™ HD7560D; 4GB DDR3 SDRAM; Win10x64)

David Grundy
Posts: 241
Joined: 13 May 07 16:40
Location: Hong Kong

Re: Search Logic #3

Post by David Grundy » 04 Aug 13 15:20

Interesting thought, Jim. It makes sense although it creates some cases which feel odd to me.

An example of an odd case, if that's what's happening

Suppose I have the list of numbers 1..8
What do I get with OR(odd numbers; exclude numbers greater than 4)
--> Start with nothing; then get 1,3,5,7; then exclude 5..8 leaving result of 1,3

On the other hand with OR(exclude numbers greater than 4; odd numbers)
--> Start with nothing; exclude 5..8 leaving nothing; include 1,3,5,7 --> result of 1,3,5,7

So this logic creates a dependence on the order of the terms.

Ramble ...

I could see though that this could make sense in the context of a "simple" search experience. Although, that's not consistent with the fact that the dynamic search sorts the terms into groups of the same kind, so you can't completely specify the order of the terms. (For example, excluded terms seem to be sorted towards the end.) If the intention is to create a logic which works by saying: start with nothing, add a bit, add a bit, remove something, add a bit until you get what you need ... then it should let us completely specify the order of adding and removing bits!

And now I recall that Tom (I think) noted in a recent post that the order in which terms are added to Dynamic Search does actually make a difference to the result. So I really think you could be right.

It's different from the usual Boolean logic, but if I understand it perhaps I can work with it.

What about the implications for compound searches involving Exclusion and OR? I saw results a while ago which made me think that the meaning of the grouped expressions (stored in favourites) may change when they are put into an OR search instead of an AND search - that it, I suspect that the whole expression inside the stored favourite may have different logic when the item is added to an OR vs AND join. But I was chasing down another aspect of search behaviour at the time, and I did not make a note of that case. Perhaps when the outer join is an OR, all the inner expressions are evaluated as OR, regardless of whether they were originally stored as AND or OR. I note that the stored expressions in the Favourites don't specify whether they result from AND or OR joins.

BTW An aside: I saw something recently which made me suspect that it might be possible to put complex expressions into the text search box. If I can get the syntax right, that may be quicker than building complex searches in dynamic search. That's something for next week (or later).

... David

[slight edit for clarity, and to fix a minor error in the example I gave]
Last edited by David Grundy on 04 Aug 13 17:28, edited 2 times in total.

Jeff F
Posts: 22
Joined: 19 May 11 1:43

Re: Search Logic #3

Post by Jeff F » 04 Aug 13 17:13

Just wanted to add my thanks for putting this in simple terms.
Jim - "AND" narrows the search by requiring more than one criterion to be met for inclusion of a record in the results set (Janet AND John must be present); "OR" widens the search by adding multiple results sets together (Janet OR John must be present).
"AND" narrows
"OR" widens

Yes, I knew this, but somehow not quite as simply stated.

Thanks again,

Jeff

jstartin
Posts: 400
Joined: 23 Aug 06 13:47
Location: UK

Re: Search Logic #3

Post by jstartin » 04 Aug 13 22:20

David Grundy wrote:So this logic creates a dependence on the order of the terms.
David
My observation is that the order in which terms appear in the dynamic search box does not correspond to the order in which I add them. Comparing the orders resulting from different orders of addition, they usually seems to be the same (usually but not always - if I add labels that have identical spellings for the final keyword but come from hierarchies under different categories the label added last seems to go to the top of the box).

And then the application has to convert the search into an SQL statement and pass this to the SQLite "engine". Then, as I understand it, SQLite runs its own optimizer to decide how to get the required data from the database tables. Some part of what happens is not necessarily under the developers direct control.
David Grundy wrote:BTW An aside: I saw something recently which made me suspect that it might be possible to put complex expressions into the text search box. If I can get the syntax right, that may be quicker than building complex searches in dynamic search.
I understand that that is an option for those that like typing. The help text tells us that a space between terms is treated as OR, a "+" symbol as AND, and a "-" as AND NOT. There is no reference to an OR NOT, which might imply something. There is more syntax - try searching the forum for @PROP.
Jim (Photo Supreme: AMD Quad-Core A8-5500 Accelerated Processor 3.2 GHz; internal AMD Radeon™ HD7560D; 4GB DDR3 SDRAM; Win10x64)

David Grundy
Posts: 241
Joined: 13 May 07 16:40
Location: Hong Kong

Re: Search Logic #3

Post by David Grundy » 05 Aug 13 15:26

Thanks Jim, indeed I guess I saw the reference in the manual now that I think about it. I just couldn't remember where.
I think (although not 100% certain) that the SQL logic implementation will be Boolean, so there needs to be a translation from the PSu search query to a Boolean expression for the SQL query.
However I think the query logic you suggested can probably be translated into a Boolean expression fairly easily by working from the first term to the last term, and add each new term 'N' to the query as
([Expression resulting from all terms before N]) AND 'N'
or
([Expression resulting from all terms before N]) OR 'N'
except that if the next term is 'exclude N' then use
([everything already added]) AND NOT 'N'
regardless of whether the query is AND or OR.

That is probably just as easy as constructing a Boolean expression from the Dynamic Search logic I was originally expecting (as in my first post above).

I think I'm starting to understand the thinking behind the comment in build 142 announcements, that "When you exclude terms in a Dynamic search, they are now excluded OR style instead of AND style." This really worried me - I thought it might have broken the logic of the search. Now I see that this is consitent with a search logic which proceeds as follows:

1. Group all ratings together and specify them as a single term meaning "any of these ratings" (ie an OR within the ratings group)
2. Group all excludes together, and specify this as an OR group of all the excluded terms (where previously this was evaluated as AND or OR depending on the form of the search)
3. Evaluate the remaining terms as an AND or an OR according to the logic you suggested above.

If this is right, then whenever there are grouped terms in a favourite, they will be re-evaluated as AND or OR depending on the search to which they are added. So the same item in Favourites might produce a different meaning depending whether it is later added to an AND or an OR. I will test this later.

What I meant to say in the point about the text search box, is that it's not immediately obvious that the text search will necessarily use the same logic specification as the Dynamic Search. I will try it out when I have time.

Lots of experiments for later ...
... David

David Grundy
Posts: 241
Joined: 13 May 07 16:40
Location: Hong Kong

Re: Search Logic #3

Post by David Grundy » 05 Aug 13 15:43

Anyway if in the end I don't like the search logic, there will usually be a workaround by using temporary (private) labels to designate the intermediate results of the search.

The label approach fails if I'm looking for specific versions within a set (unless one day we get version-specific labels). If looking for files within version sets, I could use Custom Input Fields to tag the intermediate search results, but I don't quite like it because that will trigger an out-of-sync on each designated file; and on the whole I prefer not to write to the files unnecessarily.

... David

Post Reply