There are several reasons this is more efficient, some of them based on how early we filter, others based on internal caching behavior. I'll try to illustrate what's happening here.
There are two main dimensions we have to consider - how many object lookups we perform (find a fixlet or find a computer or find a result of a fixlet and computer), and how long each lookup takes.
There are specific computer science terms for these, but it's easier to think of it as 'how many times does the loop run' and 'how long does each iteration take'
In the 'sets' form, starting with
(
set of fixlets whose (fixlet flag of it and exists applicable computers of it)
of bes sites whose (display name of it contains "Patches for " OR display name of it contains "Enterprise " OR display name of it contains "updates ") /* set of fixlets of interest */
, union of subscribed computer sets
of bes sites whose (display name of it contains "Patches for " OR display name of it contains "Enterprise " OR display name of it contains "updates ") /* set of computers of interest */
)
I do the filtering on fixlets and computers early. By starting from the 'bes site', I only include the fixlets from the sites we care about. We can completely skip looking up all the fixlets that are not in the site. I also only include the fixlets that are Fixlets not Tasks (via 'fixlet flag of it') and only the fixlets that have applicable computers. This discards a lot of fixlets early on.
Likewise, for the Computers I limit to only the computers subscribed to those sites. Any computers not subscribed to those sites are ignored, we can skip those lookups.
If there are any other filters we can apply here, like fixlet severities or computer group memberships, it's more efficient to do those filterings early. We can filter out most of the computers and most of the fixlets before we perform a single 'bes fixlet result' lookup.
Because this is generating 'sets', 'item 0' is a singular item (a set of fixlets), and 'item 1' is a singular item (a set of computers). If we spent 100 ms looking up fixlets and 100 ms looking up computers, this evalution has taken 200 ms so far.
Sidebar on 'Sets' vs 'Plural Items / Crosss Products'
If we don't use 'sets', we end up with the first major problem of relevance performance - the 'cross product'
bes fixlets whose (filters)
, bes computers whose (filters)
then we create a 'cross product' - every 'bes fixlet' triggers a lookup of every 'bes computer'. Conceptually we could think of that as a nested loop -
for each fixlet:
if (fixlet matches filters:
for each computer:
if (computer matches filters):
return (fixlet, computer)
If the 'for each fixlet' lookup takes the same 100 ms to lookup 100 fixlets, and the 'for each computer' lookups also takes the same 100 ms to lookup 100 computers, the difference here is that the computer lookup gets repeated 100 times, once for each Fixlet. So now it's not 100 + 100 milliseconds, it's 100 X 100 millisecondss
Back to the problem at hand
The other issue with the original stems from how the caching works internally.
The first set of lookups
results from (bes computers) whose (relevant flag of it) of bes fixlets whose (type of it = "Fixlet" and display name of site of it contains "Patches for ")
suffers from the nested loop / cross-product issue - for every Fixlet, we repeat the lookup of every Computer, to get the results.
But then, having the 'bes fixlet result' objects, we get into retrieving the properties from it:
( name of computers of it, name of fixlets of it ) of results from (bes computers) whose (relevant flag of it) of bes fixlets whose (type of it = "Fixlet" and display name of site of it contains "Patches for ")
From the 'bes fixlet result', internally it now performs another lookup to find the result -> Computer -> Name, and another lookupt to find the result -> Fixlet -> Name. These aren't 'preloaded' with the bes fixlet result, so basically it's another database lookup.
In the 'sets' form, we are basically keeping 'in cache' the bes fixlet and the bes computer that we're evaluating - where item 0 is the fixlet, item 1 is the computer, and item 2 is the bes fixlet result of this computer and this fixlet. This saves another layer of lookups, since we already have the fixlet and the computer as items in our query, retrieving 'name of item 0' is much faster than retrieving 'name of fixlet of item 2 '.
How does this align to a real lookup?
In my form, on a real demo system, the 'sets' form retrieves 1,019 results in 164 milliseconds.
The original query (which returns fewer properties, ignoring the MIME fields retrieves 1,019 results in 746 ms.
In short, the 'sets' logic may be a bit trickier to understand, but once you practice it a while it makes more sense, and it's definitely worthwhile in terms of efficiency - especially on large deployments.