Speed Up Text Search in File

mbartosh · January 8, 2019, 12:20am

Would it be possible to speed up this file search?

following text of last "Model: " of lines whose (line number of it = maximum of (it as integer) of line numbers of lines whose (it starts with “Model:”) of file “SREDKey.log” of folder “XXXXXX\SREDKey” of folder (value of variable “allusersprofile” of environment)) of file “SREDKey.log” of folder “XXXXXX\SREDKey” of folder (value of variable “allusersprofile” of environment)

I find that the search takes about 3000 ms when the file gets to 500 lines. I think on some slower machines this causes a time out error. I am trying to reduce the number of lines in the file, but I was wondering if there was a way to make the search faster.

JasonWalker · January 8, 2019, 5:27am

I don’t have a console handy and can’t test this, but what jumps at me is the double use of ‘lines whose()’. Using this construct, every line of the file is read and evaluated against the whose() condition.

In the “outer” version of this, the evaluator reads one line of the file; in the whose () condition, it is checking whether the line number of this line is equal to the maximum of the line numbers starting with "Model: ".

Suppose the file has five lines, with “Model:” on lines 1 and 5. The outer loop starts on line 1. To evaluate whether this line is the last line starting with “Model:”, the evaluator has to read each line to find the lines starting with "Model: " - the evaluator finds lines 1 and 5, determines that the current line number (1) is not the maximum (5), and then moves on to the next line in the “outer loop”, stepping to line 2.

Reading line 2, it must again determine if this line number is equal to the maximum of line numbers of lines starting with "Model: ", which again means reading every line of the file.

The result is that the there are 25 line reads required. The complexity is “n squared”. A 500 line file could incur 250,000 line reads.

Better to read the file, once, and then jump directly to a read of the one line we want. Try this one …

following texts of lasts "Model: " of lines (maximum of line numbers of lines whose (it starts with “Model:”) of it) of file “SREDKey.log” of folder “XXXXXX\SREDKey” of folder (value of variable “allusersprofile” of environment)

mbartosh · January 8, 2019, 8:37pm

Thank you so much Jason. This has taken the calculation time from 225,000 ms to 47 ms for a 500 line file. Not only will this improve this analysis, it will improve the performance of our entire Bigfix installation.