Performance of exist lines whose ( it contains ... )

Dear Team,
I am a little concerned about the high number of time “qna” took to search for “GabHandle::push” in the 67M file. It is able to search for “GabHandle” much faster.

Comparing “qna” to Linux “egrep” - “egrep” spends about the same amount of time for any of 2 searches.

[root@host ~]# /opt/BESClient/bin/qna
Default masthead location, using /etc/opt/BESClient/actionsite.afxm
Q: exist lines whose ( it contains “GabHandle::push”) of file "/var/VRTSvcs/log/engine_A.log"
A: False
T: 3385330

Q: exist lines whose ( it contains “GabHandle::push”) of file "/var/VRTSvcs/log/engine_A.log"
A: False
T: 3252452

Q: exist lines whose ( it contains “GabHandle” ) of file "/var/VRTSvcs/log/engine_A.log"
A: True
T: 282

Q: exist lines whose ( it contains “GabHandle” ) of file "/var/VRTSvcs/log/engine_A.log"
A: True
T: 504

Q: ^C
[root@host ~]# ls -l “/var/VRTSvcs/log/engine_A.log”
-rw-r–r-- 1 root root 67659175 May 20 11:12 /var/VRTSvcs/log/engine_A.log
[root@host ~]#
[root@host ~]# time egrep “GabHandle” "/var/VRTSvcs/log/engine_A.log"
2019/01/22 13:15:09 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2
2019/01/22 13:15:22 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2
2019/01/22 13:15:53 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2
2019/01/22 13:16:36 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2
2019/01/22 13:17:31 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2
2019/01/22 13:18:38 VCS ERROR V-16-1-10116 GabHandle::open failed errno = 2

real 0m0.081s
user 0m0.065s
sys 0m0.017s
[root@host ~]# time egrep “GabHandle::push” “/var/VRTSvcs/log/engine_A.log”

real 0m0.088s
user 0m0.058s
sys 0m0.030s
[root@host ~]#

Thank you,
Aleksandr

The translation of a 67-MB file into “lines” is likely quite expensive.

It may be matching “GabHandle” on a much earlier line than it matches “GabHandle::push”, which could account for the faster search if fewer lines need to be retrieved.

To check that “GabHandle::push” simply exists somewhere in the file, it can be much faster to skip the “lines” translation and just check

content of file "whatever" contains "GabHandle::push"

Jason,
Thank you!

Lines with “GabHandle” can be found close to the beginning of the log, so suggestion that “qna” has a chance to get the result without parsing through the whole file makes sense.
There are no lines with “GabHandle::push” , so “qna” has to parse the whole 67M log file before returning “False”.

Switching from “lines of file” to “content of file” saves ~25%

Q: content of file “/var/VRTSvcs/log/engine_A.log” contains "GabHandle"
A: True
T: 487

Q: content of file “/var/VRTSvcs/log/engine_A.log” contains "GabHandle::push"
A: False
T: 2427209

Q: content of file “/var/VRTSvcs/log/engine_A.log” contains "GabHandle::push"
A: False
T: 2414451

Q: exist lines whose ( it contains “GabHandle::push”) of file "/var/VRTSvcs/log/engine_A.log"
A: False
T: 3276022

Q: exist lines whose ( it contains “GabHandle::push”) of file "/var/VRTSvcs/log/engine_A.log"
A: False
T: 3230051

Q: content of file “/var/VRTSvcs/log/engine_A.log” contains "GabHandle::push"
A: False
T: 2399374