Last n lines of file

If you want to write an Analysis that show only the last n lines from it, do:

Example (last 5 lines from a file “/tmp/file.txt”

lines (integers in (number of lines of file "/tmp/file.txt" - 5,number of lines of file "/tmp/file.txt")) of file "/tmp/file.txt"

2 Likes

one problem with this relevance, is that it should be calculating the number of lines of the file for every line of the file times 2, so this is very inefficient, especially with files with large number of lines.


I made a file with 10000 lines:

Q: lines (integers in (number of lines of files "C:\Windows\Temp\linesoffile.txt" - 5,number of lines of files "C:\Windows\Temp\linesoffile.txt")) of files "C:\Windows\Temp\linesoffile.txt"
A: 9995
A: 9996
A: 9997
A: 9998
A: 9999
A: 10000
T: 2021.247 ms

Your relevance took 2 seconds to return, which is not very good performance. The relevance below was almost 4 times faster, and this problem increases exponentially with the number of lines of the file.


If you want to be efficient, it is much harder:

(item 1 of /* -> This "it" refers to the last 100 lines of the file -> */ it) whose( /* -> remove empty lines, which is why this relevance can return less than 100 lines per file -> */ it as trimmed string != "") of ( /* -> this is the number of lines of the file from the previous statement -> */ item 1 of it, (lines of /* -> the file object -> */ item 0 of it) ) /* -> This whose statement is responsible for filtering for only the last 100 lines of the file -> */ whose ( (line number of /* -> lines of the file -> */ item 1 of it) > ( /* -> number of lines of the file -> */ item 0 of it - 100 /* <- This is the number of lines to return, which is subtracted from the total # of lines <- */ ) ) of ( /* -> the parent file object itself -> */ it, number of lines of it) of files ...

Let me break this down with an example file with 10 lines:

Q: (it, number of lines of it) of files "C:\Windows\Temp\linesoffile.txt"
A: "linesoffile.txt" "" "" "" "", 10
T: 1.164 ms

This is the number of lines in the file with the contents of each line of the file:

Q: ( item 1 of it, (lines of item 0 of it) ) of (it, number of lines of it) of files "C:\Windows\Temp\linesoffile.txt"
A: 10, Line 1
A: 10, Line 2
A: 10, Line 3
A: 10, Line 4
A: 10, Line 5
A: 10, Line 6
A: 10, Line 7
A: 10, Line 8
A: 10, Line 9
A: 10, Line 10
T: 1.060 ms

This added whose statement filters down to only the lines that have a line number greater than the total number of lines minus the number of lines to return: (this also works if the total lines to return is greater than the number of lines in the file)

Q: ( item 1 of it, (lines of item 0 of it) ) whose ( (line number of item 1 of it) > ( item 0 of it - 5 /* <- This is the number of lines to return, which is subtracted from the total # of lines <- */ ) ) of (it, number of lines of it) of files "C:\Windows\Temp\linesoffile.txt"
A: 10, Line 6
A: 10, Line 7
A: 10, Line 8
A: 10, Line 9
A: 10, Line 10
T: 1.139 ms

This will give only the lines of the file:

Q: items 1 of ( item 1 of it, (lines of item 0 of it) ) whose ( (line number of item 1 of it) > ( item 0 of it - 5 /* <- This is the number of lines to return, which is subtracted from the total # of lines <- */ ) ) of (it, number of lines of it) of files "C:\Windows\Temp\linesoffile.txt"
A: Line 6
A: Line 7
A: Line 8
A: Line 9
A: Line 10
T: 1.050 ms

This relevance only calculates the total number of lines once, and only reads the file lines once, though there might be other improvements that could be made to this relevance.

3 Likes

Thank you, @jgstew! Really your algorithm is much better than mine.

1 Like

Yours is much simpler and easier to understand, I just wish it was more efficient. I think simplicity and understand-ability are important, but in this case, can cause problems.

This example is not very efficient, but gets the newest time in the newest lines with certain contents by just grabbing all lines of a file with the interesting event, then getting “the last one” of them:

(it as time) of preceding texts of firsts " bf:bfetl:debug" of following texts of lasts ", " of concatenations ", " of lines containing "bf:bfetl:debug Sending request for" of files "etl.log" of (folders it) of unique values of pathnames of folders "Logs" of folders "WebUI" of (it; parent folders of it) of parent folders of (folders it) of values of settings of client

This approach can be faster than trying to get the newest lines of the file first, though for very large log files, this approach isn’t the greatest either.

There are many ways to achieve the same thing with relevance, and it is hard to tell which is best without experimentation.

1 Like

When it comes to the relevance language, sometimes we see things that are utterly complex to write relevance for, that could be super simple to formulate if the language supported it.

For example as the subject of this thread suggests:

last 200 lines of file "/tmp/file.txt"

This would be fantastic, is it too much to hope for?

2 Likes

You could always submit an RFE

Good idea, I will actually do this.
I have had a few situations where this would be really valuable.

1 Like

Exactly! It would be much better!

Link to your RFE here and we’ll vote for it.

1 Like

My research passed this preexisting thread on the way to preexisting RFE which needs votes: BFLCM-I-248

3 Likes

I voted for it

(is this 20 characters?, required to post)

+1 Vote here.

But for completeness, this post covers the use case.

2 Likes

+1 vote & forwarded to my BigFix colleagues to vote for, the most fundamental item should just be added !