Parse lines from a command

vhenry · June 11, 2019, 4:30pm

Hello,

I’m trying to parse certain lines from the systeminfo command.

if the output contains-

A: 1a
B: 2b
C: 3c
4c
5c

How do I parse the whole contents of C?

I was able to parse A using -

line whose (it starts with “A”) of file (“file1.txt”)

If I use the same query for C, I would only get 3c since only the first line gets printed. I need the entire contents of C.

Please help.

Thank you!

Best,
Jennifer

SLB · June 11, 2019, 4:48pm

I’m sure there is a better way than this but my brain is failing me today, but one approach, although it doesn’t have error trapping, could be along the lines of

Q: lines whose (line number of it >= ((line number of (lines whose (it starts with "C") of it)) of file "file1.txt") of it) of file "file1.txt"
A: C: 3c
A: 4c
A: 5c
T: 1.619 ms
I: plural file line

vhenry · June 11, 2019, 5:27pm

That definitely worked! Thank you.

But I want to propose a slightly different scenario.

If the output of the command is-
A: 1a
B: 2b
C: 3c
4c
5c
D: 4d
5d
6d

The query that you provided gives the following output for C-
C: 3c
4c
5c
D: 4d
5d
6d

But I want it to only contain contents of C. Also, is there a way that it can eliminate “C:” from the output and only give “3c 4c 5c” and not the whole “C: 3c 4c 5c”?

SLB · June 11, 2019, 8:50pm

I think the approach that best for you is going to depend on how variable the text in your file is. Maybe this approach would be more suited for your needs

Q: (if(exists lines whose (it contains "D:") of it) then (following text of first "C: " of preceding text of first "D: " of concatenation " " of lines of it) else (following text of first "C: " of concatenation " " of lines of it)) of files "file1.txt"
A: 3c 4c 5c
T: 0.891 ms
I: singular substring

JasonWalker · June 11, 2019, 9:16pm

I’m going to assume “C:” and “D:” are only examples, but hope that I understand the format correctly.

Assuming that only the lines showing a new result contain a colon, it can be parsed this way. This was a nice distraction, thanks for the challenge. I hope this illustrates several neat things we can do with line parsing.

First, a file (which has lines):

q: lines of file "c:\temp\test.txt" 
A: A: 1a
A: B: 2b
A: C: 3c
A: 4c
A: 5c
A: D: 4d
A: 5d
A: 6d
T: 9.062 ms

Find the line number of the line containing “C:”

q: (it, line numbers of lines whose (it contains "C:") of it) of file "c:\temp\test.txt" 
A: "test.txt" "" "" "" "", 3
T: 8.735 ms

Find the line numbers of the lines containing a “:”

q: (it, line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 1
A: "test.txt" "" "" "" "", 2
A: "test.txt" "" "" "" "", 3
A: "test.txt" "" "" "" "", 6
T: 7.893 ms

Because there is more than one and we’ll want to do comparisons, let’s make that a “set” of line numbers. Passing along a “set” of values is more efficient than passing along each value individually.

q: (it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
E: This expression evaluates to an unrepresentable object of type "( file, integer set )"
T: 5.597 ms

That is fine.

Now let’s put them together, and find the first line after the “C:” that contains a colon

q: (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 3, 6
T: 5.215 ms

That’s a big jump and worth a little explanation. Where I have minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it), that’s looking for the minimum line number for a line containing “:” that his higher than the line number of the line containing “C:”. So our “C:” is on line 3, and our “D:” is on line 6.

Now we know the line number of the line containing “C:”, and the line number of the first line after that one containing a “:”. We can determine all the integers in-between as well. But that “6” is the first line we don’t want to include, so we’ll subtract 1 from it.

q: (item 0 of it, integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 3
A: "test.txt" "" "" "" "", 4
A: "test.txt" "" "" "" "", 5
T: 4.189 ms

These are the line numbers of the lines that we want. Keeping three separate numbers in the tuple would make it read the whole file three times, so lets put those back into a ‘set’ object:

q: (item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
E: This expression evaluates to an unrepresentable object of type "( file, integer set )"
T: 2.091 ms

Now that we have the file, and the set of line numbers of lines to read from the file, let’s do it.

q: items 0 of (lines of item 0 of it, item 1 of it) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: C: 3c
A: 4c
A: 5c
T: 1.525 ms

Good so far, but we want to trim off the “C:” in the front, and concatenate those into one result right? Here, the singular ‘following text of first “:” of it’ will give an error for a line that doesn’t contain “:”, and that error will be trapped with the pipe operator “|” and use the result ‘it’. So we’ll either get the following text of the colon, or if there’s no colon then we’ll get the whole original line. We trim that to remove leading/trailing spaces, then concatenate those multiple lines together with spaces.

q: concatenation " " of (it as trimmed string) of (following text of first ":" of it | it) of items 0 of (lines of item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, item 1 of it, minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)) of (it, line numbers of lines whose (it contains "C:") of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: 3c 4c 5c
T: 0.822 ms

So that is hopefully a good result. As I am manipulating the line numbers, I also want to sanity-check against a result that doesn’t have multiple lines. So I’ll run it again checking for the “B:” result. The way I’ve written the query we only have to change one spot.

q: concatenation " " of (it as trimmed string) of (following text of first ":" of it | it) of items 0 of (lines of item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, item 1 of it, minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)) of (it, line numbers of lines whose (it contains "B:") of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: 2b
T: 0.828 ms

Looks like it works to me. Hope this helps!

-Jason

JasonWalker · June 11, 2019, 11:08pm

Edge case! What if the value we’re retrieving is the last one (in this case “D:”)? In my earlier post, we’d get an error because there is no line containing ":" with a line number higher than the number of D:'s line.

We could error-trap our way out of that, but I think it’s simpler to coerce a higher line number. In this case, in addition to 'line numbers of lines containing “:” ', I’ll also include the ‘number of lines of the file + 1’. When that is included, we can retrieve all lines from D: to the last line of the file:

q: concatenation " " of (it as trimmed string) of (following text of first ":" of it | it) of items 0 of (lines of item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, item 1 of it, minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)) of (it, line numbers of lines whose (it contains "D:") of it, set of (line numbers of lines containing ":" of it;number of lines of it + 1)) of file "c:\temp\test.txt"
A: 4d 5d 6d
T: 0.968 ms

vhenry · June 12, 2019, 9:35pm

This is what I wanted!! Thank you!

vhenry · June 12, 2019, 9:36pm

Woahh thank you, Jason for taking the time to explain what every statement does. It definitely helped a lot! However, your query only printed the first line and left out the rest…

vhenry · June 12, 2019, 9:37pm

Let me test this out.