I’m going to assume “C:” and “D:” are only examples, but hope that I understand the format correctly.
Assuming that only the lines showing a new result contain a colon, it can be parsed this way. This was a nice distraction, thanks for the challenge. I hope this illustrates several neat things we can do with line parsing.
First, a file (which has lines):
q: lines of file "c:\temp\test.txt"
A: A: 1a
A: B: 2b
A: C: 3c
A: 4c
A: 5c
A: D: 4d
A: 5d
A: 6d
T: 9.062 ms
Find the line number of the line containing “C:”
q: (it, line numbers of lines whose (it contains "C:") of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 3
T: 8.735 ms
Find the line numbers of the lines containing a “:”
q: (it, line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 1
A: "test.txt" "" "" "" "", 2
A: "test.txt" "" "" "" "", 3
A: "test.txt" "" "" "" "", 6
T: 7.893 ms
Because there is more than one and we’ll want to do comparisons, let’s make that a “set” of line numbers. Passing along a “set” of values is more efficient than passing along each value individually.
q: (it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
E: This expression evaluates to an unrepresentable object of type "( file, integer set )"
T: 5.597 ms
That is fine.
Now let’s put them together, and find the first line after the “C:” that contains a colon
q: (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 3, 6
T: 5.215 ms
That’s a big jump and worth a little explanation. Where I have minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)
, that’s looking for the minimum line number for a line containing “:” that his higher than the line number of the line containing “C:”. So our “C:” is on line 3, and our “D:” is on line 6.
Now we know the line number of the line containing “C:”, and the line number of the first line after that one containing a “:”. We can determine all the integers in-between as well. But that “6” is the first line we don’t want to include, so we’ll subtract 1 from it.
q: (item 0 of it, integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: "test.txt" "" "" "" "", 3
A: "test.txt" "" "" "" "", 4
A: "test.txt" "" "" "" "", 5
T: 4.189 ms
These are the line numbers of the lines that we want. Keeping three separate numbers in the tuple would make it read the whole file three times, so lets put those back into a ‘set’ object:
q: (item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
E: This expression evaluates to an unrepresentable object of type "( file, integer set )"
T: 2.091 ms
Now that we have the file, and the set of line numbers of lines to read from the file, let’s do it.
q: items 0 of (lines of item 0 of it, item 1 of it) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) of (item 0 of it, item 1 of it, minimum of items 1 of (( item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it))) of (it, line numbers of lines containing "C:" of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: C: 3c
A: 4c
A: 5c
T: 1.525 ms
Good so far, but we want to trim off the “C:” in the front, and concatenate those into one result right? Here, the singular ‘following text of first “:” of it’ will give an error for a line that doesn’t contain “:”, and that error will be trapped with the pipe operator “|” and use the result ‘it’. So we’ll either get the following text of the colon, or if there’s no colon then we’ll get the whole original line. We trim that to remove leading/trailing spaces, then concatenate those multiple lines together with spaces.
q: concatenation " " of (it as trimmed string) of (following text of first ":" of it | it) of items 0 of (lines of item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, item 1 of it, minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)) of (it, line numbers of lines whose (it contains "C:") of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: 3c 4c 5c
T: 0.822 ms
So that is hopefully a good result. As I am manipulating the line numbers, I also want to sanity-check against a result that doesn’t have multiple lines. So I’ll run it again checking for the “B:” result. The way I’ve written the query we only have to change one spot.
q: concatenation " " of (it as trimmed string) of (following text of first ":" of it | it) of items 0 of (lines of item 0 of it, set of integers in (item 1 of it, item 2 of it - 1)) whose (line number of item 0 of it is contained by item 1 of it) of (item 0 of it, item 1 of it, minimum of items 1 of (item 1 of it, elements of item 2 of it) whose (item 0 of it < item 1 of it)) of (it, line numbers of lines whose (it contains "B:") of it, set of line numbers of lines containing ":" of it) of file "c:\temp\test.txt"
A: 2b
T: 0.828 ms
Looks like it works to me. Hope this helps!
-Jason