A really interesting query. There are several ways to do this, and my first inclination was to use regular expressions but I think we can build on what you already have to get there a bit more clearly.
First, a sample of the text file Iâm using for testingâŚ
q: lines of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: line 1 * 1 ( APPROVED )
A: line
A: line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: 2-Nov-2021 15:13:33,471 [ReqProc] ERROR [T:123456] C:/myfolder/logs - Output Responses (output.txt) :7=12345678 29=X12345678910111 6=abcd abcde 5=123456XXXXXX5678 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: some other line
A: line 1 * 1 ( APPROVED )
A: line
A: line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: some line
A: 3-Nov-2021 15:13:33,471 [ReqProc] ERROR [T:123456] C:/myfolder/logs - Output Responses (output.txt) :7=12345678 29=X12345678910111 6=abcd abcde 5=123456XXXXXX1234 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
A: some other line
T: 23.538 ms
I: plural file line
In your query, youâre pulling âunique values of linesâ; I donât think itâs worth pulling the unique values, since the lines appear to have timestamps on them I expect every line to be unique anyway.
Iâd start by filtering down to the two lines that we want - that appear 11 lines after your âApprovalâ string
q: lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 2-Nov-2021 15:13:33,471 [ReqProc] ERROR [T:123456] C:/myfolder/logs - Output Responses (output.txt) :7=12345678 29=X12345678910111 6=abcd abcde 5=123456XXXXXX5678 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: 3-Nov-2021 15:13:33,471 [ReqProc] ERROR [T:123456] C:/myfolder/logs - Output Responses (output.txt) :7=12345678 29=X12345678910111 6=abcd abcde 5=123456XXXXXX1234 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
T: 20.534 ms
I: plural file line
Next, youâre pulling the strings between one âXX=â and the next âXX=â. Iâd start by making sure I can correctly find the start of the strings. Here are a couple of queries that demonstrate finding the starting marks, and finding their numerical positions within the stringâŚI ended up not using the positions, but maybe useful to think about it conceptually. The grouping here demonstrates pulling multiple âfirsts X of itâ for each line of inputâŚI get 3 matches on each of 2 lines, for six results total.
q: ((firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 38=
A: 5=
A: 98=
A: 38=
A: 5=
A: 98=
T: 17.311 ms
I: plural substring
q: (starts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 380
A: 144
A: 505
A: 380
A: 144
A: 505
T: 14.004 ms
I: plural string position
Next, we can pull âeverything afterâ those starting pointsâŚagain, six strings total, and each of these matches goes âfrom the string we want, to the end of the lineâ
q: (following texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: 123456XXXXXX5678 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: 98765432 99=1
A: 1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
A: 123456XXXXXX1234 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
A: 12345678912345678912 99=1
T: 10.565 ms
I: plural substring
I notice from the input field order than in each of your cases above, you are looking for âthe next field numberâ, but I think itâs easier to approach it by looking for âthe next space after the field Iâm looking atâ. That way we donât need to know the number of the next field, we just know that this field ends with a space.
q: (following texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: 123456XXXXXX5678 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=9876 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=98765432 99=1
A: 98765432 99=1
A: 1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
A: 123456XXXXXX1234 4=12345 14=1 1=1 22=1 Street , Streeville , XX1 1XX 12=12345678 23=1234 - xxxxxx 13=12345678 8=12345678 3=1 21=123 9=123456 2=1 34=XXXXXX XXXXX XX XXXXXXX 33=XXXXXX XXXX XXXX XXXXXXX XXX XXXX XXXXXXX 36=12345 37=1234 38=1234 41=1 30=1 31=X1 12 32=12 34 56 78 90 59=0000000000000000000000000000000000000000 76=123456789 60=123 80=123456789101 98=12345678912345678912 99=1
A: 12345678912345678912 99=1
T: 10.565 ms
I: plural substring
q: (preceding texts of firsts " " of following texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 9876
A: 123456XXXXXX5678
A: 98765432
A: 1234
A: 123456XXXXXX1234
A: 12345678912345678912
T: 7.195 ms
I: plural substring
And finally, we want to group all three results from each line together. We need to wrap our âline inspectionâ in another set of parentheses so the concatenation applies to âall the matches within this lineâ instead of âto all the matches found across all linesâ.
q: (concatenation "; " of (preceding texts of firsts " " of following texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it))) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 9876; 123456XXXXXX5678; 98765432
A: 1234; 123456XXXXXX1234; 12345678912345678912
T: 3.523 ms
I: plural string
What if we wanted to keep the field labels with the results? Well, going way back to that âstarts of âXââ query above, illustrating that the âfirsts Xâ and âlasts Xâ actually returns string positions, we can modify this very slightlyâŚ
following texts of preceding texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it)
The preceding text of (firsts "X")
gives the substring from the beginning of the string up to â38=â; the following text of that is the string that begins with â38=â, so we can check from there up to the next space to get the field value including the â38=â string.
q: (concatenation "; " of (preceding texts of firsts " " of following texts of preceding texts of (firsts "38=" of it; firsts "5=" of it; firsts "98=" of it))) of lines ((it + 11) of line numbers of lines whose (it contains "* 1 ( APPROVED )") of it) of files whose (name of it as string as lowercase contains "test.txt") of folder "c:\temp"
A: 38=9876; 5=123456XXXXXX5678; 98=98765432
A: 38=1234; 5=123456XXXXXX1234; 98=12345678912345678912
T: 3.556 ms
I: plural string