Trying to update a relevance that uses RegEx to check for IPs in /etc/pf.anchors/ files, but I’m having some mixed results. Here’s what the file contents would have:
table <ranges> {10.0.1.0/28 10.0.2.0/25 10.0.4.4 }
# Block external
block in on en0 proto tcp from any to any port 22
block in on en0 proto tcp from any to any port 5900
# Allow internal
pass in on en0 proto tcp from <ranges> to any port 22
pass in on en0 proto tcp from <ranges> to any port 5900
And my analysis uses this line:
(unique values of matches (regex "([0-9]{1,3}[.]){3}([0-9]{1,3}[/]{0,1}[0-9]{0,2}|[0-9]{1,3})") of lines whose (it does not start with "#" AND it as trimmed string != "") of file "/etc/pf.anchors/edu.psu.test")
The regex seems to work outside of QnA (testing here: http://regexr.com), but in QnA I receive more results than I expected:
Q: (unique values of matches (regex "([0-9]{1,3}[.]){3}([0-9]{1,3}[/]{0,1}[0-9]{0,2}|[0-9]{1,3})") of lines whose (it does not start with "#" AND it as trimmed string != "") of file "/etc/pf.anchors/edu.psu.test")
A: 0.0.1.0/28
A: 0.0.2.0/25
A: 0.0.4.4
A: 10.0.1.0/28
A: 10.0.2.0/25
A: 10.0.4.4
T: 3729
This is tough, not exactly sure if it is PMR worthy or not. Certainly could try that, but this seems more in the regex help area, and I’m not sure how much IBM support will be able to help there.
Are the /'s expected to be there in all of the results?
Are the IPs always on a line starting with table <ranges> contained within brackets?
I don’t know much about this file type and format, so I’m trying to get a better understanding of what is expected to always be there and how the data should be pulled out. It might be helpful if you could provide links online to public examples of this file format.
I don’t usually recommend using RegEx in such a broad way. Sometimes it is required, but the more structured and similar the data is, the more you can do it with string parsing rather than RegEx alone. Sometimes RegEx is needed, but even then I try to use string parsing to limit the scope of the RegEx.
Does this work?
(it as trimmed string) whose(it != "") of (preceding text of first "/" of it | it) of substrings separated by " " of preceding texts of lasts "}" of following texts of firsts "{" of "table <ranges> {10.0.1.0/28 10.0.2.0/25 10.0.4.4 }"
This is my attempt at applying this relevance to the file:
(it as trimmed string) whose(it != "") of (preceding text of first "/" of it | it) of substrings separated by " " of preceding texts of lasts "}" of following texts of firsts "{" of lines containing "{" whose(it contains "}") of files "/etc/pf.anchors/edu.psu.test"
Not all results would have the /'s. I would expect to see standard IP and CIDR IP notations only.
No, The IPs can also be on a new line. They will typically be in a ‘table {}’, but could also be referenced directely in the rules.
The type of data I’m looking for are IP addresses, which can exist in a PF Table: OpenBSD PF: Tables
It’s also possible to have IPs directly added to rules: OpenBSD PF: Tables
My original post has an example of the content found in PF. I updated the file to include some additional test IPs:
table <ranges> {10.0.1.0/28 10.0.2.0/25 10.0.4.4 }
table <ranges> {192.168.1.0/28 192.168.2.0/25 10.0.4.4 }
# Block external
block in on en0 proto tcp from any to any port 22
block in on en0 proto tcp from any to any port 5900
# Allow internal
pass in on en0 proto tcp from <ranges> to any port 22
pass in on en0 proto tcp from <ranges> to any port 5900
The first relevance finds the 10. ranges.
Q: (it as trimmed string) whose(it != "") of (preceding text of first "/" of it | it) of substrings separated by " " of preceding texts of lasts "}" of following texts of firsts "{" of "table <ranges> {10.0.1.0/28 10.0.2.0/25 10.0.4.4 }"
A: 10.0.1.0
A: 10.0.2.0
A: 10.0.4.4
The relevance does find all the IPs!
Q: (it as trimmed string) whose(it != "") of (preceding text of first "/" of it | it) of substrings separated by " " of preceding texts of lasts "}" of following texts of firsts "{" of lines containing "{" whose(it contains "}") of files "/etc/pf.anchors/edu.psu.test"
A: 10.0.1.0
A: 10.0.2.0
A: 10.0.4.4
A: 192.168.1.0
A: 192.168.2.0
A: 10.0.4.4
The relevance I provided will only find IPs within { brackets } but not elsewhere in the file.
The relevance I provided should be significantly faster than the regex, but if you really needed to find IPs not within brackets, then you might need to use regex for that case, though if the IPs are always after the add or delete keyword, then that makes it easier.
This is more generalized to handle more cases, but it sounds like you don’t need it currently:
(it as trimmed string) whose(it != "") of (preceding text of first "/" of it | it) of substrings separated by " " of (preceding text of last "}" of it | it) of (following text of first "{" of it | it) of (following text of first "add" of it | it) of (following text of first "delete" of it | it) of lines containing "." whose( (it contains "{" AND it contains "}") OR (it as lowercase contains "add") OR (it as lowercase contains "delete")) of files "/etc/pf.anchors/edu.psu.test"
This should find BOTH those IPs within brackets as well as those after the add or delete keyword, though it will report those added and deleted the same way.
One key thing in this relevance is lines containing "." which causes the rest of the relevance to only consider lines of the file that contain a period somewhere within. Your first relevance you provided attempts the regex on all lines of the file no matter what, even if they don’t contain a single . which means it will become much slower the longer the file is. If you filter based upon lines containing it will still be somewhat slower the larger the file, but not as significantly so since most lines will be eliminated early in the process.