File parsing - MongoDB config files (Linux only)

Hello all,

I need to check compliance for DISA STIGs by pulling content from MongoDB files (mongod.conf) that could have some inconsistencies in terms of how the lines are laid out. Here’s a default example, and notice the lack of a blank space before the “security” stanza. There also could be anywhere from 1 to 5 or so setting lines below each stanza header (like 4 under “tls” for example)

tls:
   mode: requireTLS
   certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem"
   CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"
   FIPSMode: true

storage:
   dbPath: "/u01/mongo/data"
   journal:
      enabled: true
security:
   keyfile: "/u01/mongo/keys/keyfile"
   authorization: enabled
   javascriptEnabled: false

For the most part there is one line underneath each stanza I need to return a boolean on. For example, I need to confirm if the “enabled: true” line underneath the storage stanza is present:

If (next line of it as lowercase contains “enabled: true” OR next line of next line of it as lowercase contains “enabled: true” OR next line of next line of next line of it as lowercase contains “enabled: true”) of line containing “storage:” of file “/etc/mongod.conf” then “Compliant” else “Not Compliant”

Using a series of "Next Line of it … " works fine (as long as I include enough of them), but it’s hardly elegant.
Because of the possible inconsistencies as noted above, I don’t see another solution. I was thinking of a RegEx but I"m not that familiar with those and unsure if we could get it working anyway … because of the possible inconsistencies.

Any ideas?

Thanks!

I added code tags to your post, so the indentations are presented. This is YAML markup, and the spacing before section names & parameters is required.

We don’t yet have native YAML inspectors, and I don’t have a solution for you yet, but I think it will be based on the spacing.

Thank you, Jason. This is very helpful. I’ve done a few relevance based on character position, so I’ll have to find those in my notes. I’m thinking now I can use preceding and following text based on that spacing.

There is an idea I submitted a while back for YAML inspector, so please vote if you agree that it would be useful.

Otherwise, wrote a piece of code and tested it against a dummy file - I think it works fine but feel free to test further. It DOES go through the lines of the file several times to make it dynamic, so wouldn’t recommend it for an YAML file that is very long (performance-wise it will get bad in a HURRY!!!) but something small it should do the trick…

Code:
exists (set of ((item 1 of it as trimmed string) of (set of (integers in (tuple string item 0 of it as integer, tuple string item 1 of it as integer - 1)) of concatenation ", " of (item 0 of it as string) of ((line numbers of lines whose (it as trimmed string != "" and it does not start with " ") of it; maximum of line numbers of lines of it), line number of line starting with "<name_of_stanza>" of it) whose (item 1 of it <= item 0 of it) of it, lines whose (it as trimmed string != "") of it) whose (line number of item 1 of it is contained by item 0 of it) of file "mongod.conf" of folder "<folder_location>")) whose (exists elements whose (it starts with "<config_name>" and it contains "<config_value>") of it)

Testing examples:
q: (line number of it, it) of lines of file “mongod.conf” of folder "c:\temp"
A: 1, tls:
A: 2, mode: requireTLS
A: 3, certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem"
A: 4, CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"
A: 5, FIPSMode: true
A: 6,
A: 7, storage:
A: 8, dbPath: "/u01/mongo/data"
A: 9, journal:
A: 10, enabled: true
A: 11, security:
A: 12, keyfile: "/u01/mongo/keys/keyfile"
A: 13, authorization: enabled
A: 14, javascriptEnabled: false
A: 15, test1:
A: 16, mode: requireTLS
A: 17, certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem"
A: 18, CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"
A: 19, FIPSMode: true
A: 20,
A: 21, test2:
A: 22, dbPath: "/u01/mongo/data"
A: 23, journal:
A: 24, enabled: true
A: 25, test3:
A: 26, keyfile: "/u01/mongo/keys/keyfile"
A: 27, authorization: enabled
A: 28, javascriptEnabled: false
T: 9.852 ms
I: plural ( integer, file line )

q: exists (set of ((item 1 of it as trimmed string) of (set of (integers in (tuple string item 0 of it as integer, tuple string item 1 of it as integer - 1)) of concatenation ", " of (item 0 of it as string) of ((line numbers of lines whose (it as trimmed string != "" and it does not start with " ") of it; maximum of line numbers of lines of it), line number of line starting with "test2" of it) whose (item 1 of it <= item 0 of it) of it, lines whose (it as trimmed string != "") of it) whose (line number of item 1 of it is contained by item 0 of it) of file "mongod.conf" of folder "c:\temp")) whose (exists elements whose (it starts with "enabled" and it contains "true") of it)
A: True
T: 8.852 ms
I: singular boolean

q: exists (set of ((item 1 of it as trimmed string) of (set of (integers in (tuple string item 0 of it as integer, tuple string item 1 of it as integer - 1)) of concatenation ", " of (item 0 of it as string) of ((line numbers of lines whose (it as trimmed string != "" and it does not start with " ") of it; maximum of line numbers of lines of it), line number of line starting with "security" of it) whose (item 1 of it <= item 0 of it) of it, lines whose (it as trimmed string != "") of it) whose (line number of item 1 of it is contained by item 0 of it) of file "mongod.conf" of folder "c:\temp")) whose (exists elements whose (it starts with "javascriptEnabled" and it contains "true") of it)
A: False
T: 4.416 ms
I: singular boolean
1 Like

Thanks, ageorgiev! I’ll try and find time to test it out.

For my problem, I think the key is isolating the text to just the individual stanzas and using preceding and following text should do that. So if I wanted to return a boolean if the line “enabled” true" exists underneath the stanza header “storage:” as seen here …

image

Then I’d want to look at the lines that follow “storage:”, but precede “security”. The problem is there’s no guarantee the security stanza will follow the storage stanza, so need to figure out how to write the preceding text part based on character position, as in the next line that DOES NOT have an empty space as the first character, as “security” doesn’t.

Still working on it, but if someone has ideas would love to hear them!

The code I gave you should be doing just that. All you need change is:

  1. folder path to the file
  2. the stanza header
  3. the sub-key you are checking
  4. value you are checking for…

The logic I built into the parsing is as follows

  1. Breaks down the file for each stanza and all its subkeys and gets back the line numbers for each
  2. Compares which line number is the one you are looking for and generates an integer range of all the lines from that stanza
  3. Looks if any of the line number under each stanza contain both the sub-key and the value you are after. If it does, returns “True” otherwise “False”

Apologies, ageorgiev! I think it was more I’m not at the level where I could break down your code. I know the individual elements, etc., but putting into a “working” whole still eludes me sometimes.

And so far it looks great. I know you’ve already done this, but I’ve tested to ensure no other lines containing “enabled” and “true” that might exist beyond the stanza I want to look at throws a false positive, and it doesn’t.

Will test more but this looks outstanding (and I will vote).

2 Likes

Note that I am not happy with this yet but wanted to post some regex-based progress in case it’s useful or sparks any ideas.

q: ((items 0 of it of it as trimmed string) whose (it != "") , tuple string of ((substrings separated by "%0a" of item 1 of it as trimmed string) whose (it != ""))) of (item 0 of it, (preceding text of first match (regex("(%0a[[:alnum:]])")) of it | it) of item 1 of it) of (it, concatenation "%0a" of following text of it) whose (item 0 of it does not start with " ") of substrings separated by "%0a" of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: tls:, ( mode: requireTLS, certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem", CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem", FIPSMode: true )
A: storage:, ( dbPath: "/u01/mongo/data", journal:, enabled: true )
A: security:, ( keyfile: "/u01/mongo/keys/keyfile", authorization: enabled, javascriptEnabled: false )
T: 4.409 ms
I: plural ( string, string )

q: tuple string items whose (it starts with "CAFile:") of items 1 of (it) whose (item 0 of it = "tls:") of ((items 0 of it of it as trimmed string) whose (it != "") , tuple string of ((substrings separated by "%0a" of item 1 of it as trimmed string) whose (it != ""))) of (item 0 of it, (preceding text of first match (regex("(%0a[[:alnum:]])")) of it | it) of item 1 of it) of (it, concatenation "%0a" of following text of it) whose (item 0 of it does not start with " ") of substrings separated by "%0a" of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"
T: 2.197 ms

Expanding out the first query for explanation -

(
  (items 0 of it of it as trimmed string) whose (it != "") /* discard any 'empty' stanza (blank line in origin) */
  , tuple string of ((substrings separated by "%0a" of item 1 of it as trimmed string) whose (it != "")) /* split the content of this stanza on newlines, and combine them back into a tuple string, discarding empty values */
) of 
(
  /* with item 0 being the line starting a stanza, 
  split the item 1 to be everything before the start of the next stanza, by finding the next line that does not start with a space(%0a followed by an alphanumeric character). If there is no 'next stanza' then match to the end of the string.
  */
  item 0 of it
  , (preceding text of first match (regex("(%0a[[:alnum:]])")) of it | it) of item 1 of it
) of (
	/* items 0 will be each line of the file filtered to those that do not start with a space,
     items 1 will be the following lines of that line (all the way to the end of the file!) concatenated (again) by %0a
  */
  it
  , concatenation "%0a" of following text of it
  ) whose (
    item 0 of it does not start with " "
    ) of 
/* 
   join all the lines of the file with %0a character (newline) so it's all one big string
   Then split the string into substrings on the %0a character.  Substrings allow 'preceding text of it' and 'following text of it' 
   needed later
 */
substrings separated by "%0a" 
of concatenation "%0a" 
of lines of file "c:\temp\mongo.txt"

Once that’s done, it’s easier to retrieve the contents where ‘item 0’ is the name of a stanza, and ‘item 1’ is a tuple string of all the items in that stanza.

items 1 of (it) whose (item 0 of it = "tls:") of [the long query]

–> returns all the contents of the ‘tls:’ stanza

A: tls:, ( mode: requireTLS, certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem", CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem", FIPSMode: true )

 tuple string items whose (it starts with "CAFile:") of items 1 of (it) whose (item 0 of it = "tls:")  of [the long query]

–> returns the ‘CAFile:’ value from the ‘tls:’ stanza:

A: CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"

I’m still not happy with this, I think it could be made simpler and also doesn’t handle nesting of stanzas (only recognizes the top-level indentations as stanzas). But it’s a work-in-progress.

2 Likes

Here’s another approach that avoids regular expressions. I still don’t have a good way to track the nesting of sub-stanzas but this does break out the top-level stanzas.

First find the positions where we need to start & end a stanza - the first position of the string, the last position of the string, and the positions in between that are the start of a new stanza (the previous character is a “%0a” and the current character is not a space):

q: concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: tls:%0a   mode: requireTLS%0a   certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem"%0a   CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"%0a   FIPSMode: true%0a%0astorage:%0a   dbPath: "/u01/mongo/data"%0a   journal:%0a      enabled: true%0asecurity:%0a   keyfile: "/u01/mongo/keys/keyfile"%0a   authorization: enabled%0a   javascriptEnabled: false
T: 16.739 ms

q: (((starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: 0
A: 159
A: 229
A: 330
T: 16.374 ms

We need to put those position integers into a ‘set’ so we can compare them to each other. We want to build a list of (start position, last position) for each stanza. Loop through each number (to be a potential stanza start) and find the next-smallest number among them to be the end of the stanza:

q: (item 0 of it, item 1 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: 0, 159
A: 159, 229
A: 229, 330
T: 28.092 ms

We can use the substring (<start>, <length>) of <string> inspector to retrieve the substrings between those positions , where we consider the start as ‘item 0 of it’ and the length to be ‘item 1 of it - item 0 of it’.

q: substrings ((item 0 of it, item 1 of it - item 0 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it)) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: tls:%0a   mode: requireTLS%0a   certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem"%0a   CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"%0a   FIPSMode: true%0a%0a
A: storage:%0a   dbPath: "/u01/mongo/data"%0a   journal:%0a      enabled: true%0a
A: security:%0a   keyfile: "/u01/mongo/keys/keyfile"%0a   authorization: enabled%0a   javascriptEnabled: false

Because I find ‘tuple strings’ to be easier to use in terms of looking up values in the string, I’ll split each of these results on the ‘%0a’ character and turn each into a tuple string result:

q: (tuple string of ((substrings separated by "%0a" of it) )) of substrings ((item 0 of it, item 1 of it - item 0 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it)) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"

A: tls:,    mode: requireTLS,    certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem",    CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem",    FIPSMode: true, , 

A: storage:,    dbPath: "/u01/mongo/data",    journal:,       enabled: true, 

A: security:,    keyfile: "/u01/mongo/keys/keyfile",    authorization: enabled,    javascriptEnabled: false
T: 10.679 ms

That ends up with some empty values (where there was a blank line in the file), and also each item might be preceded by spaces (where there was indentation on the value in the file). We can remove those by trimming the substrings and removing the empty ones

q: (tuple string of ((substrings separated by "%0a" of it as trimmed string) whose (it != ""))) of substrings ((item 0 of it, item 1 of it - item 0 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it)) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"

A: tls:, mode: requireTLS, certificateKeyFile: "/u01/mongo/etc/ssl/hostname.va.gov.pem", CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem", FIPSMode: true

A: storage:, dbPath: "/u01/mongo/data", journal:, enabled: true

A: security:, keyfile: "/u01/mongo/keys/keyfile", authorization: enabled, javascriptEnabled: false

Now that we have each stanza as its own result, and each result is represented in a tuple string, we can retrieve any specific value from any top-level stanza:

// retrieve 'CAFile:' from 'tls:'

q: tuple string items whose (it starts with "CAFile:") of it whose (tuple string item 0 of it starts with "tls:") of (tuple string of ((substrings separated by "%0a" of it as trimmed string) whose (it != ""))) of substrings ((item 0 of it, item 1 of it - item 0 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it)) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: CAFile: "/u01/mongo/etc/ssl/vapkirootssub1.pem"
T: 8.788 ms

// retrieve 'keyfile:' from 'security:'

q: tuple string items whose (it starts with "keyfile:") of it whose (tuple string item 0 of it starts with "security:") of (tuple string of ((substrings separated by "%0a" of it as trimmed string) whose (it != ""))) of substrings ((item 0 of it, item 1 of it - item 0 of it) of ((elements of it, elements of it, it) whose (item 1 of it = minimum of items 1 of (item 0 of it, elements of item 2 of it) whose (item 1 of it > item 0 of it) ) of (set of (starts of ( characters (positions of it) whose (start of it = 0 or (it does not start with "%0a" and it does not start with " " and last 1 of preceding text of it="%0a" )) of it); length of it)) of it)) of concatenation "%0a" of lines of file "c:\temp\mongo.txt"
A: keyfile: "/u01/mongo/keys/keyfile"
T: 5.259 ms

Hey guys,

Thanks so much for the help and both codes work. I did just find I have an additional requirement, which is along with looking for a parameter that matches a certain value I also need to ensure a "filter’ parameter does not exist.

For example the code at the bottom is looking for an element containing a parameter beneath the “auditlog” stanza that has the name and value pair of:

destination: syslog

It works perfectly, but when I add that a “filter:” parameter does not also exist I get a parsing error. Clearly my syntax is off as I can’t get the last “it” to point to the folder object. I’m still working on it and pretty sure this is an easy fix I’m just missing.

So either of you (or anyone) can point out where I’m going wrong it would probably save me a fair amount of time.

Thanks!

q: If exists (set of ((item 1 of it as trimmed string) of (set of (integers in (tuple string item 0 of it as integer, tuple string item 1 of it as integer - 1)) of concatenation ", " of (item 0 of it as string) of ((line numbers of lines whose (it as trimmed string != "" and it does not start with " ")of it; maximum of line numbers of lines of it), line number of line starting with "auditlog:" of it) whose (item 1 of it <= item 0 of it) of it, lines whose (it as trimmed string != "") of it) whose (line number of item 1 of it is contained by item 0 of it) of file "C:\test.txt")) whose (exists elements whose (it starts with "destination:" and it contains "syslog") of it) and whose (not exists elements whose (it starts with "filter:") of it) then "Compliant" else "Not Compliant"
E: class ParsingFailure

In the whose clause, and whose() isn’t a valid statement so it gives the parsing error.

Since ‘it’ in the whose clause refers to the whole set of lines, I think you want whose (good condition exists AND bad condition does not exist), so probably something like

whose ( exists elements whose (it starts with "destination:" and it contains "syslog") of it and not exists elements whose (it starts with "filter:") of it ) then "Compliant" else "Not Compliant"

I knew it. I essentially had what you did on a different attempt, but my parenthetical placement was off.

Thanks so much!

1 Like