Comparing two files

swap041 · September 15, 2015, 3:12pm

Hi Team,

I want to read the content from a file . The file looks :

“hostname”,“ID”,“Status”
“ABC123”,“12YT”,“Prod”
“DCV234”,“45TY”,“uat”
“VBG345”,“hg567”,“dev”

I want to append only hostnames from the above file which the first item of the file in each line.

Used substring seperated by “%22” but all the text comes in same line.

I want to achieve the file shown below :

hostname
ABC123
DCV234
VBG345

Please assist

jmaple · September 15, 2015, 3:23pm

If each entry for the hostname is the start of the line, you would want the preceding text of the second quote of the text following the first quote of each line.

preceding texts of firsts "%22" of following texts of firsts "%22" of lines of file

swap041 · September 16, 2015, 6:26am

Hi,

Thanks adding this if we want to append this to a text file and it worked .
"Concatenation %0d%0a" of .

Thanks once again

swap041 · September 16, 2015, 10:15am

Guys Now I got the above files in this form…

And when i compare two files each file would have around 5000 entries. I want to comapare (item 0 of it = item 1 of it) means comapre both files .

I have the following relevance in place:
(item 0 of it) of(lines of file “test.txt”, lines of file “text1”) whose (item 0 of it = item 1 of it).

The fixlet debugger is reflecting Out of memeory error.

Aram · September 16, 2015, 2:15pm

Can you see if something like the following works better and more quickly?

elements of ((intersection of (item 0 of it;item 1 of it))) of (set of (lines of file “test.txt”),set of (lines of file “text1”))

swap041 · September 21, 2015, 7:10am

Hi Aram,

Both the files have compared it works ,but it only searches the 1st Equal string. means

file 1 :- File 2
31 45
34 31
36 67
45 78
67 36

Output is :- 31 , means only first compared value is identified not all. I want all the equal value to be appended in new file .

swap041 · September 23, 2015, 2:14pm

Guys any help on this would be appreciated

JasonWalker · September 24, 2015, 12:42am

@Aram’s relevance works for me. Sure you aren’t using a singular where you should be using a plural? An excerpt from your actual actionscript might help.

swap041 · September 24, 2015, 6:09am

I tried this relevance to get the data:

(item 0 of it) of (preceding texts of firsts “.” of lines of file “E:\test\test1.txt”, lines of file “E:\test\test2.txt”) whose (item 0 of it = item 1 of it)

But for the files containing 10 -15 lines it compares successfully. But when i try to compare files with 2000 entries . It shows blank output.

Please suggest

JasonWalker · September 24, 2015, 7:57am

You should try that with a smaller data set and then tune it for efficiency. With 2,000 entries, you’re creating a record set of 4,000,000 records, then reducing it to “only” the matches. For each line of the first file, you’re reading every line of the second file to find the lines that match the first file, so the second file is being read 2,000 times.

@Aram’s relevance, on the other hand, reads each file 1 time, putting all the lines of the file into a set object, which can be quickly compared with other sets.

Here you can see that even for a small number of lines, @Aram’s check is twice as fast:

    q: lines of file "c:\temp\test1.txt"
A: 1.abc 
A: 2.abc 
A: 3.abc 
A: 4.abc 
A: 5.abc 
T: 0.165 ms
I: plural file line

q: lines of file "c:\temp\test2.txt"
A: 3.def
A: 4.def
A: 5.def
A: 6.def
A: 7.def
A: 8.def
T: 0.170 ms
I: plural file line




  q: (item 0 of it) of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt", preceding texts of firsts "." of lines of file "c:\temp\test2.txt") whose (item 0 of it = item 1 of it)
    A: 3
    A: 4
    A: 5
    T: 0.697 ms
    I: plural substring
    
    q: elements of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt"),set of (preceding texts of firsts "." of lines of file "c:\temp\test2.txt"))
    A: 3
    A: 4
    A: 5
    T: 0.365 ms
    I: plural string

When I increase the file sizes to 2,000 lines (of that same sort of short sample line), the time delta is much more drastic:

    q: number of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt", preceding texts of firsts "." of lines of file "c:\temp\test2.txt") whose (item 0 of it = item 1 of it)
A: 1501
T: 8831.301 ms
I: singular integer

q: size  of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt"),set of (preceding texts of firsts "." of lines of file "c:\temp\test2.txt"))
A: 1501
T: 11.381 ms
I: singular integer

swap041 · September 25, 2015, 2:10pm

Hi Jason Thanks… It worked but a problem…

I have a file 1 :-

Test
Test1
Test2.xyz.com
Test3.vbh.com
Test4

File 2:
Test
Test6
Test2
Test3

When i am comparing files both files as

elements of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts “.” of lines of file “file.txt”),set of (lines of file “file1.txt”))

The output is :------------
Test
Test2
Test3

I want the output to be
Test
Test2.xyz.com
Test3.vbh.com

Please suggest

gearoid · September 25, 2015, 2:59pm

shot in the dark - try swapping over the references to the two files in the relevance ?

swap041 · September 26, 2015, 10:38am

No success… It will not work as comparison is based on preceding text. and other is lines of file.

Please suggest …!!!

JasonWalker · September 27, 2015, 9:20pm

Since you’ve changed the question from “line matches” to “partial line matches”, you’ve made the problem considerably more complex. I don’t see a way to do set intersections anymore, so you’re back to string comparison clauses.

I would question whether you’re taking a correct approach, as I think you are either not using your data correctly, or BigFix is not the right tool for what you are trying to do.

Can you explain the use case for this?

swap041 · September 28, 2015, 8:07am

Hey Jason,

Here is what i want to achieve:----

I have 2 list of computers One from our Asset Management tool and other from IEM.
So We are pulling the retired machine from Asset Mgmt tool and compare it with IEM. and remove it from IEM automatically.

Problem :----
The Computer Name in Asset mgmt is normal name…but IEM detects FQDN.
Example :—
Asset Mgmt tool IEM tool
Test1 Test1.example.com
Test2 Test2
Test3 Test3.standard.com

So we have like this …now i want to compare both the files and pull out data from IEM which matches with Asset Mgmt tool file… and then remove it…

I am through with machines have same name in both files but problem is with fqdn.

As we can only put Computer name in a file and Remove using computer removal utility.

I hope this describes what exactly i am want to achieve…

Please suggest if you haveany simpler way to do it.

gearoid · September 28, 2015, 8:23am

So … you want to delete computers from BigFix when they are in the list you’ve got from your Asset Management system ?

swap041 · September 28, 2015, 8:30am

Yes…but Asset Mgmt list should be compared with IEM as there would be possibility of Machines may not be present in IEM. and around 20 % machines have computer name as FQDN…and the same name in Asset Mgmt tool (as it is manual) they are without FQDN.

swap041 · September 29, 2015, 3:25pm

Any suggestions on this .!!! Please suggest

JasonWalker · September 30, 2015, 1:55am

Ah, well, I guess I’m invested in this question now. This could certainly be optimized for readability, and possibly for performance, but it performs much better than a tuple of lines from each file already.

Build three sets - one for the lines of the first file, one for the lines of the second file (removing the “.” and everything after), and a third for the original lines of the second file.

Use an intersection of the “trimmed” lines to find matches, then find the lines from the original second file that start with a matched value.

The performance on this should improve as the number of duplicates goes down. In my check test1.txt has 501 lines, test2.txt has 204 lines, there are 101 matches between them, and it completes in 7.098 ms.

q:  items 1 of ((elements of item 0 of it, elements of item 1 of it) whose (item 1 of it starts with item 0 of it))of (intersection of (item 0 of it;item 1 of it), item 2 of it) of ((set of (if it contains "." then preceding text of first "." of it else it) of lines of file "c:\temp\test\test1.txt"),(set of (if it contains "." then preceding text of first "." of it else it) of lines of file "c:\temp\test\test2.txt"), (set of lines of file "c:\temp\test\test2.txt"))

swap041 · September 30, 2015, 1:51pm

Thanks Jason!!!. It worked