If each entry for the hostname is the start of the line, you would want the preceding text of the second quote of the text following the first quote of each line.
preceding texts of firsts "%22" of following texts of firsts "%22" of lines of file
@Aram’s relevance works for me. Sure you aren’t using a singular where you should be using a plural? An excerpt from your actual actionscript might help.
(item 0 of it) of (preceding texts of firsts “.” of lines of file “E:\test\test1.txt”, lines of file “E:\test\test2.txt”) whose (item 0 of it = item 1 of it)
But for the files containing 10 -15 lines it compares successfully. But when i try to compare files with 2000 entries . It shows blank output.
You should try that with a smaller data set and then tune it for efficiency. With 2,000 entries, you’re creating a record set of 4,000,000 records, then reducing it to “only” the matches. For each line of the first file, you’re reading every line of the second file to find the lines that match the first file, so the second file is being read 2,000 times.
@Aram’s relevance, on the other hand, reads each file 1 time, putting all the lines of the file into a set object, which can be quickly compared with other sets.
Here you can see that even for a small number of lines, @Aram’s check is twice as fast:
q: lines of file "c:\temp\test1.txt"
A: 1.abc
A: 2.abc
A: 3.abc
A: 4.abc
A: 5.abc
T: 0.165 ms
I: plural file line
q: lines of file "c:\temp\test2.txt"
A: 3.def
A: 4.def
A: 5.def
A: 6.def
A: 7.def
A: 8.def
T: 0.170 ms
I: plural file line
q: (item 0 of it) of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt", preceding texts of firsts "." of lines of file "c:\temp\test2.txt") whose (item 0 of it = item 1 of it)
A: 3
A: 4
A: 5
T: 0.697 ms
I: plural substring
q: elements of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt"),set of (preceding texts of firsts "." of lines of file "c:\temp\test2.txt"))
A: 3
A: 4
A: 5
T: 0.365 ms
I: plural string
When I increase the file sizes to 2,000 lines (of that same sort of short sample line), the time delta is much more drastic:
q: number of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt", preceding texts of firsts "." of lines of file "c:\temp\test2.txt") whose (item 0 of it = item 1 of it)
A: 1501
T: 8831.301 ms
I: singular integer
q: size of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts "." of lines of file "c:\temp\test1.txt"),set of (preceding texts of firsts "." of lines of file "c:\temp\test2.txt"))
A: 1501
T: 11.381 ms
I: singular integer
elements of ((intersection of (item 0 of it;item 1 of it))) of (set of (preceding texts of firsts “.” of lines of file “file.txt”),set of (lines of file “file1.txt”))
Since you’ve changed the question from “line matches” to “partial line matches”, you’ve made the problem considerably more complex. I don’t see a way to do set intersections anymore, so you’re back to string comparison clauses.
I would question whether you’re taking a correct approach, as I think you are either not using your data correctly, or BigFix is not the right tool for what you are trying to do.
I have 2 list of computers One from our Asset Management tool and other from IEM.
So We are pulling the retired machine from Asset Mgmt tool and compare it with IEM. and remove it from IEM automatically.
Problem :----
The Computer Name in Asset mgmt is normal name…but IEM detects FQDN.
Example :—
Asset Mgmt tool IEM tool
Test1 Test1.example.com
Test2 Test2
Test3 Test3.standard.com
So we have like this …now i want to compare both the files and pull out data from IEM which matches with Asset Mgmt tool file… and then remove it…
I am through with machines have same name in both files but problem is with fqdn.
As we can only put Computer name in a file and Remove using computer removal utility.
I hope this describes what exactly i am want to achieve…
Please suggest if you haveany simpler way to do it.
Yes…but Asset Mgmt list should be compared with IEM as there would be possibility of Machines may not be present in IEM. and around 20 % machines have computer name as FQDN…and the same name in Asset Mgmt tool (as it is manual) they are without FQDN.
Ah, well, I guess I’m invested in this question now. This could certainly be optimized for readability, and possibly for performance, but it performs much better than a tuple of lines from each file already.
Build three sets - one for the lines of the first file, one for the lines of the second file (removing the “.” and everything after), and a third for the original lines of the second file.
Use an intersection of the “trimmed” lines to find matches, then find the lines from the original second file that start with a matched value.
The performance on this should improve as the number of duplicates goes down. In my check test1.txt has 501 lines, test2.txt has 204 lines, there are 101 matches between them, and it completes in 7.098 ms.
q: items 1 of ((elements of item 0 of it, elements of item 1 of it) whose (item 1 of it starts with item 0 of it))of (intersection of (item 0 of it;item 1 of it), item 2 of it) of ((set of (if it contains "." then preceding text of first "." of it else it) of lines of file "c:\temp\test\test1.txt"),(set of (if it contains "." then preceding text of first "." of it else it) of lines of file "c:\temp\test\test2.txt"), (set of lines of file "c:\temp\test\test2.txt"))