Well, can I suggest a potential security measure? Maybe additional advanced system settings that not only enables/configures this level of ClientIdentityMatch functionality but also allows you to configure the approved subnets from which to accept such registration requests. For example:
- ClientIdentityMatch=200
- ClientIdentityMatchCriteria=UUID,hostname,IP Address,MAC Address
- ClientIdentityMatchCIDR=1.2.3.0/24,2.3.4.0/24,3.4.5.0/24
This way, it only allows the level 200 on requests coming from those subnets and for all others it treats it as ClientIdentityMatch=100
@JasonWalker - I used a tool called CredentialsFileView to view the credentials on my test PVS server. The credentials did not change after a restart. If you would like to work with me on my test environment, that would be great.
I performed several restarts on the PVS server yesterday and the Bigfix client ID did not change. I have no idea why I am not getting a new ID now.
As we are facing similar issue it would interesting to know if there is any progress on this topic?
Yes, on the endpoint there’s a Citrix service that actually does restore the machine identity correctly, but BESClient was starting before that restore was complete. Adding a startup delay to the BES Client of several minutes (I think they used 15 minutes in that case) appears to resolve it.
3 Likes
Thanks, Sounds good. We will give it a try.
We are still working to get it tested (just internal resource prioritization thing) but same was officially posted as KB article.
1 Like
I am still working with support on this issue. The workaround which is to save the Bigfix registry to a persistent drive and then restore it on restart of the PVS worker server, works until there is a promotion on a master server. Then all of the workers generate a new ID. However, if you re-install the Bigfix agent on the master after the promotion and before restarting the worker servers, the ID is not recreated. Of course, the master servers get a new ID. At least, we avoid 150+ worker servers getting a new ID. Additionally, once the workers start getting a new ID, it does not stop until the Bigfix agent is re-installed on the master.
It all makes absolutely no sense to me. Especially, when Crowdstrike has such a simple answer. You install the Crowdstrike agent with a parameter that says to use the host name as the ID. There is no overhead of a persistent partition, shutdown scripts, and startup scripts.
1 Like
Why do I keep hearing from people on the HCL Bigfix side, that no one else in the world is having the problem I am having with the regeneration of client IDs in the Citrix PVS environment? It sounds like from this thread that others are having the same problem.
I have also heard that no one uses Bigfix in the Citrix PVS environment. I do not use it for delivering updates, but I do use Bigfix Inventory to capture the hardware and software inventory. Bigfix Inventory is dependent on the Bigfix agent working correctly.
Not true, we have the same exact problem but we are just moving very slow towards resolution (in fact, we had done some calculations and because of it the size of our BFEnterprise database is 4 times bigger than what would be expected for environment of our size, so it is not just impact on managing these devices which is bad enough but clear implication on performance & stability of the environment). We have a positive testing on a few “clusters” but haven’t deployed it yet globally. Our Citrix team are migrating from PVS to MCS which as far as the problem is concerned is not making any difference and changing some of the other tooling around managing the devices, which is slowing things down significantly but it is certainly a problem and I am keeping an eye on your posts (much appreciated by the way!). That said, we haven’t seen the issue after promotion of the master server but it is possible that we just haven’t rolled out to large enough group to notice it.
The issue with promotion of the master only seems to happen in our production environment where there are two groups and therefore two masters. In our DR environment, we are not seeing the issue.
Yes, can confirm. We have the same problem; We had a case with HCL that was going on for months. In the end, we just gave up and now we do a manual cleaning everyday, but we have hundreds of duplications each day, all of them from our Citrix Environment.
Until we have a working solution or any documentation, we’ll be doing this manually.
Adeilson,
Thank you so much for your response. I have been working on this issue for almost a year now.
1 Like
I wanted to ask something else what I’ve noticed that is a bit inconsistent and see whether others have experienced it or if in fact someone knows what is causing the inconsistent behaviour. Two citrix machines provisioned off the same exact image with the same exact workaround scripts - both seem to not be resetting (computer ID is not physically changing) but the reporting I have around it is picking one as “reset” while the other isn’t.
The way that I have been “detecting” those client resets is by using a custom property with relevance subscribe time of current site
which seems to be a pretty good estimation when a machine first reports into an environment.
With the workaround I see this relevance/property flaking - both machines are getting OS reset daily, and the property reports a date from May on one of the machines and the current date on the other. Again, the same exact workaround was implemented, with the same exact everything as the machines are clones of the same image… Any ideas what may be causing this OR if in fact anyone has better suggestion on how to establish “first report time” of a client, I would be very grateful!
I leverage minimum of subscribe times of sites
with some added relevance for local time zone and date format.
I tried that too, I even checked (name of it, subscribe time of it) of sites
just to make sure it’s not one site that is getting reset but they all are. Really strange, two machines provisioned from the same image at the same time with the same scripts and one is getting all sites resubscribed where the other is just picking up where it left before.
So if I understand you’re already following the instructions and set up the scripts to capture & restore the BES registry paths and KeyStorage folders, and you have unique computer ID and certificates preserved for each of your running instances, right?
If you consistently have one of the two that is working as expected and the other is not, it’s possible that at some point the resetting one was initialized without running the restore script, and detected as a duplicate.
Once a machine is detected as a duplicate, that machine’s certificate is revoked on the server, and every time you re-initialize the machine with that revoked certificate it is going to reset itself again.
So with that in mind I’d check a few things to see whether this might be the problem -
-
Are you capturing again the current state at each shutdown (picking up the new computer ID and certificate for the one that is resetting, after it has enrolled a new ID and certificate)?
-
Is it consistently the same instance that is resetting?
-
Is it only one instance that is resetting?
-
Is either of the machines reporting issues (probably in the Event Log) around the Citrix services startup? The BigFix certificates are encrypted with some Windows crypto api key information that Citrix has to restore back to original before the BigFix client can load the client cert. If the client can’t open the certificate because cryptoapi info has changed, the client will reset itself and generate a new certificate & ID.
-
Can you post a few lines of the client log before & after the client reset? Or open a support ticket and upload there? That can help determine whether the reset is because the server is detecting a duplicate ID, or because the client can’t load the client certificate.
@JasonWalker, sorry, I didn’t explain it well enough. As far as resetting the ComputerID is concerned alone they are both working - the process is retaining the ComputerID post-OS reset which is what the KB covers. The inconsistency is specifically with the way I detect/report which is using subscribe times of sites essentially:
- One of the machines reports - date in May and that date is retained post-resets
- The other machine - date that changes every day to the current date
So question is what causes/forces the actual subscribe times of the client to reset on one of the machines but not on the other if they use the same exact process as documented and both processes are technically working as expected?
Oh! The computer ID is not being reset, on the client that re-subscribes the existing sites?
I think we’d probably need to read the client log from the point at startup to see what’s happening with it.
But, no, I don’t think that’s expected behavior. Assuming the client has a good copy of __BESData, it should not need to reset when it starts up.
Yes, that’s what I would have thought too. I did open a support case and uploaded the log for analysis but yea, the entire BES Client folder is put on a separate persistent drive and per the KB the only thing that is being put back are the reg keys, everything else should be exactly as is prior to OS reset.