Any official doc to address duplicate computer caused by Citrix "Citrix Provisioning Services"?

yorkly1 · July 26, 2022, 6:47pm

I have validated that those VMWare VMs (duplicate computers) created from the Citrix PVS model have the same information from these 3 queries below. Same VMWare UUID, same IP and same MAC and same hostname. Only different is the Client ID created a new one each time it spawned a new VM.

Q: virtual of hardware

Q: info of client

Q: concatenation “_” of elements of set of mac addresses of ipv4or6 interfaces whose ( exists mac address of it and length of mac address of it = 17 ) of adapters of network

mbartosh · July 26, 2022, 8:28pm

Hi @gus, I ran the relevance statements and they were all the same before and after a restart. The results of these statement do not change. This are basically the same as the statement I made early, and @yorkly1 has made that the UUID, MAC address, IP Address, and serial number remain the same. Only the computer ID changes.

Were you able to get your Citrix Provisioning Services working with Bigfix?

JasonWalker · July 26, 2022, 9:37pm

@mbartosh - I think there’s a clue fairly early in the client log you posted.

RegisterOnce: Attempting secure registration with 'https://REDACTED:52311/cgi-bin/bfenterprise/clientregister.exe?RequestType=RegisterMe60&ClientVersion=10.0.1.41&Body=0&SequenceNumber=4&MinRelayVersion=7.1.1.0&CanHandleMVPings=1&Root=http://REDACTED%3a52311&AdapterInfo=00-50-56-bc-94-3c_REDACTED%2f24_REDACTED_0'

Comparing with my RegisterOnce messages after rebooting/restarting the BESClient, your URL’s parameter ‘Body=0’ should contain the BES Client ID. Also the SequenceNumber=4 is a very low sequence number, like what I’d expect with a newly-installed client.

For instance my message looks like
RegisterOnce: Attempting secure registration with 'https://REDACTED:52311/cgi-bin/bfenterprise/clientregister.exe?RequestType=RegisterMe60&ClientVersion=10.0.7.52&Body=1614594029&SequenceNumber=514&MinRelayVersion=7.1.1.0&CanHandleMVPings=1&Root=http://REDACTED%3a52311&AdapterInfo=00-15-5d-01-02-03_192.168.1.0%2f24_192.168.1.36_0&AdapterIpv6=00-15-5d-01-02-03%5efe80%3a%3ad12%3a3a9%3a860%3a3acf%2f64_0'

I’m still liking the idea that your BESClient service is sometimes starting before your Startup script restores the registry keys, which could also account for why it sometimes resets and sometimes does not (i.e. a timing issue between Machine Startup scripts executing & how long it takes for all the Services to start).

I’m going to try tampering with my client (removing the registry values, removing the client keys) to see whether I can reproduce the log symptoms at least.

Also, if you don’t mind, could you restart the client and then post the portion of the new log with the computer info on restart? My 10.0.7 client log reports the existing Computer ID before it tries to register to the Relay, while yours does not; I’m not sure whether that’s a logging change between 10.0.1 and now, or whether the computer ID wasn’t reported in your log because the computer didn’t have one yet. Restarting the client service on a working machine should give a clue there.

JasonWalker · July 26, 2022, 9:42pm

After removing only the “ComputerID” value at HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\BigFix\EnterpriseClient\GlobalOptions, my logs look a lot more like yours. The Computer ID is not reported in the client info before the RegisterOnce, and the RegisterOnce Body= parameter is a 0.

JasonWalker · July 26, 2022, 10:44pm

To be clear, after removing the ComputerID value in the registry, the computer reset itself when I started the BESClient service.

The next case I tried was where the ComputerID is retained in the registry, but the KeyStorage folder is deleted.
The client attempted to register with the existing ComputerID and generate a new certificate, but got a message from the server that a public certificate already existed for that computer ID, and the client reset itself.

Then I left the ComputerID and KeyStorage folder, but decremented the value in ‘RegCount’. This tracks the number of registrations the client has made, and decrementing it simulates ‘Rolling a VM to an earlier Snapshot’. When the client started, the server triggered a client reset.

Then I set ‘clientIdentityMatch=100’ on my root server and rolled back the RegCount on my client again. This time the client log reflects it reusing the existing identity -

   RegisterOnce: Attempting secure registration with 'https://redacted:52311/cgi-bin/bfenterprise/clientregister.exe?RequestType=RegisterMe60&ClientVersion=10.0.7.52&Body=15728897&SequenceNumber=1&MinRelayVersion=7.1.1.0&CanHandleMVPings=1&Root=http://redacted%3a52311&AdapterInfo=00-15-5d-01-02-03_192.168.1.0%2f24_192.168.1.36_0&AdapterIpv6=00-15-5d-01-02-03%5efe80%3a%3ad12%3a3a9%3a860%3a3acf%2f64_0'
   Unrestricted mode
   Configuring listener without wake-on-lan
   Registered with url 'https://redacted:52311/cgi-bin/bfenterprise/clientregister.exe?RequestType=RegisterMe60&ClientVersion=10.0.7.52&Body=15728897&SequenceNumber=1&MinRelayVersion=7.1.1.0&CanHandleMVPings=1&Root=http://redacted%3a52311&AdapterInfo=00-15-5d-01-02-03_192.168.1.0%2f24_192.168.1.36_0&AdapterIpv6=00-15-5d-01-02-03%5efe80%3a%3ad12%3a3a9%3a860%3a3acf%2f64_0'
   Registration Server version 10.0.7.52 , Relay version 10.0.7.52
   Relay does not require authentication.
   Client has an AuthenticationCertificate
   RegisterOnce: Server has detected a possible rollback and requests a site gather and report refresh

mbartosh · July 26, 2022, 10:51pm

After the restart of the server, and before the Bigfix service is started, I can see that the Bigfix client registry is restored to what it was before the restart. It has the same ComputerId as before the restart. I wait at least 5 minutes and then start the Bigfix service. As soon as I start the service, the computerId goes to all zeros. Then it takes about 5 minutes for logging to start again. A .bkg log is created and a new log is started. The .bkg log does not have a computer ID, but says "Scheduling client reset: Computer id changed to 539661474. Then the new log has Computer ID 1082361227. Debug logging appears to stop as soon as the new computer ID is issued even though the registry still has the EMsg settings.

yorkly1 · July 27, 2022, 9:47pm

Hi Jason, from my perspective, the procedures and logic dealing with duplicate computer matter does not work as expected. On top of that, there are manual steps to preserve and restore information on each VMs. It is just not a feasible solution for BigFix admin team to add more workload. Could you please discuss this matter with the development team internally to come up a simple automation solution to handle this. If a VM was restored from snapshot or whatever other methods with the same UUID and MAC, technically speaking it is from the same VM image. UUID and MAC are the unique identifier to identify and validate this is the same server. Thus, keep the same ComputerID and why it needs to reset?

If UUID and MAC information somehow gets changed, then reset the ComputerID. Hostname and IP information should be play the role to determine if ComputerID needs to change as this information can be changed as required.

JasonWalker · July 27, 2022, 10:49pm

When VMs are rolled-back to a snapshot, the ClientIdentityMatch logic does seem to work. At this point I think Citrix Provisioning Services is doing more than a rollback and I’m zeroing-in on an answer, been running some experiments for a couple days now.

mbartosh · July 28, 2022, 4:42pm

Here is the latest on this issue from support and development.

When the client is installed we use an OS provided crypto key to do the encryption of the client private key.
When the Citrix instance is closed down the OS key is destroyed. On start up on the new image, the OS Crypto key is regenerated and is a new key which cannot then be used to decrypt the client private key which requires the old OS Crypto key. Unless Citrix has a means to store and reuse the old key this is a new design function and therefore will require an enhancement request.

I was wondering how SCCM works with Citrix PVS. The following KB mentions: ESX - The PVS server needs to have a self-signed cert with the vCenter server.

https://support.citrix.com/article/CTX205394/how-to-configure-pvs-vdisk-update-management-using-sccm

JasonWalker · July 28, 2022, 5:17pm

Yes, I’ve been talking with Support and Engineering on it, and running some experiments of my own (though I don’t have access to Citrix PVS itself I can kind-of simulate some of the things they’re doing).

When the BESClient generates its client authentication certificate we encrypt the private key using Windows DPAPI. In a Snapshot Rollback/DeepFreeze rollback-type scenario, the private key can still be decrypted by Windows because the key we have on-disk matches the DPAPI master key used by Windows.

Something in Citrix Provisioning Service is resetting the master key, leaving our BESClient private key unreadable. We can see this in log messages such as "Windows Error 0x8009000b: Key not valid for use in specified state. ". That appears in the BESClient log, but the message text itself is coming from Windows’ DPAPI library, I can reproduce the exact same message in non-BigFix code. Resetting the master key would have other effects in Windows as well - like Scheduled Tasks or Services logons that have stored passwords would be unusable, saved browser passwords would not be usable, things along those lines.

I suspect that Citrix’s management of the local computer machine account password is related to losing access to the DPAPI master key in Windows, but we’re beginning to get into a bit of Windows Internals that I’m still learning about. Of the Citrix articles I could find, I think these maybe related

https://support.citrix.com/article/CTX132289/how-to-troubleshoot-provisioning-services-server-machine-account-password

https://developer-docs.citrix.com/projects/citrix-virtual-apps-desktops-sdk/en/latest/ADIdentity/Repair-AcctADAccount/

My theory would be that whatever they’re doing to change the LocalSystem account probably resets the password in a way where we can no longer decrypt secrets that were saved under the original keys.

In this state, where BESClient cannot unpack its existing certificate & key to authenticate as the original computer account, we don’t even get as far as the ClientIdentityMatch logic to try overwriting the old computer account. The hardware matching logic doesn’t even come in to play. The most likely forward path that I see is to have Citrix support look at whatever is happening that’s resetting the Windows DPAPI secrets.

mbartosh · July 29, 2022, 3:42pm

Did you find a way to display the DPAPI? If I can display it, I can verify that it is changing after every restart.

JasonWalker · July 29, 2022, 4:32pm

I didn’t go down that path, but yes if you can retrieve the DPAPI key that would be useful to me, thanks!

mbartosh · August 3, 2022, 2:42pm

We have opened a case with Citrix. The initial contact was not hopeful. We are waiting for an escalated answer. I can’t believe that HCL cannot setup a Citrix PVS environment.

I am not sure Bigfix Management realizes what a big problem this is. We have 300 of these Citrix PVS servers. We are going to have to remove the Bigfix agent from these systems if this problem cannot be resolved. We cannot continue to create 300 duplicate instances every day. This means that the viability of Bigfix is in question if it cannot manage all of our servers.

ageorgiev · August 3, 2022, 3:01pm

I still think there needs to be a solution to be looked in - BigFix should be able to accommodate the support of such machines! This is where infrastructure technology is most-likely going to go with the “Infrastructure as code” concept making strides where machines will be immutable (cattle-vs-pets) and machines will just be “rebuilt” on every release, rather than “patched”/“maintained”…

The above discussion is great in the sense of explaining WHY the workaround is not working with current design of the agent but that doesn’t take away from the fact that we need permanent solution. Today’s Citrix PVS, tomorrow will be ALL machines provisioned to CI/CD pipelines running on all virtualizations and so on…

mbartosh · August 3, 2022, 4:21pm

@ageorgiev - you mentioned a workaround. What workaround are you referring to? Do you mean the persistent partition with saving the Bigfix registry before restart and the reapplying it at startup?

ageorgiev · August 3, 2022, 4:39pm

I meant what you guys are trying by restoring reg keys/certs/files/folder/etc. I don’t consider this a “solution” but more of a “workaround” under current agent/rootserver design, even if you get it working. Solution would be a native out-of-box support for this kind of technologies where we configure what are the minimum requirements to be matched on registration requests and the root server is smart enough to not reset client IDs but reuse them and will require a redesign of the processes.

yorkly1 · August 4, 2022, 2:25pm

HCL should allocate resources to build the lab environment to simulate and test different virtualization technologies in today and moving forward to tomorrow world. This helps to deal with different issues and how the BigFix agent behaves especially dealing with duplicate computer matter which is a big issue in virtualization environment. My technical and logical sense is that regardless how the same virtual server is restored with preinstalled agent from snapshot or system restore method, it is still the same virtual server (guy) with the same UUID, hostname, IP and MAC. Why the bigfix client needs to reset itself to create a new ID? That is the part not making sense. Think of the bigger picture, a person moved out of the house and moved back to the same house later or different house located in different city or country. Isn’t this person carrying the same identify information (security ID, same driver license #, etc…) Why would you issue a new identify information to this person? Same logic apply here when dealing with duplicate computer matter.

The workaround provided to manually take a backup of the registry key and other information to a backup location and then manually restored them before restarting the besclient service in order to avoid duplication issue, is completely not a good workaround per say. Adding extract workload for someone to do these steps are not doable especially dealing with hundreds of duplicate computers daily. Not a happy workaround.

yorkly1 · August 4, 2022, 2:28pm

HCL can’t really rely on its customers to try testing out this workaround and hoping it will work without conducting a fully tested in HCL lab environment. After it is fully tested and verified it is 100% working in the logical sense, please document it the step-by-step how to get it working.

JasonWalker · August 4, 2022, 3:52pm

The main difficulty there is that the UUID, hostname, IP, and MAC are all reported by the client, and a malicious client could impersonate any other machine and report whatever values it wants up to the Relay/Root Server. Our protection against that is the client authentication certificate that is used to sign that report when it is made, and removing the certificate authentication reduces how much the server should trust the given client report.

We’re still considering options, but must take care as there’s a security implication to any changes we make in this space.

As far as a tested/proven procedure, the ClientIdentityMatch option is a tested & proven procedure - but it’s a solution that solves a different problem, the VM Snapshot/Rollback or Disk Write Filter/DeepFreeze scenario. That doesn’t apply in the Citrix PVS scenario because the machine Citrix provisions really is a new operating system base every time, without preserving the system’s keys that make the endpoint unique.

I can’t commit to whether or when a workaround could be provided, but if there is one it would be very specific to this scenario, would in some way reduce the security of the server infrastructure, and care would need to be taken to avoid applying such workaround generally, one would only perform it on the specific systems on an as-needed basis.

itsmpro92 · August 4, 2022, 4:18pm

When we migrate a root server to new hardware, we use the ServerKeyTool to decrypt and save the otherwise encrypted configuration keys. I wonder if a similar approach could be developed for managing the client’s private key.