DSA Failing on secondary server after migration to dedicated SQL servers

Any suggestions on this one? Currently 2 BES Server instances, primary and secondary DSA structure. Previously the SQL databases for both instances were running locally on the same server as the BES installations. We migrated the SQL databases to dedicated SQL servers recently for performance and other reasons. Currently, I have the primary up and running as it should, it took some tweaking to get it to work after the migration, but it is now working completely on the primary. On the secondary, the actual instance itself is running as well, and is also able to connect to its own database, but the replication between primary and secondary databases is failing. To keep this initial post short(er), I will continue providing further details as replies, if anyone wishes to read further.

Essentially, we previously had just the 2 BES Servers, each with their own SQL databases running locally. Now, we have the 2 BES servers, each with their own dedicated SQL server, geographically located close to each other (primary BES and SQL servers located in production datacenter, secondary BES and SQL servers located at DR datacenter). Replication was working just fine previous to the migration to dedicated SQL servers, but now it is not. Viewing the Replication status in the BESAdmin tool shows error and “Cannot open database ‘BFEnterprise’ requested by the login. The login failed. (37000:4060) - …”

Replication is handled through the FillDB service. Check the logs at BES Server\FillDBData\Filldb.log.

Do you have SQL traffic open between each DB server and each DSA server? Each DSA will need to communicate with the other server’s SQL database. Are you using Integrated Authentication or SQL Auth on the database instances? If Integrated Authentication, you’ll need the computer accounts (or Service Account) to have permission to each other’s database, and if using SQL Auth there are registry keys on each server required to communicate to the remote databases.

Previously, both instances had all BES services running as local system, which was no problem while the SQL instances were still local. After migrating to dedicated SQL servers, I had to change all the services to run as AD service account (with full admin access to both BES and SQL instances) on both instances, which allowed each BES instance to start up and connect successfully to its own SQL database. Same service account across all instances. And yes SQL traffic is open between each DSA and DB server.

The FillDB.log shows essentially the same error as what shows in the BESAdmin replication tab:

Replication connection attempt failed for server ‘(besserver).(domain).net’: Database Error: [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot open database “BFEnterprise” requested by the login. The login failed. (37000:4060)

Yep, that looks like a bad/missing credential or the login is not created in SQL server.

Confirmed all ODBC connections were updated to point to the new SQL instances, tested all ODBC connections successful. What is interesting is it is the exact same service account across all instances, all instances configured the same way and all joined to the same domain. They are each able to authenticate to their own SQL instance just fine, using that same service account, I just must be missing an additional permission that is blocking that replication. The BES FillDB services on both instances are using that same service account, and they are both able to write to their own databases just fine, which would seem to me that the existing permissions should be just fine for replication, obviously I’m missing something though. Will keep digging… Thanks for the input Jason!

Can you connect to each database (remotely) using SQL Studio using the service account? I also recall that you might need to setup both 32- and 64-bit odbc connections. Using both system32\odbcad32.exe and syswow64\odbcad32.exe

Sorry it’s been a long time since I configured our DSA servers

I was able to connect to both from either instance no problem. The only remaining caveat was that I still had SQL running on the primary BigFix server, hosting only the ilmt db, so I went ahead and migrated that off to the dedicated SQL server and removed SQL entirely from the Primary BigFix instance, so it was in exactly the same config as the replication server. Still no dice. So finally I followed the documentation to remove the replication server, that went well, then went through and completely removed all BigFix components from the replication server, rebooted then went through the process to add it back/reinstall server components. During that setup, I received errors, it failed completely to even create the necessary databases on the dedicated SQL server for the replication server instance, gave errors regarding inconsistent state. All permissions were exactly the same as for the primary BF server SQL instance, so I know that is not the problem. It is now leading me to believe that the SQL instance for the replication server is jacked up. So I’m going to completely wipe and rebuild both SQL and BF replication server instances, and hopefully get better results. Will post again with results.