9.5.x DSA ISSUES

Since the upgrade to 9.5.5 DSA has never finsished replication. I’m hearing this from other BigFix customers.
Whilst I have given support every opportunity to fix this, no one has to date. Debug logs indicate initial replication has started but the DB size fails to increase any greater than 8GB of the expected 50GB on the master.
Given the FillDB has been rewritten for 9.5.x and I’m seeing significant perfomance gains on the master server, something has broken replication as I have never had these issues previously on my 9.2.x deployment.
Anyone else ?

//Extract from FillDB.log
Initial replication from my.server.com using to log on BFEnterprise
Tue, 06 Jun 2017 09:58:38 -0700 – 7592 – Replication failed for server 'my.server.com: A replication lock request for FIXLETRESULTS (Shared) timed out.
Tue, 06 Jun 2017 14:01:33 -0700 – 7592 – Replication failed for server 'my.server.com: A replication lock request for FIXLETRESULTS (Shared) timed out.
Tue, 06 Jun 2017 19:02:57 -0700 – 7592 – Replication failed for server ‘my.server.com’: A replication lock request for FIXLETRESULTS (Shared) timed out.
Tue, 06 Jun 2017 23:14:47 -0700 – 7592 – Replication failed for server ‘my.server.com’: Database Error: [Microsoft][ODBC SQL Server Driver][SQL Server]Transaction (Process ID 56) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. (40001: 1,205)
Tue, 06 Jun 2017 23:16:49 -0700 – 9824 – Error while recoding query results for encoding windows-949-2000. CharacterSetResult: UErrorCode=12 [U_ILLEGAL_CHAR_FOUND] in ucnv_convertEx ["\x82"]
Tue, 06 Jun 2017 23:16:49 -0700 – 9824 – Error while recoding query results for encoding windows-949-2000. CharacterSetResult: UErrorCode=12 [U_ILLEGAL_CHAR_FOUND] in ucnv_convertEx ["\x99"]
Tue, 06 Jun 2017 23:16:49 -0700 – 9824 – Error while recoding query results for encoding windows-949-2000. CharacterSetResult: UErrorCode=12 [U_ILLEGAL_CHAR_FOUND] in ucnv_convertEx ["\x82"]
Tue, 06 Jun 2017 23:16:49 -0700 – 9824 – Error while recoding query results for encoding windows-949-2000. CharacterSetResult: UErrorCode=12 [U_ILLEGAL_CHAR_FOUND] in ucnv_convertEx ["\x82"]
Tue, 06 Jun 2017 23:16:49 -0700 – 9824 – Error while recoding query results for encoding windows-949-2000. CharacterSetResult: UErrorCode=12 [U_ILLEGAL_CHAR_FOUND] in ucnv_convertEx ["\xA5"]

Not quite the same issue, but I upgraded to 9.5.4 some time ago, and just last week replication issues have appeared. I’ve gone through the usual troubleshooting with Support, but the main issue appears to be constant messages about the QUESTIONRESULTS table locks:

Replication failed for server ‘MY-DSA.mycompany.com’: A replication lock request for QUESTIONRESULTS (Exclusive) timed out.

I’ve seen the replication lock errors forever, but replication normally finishes once it gets past that. Now I see this error every time I open the BESAdmin Tool; like some part of replication is genuinely stuck.

I see that actions are replicating as well as most computers, but some “Last Report Time” - and other Retrieved Property values - are stuck and not updating on either server.

Support are focused on what I feel to be possible red herrings about health checklist items that we have been improving recency, and have been way worse in the years before.

Anyone else experiencing replication/DSA issues with 9.5+?

I wonder if this is also related: 9.5.5 - Mailbox targetting issues

I’m not certain if this is related, but there is a potential issue with the FillDB parallelism introduced with 9.5.5.193+ which could affect DSA in some ways. The parallelism is on by default but could cause issues in some cases, particularly where the number of FillDB threads exceeds the server CPU count. I would recommend trying to turn it off to see if it helps:

IBM Documentation

I would recommend filing a PMR to contact IBM support about this issue.

This version should not be affected by the issue I mentioned above.