I’ve noticed that there appears to be a significant delay between clicking “Take Action” and the action actually starting on a mac.
On several Mac systems when an end-user clicks “Take Action”/“Take All Actions”, the window closes and “nothing happens” for a long time (5 to 26 minutes measured so far). There is nothing recorded in the log during this time. Then suddenly, the action will start, progress messages will popup for the user, and the log will indicate the action is starting. If I check the console the action start time that is reflected is the time (5 to 26) minutes after the user clicked “Take Action”.
The delay time appears to correlate to the size of the data downloaded… but maybe that’s a red haring.
This is not a delay, this how the product design works.
The action would take effect after the clients receive the notification of new actions and if them contain downloads, the client should first download packages prior to execution itself.
Sorry I used the term “I” too generically and it may have been too confusing.
When the end-user clicks “Take Action” on a pending action, then nothing happens for # minutes. The file has already downloaded before the end user click’s “Take action” Then after # minutes, the action actually starts.
A delay of 26 minutes is unacceptable as after a few minutes the end-user will assume that it must have completed… when it has not.
In many cases these clients are doing a lot of work and it may not remove the constraint of waiting for the action to be approved by the end user for a while.
Additionally its possible that the download had been deleted as the cache may have been used up or overflowed and thus the download had to happen again.
It would be hard to determine what is going on without knowing the exact combination of content/actions etc that the endpoint was working on. If the client is busy processing relevance or some long expression it may take a while to get around to determining that the constraint has been lifted.
During the 26 minutes there was very little network traffic (less than 10MB) passed by BESAgent so I know it is not re-downloading the 3GBs of data.
I’ll sample file access on the process next and see what the hell it’s doing.
Still, even if the client is busy, when a user clicks “Take Action” I think the end-user expectation is that the action will be processed right then. At least it should be communicated that the action is in-process/queued.
Thank you, it’s slightly related, however I’m talking about the delta time between GUI “take action” situations and not delta time of actions taken from the console.
I will use the tracing methods from the other post you linked to next too.
I did search the forum and I did not find any useful posts related to the time between an end-user clicking “Take Action” on the client UI and the actual action being processed.
as @gearoid suggested, I have used @jgstew’s BES Client Info - Universal to pull the data below about one of the endpoints that it took 21 minutes from the end-user clicking take action to the action starting in the log/user seeing the action progress window.
The gui (if you enable it and then click on the progress tab) and look it does indicate “Waiting for processing…”
Are there any wrong settings here, or can someone help me identify the misconfiguration?
# Failed Syncs 126
# of Client BKG Files 0
# of Client Log Files 28
_BESClient_ComputerType not set
_BESClient_Log_Days 30
_BESClient_UploadManager_BufferDirectory /Library/Application Support/BigFix/BES Agent/__BESData/__Global/Upload
Actual max % of CPU of Client 1
ArchiveManager_IntervalSeconds 86400
ArchiveManager_MaxArchiveSize (bytes) 1000000
ArchiveManager_OperatingMode 0
Bad Cache Files? The expression could not be evaluated:
BES Log Folder /Library/Application Support/BigFix/BES Agent/__BESData/__Global/Logs
BESData Folder /Library/Application Support/BigFix/BES Agent/__BESData
client settings
Department=XXX
_BESClient_ActionManager_HasMainDialog=1
_BESClient_ActionManager_TrayMode=AlwaysTray
_BESClient_Comm_CommandPollEnable=1
_BESClient_Comm_CommandPollIntervalSeconds=43200
_BESClient_Download_DownloadsCacheLimitMB=10000
_BESClient_Download_MinimumDiskFreeMB=2000
_BESClient_Download_NormalStageDiskLimitMB=20000
_BESClient_Download_PreCacheStageDiskLimitMB=5000
_BESClient_Download_RetryMinutes=1
_BESClient_Download_UtilitiesCacheLimitMB=200
_BESClient_LastShutdown_Reason=Service%20manager%20stop%20request
_BESClient_Log_Days=30
_BESClient_Report_MinimumInterval=30
_BESClient_Resource_StartupNormalSpeed=0
_BESClient_Upgrade_UTF8Settings=1
_BESClient_UploadManager_BufferDirectory=%2fLibrary%2fApplication%20Support%2fBigFix%2fBES%20Agent%2f__BESData%2f__Global%2fUpload
__LockState=false
__RelaySelect_Automatic=1
__RelayServer1=XXX
__Relay_Control_RootServer=XXX
__Relay_Control_Server1=XXX
CommandPollInterval 43200
Commands with an exit code != 0 <none>
Download_DownloadsCacheLimitMB 10000
Download_NormalStageDiskLimitMB 20000
Download_PreCacheStageDiskLimitMB 5000
Download_RetryLimit 6
Download_RetryMinutes 1
Download_UtilitiesCacheLimitMB 200
Eval % action 37.2
Eval % gather 4.2
Eval % property 0.5
Eval % Relevance 26.8
Eval % report 0.8
Evaluation Cycle Maximum 22
Evaluation Loop Average 00:01:04.539585
Failed Sync Sites 1, General transport failure. - BAD SERVERNAME
Folders of SystemRoot The operator "root folder" is not defined.
last command time (UDP) <none>
minimum of subscribe times of sites Fri, 18 Mar 2016 13:40:26 -0700
parent folder of client /Library/BESAgent
Register_IntervalSeconds <none>
RelaySelect_IntervalSeconds <none>
Report_MinimumAnalysisInterval <none>
Report_MinimumInterval 30
Resource_StartupNormalSpeed 0
SleepIdle 480
slow evals
opsite227.50704109:SubAction Constraints
opsite227.50704109:SubAction Constraints
opsite227.50703876:Background Evaluation
opsite227.50704106:Evaluate SubAction State
opsite227.50704106:Evaluate SubAction State
opsite227.50703294:Background Evaluation
opsite227.50703876:Background Evaluation
opsite227.50704111:SubAction Constraints
WorkIdle 10
WorkIdle+SleepIdle 490
This particular test computer is a fast one: MacBook Pro/SSD(50GB free)/Corei5/16GB ram. In this particular case there were three of six Office 2016 updates needed by the system (Excel, PowerPoint, and Word).
The interesting part was the file trace.
End-user hits “Take Action”
Write to the local “AcitonHistory” journal
Read the action group
Read the SHA1 index file (I assume to get the file name/location)
Read the entire Excel file once (I assume, to check the sha1 value of it)
Read the action group
Read the SHA1 PowerPoint index file (I assume to get the file name/location)
Read the entire PowerPoint file once (I assume to check the sha1 value of it)
Read the action group
Read the SHA1 Word index file (I assume to get the file name/location)
Read the entire Word file once (I assume to check the sha1 value of it) ~15:38 after the end-user has clicked “Take Action”
Read all /Applications pkginfo/info.plist/and some other files (I assume to get the version of the applications to check the individual fixlet relevances)
Read the Action group
Read application data checking the first fixlet (OneNote not relevant)
Read the Action group
Read application data checking the first fixlet (Outlook not relevant)
Read the Action group
Read application data checking the first fixlet (Word - Relevant, but listed after Excel)
Read the Action group
Read application data checking the first fixlet (Excel)
Read the Action group ~16:07 after the end-user has clicked “Take Action”
Read the SHA1 Excel index file (I assume to get the file name/location)
Read the entire Excel file once (I assume, to check the sha1 value of it… yes, again)
Read the SHA1 Excel index file (I assume to get the file name/location) ~20:35 after the end-user has clicked “Take Action”
Read & write (assumed) various bigfix action state info
Read the Action group Action start time logged GUI progress popup for user
Write to the local “AcitonHistory” journal
Write to Action group temp dat file
Write to Action group site file Excel fixlet tasks being processed ~20:38 after the end-user has clicked “Take Action”
Service Write loop starts, reads and writes all over Stopped logging ~24:18 after the end-user has clicked “Take Action” - Install of the three updates completes
So in summary:
User clicks take-action
Each prefetched file in a relevant fixlet in the action group is sha1 checked.
The prefetched file in the first relevant fixlet in the action group is sha1 checked again.
The action starts, user is prompted that the action is in progress, thre sub actions are fired off and everything completes quickly.
I am unsure if other prefetched files are sha1 checked again or not, but if they are it is done at a MUCH faster speed (taking 10 seconds rather than the 4 minutes 30 seconds).
The assumptions here is that the client does nothing but process the one action after the person clicks accept. This isn’t true however. The client time slices its work (except for hashing the files) so it may be doing other things as well. The client does duplicate the files into other locations as well so looking at the source file may be misleading.
I’d say if you want better answers we would probably need a PMR and a debug log to show you what the client is doing at this time.
I did notice other file activity, but it was sparse and rarely interjected in the flow that I was watching.
I’ll open a PMR, but I can tell you that the the assumptions are just on the why (e.g. SHA1 calculation), not the what (The fact that it reads the entire excel pkg file twice).
I’ll start capturing a debug log next, then open a PMR as there are four issues I see:
No “Take Action” time/actual task start time is logged (or easily visible in the console) for a given action (only the action time after SHA1 and SHA1 checks are completed).
The progress GUI only popups up after the second SHA1 file verification completes (this is the time that is recorded as the action start time).
SHA1 verification of prefetched items from actions that have been initiated appear to use very low priority and are very slow to complete for large files.
The first fixlet’s prefetched file(s) from a multiple action group get SHA1 checked again (at a low priority) after they are SHA1 checked as part of the multiple action group, but it appears that subsequent fixlets’ prefetched file(s) are either not SHA1 checked, or are checked with a higher priority and complete much faster – I was not logging that far, but can infer form time time difference of 4:30 vs 0:10.
I’ll also run a trace again on a slow test system and see what I get.= to make sure these results are not unique/getting combines with other actions.
Some of your observations are correct as until an action is actually started the client works at the Idle speeds which is generally at 2% but much faster once an action starts.
We have been talking about this as with large files this can take a while, and as the download is technically part of the action processing, but the prefetch issue would still need to be hashed at a lower speed as it is out of people’s action processing windows often.
The measurements are all well after the system has checked the mailbox, but UDP is enabled and that part works, but this is all on the end-user side, not the console-user side.