Take Action Delay - when user on endpoint approves actions

I’ve noticed that there appears to be a significant delay between clicking “Take Action” and the action actually starting on a mac.

On several Mac systems when an end-user clicks “Take Action”/“Take All Actions”, the window closes and “nothing happens” for a long time (5 to 26 minutes measured so far). There is nothing recorded in the log during this time. Then suddenly, the action will start, progress messages will popup for the user, and the log will indicate the action is starting. If I check the console the action start time that is reflected is the time (5 to 26) minutes after the user clicked “Take Action”.

The delay time appears to correlate to the size of the data downloaded… but maybe that’s a red haring.

Edit for clarity

This is not a delay, this how the product design works.

The action would take effect after the clients receive the notification of new actions and if them contain downloads, the client should first download packages prior to execution itself.

Sorry I used the term “I” too generically and it may have been too confusing.

When the end-user clicks “Take Action” on a pending action, then nothing happens for # minutes. The file has already downloaded before the end user click’s “Take action” Then after # minutes, the action actually starts.

A delay of 26 minutes is unacceptable as after a few minutes the end-user will assume that it must have completed… when it has not.

1 Like

In many cases these clients are doing a lot of work and it may not remove the constraint of waiting for the action to be approved by the end user for a while.

Additionally its possible that the download had been deleted as the cache may have been used up or overflowed and thus the download had to happen again.

It would be hard to determine what is going on without knowing the exact combination of content/actions etc that the endpoint was working on. If the client is busy processing relevance or some long expression it may take a while to get around to determining that the constraint has been lifted.

1 Like

During the 26 minutes there was very little network traffic (less than 10MB) passed by BESAgent so I know it is not re-downloading the 3GBs of data.

I’ll sample file access on the process next and see what the hell it’s doing.

Still, even if the client is busy, when a user clicks “Take Action” I think the end-user expectation is that the action will be processed right then. At least it should be communicated that the action is in-process/queued.

you might want to search for and look at other posts on the forum.
For example

Thank you, it’s slightly related, however I’m talking about the delta time between GUI “take action” situations and not delta time of actions taken from the console.

I will use the tracing methods from the other post you linked to next too.

I did search the forum and I did not find any useful posts related to the time between an end-user clicking “Take Action” on the client UI and the actual action being processed.

as @gearoid suggested, I have used @jgstew’s BES Client Info - Universal to pull the data below about one of the endpoints that it took 21 minutes from the end-user clicking take action to the action starting in the log/user seeing the action progress window.

The gui (if you enable it and then click on the progress tab) and look it does indicate “Waiting for processing…”

Are there any wrong settings here, or can someone help me identify the misconfiguration?

# Failed Syncs    126 
# of Client BKG Files    0 
# of Client Log Files    28 
_BESClient_ComputerType    not set 
_BESClient_Log_Days    30 
_BESClient_UploadManager_BufferDirectory    /Library/Application Support/BigFix/BES Agent/__BESData/__Global/Upload 
Actual max % of CPU of Client    1 
ArchiveManager_IntervalSeconds    86400 
ArchiveManager_MaxArchiveSize (bytes)    1000000 
ArchiveManager_OperatingMode    0 
Bad Cache Files?    The expression could not be evaluated:  
BES Log Folder    /Library/Application Support/BigFix/BES Agent/__BESData/__Global/Logs 
BESData Folder    /Library/Application Support/BigFix/BES Agent/__BESData 
client settings
    Department=XXX 
    _BESClient_ActionManager_HasMainDialog=1 
    _BESClient_ActionManager_TrayMode=AlwaysTray 
    _BESClient_Comm_CommandPollEnable=1 
    _BESClient_Comm_CommandPollIntervalSeconds=43200 
    _BESClient_Download_DownloadsCacheLimitMB=10000 
    _BESClient_Download_MinimumDiskFreeMB=2000 
    _BESClient_Download_NormalStageDiskLimitMB=20000 
    _BESClient_Download_PreCacheStageDiskLimitMB=5000 
    _BESClient_Download_RetryMinutes=1 
    _BESClient_Download_UtilitiesCacheLimitMB=200 
    _BESClient_LastShutdown_Reason=Service%20manager%20stop%20request 
    _BESClient_Log_Days=30 
    _BESClient_Report_MinimumInterval=30 
    _BESClient_Resource_StartupNormalSpeed=0 
    _BESClient_Upgrade_UTF8Settings=1 
    _BESClient_UploadManager_BufferDirectory=%2fLibrary%2fApplication%20Support%2fBigFix%2fBES%20Agent%2f__BESData%2f__Global%2fUpload 
    __LockState=false 
    __RelaySelect_Automatic=1 
    __RelayServer1=XXX
    __Relay_Control_RootServer=XXX
    __Relay_Control_Server1=XXX
CommandPollInterval    43200 
Commands with an exit code != 0    <none> 
Download_DownloadsCacheLimitMB    10000 
Download_NormalStageDiskLimitMB    20000 
Download_PreCacheStageDiskLimitMB    5000 
Download_RetryLimit    6 
Download_RetryMinutes    1 
Download_UtilitiesCacheLimitMB    200 
Eval % action    37.2 
Eval % gather    4.2 
Eval % property    0.5 
Eval % Relevance    26.8 
Eval % report    0.8 
Evaluation Cycle Maximum    22 
Evaluation Loop Average    00:01:04.539585 
Failed Sync Sites    1, General transport failure. - BAD SERVERNAME 
Folders of SystemRoot    The operator "root folder" is not defined. 
last command time (UDP)    <none> 
minimum of subscribe times of sites    Fri, 18 Mar 2016 13:40:26 -0700 
parent folder of client    /Library/BESAgent 
Register_IntervalSeconds    <none> 
RelaySelect_IntervalSeconds    <none> 
Report_MinimumAnalysisInterval    <none> 
Report_MinimumInterval    30 
Resource_StartupNormalSpeed    0 
SleepIdle    480 
slow evals
    opsite227.50704109:SubAction Constraints 
    opsite227.50704109:SubAction Constraints 
    opsite227.50703876:Background Evaluation 
    opsite227.50704106:Evaluate SubAction State 
    opsite227.50704106:Evaluate SubAction State 
    opsite227.50703294:Background Evaluation 
    opsite227.50703876:Background Evaluation 
    opsite227.50704111:SubAction Constraints 
WorkIdle    10 
WorkIdle+SleepIdle    490

This particular test computer is a fast one: MacBook Pro/SSD(50GB free)/Corei5/16GB ram. In this particular case there were three of six Office 2016 updates needed by the system (Excel, PowerPoint, and Word).

The interesting part was the file trace.

End-user hits “Take Action”
Write to the local “AcitonHistory” journal
Read the action group
Read the SHA1 index file (I assume to get the file name/location)
Read the entire Excel file once (I assume, to check the sha1 value of it)
Read the action group
Read the SHA1 PowerPoint index file (I assume to get the file name/location)
Read the entire PowerPoint file once (I assume to check the sha1 value of it)
Read the action group
Read the SHA1 Word index file (I assume to get the file name/location)
Read the entire Word file once (I assume to check the sha1 value of it)
~15:38 after the end-user has clicked “Take Action”
Read all /Applications pkginfo/info.plist/and some other files (I assume to get the version of the applications to check the individual fixlet relevances)
Read the Action group
Read application data checking the first fixlet (OneNote not relevant)
Read the Action group
Read application data checking the first fixlet (Outlook not relevant)
Read the Action group
Read application data checking the first fixlet (Word - Relevant, but listed after Excel)
Read the Action group
Read application data checking the first fixlet (Excel)
Read the Action group
~16:07 after the end-user has clicked “Take Action”
Read the SHA1 Excel index file (I assume to get the file name/location)
Read the entire Excel file once (I assume, to check the sha1 value of it… yes, again)
Read the SHA1 Excel index file (I assume to get the file name/location)
~20:35 after the end-user has clicked “Take Action”
Read & write (assumed) various bigfix action state info
Read the Action group
Action start time logged
GUI progress popup for user
Write to the local “AcitonHistory” journal
Write to Action group temp dat file
Write to Action group site file
Excel fixlet tasks being processed
~20:38 after the end-user has clicked “Take Action”
Service Write loop starts, reads and writes all over
Stopped logging
~24:18 after the end-user has clicked “Take Action” - Install of the three updates completes

So in summary:

  1. User clicks take-action
  2. Each prefetched file in a relevant fixlet in the action group is sha1 checked.
  3. The prefetched file in the first relevant fixlet in the action group is sha1 checked again.
  4. The action starts, user is prompted that the action is in progress, thre sub actions are fired off and everything completes quickly.
    I am unsure if other prefetched files are sha1 checked again or not, but if they are it is done at a MUCH faster speed (taking 10 seconds rather than the 4 minutes 30 seconds).

The assumptions here is that the client does nothing but process the one action after the person clicks accept. This isn’t true however. The client time slices its work (except for hashing the files) so it may be doing other things as well. The client does duplicate the files into other locations as well so looking at the source file may be misleading.

I’d say if you want better answers we would probably need a PMR and a debug log to show you what the client is doing at this time.

1 Like

I did notice other file activity, but it was sparse and rarely interjected in the flow that I was watching.
I’ll open a PMR, but I can tell you that the the assumptions are just on the why (e.g. SHA1 calculation), not the what (The fact that it reads the entire excel pkg file twice).

I’ll start capturing a debug log next, then open a PMR as there are four issues I see:

  1. No “Take Action” time/actual task start time is logged (or easily visible in the console) for a given action (only the action time after SHA1 and SHA1 checks are completed).
  2. The progress GUI only popups up after the second SHA1 file verification completes (this is the time that is recorded as the action start time).
  3. SHA1 verification of prefetched items from actions that have been initiated appear to use very low priority and are very slow to complete for large files.
  4. The first fixlet’s prefetched file(s) from a multiple action group get SHA1 checked again (at a low priority) after they are SHA1 checked as part of the multiple action group, but it appears that subsequent fixlets’ prefetched file(s) are either not SHA1 checked, or are checked with a higher priority and complete much faster – I was not logging that far, but can infer form time time difference of 4:30 vs 0:10.

I’ll also run a trace again on a slow test system and see what I get.= to make sure these results are not unique/getting combines with other actions.

Some of your observations are correct as until an action is actually started the client works at the Idle speeds which is generally at 2% but much faster once an action starts.

We have been talking about this as with large files this can take a while, and as the download is technically part of the action processing, but the prefetch issue would still need to be hashed at a lower speed as it is out of people’s action processing windows often.

2 Likes

What was the CPU on the slower mac?

On the device that was slow, does it get UDP notifications? That can definitely make a difference.

The slowest Mac was running a core i5 1.6 GHz.

The measurements are all well after the system has checked the mailbox, but UDP is enabled and that part works, but this is all on the end-user side, not the console-user side.