Stagger action start times over X minutes

Starting a few patches ago we started to have odd behaviour when setting this option.

When issuing patches to our test domain I’ll just put stagger action start times over 60 minutes so all the servers don’t reboot at the same time.

Does this setting have some kind of interaction with others which causes it to delay considerably? I issued a test action with the MS18-Dec patches at 11 AM and some of the clients still show “Waiting to satisfy temporal distribution time constraint.” 5.5 hours later.

Execution
This action will never expire.
It will run at any time of day, on any day of the week.
If the action becomes relevant after it has successfully executed, the action will be reapplied as a policy up to 3 times.
The action’s downloads will be started before action constraints are satisfied.
If the action fails, it will be retried up to 3 times, waiting 10 minutes between attempts.
The action’s downloads will be staggered over 60 minutes to reduce network load.
If a member action fails, the action group will continue to run.

This is anecdotal, but in my experience it seems like when configuring a baseline to stagger, each component may be staggered separately. So you could end up with 0 to 60 minutes before running the first component, then another 0 to 60 minutes for the second component, etc.

I also recall reading about a bug that causes the action to stagger for up to twice as long as the value specified, but I don’t know what versions were affected or whether it has been resolved.

Once the action has started, the actions in the baseline all run in sequence with no delay. It’s just the initial delay for the action to begin that’s so odd.

Maybe i could look back and find it, but it definitely used to be very consistent with all actions starting in that X minute setting. Now it’s incredibly delayed and somewhat unusable if you want a “predictable” delay.

The delay should be precise (except there was a bug where it was doubled), but it staggers from when the action becomes relevant on the endpoint. So if machines come on line several hours later or becomes relevant later due to other constraints, then it can add up to 60 minutes stagger delay at that time.

The client log is the best way to confirm as it will clearly show the stagger delay being added so you can see when that occurred and how long it delayed. If you see anything over 60 minutes in the message, make sure you are at 9.5.8 or higher to get the fix for
Issue 153443 - APAR IV99808 - SPECIFIED INTERVAL FOR STAGGERING ACTIONS NOT BEING HONORED.

Report the bug via PMR, if you already are.

1 Like

I’ll certainly open a PMR, I was just curious if it was something I was doing something simple that caused it to occur. The oddities may have started when we went to 9.5.8; Everything is 9.5.10 currently.

One the client that I looked at I see the following:

11:06:30 - Action is issued in the console
11:07:21 - Client Evaluates the Baseline for Relevance and downloads the content.

From there it seems like Jason was indeed correct and each action seems to stagger their actions which causes the observed delay:

11:07:58 - ActionLogMessage: (group:78008,action78009) Action temporal distribution - delay for 3330 seconds.
12:03:56 - ActionLogMessage: (group:78008,action78045) Action temporal distribution - delay for 866 seconds.
12:19:42 - ActionLogMessage: (group:78008,action78049) Action temporal distribution - delay for 1308 seconds.
12:41:58 - ActionLogMessage: (group:78008,action78055) Action temporal distribution - delay for 2512 seconds.

13:24:28 - ActionLogMessage: (group:78008,action78055) Distributed - time has arrived
13:24:28 - ActionLogMessage: (group:78008,action78063) Action temporal distribution - delay for 3206 seconds.

14:20:48 - ActionLogMessage: (group:78008,action78063) Distributed - time has arrived
14:20:48 - ActionLogMessage: (group:78008,action78077) Action temporal distribution - delay for 2146 seconds.

14:57:40 - ActionLogMessage: (group:78008,action78077) Distributed - time has arrived
14:57:40 - ActionLogMessage: (group:78008,action78081) Action temporal distribution - delay for 804 seconds.

15:13:41 - ActionLogMessage: (group:78008,action78081) Distributed - time has arrived
15:13:41 - ActionLogMessage: (group:78008,action78083) Action temporal distribution - delay for 1268 seconds.