Best method for "kicking" a system that apparently missed the UDP Ping

system · February 4, 2010, 7:52pm

(imported topic written by MrFixit)

What is the best method or methods that are used to get a Client wake up and start running the action that it should be running?

The best option I’ve used for myself is to create a custom action with nothing to do and issue it to the same node but that requires the operators to be able to create custom content which is not always available to users.

Recycling the BESClient works sometimes.

A right-click “Send Refresh” has no eftect either.

Considering switching to a polling model as this seems to occur way to often. So I would be interesting if others decided to do that and what was the value that seems to work the best… things that state there are performance implications make me nervous.

thanks,

-Gary

system · February 4, 2010, 8:17pm

(imported comment written by SystemAdmin)

We really shouldn’t have to do things like this, but the way we handled it was actually creating a “blank” task. That way folks without the ability to create content could send it out as well.

BenKus · February 5, 2010, 5:56am

(imported comment written by BenKus)

Hey guys,

The fact that you notice a significant lag here indicates to me that maybe your agent is overly busy doing lots of background tasks and it gets distracted from your scheduled action…

There is probably room for optimizing your deployment. What does your Health Checks dashboard say? Do you have any issues?

Ben

system · February 5, 2010, 8:35pm

(imported comment written by MrFixit)

I regularily profile this environment and run debug mode on a couple to see how the overall loop is running. Normally a node can get through a complete loop cycle in about 20 minutes, except when a number of properties come due to be evaluated then some longer running relevence will pop up on the profiles.

The Client logs indicate that it never “hears” the UDP Ping, and just continues along with processing the loop where ever it is. I’ve not done any network capturing to see if the packet is lost before it even gets there.

I’m doing some Oracle patch development today so I’ll run some NETMOM to see if I can catch a miss. I’m already in debug for my test system so I’ll have both a capture and the log.

system · February 6, 2010, 2:39am

(imported comment written by SystemAdmin)

I think what you are all describing is what I was terming “Clients Sleeping”. Either way, we have a similar issue and each time we do debug, it seems to point to a different problem, which we resolve. Although it rears it’s ugly head again in some other way.

We’ve seen the same things in the logs, the client doesn’t see the ping. We’ve also seen the client fail to log anything to the log. Our clients can go up to an hour without reporting in.

Can’t wait to hear what you find.