How do I kill an unresponsive action

idaubney · June 30, 2015, 7:13am

V 9.2
During the later part of development of an action, I often decide to test it on a single endpoint. I find myself in a situation where something has gone wrong unexpectedly and the client reports Running in the console. (This could be something as simple as not providing the correct command line switches or worse!!).

I am able to stop the action in the console, but I notice the client continues for quite some time before it gives up executing this action.

I am wondering how I can expedite the process of killing the action on the client? That way I can fix my script and retry it on the client in a shorter time span.

I am aware of the debugger, which in most situations works fine for relevance and early action development.

Any insight or tips would be greatly appreciated.

strawgate · June 30, 2015, 2:39pm

You could restart the service on the client!

strawgate · June 30, 2015, 2:42pm

We used these: http://support.bigfix.com/labs/customright.html

And added a custom one for remote restart of the client service. I think we used: https://technet.microsoft.com/en-gb/sysinternals/bb897542.aspx

Something like:
psservice restart BESClient

jgstew · June 30, 2015, 3:21pm

The only option is to connect to that system through some other path and kill the process that is hung, or restart the service as @strawgate mentions.

If you are just testing, you could use run instead of wait in which case the client will not wait until the process is completed. This will prevent the client from locking up from this happening and you can kill the process with the client itself if it does not close. You can even do this in the same action with a pause while timer so that you can set the maximum time the client will wait before killing the process.

idaubney · July 1, 2015, 1:47am

thanks to @strawgate and @jgstew for your responses.

@strawgate Thanks for linking the right console right-click customisation menu. It looks useful and exciting, i’ll have to play d:)

I am able to stop / start the besagent at an elevated command line using the following command line (, usually the machine I am testing with is sitting beside me).

net stop besclient && net start besclient

I would have thought that stopping and then starting the agent would abort the current action. But I am not convinced. It seems I have to wait for a certain period of time before the console reports “Failed” on the action on the client in question.

My other thought is to implement a watchdog timer on the script to abort after x minutes of inactivity, where x is a time larger than the period of successful processing.

FYI: forgot to add Win7x64 and Win8.1x64 clients if that helps.

Kind Regards,

Ian.

strawgate · July 1, 2015, 2:01am

You should check the client logs but it shouldn’t take very long after service restart before the client reports failed on the previous action.

Though the status of the previous action isn’t really that important is it? What’s more important is how quickly the client starts a new action, right?

We implement a watchdog timer on processes that we know sometimes fail. For instance, we use a program called Secunia that does vulnerability scans on endpoints and 1/1000 times it literally just hangs so we have, as part of the script, something that only lets it run for 60 minutes:
On the Mac side it would look something like this:

appendfile #!/bin/sh
appendfile doalarm () {"%7B"} perl -e 'alarm shift; exec @ARGV' "$@"; {"%7D"} # define a helper function
appendfile doalarm 3600 "Path/To/executable" Arguments

Unfortunately I dont have any examples for the Windows side at the moment.

jgstew · July 1, 2015, 3:46pm

If it takes a while for the client to report failure, it might be due to the WorkIdle setting being too low, or if there is too much delay in reports going from the client, through the relays, to the root server, and then being refreshed by the console. It could also be that your client is not getting UDP notifications which could make things seem slow.

AlanM · July 1, 2015, 5:09pm

Stopping an active action on the console will stop the client’s processing of the action as soon as the gather can reach the endpoint, but we can’t stop anything that has been executed from trying to complete.

The client will gather the site with the stopped action and it will gather due to the action being removed and this causes action processing to stop and the action from the client standpoint will be terminated. This means though if we were waiting in a “wait” or such, that that executable is still running.

idaubney · July 6, 2015, 1:17am

@strawgate said: Though the status of the previous action isn’t really that important is it? What’s more important is how quickly the client starts a new action, right?

Thanks, you are correct. I assumed that it would not be worth repeating the modified action until I was assured of the previous task status. I will go back to the drawing board.

For others coming across this thread, I’d imagine in windows it could be done like so:

start program.exe
timeout /T 3600 /nobreak
taskkill /im program.exe

You could test that the program.exe exists as a running task by checking tasklist against the process name (IMAGENAME)

tasklist /FI “IMAGENAME eq program.exe”

idaubney · July 6, 2015, 1:19am

Thanks All for the better understanding that you have brought to this process.

jgstew · July 6, 2015, 2:08am

This is how to run something in actionscript that you think might hang, but have it be killed after a timeout.

run process.exe
parameter "start_1"="{now}"
pause while{ (now - time (parameter "start_1") < 360*second) AND (exists running application "process.exe") }
if { (exists running application "process.exe") }
    waithidden taskkill /im process.exe
endif

If the process hangs, then it will wait up to 3 minutes, then kill it. This will hold up action processing on the client for up to 3 minutes. You could instead do it as 2 separate actions, one that runs it, and another that runs only if it is still running after X amount of time and kills it.

Related: http://bigfix.me/fixlet/details/743