Hello community,
In our dedicated infrastructure (IEM 9.0.853) we are using RestAPI calls to create actions or modify settings at endpoints. At first execution call, the response is about 20 seconds. After a few thousands of calls (60k~), the response between them is up 4 minutes.
The only way we have to get again the fast response is doing a database reboot.
Is somebody using that creation method?
If there is somebody… How solves that behaviour without database reboot?
Thanks!
The first thing that came to mind when I read this was “…after a few thousands of calls (60k~)…” was “Holy Action Spam, Batman!”.
Just off the top of my head, I would have concerns about your processes linked to the use of the RESTAPI calls that are creating this many actions. I’m sure someone else will correct me and say that this is perfectly reasonably and it can easily handle x10 or x100 times this number…but I’d be worried what this is doing to your infrastructure in any case.
The fact that you say you restart the database (assuming it is separate from the IEM server) and that it improves performance makes me think there is some transaction, logging or indexing going on that can’t keep up with the number of requests that are constantly being created.
I am also very interested in hearing more detail about why all these actions get created and whether you need to consider grouping some of this work (e.g. for the same change required on a collection of machines, create an action to target that group).
Secondly, is anyone analysing the results of these actions and seeing whether the same actions are getting created over and over again on some endpoints where the action isn’t working (or seems to work, but doesn’t)? I can appreciate that the API allows for some slick automation and avoiding a lot of manual intervention, but from as often as I’ve used Bigfx (and done system mgmt the old way, using scripts/ssh), you need to make sure you’re managing your processes and be sure you’re not overloading your infrastructure (clients) or your tools (IEM).
I have to agree with @jwilkinson
60k calls seems like a lot. Are every one of these calls creating an action? If so, that is definitely not a great way to go. It is far better to group them. It is likely the response times could be more to do with creating 60k actions and not the REST API itself.
Also, I’m wondering if you are using all of these calls in a way that is not needed. I’d need to know more about what you are doing in broad terms and why.
Mental note: ‘I must check after writing’
Of course, @jwilkinson, @jgstew, you’re right: are not 60k~ , yes about 6k~ the last two weeks.
All these actions do settings stablishment, software deployment and script execution, and most of them have ended succeed. I know we must use actions with multiple target to reduce load, but now we are working ‘on demand’, so changing that feature is hard.
Are you also cleaning up these actions after they expire etc? Consoles will attempt to load every action (expired/stopped/etc) so this will cause problems eventually if that number grows too large and if you are generating that many actions… it will
You should delete any expired or stopped actions older than ?? days. In our case 30 days.
The deleted actions will not show up in the console, but they are still in the database and cause some load by being there, so as @AlanM points out, you must then run the Audit Trail Cleaner to remove them from the database and archive them. (in case they are needed for auditing later)