I Accidentally targeted `All Computers` when testing

I recently screwed up when testing an action in BigFix many many times. Instead of targeting an automatic group of test machines, I ended up targeting All Computers. The end result of this action involves opening notepad.exe as the current user. It could have been worse, but opening notepad.exe everywhere isn’t great.

Because BigFix is pretty fast due to UDP notifications, the action hit at least 2000 computers before I stopped it.

This begs the question: what are the Best Practices when doing testing of new content in BigFix? What is the best way to limit the impact if All Computers is accidentally selected?

One of the best options is to put the new content in a site that only test computers subscribe to. If All Computers were selected in this case, then the impact would be limited to these test computers.

Another option is to set relevance for the content itself so that it is limited to a specific set of computer names or computer groups. This isn’t ideal as you have to remember to do this for every piece of new content, but this is also a very flexible option. It can start by being limited to a very small set of computers, but then grow in scope as the testing matures over time.

Another option is to have a separate operator account just for testing that only has management rights on some test computers. This is one of the better options as far as “least privilege” and absolutely minimizing the maximum impact of anything done as this account. This option is not ideal because it would require switching accounts frequently in some cases, which can take quite a while when closing and reopening the console. It also isn’t ideal to have the console open twice as 2 different accounts as this requires different computers to do this.


A separate by related issue is that All Computers is selected by default in the Take Action dialog when targeting by property, even though All Computers is NEVER what should be targeted in my organization. Actions are always targeted to different Automatic Groups depending on the target audience.

Ideally, only master operators would be able to target All Computers. It would be useful if there was a way to configure BigFix to prevent All Computers from being used.

Also, it would be interesting if in the Take Action dialog, when selecting “Dynamically target by property” if it wouldn’t have anything selected so you would have to actually click on All Computers if that is what you wanted to happen, and it wouldn’t allow you to continue unless you actually selected something.

CC: @AlanM


I created this relevance to detect if notepad was opened during a specific time window:

exists creation times whose(it > "Thu, 31 Mar 2016 17:00:34 -0700" as time AND it < "Thu, 31 Mar 2016 17:49:34 -0700" as time) of processes "notepad.exe"

@alexk came up with this as part of the solution to this screw up:

wait __Download\RunAsCurrentUser.exe --w --q C:\Windows\System32\taskkill.exe /FI "WINDOWTITLE eq Untitled - Notepad"

Related:

4 Likes

One possible way is while you are testing, put the group as part of the relevance and then you can’t mess up by targetting

1 Like

Yes, that is what I meant as one of the possibilities of the 2nd option I mentioned.

Something like this I believe:

exists members whose(it = TRUE) of (groups 00000 of it; groups 00000 of it) of sites

For some reason, this doesn’t seem exactly right.

This warrants further discussion around content management best practices, but I agree with some of the approaches you put forward. In particular, leveraging Custom Sites (and associated subscriptions) for Content Development, Testing, QA, Production, as well as limiting scope with operators (which is easier to do with WebUI).

1 Like

We had a major outage couple of years ago because a CO operator did something like that and it was a baseline that was removing computers from the domain and shutting them down. As a result we implemented a several precaution/limitation measures and we also have an open RFE (#35984) for peer-review system built-in (IBM at the time promised us that it will be done soon but with the emphasis of WebUI rather than the console, it hasn’t materialised just yet). Anyway, here are the measures we put in place to safeguard against something like this happening:

  1. Turned ON Action Overview pop-up (Advanced System Option “gtsConfirmAction” = true), so as you are submitting an action it will give the user another chance to change their mind
  2. Turned OFF ability for CO operators to dynamically target machines (Advanced System Option “disableNmoDynamicTargeting” = true). There is already ways to limit the amount of machines that can be target by selecting them manually or pasting a list but nothing controls dynamic targeting, so we just had it disabled for CO operators altogether.
  3. Built our own “workaround” peer-review system and applied it to selected “high risk” tasks/fixlets/baselines. The system utilizes the Action Settings Locks of “Run only when” criteria, so when an user runs a high-risk action it goes to status of “Constrained” and then the users have to run a second task to “approve” the initial task (the “approve” task just sets a client setting, that matches the criteria from the original task). We further made the “approve” task only to limited higher-level operators (separate roles were set-up), so essentially not everybody can review/approve actions. Hopefully, we might get a fully functional peer-review system built-in soon, because this is not ideal but does the job for now.
  4. Removed “write” and “create content” permissions for all CO operators for all custom sites and force everybody to utilize Development environment with a handful of machines available in it for all development and testing, so untested and unproven content is not ran on production machines.

With these 4 in place, we are safe that at least two people review each high-risk action and even if both make mistake and erroneous action is ran it cannot have a big impact (any CO operator cannot run action on more than 200 machines). Maybe some/all of those will be of use to you.

3 Likes

Thanks for these added details, they are definitely useful things to consider.

What is a CO operator? do you mean console operator?

I can see how this would make sense in some cases, but for this particular task I was deploying, it had to be deployed dynamically in order for the relevance to work.

This is also a bit of an issue because most policy actions and provisioning baselines should always be targeted dynamically in our environment.

It would definitely be useful for me if some content could be marked to always be targeted dynamically, while others would be marked to never be targeted dynamically, while other content could be either.

It could be useful to limit junior and new operators from targeting dynamically or targeting more than X machines, but it would not work for us to limit all non master operators this way.

Yes, CO means Console Operator (as opposed to MO).

I understand where you are coming from and as I said IBM committed to releasing a fully-fledged peer-review system per the RFE mentioned above but are yet to deliver it. Once they do all those kind of additional precautions can be considered to be added on.

We have a large deployment and need to give access to a lot of operators who should only be able to impact 5 systems at a time. Is there any feature coming or way to limit this for some operators?

Is this for a helpdesk/support use case? Can you give more details?

CC: @dexdexdex

1 Like

Hey @ageorgiev,

Where did you get these parameters?
How can I list these parameters?

I didnt find any reference in the documentation or on the internet.

Well, those were two I specifically put RFEs for, so once they released them they told us what they are and we tested them. I had a quick search and I can’t see them being documented by IBM which is strange but they do work (we have had them on ever since they were released in late 2013). Maybe they have some kind of internal documentation that is not published for the public or something.

The action confirmation dialog is available via the documented setting, requireConfirmAction. See https://www.ibm.com/support/knowledgecenter/SSQL82_9.5.0/com.ibm.bigfix.doc/Platform/Installation/c_list_of_advanced_options.html

The dynamic targeting option is not documented, but the WebUI provides this capability, by default. Only individual machine targeting and group targeting options are available in the WebUI. It also includes some more detailed permissions that allow you to limit the number of devices and/or pieces of content that an NMO can select and deploy to at one time.