BigFix root server on virtual server

Spirillen · May 14, 2020, 8:27am

We are looking into migrating our BigFix root server and our Inventory server from Redhat 6 to Redhat 7.

Our current root server is on an physical server, where we also have the DB2 database installed, holding the BigFix database and the Inventory database.

Current setup is Linux Root server < 2 physical Windows top relays < xx virtual Windows relays < agents

Sizing 40.000 agents, of which 25.000 is workstations.

My question is, with this setup/sizing in mind, would it be possible to migrate the root server to a virtual server and if so, what do we have to be aware of.

Thanks in advance

JonL · May 14, 2020, 6:26pm

Yes, you can. Be sure to triple check the I/O capabilities of your hypervisor to make sure you could get the IOPs needed for your size of environment. I/O latency could potentially be an issue if you don’t plan carefully for it.

brolly33 · May 14, 2020, 6:38pm

@Spirillen
You will find a wealth of sizing and performance information for your BigFix root server, and considerations for virtualizing it, in Mark’s fine guide here

Spirillen · May 15, 2020, 6:37am

Thanks for your answers, it really helps

What IOPS and I/O latency should we be looking for, i order to be on the safe side ?

JonL · May 15, 2020, 6:18pm

Per the capacity planning guide Brolly linked to:

3.1.4 Storage
• The BigFix storage requirement is for storage to offer in excess of 5,000 IOPS (IO Operations
per Second) with 1ms latency. This is especially critical for the database server volumes.
BigFix Capacity Planning
13
• This capability is easily achieved with local SSD with an AHCI interface.
• This capability can be massively over-achieved with flash-based storage appliance and local
NVME based flash storage. For example, on benchmark systems we can manage in excess
of 100,000 IOPS with 1ms latency with a single NVME based flash-based drive. See the
References section for further information on NVME and the “storage revolution”.

The larger your environment, the more you should research. Become friends with your local SAN or Storage admin to review the capabilities available to you. Go with the best storage that your budget and SAN admin will permit.

jgstew · May 17, 2020, 9:51pm

I’ll add that this also matters for FillDB in addition to the database.

The other issue is you want the highest network speed and the lowest latency between machines running the BigFix infra. The root server, the webui, web reports, inventory, Windows Console Sessions, Top Level Relays, etc…

Going virtual often means these things suffer and not just because there is some virtualization overhead in general.

In a virtual environment, CPU, RAM, and bulk storage are generally the easiest resources to add on if needed, while network+storage speed and latency are often what they are with little way to improve it.

TimRice · May 18, 2020, 2:19pm

I have done the exact thing you are looking into.
I went from 2 Physical DSA Servers to 2 Virtual DSA servers supporting +50k endpoints.
In my case, the Physical servers had SSD’s and our Virtual Environment was given SSD’s as well.

Work with your Virtualization folks to make sure you don’t allocate too many resources (CPU cores in particular) to your new Server VM’s. Unlike a physical environment where More is Better, in a Virtual environment, it needs to be “right sized” to prevent unnecessary resource swapping by the VM.

jgstew · May 18, 2020, 6:08pm

This is true, the lower the CPU core count of the VM, the more likely the VM is able to be run on the actual CPU cores of the host. Similarly, vCPUs should always be either 1 or even numbers.

it is inefficient, but you can sometimes give a particular VM higher priority to resources, or even pin the resources so that they are not shared, but then the CPUs are consumed even when they are not needed.

The fact that the vCPUs are scheduled rather than “real” can lead to increased latency of network, storage, and compute, but you can get around this fact in some cases like I just mentioned, but you are still going to be limited by the maximum speed and minimum latency of your resources, but those values are also the best case, where the reality would always be worse depending on vCPU scheduling in the hypervisor and how over provisioned the host is.

Some workloads are just not as sensitive to these things, but the same can be true of BigFix if your raw performance of everything is high enough and your use of BigFix is small enough by comparison. It is even possible that if your physical server is low spec enough, you could even get increased performance by going virtual if the virtual resources are good enough, but that is not always the case.

If you are considering moving BigFix to AWS, then this is relevant: Recommendations for deploying BigFix on AWS

How many simultaneous Windows Console Operators? This tends to be a significant impact on the system. The WebUI tends to be less impactful than the Console.

Spirillen · May 19, 2020, 9:58am

Hi

Thanks for all your inputs, really appreciate it.

I have talked with our VmWare administrator, who told me, that SAN was the only option for storage. Then our storage administrator told me, that he couldn’t guarantee 5.000 IOPS on our SAN and that latency would be more like 5 ms.

With this information in mind and taking all of your input into consideration, I think it would be most reasonable to go for a physical server.

jgstew · May 19, 2020, 11:53am

If you don’t mind doing a server migration, or multiple server migrations, then technically there is no harm in trying it, but, my main caution would be that the virtual infrastructure might work okay at first, but because of the limitations of it, it will not handle any increased performance needs with growth.

One option, which is in effect more expensive than a single physical server is to create a high performance virtual cluster that is just for BigFix core infrastructure. The advantage of this kind of solution is you can move things around between clusters or hosts more easily as needed, but you would still end up paying a virtualization penalty, plus if you do certain things to maximize performance, then the VMs won’t be able to move across hosts automatically as easily, so then you kind of loose the higher availability of a cluster.

Is there anything wrong with your current physical servers performance? If not, then you might get more out of it by adding NVMe storage or 10Gig NICS rather than replace it entirely. In previous jobs I had, when we would replace our physical root server, the “retired” server would become a top level relay or similar. If configured well, especially when using automatic relay selection, then relays are effectively disposable. There isn’t anything bad about using out of warranty devices as relays as long as you have more than one to cover the same area.

If you just want to migrate OS versions without downtime, then you could do a server migration from your physical host to a virtual one, then do another server migration from the virtual one back to the physical one once you have reinstalled the OS.