System Management Tools
  • 26 Jun 2025
  • 13 Minutes to read
  • Dark
    Light
  • PDF

System Management Tools

  • Dark
    Light
  • PDF

Article summary

Description

Caution
These tools are intended to be used only as directed by a Netreo support engineer and are available only to users with the SuperAdmin access level. They are not intended for customer use in troubleshooting their own systems and are not supported for that use.

The System Management page contains several tools that can be used to aid Netreo support in diagnosing issues with your Netreo deployment.

Select Administration >> System >> System Management from the main menu to navigate to the System Management page.

These tools can be run from the UI of any active Netreo appliance (with one exception), including:

  • Any primary instance (including high-availability and Overview)
  • The replica instance of a high-availability deployment
  • Any service engine instance
  • Any client instance of a Netreo Overview deployment

(Exception) These tools can not be run from the arbitrator instance of a high-availability deployment as it does not have a UI. However, an arbitrator can be selected as the target of the tools when run from another instance.

If run from a primary instance, any other connected Netreo instance can be selected as the tool's target. However, if run from a non-primary instance, only that instance (that is, itself) can be selected as the target.

MySQL Query Tool

With this tool, users can run queries (Select only) against the MySQL databases of Netreo VMs.

The purpose of this tool is to allow Netreo support personnel to more easily obtain database information from a customer system to aid in troubleshooting. Netreo support personnel will provide the customer with specific queries that can be run from this tool to collect the necessary data. That data can then be exported via the tool and sent to Netreo for evaluation. This enables support personnel to examine the data without requiring direct access to the customer system.

Procedure

To run a Netreo-provided query through this tool, follow the procedure below.

  1. Log in as a user with the SuperAdmin access level to either your primary Netreo appliance or the Netreo instance against which you plan to run the query.
  2. Select Administration >> System >> System Management from the main menu to open the System Management page.
  3. If not already selected, select the "MySQL Query" tool by using the buttons at the top of the page.
  4. On the Application System panel, in the NETREO DEVICE field, use the pull-down menu to select the Netreo instance against which to run the query.
  5. On the Query panel, copy/paste the query provided to you by Netreo support. (Only supports read-only commands.)
  6. Click the Submit button.
  7. The results of the query are displayed in the Result panel.
  8. Click the CSV button to export the data to a CSV file.
  9. Send the file to Netreo using the method requested by support.

The tool has a hard limit of 1000 results maximum. However, it does accept commands to limit the maximum results to fewer than that.

Processes Tool

With this tool, users can manually restart selected Netreo internal processes or completely restart a Netreo VM from within the Netreo UI.

The purpose of this tool is to allow Netreo support personnel to easily restart Netreo internal processes (a common troubleshooting step) without resorting to the command line. Using this tool allows Netreo support personnel to direct customers to restart specific processes without the need for direct access to a customer system. It can also be used to restart a Netreo VM from within the Netreo UI (expect an appropriate amount of downtime for the relevant system when rebooting a VM).

When the Netreo processes affected by this tool are listed, they are organized into the following categories:

  • Availability Engine - Processes responsible for service checks and host availability checking.
  • Database - Processes responsible for the database that holds Netreo system configurations. (This does not include managed device historical performance data. That is stored separately.)
  • Incident Management - Processes responsible for incident creation, alarm correlation, and the sending of alert notifications.
  • Logging and Traps - Processes responsible for processing SNMP traps and syslogs sent to Netreo.
  • NetFlow - Processes responsible for processing traffic flow packets sent to Netreo.
  • Netreo Monitor - Processes responsible for starting, stopping, and monitoring Netreo's internal functions. Includes all of the services and processes listed in these categories.
  • Polling Engine - Processes responsible for all the jobs that request, receive, process, and store all historical performance data for managed devices.

This tool supports two methods of process restart:

  • Graceful Restart (uses the restart command) - The square icon. Attempts to restart the process normally.
  • Force Restart (uses the kill-9 command) - The X icon. Forces the process to stop without attempting to restart. Netreo then attempts to restart the process using its own internal logic.

Procedure

Follow the procedure below to use the tool to restart a group of Netreo services (or a Netreo VM).

  1. Log in as a user with the SuperAdmin access level to either your primary Netreo appliance or the Netreo instance against which you plan to run the query.
  2. Select Administration >> System >> System Management from the main menu to open the System Management page.
  3. If not already selected, select the "Processes" tool by using the buttons at the top of the page.
  4. On the Application System panel, in the NETREO DEVICE field, use the pull-down menu to select the Netreo instance against which to run the tool.
  5. Click the List Processes button.
  6. A list of Netreo internal process categories is displayed, as well as their current status. (Each category includes multiple related processes.)
    • The list results from a single query and is not a real time display of running processes.
    • Different categories are displayed depending on your Netreo deployment configuration (not all categories are available on all appliances).
    • Netreo processes currently not running are displayed as "missing" in the category info area.
  7. As directed by a Netreo support  engineer, restart a process category using the buttons to the right.
    1. In the pop-up dialog that appears, click the Yes button.
  8. Click the List Processes button again to confirm that the selected category has actually stopped all of its processes.

It is important to note that Netreo frequently restarts its own processes in the normal execution of events, so stopped processes displayed in the list are not necessarily of concern. However, if a process continually shows as stopped, we recommend you contact Netreo support for additional help.

HA Database Processes
Due to the way databases work in a Netreo high-availability (HA) deployment, the Database category of processes is never available for appliances with an active HA configuration (Administration >> System >> High Availability). Delete your HA configuration to access the Database category of processes for those appliances.

Netreo Workers

Advanced Netreo Users Only
The Netreo Workers tool is intended for use by advanced Netreo Administrators. Netreo attempts to check memory usage to prevent system crashes caused by unreasonable values. Still, misconfiguring Netreo using this tool can lead to extreme performance issues, resulting in missing or lost data.

This tool allows you to manually override a specific appliance's maximum number of simultaneously spawned Netreo worker instances.

A worker is a single process tasked with completing a specific job within the Netreo monitoring workflow. Many worker instances for different jobs are continuously spawned and die as Netreo monitors your environment.

If your Netreo deployment includes a highly demanding environment (for example, a large network environment monitoring QoS and traffic flows), this tool provides you with the ability to customize Netreo worker spawning to maximize the usage of your available resources.

The adjustable worker types are:

  • Pollmaster Workers - These workers collect metrics from a managed device. They know which device to query and what metrics to collect.
  • Pollmaster Result Workers - These workers process the metrics data retrieved by the Pollmaster Workers and send the results to storage.
  • NetFlow Workers - These workers process the data from incoming traffic flow packets and send the results to storage.
  • OAM Workers - These workers perform the service checks assigned to managed devices. Each worker executes one service check and processes the results.

Normally, the maximum number of simultaneous workers for each type is calculated based on the available resources for the appliance at the time of deployment (except for NetFlow Workers, which is a static value). See the Fields section below for how each worker type's default values are calculated.

Fields

  • Application System Panel
    • NETREO DEVICE - Selects the appliance on which to make changes. You can select the core appliance of a standard on-premises deployment, the primary appliance in an HA deployment, a service engine appliance, or a service engine group. Selecting a service engine group applies the settings equally to all service engine appliances within the group. Replica and arbitrator appliances for HA deployments can not be selected, as their settings automatically mirror the primary appliance.
  • Netreo Workers Configuration Panel
    • POLLMASTER WORKERS - Limits the maximum number of simultaneous Pollmaster Workers. By default, this value is the number of CPU cores available to the appliance, up to a maximum of 40.
    • POLLMASTER RESULT WORKERS - Limits the maximum number of simultaneous Pollmaster Result Workers. The default value is based on the type of appliance:
      • Primary appliance default = number of CPU cores * 1.5, up to a maximum of 20
      • Service engine appliance default = 4
    • NETFLOW WORKERS - Limits the maximum number of simultaneous NetFlow Workers. By default, this value is 3.
    • OAM WORKERS - Limits the maximum number of simultaneous OAM Workers. By default, this value is the number of CPU cores * 5, up to a maximum of 125.

By default, the worker configuration fields do not show the current value calculated for that field, as that would require the field to live update on any resource changes. Instead, the fields display "Job count limit" until manually overridden by the user.

To set a custom value: Check the OVERWRITE checkbox for the desired field, enter the new value in the text field, and click Save. Once a value has been overwritten, the new value displays in the text field.

To remove a custom value and return to the default: Delete the value in the text field, uncheck the OVERWRITE checkbox, and click Save. Netreo will recalculate the correct default value based on currently available resources for that appliance.

After changing the worker configuration, a primary appliance will take at least 5 minutes to adjust to the new values. Service engines have their worker values updated when synchronizing their data with the primary appliance, so allow at least 20 minutes for the new values to take effect.

Reccomendations

Follow the recommendations below to maximize Netreo's performance based on your deployment's available resources.

Note: When checking the various performance statistics identified in the recommendations during tuning, also check the Swap memory statistic on the Performance tab of the Device Dashboard of the Netreo managed device you want to tune. If the value in the Swap graph is greater than 10% before tuning, then increasing the workers will not help (in fact, it would be detrimental), as you are likely already experiencing performance issues due to insufficient memory resources.

Caution
Setting worker values that are too high can cause gaps in historical data, as an overall system load that is too high can cause delays in processing. Conversely, setting values that are too low can also result in gaps in historical data, as there might not be enough workers to process the work volume according to Netreo's schedule.

Pollmaster Workers

When tuning the Pollmaster Workers value, first check the Performance tab of the Device Dashboard for the Netreo managed device that you want to tune. (Remember, this can be the Netreo core appliance, a service engine, or a service engine group.)

On the Performance tab, look for the Polling Queue statistic. This value represents the percentage of the total monitored device count for which pollmaster jobs are waiting to be processed. Ideally, the value of this statistic would always be zero. Occasional spikes in the graph are considered normal and acceptable. However, if there is a persistent nonzero value in this graph, try increasing the number of pollmaster workers until the Pollmaster Queue consistently remains at zero. Increase the number of workers in increments of 10% of the default value. Because the default value is not displayed in the UI, determine the default value for your appliance by using the information provided in the Fields section discussed earlier.

If Swap is at zero and increasing the number of workers fails to affect the Polling Queue value, it is likely that you have devices that are responding slowly to polling, and they are the reason for the increased polling queue. Remove any worker adjustments and address any slow-responding devices first. Then, begin increasing the workers again to see if it reduces the Polling Queue value. Be careful as you increase the number of pollmaster workers, so that the Swap value does not increase as well.

As an additional note, when tuning the worker values, it is recommended to never set the values lower than what the default calculations would produce. For example, if there are 16 CPU cores available for the appliance, don't set the number of pollmaster workers lower than 16. See the Fields section discussed earlier for default values.

Pollmaster Result Workers

The only time you would want to change the number of pollmaster result workers is when the job completion rate for the pollmaster_result statistic is less than 100% for more than 10% of the time (2-3 hours out of a given 24-hour period). Ideally, the value of this statistic would always be 100%.

Check the pollmaster_result statistic by opening the Performance tab of the Device Dashboard for the Netreo managed device that you want to tune and locating the pollmaster_result statistic within the Netreo Queue Performance statistical group. The percentage of completed jobs is shown in the JOB COMPLETION column. Click the value in the column to open a 24-hour graph for that statistic.

Try increasing the number of pollmaster result workers by 10% of the default value. (Because the default value is not displayed in the UI, determine the default value for your appliance by using the information provided in the Fields section discussed earlier.)

If adjusting the number of pollmaster result workers fails to affect the pollmaster_result job completion value within roughly 2 hours, it might indicate a database/storage speed issue. Do not continue increasing the workers beyond the initial 10%. Checking the I/O Performance statistics for consistently high values might reveal a storage performance bottleneck.

NetFlow Workers

Netreo tracks several traffic flow processing statistics for each appliance that is receiving flow data. Review these statistics by locating them on the Performance tab of the Device Dashboard for the Netreo managed device that you would like to tune. Try increasing the number of NetFlow workers by 1. Wait at least 1 hour, and then check to see if the traffic flow statistic values increase.

If the values do not increase, then Netreo is already processing all received traffic flows, and no adjustment is necessary. Remove the adjustment to reset the default value.

If the values do increase, then Netreo was not processing all received traffic flows. Continue increasing the number of NetFlow workers by 1 and rechecking the statistics until they stabilize and no further gains are made. At that point, Netreo is processing all received traffic flows.

OAM Workers

When tuning the OAM Workers value, first check the Performance tab of the Device Dashboard for the Netreo managed device that you want to tune. If using service engines, you should only need to tune this value on them (or the service engine group they are assigned to). If not using service engines, you would tune this value on the core appliance.

On the Performance tab, look for the OAM Latency statistic. Ideally, this value would be 0, with minor bursts for no more than a few minutes. However, if the graph shows persistent latency higher than 5 seconds over a period of more than 30 minutes, you can try increasing the OAM Workers value until the OAM Latency is reduced.

Try increasing the number of OAM workers by 10% of the default value. (Because the default value is not displayed in the UI, determine the default value for your appliance by using the information provided in the Fields section discussed earlier.)

If increasing the OAM workers is effective in reducing latency, but the latency value continues to be greater than desired, continue increasing the number of workers by 10% of the original value each attempt until latency is reduced to the desired level.


Was this article helpful?