Autopilot (AIOps Module)
  • 29 Jan 2025
  • 12 Minutes to read
  • Dark
    Light
  • PDF

Autopilot (AIOps Module)

  • Dark
    Light
  • PDF

Article summary

Description

The Autopilot module for Netreo's AIOps feature automatically manages administrative system configuration tasks within Netreo. Many operational and administrative issues that arise from suboptimal or broken configuration settings can be automatically fixed (or avoided altogether) through the use of this module. (All AIOps modules are accessible only to users with the SuperAdmin access level.)

This module monitors and manages the following administrative configuration areas:

  • Solution setup: Checks for system-level problems such as incorrect DNS, NTP, or mail settings. (On-premise deployments only, not available in SaaS deployments.)
  • Environmental integration: Checks for device bad credentials within the monitored environment.
  • Best practices: Checks for issues in device organization and configuration consistency.
  • Data collection: Checks that the correct data is being collected from each type of device.
  • Threshold check baselining: Checks for threshold check static values that appear to be configured poorly (potentially resulting in excessive alerts).

Details

The Autopilot module scans for issues at approximately midnight every night. Scan times will vary depending on the number of devices being managed by Netreo.

Autopilot uses a series of models to scan for each particular type of administrative configuration issue. Each model can be switched on or off, as desired.

If the Autopilot scan finds an issue, the action taken depends on the behavior configured for the model that found the problem. All actions are recorded in the Autopilot Activity Log.

Autopilot Models

Autopilot models are the functionality that detects and fixes Netreo configuration issues. Each model used by Autopilot is described in this section, organized by model category as they appear in the module's configuration settings.

(Models with an asterisk ( * ) are available only for Netreo on-premise deployments.)

Baselines

  • Threshold Baseline Check
    This model looks for managed devices with static threshold check values that appear to be configured poorly.
    The automatic fix tunes the respective threshold values to better represent the actual consistent values for the statistic each check is monitoring.

Best Practices

  • Auto-configuration Rule Mismatch
    This model looks for managed devices whose Netreo monitoring configuration doesn't conform to the autoconfiguration rulesets.
    The automatic fix runs the appropriate autoconfiguration rules against the respective devices.
  • Device Templates Status
    This model looks for managed devices with device template usage switched off.
    The automatic fix switches on the device template usage setting for the respective devices.
  • Host Alert Contacts Present
    This model looks for missing host alert contacts on each managed device.
    The automatic fix applies applicable device templates (which should be configured with appropriate host alert contacts) to the respective devices.
  • Missing Device Organization Information
    This model looks for managed devices that are not assigned to category, site, or strategic groups.
    The automatic fix runs the appropriate autoconfiguration rules against the respective devices.

Data Collectors

  • All Missing Basic Polling Data*
    This model checks that all performance data specified in each device type is being collected from all managed devices of that type.
    If issues are found, the findings card for each failed device shows the number of metrics missing for that device. (To determine which statistics are missing for a given device, compare the metrics listed on the Performance tab of its Device Dashboard with the pollers listed in its device type.)
  • Device Type Check*
    This model checks that the Netreo device type assigned to a managed device matches the device type reported by that device's internal description.
    The automatic fix changes the assigned device type for mismatched devices.
  • Interface Speed Check*
    This model checks for network interfaces on managed devices that are missing bandwidth speed values. If five or more interfaces on a device are missing in or out bandwidth speed data, an exception is output to the Autopilot Activity Log.
    The automatic fix attempts to determine and set the correct speeds on each failed device by using a combination of SNMP interrogation of the device and its own database information.
  • Missing Basic Polling Data*
    This model checks that appropriate performance data is being collected from all managed server-type devices.
    If the collected data for a device is missing any expected data for the previous 24-hour period, the model checks all credentials stored in Netreo and identifies any working credentials that are found in its findings. The user may then manually update their device templates or the respective device's monitoring configuration with those credentials.
  • NetFlow Validity Check
    This model checks that appropriate traffic flow data is being collected from all managed devices set as traffic flow exporters.
    If no data has been collected from a traffic exporting device for the previous 24-hour period, the model schedules the device for an immediate discovery poll to attempt to remedy the problem.
  • Selected Polling Data Missing*
    This model checks the metrics collected for each managed device over the previous 2-hour period. If any metric contains all NaNs as its data for that period, the device is identified in the findings.
    The automatic fix schedules the managed device for a discovery poll to correct any misconfigured poller queries. Frequently, this is enough to get the device reporting metric data properly again. However, if a device is not responding to the poller for a particular OID or API endpoint, the issue must then be addressed manually.
  • Syslog or SNMP Trap Validity Check
    This model checks that appropriate log data is being collected from all managed devices set as log exporters.
    If no data has been collected from a log exporting device in the previous 24-hour period, the model schedules the device for an immediate discovery poll to attempt to remedy the problem.
  • Unlocked Template Thresholds Detected
    This model looks for threshold checks assigned to a managed device by a device template that has any of its configuration settings unlocked (i.e., editable within the device configuration, overriding the settings of the device template that assigned it).
    All unlocked thresholds are reported in the model's findings.

Environmental Integration

  • Device Credential Check
    This model looks for managed devices experiencing a Config Manager authentication failure.
    The model parses through all device templates and attempts to find working credentials for each failed device. If working credentials are found, the name of the device template containing them is output to the Autopilot Activity Log along with the name of the failing device so that an administrator can update the device configuration to include the working credentials. If no working credentials are found, an exception is output to the Autopilot Activity Log with the device name and a message that no working credentials could be found.

Solution Setup

  • DNS Server Incorrect
    This model checks to see if the DNS server configured in Netreo is valid.
    The automatic fix attempts to find a valid DNS server and configure Netreo to use it. If no valid servers can be found, an exception is output to the Autopilot Activity Log.
  • NTP Server Incorrect
    This model checks to see if the NTP server configured in Netreo is valid.
    The automatic fix attempts to find a valid NTP server and configure Netreo to use it. If no valid servers can be found, an exception is output to the Autopilot Activity Log.
  • SMTP Mail Delivery Problem
    This model looks for bounce errors in the Netreo Mail Log (Administration >> Alerts >> Mail) for the previous 24-hour period.
    If counce errors are found, the model reports them in the Autopilot Activity Log.

The Autopilot Dashboard

To open the Autopilot dashboard, go to the main menu and select Administration > AIOps: Autopilot.

Summary Tab

Autopilot Summary

Summary

The Summary section visualizes the results of the most recent Autopilot scan with three gauge-style indicators labeled Observe, Analyze, and Act.

  • Observe
    The Observe gauge shows the number of configuration items scanned (blue bar) and the number of items that require fixing, called findings, identified (green bar).
  • Analyze
    The Analyze gauge shows how many findings Autopilot did not automatically fix (blue bar) and separates them into items that can be automatically fixed but weren't (green bar) and items that require user intervention (red bar).
  • Act
    The Act gauge shows the total number of findings fixed by Autopilot (blue bar), separated into the number automatically fixed without user intervention (green bar) and the number automatically fixed after manual user authorization (red bar).

Findings by Model Category

The Findings by Model Category section shows the number of active (unresolved) findings versus the number of findings that have been remedied (fixed), broken down by Autopilot model category.

Findings Tab

Autopilot can automatically fix a wide variety of configuration issues in Netreo, but some findings require user intervention. When Autopilot encounters something that it either can't fix or has been instructed not to fix, it flags those findings for review. These are called active findings, and they require a user to either approve and apply Autopilot's recommended remedy or fix the issue manually outside of Autopilot.

To view the current list of active findings in Autopilot, select the Findings tab and make sure that the Finding Sources button is selected.

Each active finding is displayed in its own card, which shows the name of the item affected (typically a managed device), its current host availability status, whether Autopilot can fix it or not, the nature of the issue, and the name of the model and category that detected it.

If a finding can be automatically fixed but Autopilot has been configured not to, the card will say "Can Be Addressed." If Autopilot cannot fix the finding, the card will say "Needs Review."

To view a finding, click its card to extend a side panel that shows more detail about the finding and the device involved.

Multiple findings can be selected and deselected simply by clicking on their cards. If multiple findings are selected, the side panel will show a summary of the selected findings instead of a detailed view.

The findings can be filtered in several ways to focus on certain types of findings.

  • Model Category
    To filter the findings by model category, use the pull-down menu. By default, findings from all categories are displayed. Click a category in the pull-down menu to select it and clear the other categories. This adds a filter that shows only findings from the selected category. Then, if desired, click additional modules in the list to add them to the filtered selection.
  • Issue Label
    To filter the findings by issue status, click one of the status displays below the Modules pull-down menu. By default, results from all statuses are displayed. Click a status to select it and deselect the other statuses. This will add a filter to show only findings with the selected status. Then, if desired, click additional statuses to add them to the filtered selection. The labels available are In Progress, Can Be Addressed, and Needs Review.
  • Search Term
    To filter the findings by a specific term, enter a term in the filter box at the top right of the results display. A real-time filter is applied to the findings and updated as you type. Only a single search term (with no spaces) can be entered.

The findings are filtered in real time, and any combination of filters may be used together. The currently applied filters are shown above the findings. To remove all filtering options and display the complete list of findings again, click the clear filter label that is shown when filtering is applied.

Fixing Unremedied Findings

Only active findings labeled as "Can Be Addressed" can be fixed from within Autopilot. Findings labeled as "Needs Review" cannot be fixed from within Autopilot and must be addressed appropriately by the Netreo administrator.

To fix active findings:

  1. Click one or more active findings labeled "Can Be Addressed" on the Findings tab to select them.
  2. In the side panel, click the Fix button.
  3. Autopilot will display the remedy it intends to apply.
    • Selecting the Fix future occurrences automatically option will update Autopilot's model configuration to automatically fix these types of findings in the future.
  4. Select Submit.

Findings labeled "Needs Review" that were selected with fixable findings will be ignored during the fixing process.

If you choose not to fix a finding labeled "Can Be Addressed" it will remain as an active finding and continue to display on the Findings tab. However, after a set time passes (configured on the Settings tab), it will automatically be fixed even if the model fixing behavior is set to Manual.

Settings Tab

To configure Autopilot, click the Settings tab.

The Preferences section contains the setting for how long Autopilot will allow a fixable finding to remain active before automatically applying the recommended remedy. Findings will always be automatically fixed after they have been active for this period of time, regardless of what the fixing behavior of the associated model is set to (see below).

The Model Categories section contains the models that make up Autopilot, grouped into categories. Settings can be configured for a whole category, or set individually on each model.

  • To change a setting for an entire category simultaneously, select the desired setting in the row of the category you want to change.
  • To change a setting for an individual model, click the + next to a category name to expand its model list and select the desired setting in the row of the model you want to change.

Changes to settings must be saved to take effect. Click Save Changes to do this.

To set whether a model category or individual model will participate in the Autopilot scan, click the switch in the ENABLED column to select the desired state: ON to enable, OFF to disable.

To exclude managed devices from the Autopilot scan by functional group, create one or more functional groups to exclude and add the desired devices to them. On the Autopilot Settings tab, click the button in the EXCLUSIONS column of the category or model from which you would like to exclude the functional group devices, select the desired functional group(s) in the pop-up dialog, and then click Save Changes.

To specify the behavior of Autopilot when a model category or individual model reports its findings, select the desired behavior from its respective pull-down selector in the FIXING column. The following behaviors are available:

  • Automatic - (The default behavior for all categories and models.) When the model discovers a finding, it runs the prescribed remedy automatically to try and fix the problem. No user action is required, and the issue will not be displayed on the Findings tab. (Although it is still viewable in the Autopilot Activity Log.)
  • Manual—When the model discovers a finding, it does not run the prescribed remedy automatically. Instead, it flags the finding for review and requires the user to apply the remedy manually on the Findings tab using the Fix button. Additionally, selecting Fix Automatically when applying the fix will reset the behavior configuration for the model to Automatic.
  • Disabled - When the model discovers a finding, it does not run the prescribed remedy automatically. It also does not provide the option to apply the remedy manually from within the Findings tab. However, the findings are still flagged for review. If you do not want to see findings from this model at all, disable the model in the ENABLED column.

Setting the behavior of a model category automatically sets all models in that category to the same behavior. Conversely, setting the behavior of a model differently from its model category will cause the category behavior to show as Plugin-specific.

Fixing Behavior Override
Autopilot will always auto-remedy fixable findings that have been in the Findings tab for longer than the auto-remedy time period (specified on the Settings tab), even if the fixing behavior for a model is set to Manual or Disabled.

Viewing the Autopilot Activity Log

Autopilot maintains a complete log of its actions. To view this log, select the Findings tab and then click the Activity Log button.


Was this article helpful?

What's Next