- 20 Sep 2023
- 7 Minutes to read
- Print
- DarkLight
- PDF
Add a Windows WMI Service Check to a Device
- Updated on 20 Sep 2023
- 7 Minutes to read
- Print
- DarkLight
- PDF
Add a Windows WMI Service Check
Only available for Windows-based devices.
To add a service check to a single Windows-based managed device follow the steps below.
- Log in to Netreo as a user with the Admin access level or higher.
- Locate the device to which you would like to add a service check and select it to open its device dashboard.
- Specific devices can be located in Netreo by either drilling in to a Tactical Overview dashboard widget or searching for the device by name using the search feature at the top of the main menu.
- Select the gear icon in the top right of the dashboard to open the dashboard administrative view.
- Select the Service tab to view the service check management area.
- From the Actions pull-down menu select WMI Service Check Wizard.
- Netreo will query the Windows device using WMI. You may proceed after a notification is shown that it has successfully retrieved the server name.
- In the ACTION GROUP field select the action group(s) to receive alert notifications before escalation (multiple selection is allowed).
- Action groups may also run commands on the affected device. See the article Action Group for more information about action groups and their uses.
- In the ESCALATION GROUP field select the action group(s) to receive alert notifications after escalation (multiple selection is allowed).
- Action groups may also run commands on the affected device. See the article Action Group for more information about action groups and their uses.
- A list of currently running services is displayed for the device. (Service checks cannot be added for services that are not active when the wizard is run.)
- Any service that already has a service check monitoring it is highlighted in green and cannot be selected.
- All services that are auto-started are automatically selected.
- Select the services that you wish to monitor (multiple selection is allowed).
- Select Add WMI Service Checks.
It may take a few minutes for your new service check to become active, please be patient.
Modify a Service Check
Once added to a device, you may:
- edit the check
- delete the check
by selecting the appropriate icon in it's ACTIONS column.
Best Practices
Device Templates
It is highly recommended that service checks be added to devices and managed through device templates, and not directly on devices. Even in unique device-specific circumstances, service checks for that device can still be managed using a device template that includes the desired service checks and is assigned directly to the device.
The reason for this is that any service check added directly to a device runs the risk of being overridden by a device template applied to that device that includes a service check with an identical description field. If this occurs, the service check added to the device directly will be overridden. Device template settings always override any settings made directly on a device.
The only circumstance under which a service check should ever be added to a device directly is when that device has had its device template functionality turned off completely.
Service Check Names
It is not allowed for two or more service checks on a single device to have the same DESCRIPTION field value. This value acts as the service check name in dashboards and alert notifications. So, be sure to provide unique names for your service checks when creating them.
Best practice here is to enter a descriptive name that indicates what the check is doing along with any specifics of what it's doing it to. As an extremely basic example, suppose you have two TCP port checks being added to the same device, one checking port 80 and the other checking port 110. Best practice would be to name the first check "TCP port 80 check" and the other check "TCP port 110 check." This way each check will be clearly identifiable everywhere, from dashboards to alert notifications.
Unique service check names are particularly important if you intend to override service check settings using device templates. A service check in a device template will only override another service check if the DESCRIPTION field matches exactly. So, be aware of this when configuring service checks in your device templates.
Custom Alert Timing
The following only applies to active service checks, as passive service checks have their schedule controlled by whatever is updating them. (See Service Check for more information on active and passive service checks.)
Setting ALERT AFTER
Using the default 5 Minutes selection Netreo will execute the service check query every 3 minutes until a failure is detected. Once a failure is detected, it will execute the query two more times at 1-minute intervals, leading to a worst-case alert notification response time of five minutes. Although you certainly may use the Custom selection for this field, it's highly recommended that you do not do so without a very specific reason. The selection of choices available for the ALERT AFTER field should be adequate for most situations.
Setting CHECK INTERVAL
This field defines how often (in minutes) this service check will be executed under normal circumstances. After every successful query, Netreo will wait this interval before it executes the query again. There is a significant performance consideration for this field in that, if you're executing 10,000 service checks at 1-minute intervals, Netreo will have to execute 167 checks per second—adding significant network traffic and system load. Use common sense and try to select a reasonable interval. Netreo will try to spread the queries out anyway—so they don't all run at the same time, but you can still overwhelm your network by overdoing the number of configured service checks.
The lowest that you'll generally ever want to set the this setting to is 3 Minutes (especially if the system is very heavily utilized). You may go lower, but the more frequently you execute the query the heavier the load on the system, and the more network overhead required to perform them.
Setting ON FAILURE, RETRY EVERY
This field defines the amount of time (in minutes) Netreo will wait to retry the query after an initial failure (during the soft state). This period should generally be considerably shorter than the CHECK INTERVAL period. Netreo will continue to retry the query at this interval even after an alert has been sent. If any of the retry queries return a success code the check will stop retrying, clear any current alarm and return to the normal CHECK INTERVAL schedule.
Setting TOTAL FAILURES BEFORE ALERT
This field defines the maximum number of failed queries allowed to qualify for an alert. When the total number of failed queries (initial failure plus retries) reaches this number, the service check enters a hard state. At this point, an alert notification is sent. The service check will continue to query according to the ON FAILURE, RETRY EVERY timer value. It is recommended that you do not set this option to 1, as that will generate a significantly higher number of false alarms.
Common Values
If you need to be alerted to an outage immediately, you'll probably want to go with the following custom settings.
- CHECK INTERVAL = 3
- ON FAILURE, RETRY EVERY = 1
- TOTAL FAILURES BEFORE ALERT = 1
However, such a configuration means no soft state. This means that Netreo won't do any verification to ensure that a problem is real before it sends an alert notification. Users have done this in the past and then complained that Netreo was spamming them with alert notifications. So be careful.
Another common configuration is as follows.
- CHECK INTERVAL = 2
- ON FAILURE, RETRY EVERY = 1
- TOTAL FAILURES BEFORE ALERT = 2
If you do the math for such a configuration, the maximum possible time between a service outage and an alert is 3 minutes. It works well, but remember the potential load problems of setting the CHECK INTERVAL to three minutes or below.
It is always recommended to avoid a TOTAL FAILURES BEFORE ALERT setting of 1. As any little hiccup on the network (like a lost ping packet) will immediately send an alert notification—which is probably not what you want if you're looking to minimize false alarms.