It's a Trap !
Actually, no, it's not a trap. ServiceNav implements Trap event handling in version 3.19!
This highly requested feature allows you to collect events generated by your equipment, give a status based on their content and then integrate these results into a service weather, dashboard and report.
How does it work?
A challenge for ServiceNav
The management of traps had already been under consideration at Coservit for a few months. We knew that most of the devices could generate alerts, some of them only work with this type of supervision. After a slight configuration of the ServiceNav Box and the upgrade to version 3.19, the ServiceNav product now has the ability to handle traps.
Before we start, what is a Trap?
To make it simple the SNMP protocol works in 2 ways:
- Active: the supervision box sends a "GET" request to collect information. The server will then answer. This is the standard operation of the supervision.
- Passive: here the equipment does not wait to be interrogated. As soon as an alert occurs, it sends an SNMP packet (a trap), containing information about it.
A problematic operation
1 - In most monitoring tools, the implementation of the passive mode is binary:
- No alert: status is OK.
- Arrival of an alert: the status becomes critical.
Regardless of the criticality or content of the event, it causes a critical status.
We need to be more precise in managing alerts.
2 - Once a status has been changed to critical, it does not automatically change back to OK.
We want to avoid a permanent red checkpoint, which would be counterproductive.
It is also important that users do not have to take any manual action to reset the control state.
A solution for every problem
It is essential that the information collected is relevant and useful, both for technical operations and for management.
To avoid having a critical alert at each event, the plugin gives the possibility to filter the trap content thanks to customizable patterns.
If a particular string is found in the trap, the status adjusts accordingly.
When a trap is received, its content is compared to the patterns entered as parameters. Everyone can define what he considers as a critical alert or not.
Filtering can be done on an OID, words or a phrase.
The problem of the relevance of alerts is now solved. How do you make the checkpoint turn back to green once the alert is processed? Besides, how do you know if an alert is completed?
Some equipment sends traps when there is a problem and then a "reverse" trap indicating that the equipment has returned to its nominal state.
Thanks to the OK pattern, it will be possible to define an OID or a string present in the trap that will indicate an OK status. Thus, no manual action is required.
For other equipment that only notifies in case of a problem, we have implemented a timeout system.
When a trap is received, the control point will go to Warning or Critical status. If no other trap is received before the timeout defined in the parameter, the control point will automatically return to the OK status.
The Trap checkpoints are therefore completely autonomous.
We can consider that an alert should only remain 30 minutes maximum in the supervision and that its follow-up will be done thanks to the creation of a ticket.
After this time, the checkpoint will be set to OK to allow for the possible reception of another alert.
If an email notification is associated with the checkpoint, even if the checkpoint is in OK, the user is notified that a particular event has occurred, and can intervene as best he can.
Smart management and customizable text output
A trap is composed of several variables (time of reception, OID, text for example).
The text output displayed in the ServiceNav interface can be selected by the users.
If the trap OID is in variable 2 and the text is in variable 3, the text output could be :
An example to illustrate this
With this configuration :
If the trap received is :
The checkpoint will be :
If a few minutes later, the equipment sends a trap indicating that the server is healthy again:
The checkpoint will be set to OK with the pattern :
As you can see, such a feature opens the supervision to new equipment and is part of our will to constantly improve ServiceNav.
A webinar will be held on December 5th and 11th for a more precise presentation of this feature. Register at !