Network losses are a classic case where administrators can be overwhelmed with alerts. To remedy this problem, parent-child relationships must be put in place.
We will explain how to do this in the ServiceNav monitoring tool.
Use cases
Your company has several sites connected by an MPLS network and you wish to supervise your equipment of the remote sites.
The supervision box is located on the DC of your head office and will therefore regularly interrogate the equipment of your remote sites.
The architecture is as follows:
Now let's imagine that the server of the remote site is turned off.
The supervision box will therefore query the server directly on its IP address. If the server does not answer, its status will be DOWN (Critical). An alert will be sent if the notification is activated on the DOWN state (Critical).
Now, let's imagine that the router is unreachable (operator problem on the remote site).
The supervision box will therefore try to reach all the equipment on the site. For him no equipment is accessible, it will return a DOWN status (Critical) for the 4 equipments of the site. (Server, printer, switch and router).
However, the reality is not the right one since only the router is inaccessible.
In order for the ServiceNav Box to distinguish between the DOWN and UNREACHABLE states of the monitored devices, it is necessary to tell the ServiceNav Box how the devices are connected to each other from the ServiceNav Box's perspective.
To do this, trace the path that a data packet should take to get from the ServiceNav Box to each monitored device. Each switch, router, and server that the packet must pass through is considered a step and requires you to define a parent/child relationship in the monitoring.
The following relationships should therefore be established to avoid having all the equipment on the remote site in DOWN.
Parent | Child |
Router | Switch |
Switch | Server |
Switch | Printer |
When the supervision box goes to query the equipment, it first checks the relationships and finds that the equipment depends on the switch and that the switch depends on the router which is no longer accessible.
In the monitoring only the DOWN status will appear for the router, the other devices will have an UNREACHEABLE status because the ServiceNav Box is unable to know if these devices are working or not. So it considers them as Unknown and not Critical.
When the router returns to UP status, all devices will be accessible again.
Configuring parent-child relationships in ServiceNAV
Technical operation menu or Equipment configuration menu
Position yourself on the equipment to be configured and click on the button
Then in the relationship tab set up your relationships.
Example:
For the switch :
For the server and the printer
In order for the functioning of the parent/child relationship to be taken into account it is MANDATORY to supervise all the equipment with the same ServiceNAV supervision box |