Network outages are a classic case where administrators can be overwhelmed with alerts. To remedy this problem, it is necessary to set up parent-child relationships.
Here we explain how to do this in the ServiceNav monitoring tool.
consider a company has several sites connected by an MPLS network and you wish to monitor all your remote sites hosts.
The ServiceNav Box is located in the head office data center and will regularly interrogates the devices at your remote sites.
The architecture is as follows:
Now let's imagine that the server at the remote site is shut down.
The ServiceNav Box will then query the server directly on its IP address. If it does not answer, its status will be DOWN (Critical). An alert will be sent if notifications are activated for the DOWN (Critical) state.
Now, let's imagine that it is the router that is inaccessible (operational issue at the remote site).
The ServiceNav Box will therefore try to contact all the devices art the remote site. As no devices are accessible, it will return a DOWN (Critical) status for the 4 devices at the site. (Server, printer, switch and router).
But that does not reflect the reality as only the router is in fact down.
In order for the ServiceNav Box to be able to distinguish between DOWN (Critical) and UNREACHABLE (Unknown) states of the monitored hosts, it is necessary to tell the ServiceNav Box how the devices connect to each other from the ServiceNav Box's point of view.
To do this, trace the path a data packet should take from the ServiceNav Box to each monitored host. Each switch, router, and server that the packet must pass through is considered a step and requires you to define a parent/child relationship in the monitoring tool.
The following relationships should therefore be established to avoid having all the hosts at the remote site report as DOWN.
When the ServiceNav box box goes to poll a hosts it checks the relationships beforehand and realizes that the host depends on the switch and that the switch depends on the router which is no longer accessible.
In the monitoring platform only the router will appear in the DOWN (Critical) status; the other hosts will have an UNREACHEABLE (Unknown) status because the ServiceNav Box is unable to confirm if these devices are working or not. ServiceNav considers them to be Unknown and not Critical.
When the router returns to UP status then all hosts will be OK again.
Configuring Parent-Child Relationships in ServiceNav
From the main Monitoring screen or Hosts configuration menu
Select the Host to be configured and click on the "Configure" button
Then in the Relations tab set up your relationships.
For the switch:
For the printer and the server
|In order for the relationship between parents and children to be applied correctly, it is MANDATORY that all hosts are monitored by the same ServiceNav Box|