We saw it in the previous articless, the treatment of false positives in supervision is essential. Indeed, the absence of treatment drastically increases the cost of operating the IS through useless and costly interventions.
ServiceNav, as monitoring solutionThe "false positive" report, proposes different ways to avoid false positives, in particular through a correct setting of the supervision: thresholds and additional controls.
But where to start?
A recurring question that comes up when configuring the supervision and when dealing with false positives is : where to start?
In reality, there are 2 types of false positives if no work has been done:
- Those who are permanently on alert or critical and do not change their status.
- This is often a problem of thresholds. For example the number of users connected to an MS SQL database which is always between 100 and 250 with a critical threshold at 30. In this case: the thresholds have to be set.
- Those who "bagotent", i.e. their status changes from OK to not OK (Alert, Critical or Unknown) several times a day.
- This may be a problem of thresholds, but more likely a problem of additional controls not being put in place.
How to identify the elements to be processed?
For the first caseIf you want to work with the elements that are permanently out of order, the technical operation of ServiceNav and its filters is the best place to work.
By looking for all the elements in non OK and not acknowledged and by questioning myself on each of them, I check the metric tab, I look at the history, ...
- Real problem? I take it into account, I open a ticket through integration... ticketing (the item will be acquitted) and the RUN team will take care of it.
- False positives? I change the thresholds
For the second caseand the elements that are "banging around", there are also tools in ServiceNav, including the report "Summary of Operating Information".
An Excel file which outputs over the 7, 30 or 90 days of its choice, for each element the number of passages in each of the statuses and the time for each status.
Here is an example over 7 days: (we have sorted the column "Number of critical passages in descending order")
We can quickly see that by processing the first 10 lines (i.e. 10 equipments or services out of the 1315 in the file), we will reduce the number of alerts by more than 50% (386 out of the 736 in the file).
The KING is therefore immediate!
The value of dealing with false positives in supervision is a must. ServiceNav offers tools and methods to optimise the processing, with consultants trained in its tools to help you if necessary.
For ServiceNav, the processing of false positives in monitoring remains a priority and the entire BigData stack deployed since the version 4.0 will make it possible to bring even more intelligence to the product and in the coming months will propose automatic threshold adjustments or make proposals for additional controls.