More recent posts from the Servicenav team
Operationally, service providers are challenged to provide lightning-speed deployment capabilities and to cope seamlessly with dynamic environments; where devices come and go and reconfiguration is a recurring event through strict processes like ITIL change management. When controlled infrastructure changes are carried out, monitoring re-configuration should be an intrinsic part of the changes – remove monitoring, add new monitoring, reconfigure existing monitoring, and so on.
For monitoring in the IoT world to be feasible, “continuous discovery” needs to be incorporated into the monitoring environment. This means that changes affecting the monitored environment are automatically identified and can be re-configured accordingly.
Service providers are tied up in KPIs, SLOs and SLAs by their clients who are, quite rightly, demanding the very best levels of service for their money. But with so many elements contributing to a “service”, how can the service provider really keep track?
The answer is big data. Or, at least, part of the answer is big data.
Missed SLAs incur service credit penalties, damaged reputation or worst of all, loss of customers. Being able to spot a problem before it occurs is the key principle of preventative monitoring and maintenance, and taking the same approach to SLA measurement brings benefits to the service provider. Real time computation of availability rates gives service providers the best chance of maintaining agreed levels of service and customers less reason to be disgruntled.
Real time tracking of SLA performance rather than waiting for a report at month end is key. It enables service providers to be strategic about how SLAs can be met and responsive to changing priorities and commercial impact. For example, focusing on VIP clients, where the impact of getting it wrong is worst, or where there is potential for missing an SLA goal.
The great thing about gathering all these metrics is their potential for supporting all kinds of decision making. Analysis with the right tools can highlight patterns that humans just wouldn’t see, but those patterns can have a tangible impact on human workload. For example, a pattern of false notifications identified within historically gathered information could result in resetting notification thresholds to reduce or eliminate false notifications. This can be achieved automatically through “machine learning” of typical patterns of activity identified from the huge amounts of historical data collected, or such patterns can be flagged up to operators to permit them to make the modifications manually.