I have seen this "should be separate" philosophy play out in large organizations for decades. SunCor, Dell, JP Morgan Chase, and many other large companies have wound up with dozens of non-integrated and semi-integrated tools that do not provide cross domain correlation, and have very large feature set overlap. It's not a pretty picture.
A best of breed monitoring solution can correlate End User Experience degradation to storage, core/edge networking, application, virtualization, and RAN. This capability is available today in the top tier toolsets. The monitoring products that get the job done use RAM resident collectors, RAM resident Analytics engines, and are both distributed and hierarchical so that they can scale to millions of managed nodes. New "Element Manager" or "Domain Manager" modules have recently been added to allow the inclusion of multi-dimensional root cause correlation in the 2G/3G/LTE/5G RAN and EPC /5GEPC.
The biggest mistake most selection committees make is trying to deploy deep packet and deep APP analytics into production that are more appropriate for development/QA/testing. Products like UILA provide insight into application traffic and app dependency mapping that adds cross-domain correlation that keeps production humming and provides powerful visibility into protocol transactions and application flows and how they are affected by server software/hardware, network, storage, and even the code of the application itself.
By feeding contextual UILA alerts into the MOM (Manager of Managers) and adding that insight into the main monitoring solution CSP's and very large enterprises can get the downstream alert suppression they need, suppress "symptoms" and reduce MTTD (Mean Time To Diagnose) by 75%. This is a proven fact.
In 5G, simply detecting fault and performance in NFV and VNF and the virtualization and orchestration layer is not enough, The actual RAN bearer setup/selection/service discovery/service advertisement/roaming handoff/subscriber auth and other actions need to be tracked and monitored at scale for CSP's to be able to understand how to provide and insure service quality.
It's a difficult decision on whether to include deployment and provisioning orchestration and monitoring into the mainline NOC/SOC (Network/Security Operations Center). Probably the best thing in that case is some type of Swivel Seat, where the main monitoring tools are provided the provisioning team for use in problem determination related to provisioning and orchestrating 5G service delivery. This going to be the most difficult cross domain correlation challenge, in my opinion.
------------------------------
David Redwine
Ai4Cloud
------------------------------
Original Message:
Sent: Aug 19, 2020 01:31
From: Aftab Alam
Subject: IT and Network monitoring solutions
It depends upon the number of nodes to be monitored and different kinds of alarms expected from all IT surrounding systems. My recommendation, we should have monitoring based on different categories:
- Core network monitoring should be separate
- IN & its solution components monitoring should be separate
- surrounding IT Systems monitoring should be separate
- VAS solutions Systems monitoring should be separate
Apart from monitoring, we need to focus on performance counters as well from:
- Input throughput, latency, resources (CPU, RAM, IO, HDD etc),
- Success/failure rejection
- Dashboard
------------------------------
Aftab Alam
Ericsson Inc.
Original Message:
Sent: Jul 29, 2020 04:31
From: Vance Shipley
Subject: IT and Network monitoring solutions
Yes, you may have one (umbrella) Network Manager (NM) for the enterprise. Traditionally we have had a distinct hierarchy with a NM at the top, optional Domain Managers (DM) and Element Managers (EM). We often have a single IT DM and a number of vendor provided Element Management Systems (EMS) with northbound interfaces (NBI) to the top level Network Management System (NMS). The most recent specifications from 3GPP SA5 define a non-hierarchical structure of Management Service (MnS) producers and consumers.
The key is to converge on common information models. For Fault Management (FM) you want models dervided from ITU X.733 such as used in TMF642 Alarm Management API. Vendors vary widely in how they represent alarm events but what you want to do is guide them all into the structure below depicting the increasing specificity of the information describing an alarm. The alarnType MUST be one of the enumerated values and probableCause SHOULD be one of the large number of enumerated values provided by specifications from each domain.
------------------------------
Vance Shipley
SigScale
Original Message:
Sent: Jul 28, 2020 00:31
From: Eurica Tan
Subject: IT and Network monitoring solutions
Is it possible to have just one system that can monitor both IT (infra, applications, network) as well as Networks side?
Or can we have multiple IT and Network monitoring systems work together?
What are the considerations that need to be factored in for these solutions?
Thank you in advance.
#AIandData
#BusinessAssurance
------------------------------
Eurica Tan
------------------------------