Trend and Challenges of Digital Network O&M
4G changes lives, while 5G changes societies. 5G networks need to support scenarios of ultra-large bandwidth, ultra-low delays and massive connections, to serve various vertical industry applications such as automatic driving, industrial control, smart grid, large videos and AR/VR.
Diversified services, flexible deployment requirements, and complicated network forms pose great challenges to 5G network O&M, which cannot be addressed by traditional manual and semi-automatic O&M modes.
AI technologies have natural advantages in high computing data analysis, cross-field feature mining, and dynamic policy generation. The introduction of AI can further improve network deployment and O&M benefits, improve resource utilization, and reduce operation costs.
Inevitable Intelligent O&M for 5G Network Slicing
Network slicing is an important feature of 5G networks. Through flexible allocation of network resources and flexible combination of capabilities, multiple logical subnets with different network features can be virtualized based on a physical network to meet customization requirements in different scenarios. Network slicing O&M provides full lifecycle management for slice instances, including design, provisioning, SLA assurance, and termination. Network slicing brings great flexibility but increases the complexity of O&M. It is inevitable to enhance automatic slice management capability based on AI.
Key Technologies of AI-Driven Intelligent Slice O&M
With AI introduced, according to a decision-making basis output by the AI training platform, the slice management system automatically executes management policies, to equip the network with capabilities of intelligent perception, modeling, provisioning, analysis, judgment, and prediction, to perfectly balance slicing flexibility and management complexity.
1. Intelligent slice provisioning
- Service customization: The system uses data collection and machine learning to deeply mine service features and provide customized and securely isolated private slice networks.
- Network planning: The system comprehensively analyzes available resources of the entire network and uses AI technology to continuously train and optimize the algorithm to rapidly convert service demands into network demands, effectively solving the conflict between differentiated SLA and network construction costs.
- Model design: Based on the analysis result of the AI training platform, the system intelligently orchestrates and schedules virtualized resources, and automatically outputs templates of slice lifecycle, policy rules, and slice optimization deployment.
- Automatic deployment: Combined with the automatic integration deployment tool and slicing model, the system automatically instantiates resources at all layers, matches test scenarios and cases intelligently, and automatically performs slicing tests. The deployment period is shortened from several weeks to several days.
- E2E service activation: According to the configuration template definition, the system automatically splits configuration parameters to each subnet, executes automatic parameter calculation to form a batch script, and automatically activates services through the configuration channel.
2. Intelligent Slicing SLA Assurance
Network slicing assurance means to guarantee the SLA required by users. The intelligent QoS service capability can intelligently analyze service requirements, network capabilities, and user features, make multi-standard decisions, and introduce QoS supervision feedback, to achieve closed-loop SLA assurance.
- QoS capability assurance: The system collects massive service data (such as service types and time requirements), network data (such as the number of connections, load, flow rate, and delay), and user data (such as subscriber level, communication habits, time, and location). Through intelligent analysis and determination, the system evaluates current service experience in real time, and forms one or more optimal QoS parameters collections, to carry out the best decision and control.
- QoS differentiated service: The system makes intelligent determination based on time, location, access service, user communication habits, user subscription requirements, and real-time network load pressure to form the best matching QoS control parameters and provide real-time differentiated services for users.
- QoS prediction and early warning: Based on massive data collection, modeling, and analysis, the system carries out QoS prediction and provides early warning of QoS capability for extreme cases, serving as reference for O&M assurance actions, such as terminating services in advance and changing service operation. For example, based on the neural network and linear regression algorithm, the system can predict the growth rate in the same period, analyzes peak/average traffic, and predicts network congestion, to implement dynamic scheduling or traffic acceleration.
3. Intelligent Closed-loop Slice O&M
To efficiently manage network slices and reduce O&M complexity and costs, the slice management system must have intelligent closed-loop assurance capabilities such as network self-perception and self-adjustment.
At present, network policy is still statically configured by manual, which ignores the actual network condition. After the AI is introduced, the system can perform intelligent analysis and determination based on time, location, and mobility characteristics, traffic, congestion level, and load status in the network. According to dynamic slice management policies output by the AI training platform, it implements intelligent scheduling.
In addition, intelligent real-time/history analysis provides reference data such as health score, anomaly detection and prediction, and root cause analysis. Based on such data, the system performs capacity optimization, configuration optimization, resource scaling, and problem location, achieving closed-loop slice optimization.
Intelligent Closed-Loop O&M of 5G Slicing
4. Intelligent Slice Trouble Shooting
1) Intelligent trouble shooting
The system analyzes time, location, event description and other multidimensional features of a slice alarm, and identifies alarm clue relationship based on history frequency information, cross-NE information, inter/intra-network information, and intra-service association information. Based on current alarms, statistics, logs and other information as well as trained rules, the system deduces the matched alarm root cause.
Intelligent trouble shooting includes training process, derivation process, and closed-loop optimization.
- Training Process
- Extracting data
- Cleaning data and removing invalid data
- Normalizing format and partitioning data to form a transaction data set for mining association
- Running algorithm: Based on resource relationship, alarm code, and time window, the system makes comprehensive judgement through the AI algorithm to establish the knowledge of primary-secondary alarm relationship.
- Analyzing result: The system creates corresponding RCA rules in accordance with the obtained knowledge and stores the rules in the rule library.
- Derivation process: The system monitors alarms in real time, periodically samples resources and configuration data, and uses learned rules to make comprehensive judgement on alarm data, resource data, service bearer relationships, and time sequence in the existing network, to find out the root cause and fix the fault automatically or by prompting the O&M personnel.
- Closed-loop optimization: The system updates, modifies, and improves the rule library in accordance with actual rule application or expert judgment.
2) Effect evaluation
The effect of intelligent trouble shooting is measured by the number of effective alarm root cause rules and the alarm compression ratio, or evaluated indirectly through the reduction rate of the number of work orders. AI-driven intelligent alarm positioning can reduce generally over 60%.
5G smart slicing networks will experience three phases: intra-domain exploration, cross-domain integration, and high autonomy.
Firstly, each sub-field of the 5G network shall be integrated with the AI to provide preliminary intelligence in network resource allocation and other fields based on big data and machine learning.
Secondly, with the development of technologies, AI will be able to learn the big data of 5G networks across fields, and integrated intelligence will emerge in some sub-fields to achieve intermediate intelligence.
Finally, with the rapid development of 5G and AI technologies, network-wide coordination and high autonomy will be realized, greatly improving network lifecycle management efficiency, and achieving advanced intelligence based on the intentions of human controlling networks.
It is foreseeable that the combination of AI and 5G slicing network will produce a dazzling spark and promote the rapid development and evolution of networks.
YAN LIANG, MANO Product System Architect, ZTE Corporation