Telco networks and services are undergoing a massive transformation.
We are entering a new era of intelligent, high-bandwidth, and always-on connections. Telcos are uniquely placed to evolve and become the main value engines for innovative digital capabilities across industries and verticals. Already, they have invested heavily in upgrading and virtualizing their networks by deploying advanced network technologies for 5G and FTTx (fiber to the x, that is, any destination) infrastructure to deliver the bandwidth-hungry digital features desired by customers.
Meanwhile, modern technologies, network virtualization, and cloudification have exponentially increased the complexity of network and service operations. These complex new network architectures and dynamic service demands are creating new business models that require a complete rethink of the way telcos approach service assurance. This reinvention is necessary to deliver the desired outcomes of quality and experience while adhering to the stringent demands of reliability, scalability, security, and resiliency for networks and services.
Technologies such as artificial intelligence (AI) will play a key role in guaranteeing next-generation service assurance. According to TCS AI for Business Study, most companies are optimistic about the potential impact of AI on their business. As many as 72% of them are currently reworking or planning to rework their business strategy or operations to leverage the transformative power of AI.
There are multiple drivers and incentives for network transformation and service assurance with AI.
A few of the most pressing ones are:
Traditional network assurance broadly covers basic and enhanced fault, performance, and service management functions.
These functions rely heavily on skilled, high-cost human operators performing manual troubleshooting and network correlation tasks and/or developing specialized engineering and business rules. The operators enable service assurance applications to perform limited automation for assisting manual interventions.
AI-led next-gen service assurance will remove or significantly reduce the dependency on human intervention and introduce advanced data-driven highly automated decision support for service assurance. The new system would leverage comprehensive data ingestion from network, infrastructure, and applications while applying machine learning and AI algorithms to achieve service observability that delivers the highest levels of automation, service insights, and business intent-driven operations. It will also enable telcos to develop AIOps capabilities for the modern digital ecosystem, including cloud-native and software-enabled networks. Such networks can deliver broadband, 5G, fiber, and telco cloud services while managing the complexity of assuring existing incumbent networks that supply adjacent or underlay services to the new digital infrastructure. Another business outcome that telcos can hope to realize with next-gen service assurance is reduced operating costs. To illustrate, such transformations have led to some companies achieving 30% or more of total cost of operation (TCO) savings, while enhancing overall customer experience. This, in turn, protects service revenue streams.
To develop AI-enabled next-gen service assurance and realize desired expected business outcomes, companies need to:
A future-forward service assurance strategy is crucial to gaining a competitive edge with customers.
It also empowers operations teams to continually adapt to evolving network technologies and keep up with the changing needs of business. But what should telcos focus on when developing their service assurance evolution roadmap? To ensure they become future-ready organizations with the capabilities to offer great customer experience, network reliability, and quality service levels, they should:
Introducing AI technologies into the complex service assurance environment of telcos will require concerted efforts.
AIOps, large language models (LLMs), and AI/ML can be used to bridge organizational silos and increase collaboration across IT, engineering, operations, and business groups. Data science and prompt engineering experts can work with diverse stakeholders to identify relevant business and operational AI use cases, determine the corresponding data available or that must be retrieved from various sources, and then create the corresponding data stores with easily accessible data models. The objective is to select the appropriate algorithms and design effective AI/ML models to derive required business insights that support business and operations decisions. For example, GenAI and LLMs can derive operational insights from unstructured text fields normally captured in historical network and service tickets. Telcos can then augment the data with knowledge captured in network engineering manuals and documented network operations and maintenance procedures. These insights can be used to recommend the appropriate course of action to resolve network problems or trigger automated algorithms to correct service-affecting issues.
The creation and continuous enhancement of AIOps and LLMs require disciplined fine-tuning of algorithms for accuracy and performance. This fine-tuning can be carried out using systematic business requirements analysis, development, testing, and deployment in a DevOps continuous integration, continuous development, or continuous testing approach to achieve maximum speed and agility. The interactions among data, algorithms, models, and workflows in the development of AI use cases must be considered carefully in the AIOps and GenAI release management processes to minimize the risks of model corruption that can result in erroneous, diverging, or hallucinatory insights. This disciplined release management will require robust AI model governance and underlying data governance to maximize the business value of AI/ML/GenAI adoption.
The rollout of next-gen service assurance across the network and IT ecosystem of a telco must be gradual and evolve progressively.
This ensures minimal disruption to business and improves the acceptability of new ways of working across an organization. Organizations need to focus on three key aspects of handling network and service data to transform service assurance and gradually break down operational and tool siloes. These are:
Get and store the data: The aim is to reduce dependencies on fragmented network and IT monitoring tools by migrating data collection into a single data ingestion and distribution platform. In addition to ingesting network and service data streams, this enables expanding the use of contextual, environmental, and other information for enhanced network and service state awareness. It also allows for algorithmic interventions near the source of data to limit cloud storage consumption and data transfer costs. An example of this is the suppression of white noise and informational alerts at the source. It is designed to retain technical domain-level autonomy for data while allowing governed data sharing and usage for end-to-end correlation and business insights use cases. A case in point: A leading European telco deploys a master data lake to store all relevant RAN, Core, and transport networks, services, and environmental data (both structured and unstructured), which is then used to enable deep observability and AI-driven analysis of network health and service quality in real time.
Simplify access and serve the data: Focus on creating a unified visualization, dashboarding, and data distribution application programming interface layer to make available all data feeds from the cloud and IT infrastructure and incumbent modern OSS and IT tools, as well as any relevant legacy networks and tools. The consumers of the data may be business stakeholders, human operators, or machines and algorithms providing operations and business support functions as well as telco partner systems. Another key activity is the creation of data access policies and a data governance structure to mitigate data security implications and avoid data privacy violations while supporting data residency requirements.
Use AIOps and AI algorithms to interpret the data: Combine service assurance expert-created business logic with AI-discovered rules and develop the corresponding automatic trouble resolution workflows. This will require sunsetting legacy manually driven IT and OSS applications. A greenfield approach is preferred for implementing next-gen AI-based service assurance, where specific high-value and modern services are installed and monitored first to provide quick wins and enable advanced modern services quickly. A secondary and complementary approach leverages existing legacy tools by assessing their business value and determining if their respective product roadmaps can incorporate GenAI and AIOps as a possible upgrade path.
The expected outcome of this transformation is to deliver a common and flexible service assurance stack that breaks down operational and data siloes, and lowers overall TCO across networks and IT.
AI-enabled next-gen service assurance will help telcos address major shifts in their business ecosystem and operational environments.
It will also help them better manage their future networks and services. In summary, AI for telco service assurance will deliver the following outcomes:
The OSS transformations discussed in this paper will allow telcos to become truly data-driven organizations with a secure free flow of information to all stakeholders, whether human or machine, while bringing a strong culture of end-to-end service responsibility and customer experience focus.