With several new ML subfields and applications emerging at a rapid pace, businesses are even more eager to reap the benefits of ML and drive competitive advantage. Achieving this objective should be easy, provided organizations can seamlessly scale their ML initiatives beyond the pilot stage.
McKinsey Technology Trends Outlook for 2022 lists ‘Industrializing ML’ as one of the top 14 trends in technology. However, what’s surprising is that 72% of the organizations, including manufacturing businesses, have been unable to scale their AI-ML initiatives.
Often, business units in large or medium-sized organizations deploy teams of analysts to gather quick insights. Such a setup helps reduce their dependency on the central IT organization for their day-to-day needs. While business analytics teams have traditionally used tools like MS-Excel, MS-Access, and Qlik Sense to rapidly draw business insights from the latest data available, they are increasingly tapping into machine learning to optimize outcomes. But the problem is, such teams continue to work largely in siloes, with little collaboration and oversight from centralized IT or the data organization.
The siloed setup results not only in a plethora of analytics tools but also results in redundant efforts across the organization. What’s more, most business analytics teams lack the resources to operationalize their models into production-grade systems. This means the centralized IT or data group has to step in to deploy their machine learning models into production. The problem with this approach? The models developed by business analytics teams typically utilize small and often outdated datasets, making them less likely to scale well with newer, broader, and larger datasets available in production systems. Not only that, when these models are re-engineered to integrate with the production dataset, they often do not perform at the same level as they did in their original development environment.
That’s because the difference between experimental or development setup and the production setup does not merely stem from the size of the dataset or data recency. Additional factors such as production performance requirements, security, access control, and stable libraries must also be considered. Successful transfer from the development to the production environment also demands dedicated roles, depending on factors such as ML team size, number of ML applications, and release frequency. Many ML organizations task their data scientists with the additional responsibility of production release management. This takes precious time away from data scientists and severely limits their ability to work on developing more value-creating solutions. Other organizations may require their data engineers to take on this responsibility. But most data engineers lack a complete understanding of the ML application lifecycle and DevOps.
Successful operationalization of ML applications requires not only a well-defined ML operationalization process and dedicated roles but also an enabling toolset. Relying on existing DevOps toolset will not suffice as they are not fully equipped to address unique ML requirements such as data-dependent algorithms, data lineage tracking, model versioning, model performance monitoring, drift detection, model explainability and so on.
The good news is several open-source and commercial MLOps (machine learning operations) tools are emerging that can assist businesses in creating a suitable MLOps stack – if the organization’s unique requirements are clearly articulated. Using these new tech stacks, organizations can simplify data management, model development, and deployment to scale easily across the full ML workflow cycle.
Successfully industrializing ML, however, requires an ideal mix of ingredients, such as the right organizational structure, clear release process, and well-defined roles. Selecting the right MLOps technology stack and securing stakeholders’ buy-in for a suitable ML governance model is a good place to start. Once the proper tools and environment are set up, it’s easy to scale ML adoption across the organization – without impacting existing operations.