Performance monitoring and optimization of machine learning models in production environments
Synopsis
Significant progress has been made toward deploying machine learning (ML) models in real-world production environments. Such models are able to analyze large datasets, generate predictions, or classify and categorize data in seconds or minutes. They are capable of automating high-stakes, important tasks, such as transcribing digital recordings of court proceedings, automatically approving loans for thousands of customers every minute, and identifying failures in manufacturing machinery through sensor data analysis. Highly successful models may soon be analyzed and tuned by automated systems for optimization and performance monitoring in the same way that highly optimized systems for hyperparameter tuning exist today. Given the investment, promise, and pervasiveness of ML, there is a compelling need to explore effective techniques and systems to operationalize performance monitoring, diagnostic analysis, and performance optimization of such models efficiently, accurately, and effectively (Urs & Zaharia, 2019; Sharma, 2020; Mahmoud, 2021).
Even though many mature ML frameworks exist today to build data processing pipelines, train, and deploy ML models into production settings, there currently exists a lack of automated software systems for model performance monitoring and optimization. Today’s ML systems often lack adequate ML model tuning, performance monitoring, and diagnostic analysis capabilities built into them. As a result, ML engineers spend a lot of time and effort building and deploying custom solutions to achieve these ML model lifecycle capabilities. Such custom systems are often hard to implement, fragile, difficult to maintain, and may not necessarily be scalable to the ML workload for a given problem. In addition, many of them may lack adequate security and privacy protections against unauthorized access, or robust, end-to-end, performance monitoring and analytic capabilities. Many of the significant business costs and risks of deployed models stem from these fundamental limitations. A survey of companies using ML technology found that development and operational challenges came up frequently, and that market need is sufficient for companies to pay for solutions. Performance monitoring of deployed models is a fairly broad and active area of research (Urs & Zaharia, 2019; Sharma, 2020; Mahmoud, 2021).