The original article by Sanne Bouwman is in Dutch and can be found here.
Everywhere around us, we see the need arise to apply AI and Machine Learning to increase efficiency and/or speed up decision making. However, we see that it takes a long time for the cool AI solutions, where a lot of work has been put in, to be put into production. Or worse, they are not put into production at all! At the same time, a model that was lucky enough to be in production may turn out to make predictions with degrading accuracy. Not only does this have a negative impact on the business, but it also betrays trust in AI. It also negatively impacts the credibility of the employees who are involved in the development of an AI solution. What a shame! Unfortunately, we must conclude that pushing ML models into production and maintaining them is not an easy task.
Common causes of the forementioned issues prove to be a factor of increasing complexity on the fields of modelling, the many available frameworks, computational challenges, team compositions, and so on. This is exactly the problem that MLOps addresses. MLOps offers the structure that is needed to be successful in the implementation of your data mission and vision, in which AI plays a crucial part.
MLOps vs. DevOps
MLOps is the standardization and streamlining of Machine Learning (ML) lifecycle management, where we make use of, amongst other things, traditional DevOps processes within the context of ML.
However, there are some fundamental differences between the lifecycle management of a Machine Learning model and that of traditional software. Dependency is an important aspect: not only is data continuously subject to change, but the business requirements are also shifting all the time. This influences the way in which we test and monitor results. The fact that we run through the ML lifecycle with people from different disciplines, offers a consecutive challenge. Data engineers, domain/business experts, data scientists, IT teams: no two of them use the exact same tools or have the same skillsets. For example, most data scientists aren’t practiced software engineers. Their expertise lies in building and evaluating a model, not in building an application; which is why MLOps tends to be more experimental by nature. Data scientists sample different features, parameters, and models. With each iteration, they are required to manage the code base, while maintaining reproducibility of the results.
Why is MLOps a must-have?
To answer this question, it is important to review the difficulties we experience in the absence of a solid MLOps implementation. You will soon realize that — especially in these unpredictable, unknown, and constantly changing times — problems arise in model quality and continuity. If left unchecked, you risk that your model outputs biased predictions, which in turn results in negative business impact. Furthermore, deploying Machine Learning models to production may result in a conflict with GDPR regulations or other compliancy agreements that are intended to protect your customers from unlawful use of data. A good MLOps implementation contains the right means and functionality to be assured that these aspects are addressed and correctly logged, and thus are explainable whenever an audit takes place.
What does a solid MLOps implementation look like?
The diagram below shows an overview of the minimum requirements of an MLOps implementation. In practice, some additional dimensions will rear their head. We will elaborate on these in our upcoming blog posts.
When your organization shifts towards adopting MLOps , we advise you to start with the bare minimum of components. They immediately impact the MLOps strategy and are tool and framework agnostic. Depending on the organization and the measure of maturity of the process, not all aspects of MLOps — such as CI/CD — need to be in place in a fully automated way. The architecture of MLOps is modular and offers the possibility of phased deployments of separate modules regardless of the current maturity and tech landscape of the organisation. Carrying out a gap analysis helps to sketch a roadmap from the current situation towards the desired MLOps architecture and is a crucial first step in any MLOps implementation. To this end, Sogeti has developed an MLOps Maturity Assessment based on best practices and MLOps principles that provides an insight into the maturity of the organisation and helps map out the gap analysis.
Each of the modules in the MLOps architecture has a defined function in the development cycle. We are excited to elaborate on these in the forthcoming blog posts where we will take a closer look at the modules and discuss them within the context of MLOps.
About Tijana Nikolic Nikolic
Tijana is an AI Specialist in Sogeti Netherlands AI CoE team with a diverse background in biology, marketing, and IT. Her vision is to bring innovative solutions to the market with a strong emphasis on privacy, quality, ethics, and sustainability. To this end, she has worked on development of several solutions, most notable being ADA for synthetic data that won the Innovation of the Year award in 2020, Quality AI Framework for ethical AI development, and the Carbon Estimator for cloud CO2 footprint estimation – all of these having a strong business impact. She carries the Young Sogetist of the Year title for 2020, and is a part of the YS EdUnite committee which focuses on educating and uniting colleagues in Sogeti. Finally, she is an advocate for open-source contributions and knowledge sharing outside of Sogeti, and is leading the ethical AI group in Serbian AI Society.
More on Tijana Nikolic Nikolic.