Member-only story
From Big Data to Good Data
6 min readJun 14, 2021
What does it take to successfully develop AI in MLOps? Nothing but good quality data.
· 1. Create tight data feedback loops
· 2. Establish quality gates for each stage
· 3. Know what you don’t know
· Conclusion
A recent talk by Andrew Ng on data-centric AI spurred a live discussion among AI practitioners on the importance of data quality. The term MLOps was coined by Google to describe a set of practices to bring ML experiments to production. In contrast to DevOps, MLOps differs in three different ways.
- Code + Model + Data: In traditional software development, what you are shipping is the code. Thus the DevOps pipeline focuses on ensuring that the code is well-tested by covering as many test cases as possible. In MLOps, the code is often merely a medium to ship a model. The model in turn is highly dependent on the training data. For this reason, ML practitioners spend the most time refining their models and preparing the training data.
- Highly experimental: The process of developing an ML model is trial-and-error, usually starting from Jupyter notebooks and it continuously evolves throughout the ML lifecycle until the product is shipped. The data scientists start from a known baseline model with a new set of data, but then they might try…