Member-only story

From Big Data to Good Data

6 min readJun 14, 2021

What does it take to successfully develop AI in MLOps? Nothing but good quality data.

· 1. Create tight data feedback loops‌
· 2. Establish quality gates for each stage‌
· 3. Know what you don’t know‌
· Conclusion

A recent talk by Andrew Ng on data-centric AI spurred a live discussion among AI practitioners on the importance of data quality. The term MLOps was coined by Google to describe a set of practices to bring ML experiments to production. In contrast to DevOps, MLOps differs in three different ways.‌

Code + Model + Data: In traditional software development, what you are shipping is the code. Thus the DevOps pipeline focuses on ensuring that the code is well-tested by covering as many test cases as possible. In MLOps, the code is often merely a medium to ship a model. The model in turn is highly dependent on the training data. For this reason, ML practitioners spend the most time refining their models and preparing the training data.
Highly experimental: The process of developing an ML model is trial-and-error, usually starting from Jupyter notebooks and it continuously evolves throughout the ML lifecycle until the product is shipped. The data scientists start from a known baseline model with a new set of data, but then they might try…

From Big Data to Good Data

Written by Changsin Lee

No responses yet