Member-only story

From Big Data to Good Data

Changsin Lee
6 min readJun 14, 2021

--

What does it take to successfully develop AI in MLOps? Nothing but good quality data.

· 1. Create tight data feedback loops‌
· 2. Establish quality gates for each stage‌
· 3. Know what you don’t know‌
· Conclusion

Photo by Mike Benna on Unsplash

A recent talk by Andrew Ng on data-centric AI spurred a live discussion among AI practitioners on the importance of data quality. The term MLOps was coined by Google to describe a set of practices to bring ML experiments to production. In contrast to DevOps, MLOps differs in three different ways.‌

  1. Code + Model + Data: In traditional software development, what you are shipping is the code. Thus the DevOps pipeline focuses on ensuring that the code is well-tested by covering as many test cases as possible. In MLOps, the code is often merely a medium to ship a model. The model in turn is highly dependent on the training data. For this reason, ML practitioners spend the most time refining their models and preparing the training data.
  2. Highly experimental: The process of developing an ML model is trial-and-error, usually starting from Jupyter notebooks and it continuously evolves throughout the ML lifecycle until the product is shipped. The data scientists start from a known baseline model with a new set of data, but then they might try…

--

--

Changsin Lee
Changsin Lee

Written by Changsin Lee

AI/ML Enthusiast | Software Engineer | ex-Microsoftie | ex-Amazonian

No responses yet