How many images do you need for object detection?

Changsin Lee
8 min readDec 29, 2021

What is the minimum amount of data? How do you deal with unbalanced data issues?

· 1. YOLOv5 Model
· 2. The Korean Sidewalk Dataset
· 3. Minimum Dataset Size
· 4. Countering the Class Imbalance
· 5. How to Update the Model
· Conclusion
· Reference

From the Korean Sidewalk dataset

In this article, I want to answer three questions about the training dataset of an object detection model:

  1. what is the minimum dataset size for the maximum accuracy gain?
  2. How do you handle the class imbalance issue?
  3. What is the best way to update an already trained model with new data?

The importance of the first question cannot be emphasized enough. Pre-processing data (collecting, cleaning, and annotating data) accounts for more than 80% of AI development. So, ideally, you want to invest resources with the maximum return.

The problem of class imbalance is a common problem for any real AI project. The accuracy of a class with a lot of data can be trained pretty well while it is difficult to achieve good accuracy for infrequent objects. Under-sampling and over-sampling are common techniques used to solve the class…

--

--

Changsin Lee

AI/ML Enthusiast | Software Engineer | ex-Microsoftie | ex-Amazonian