How many images do you need for object detection?
What is the minimum amount of data? How do you deal with unbalanced data issues?
· 1. YOLOv5 Model
· 2. The Korean Sidewalk Dataset
· 3. Minimum Dataset Size
· 4. Countering the Class Imbalance
· 5. How to Update the Model
· Conclusion
· Reference
In this article, I want to answer three questions about the training dataset of an object detection model:
- what is the minimum dataset size for the maximum accuracy gain?
- How do you handle the class imbalance issue?
- What is the best way to update an already trained model with new data?
The importance of the first question cannot be emphasized enough. Pre-processing data (collecting, cleaning, and annotating data) accounts for more than 80% of AI development. So, ideally, you want to invest resources with the maximum return.
The problem of class imbalance is a common problem for any real AI project. The accuracy of a class with a lot of data can be trained pretty well while it is difficult to achieve good accuracy for infrequent objects. Under-sampling and over-sampling are common techniques used to solve the class…