Long-Tailed Object Classification with VGG16

Visual Intelligence & Deep Learning Dec 10, 2025 Published project

Transfer learning for imbalanced visual recognition

This project studies visual classification under a realistic long-tailed distribution. It compares a handcrafted-feature baseline with VGG16 transfer learning, then evaluates how augmentation, tuning, and fine-tuning affect class-level performance.

View source code Back to Projects

PythonTensorFlowKerasVGG16HOGSVMOpen Images

Share project

Challenge

Real-world visual datasets often have uneven class distributions and varied backgrounds.
A classical baseline is useful, but may struggle with complex visual variation.
The evaluation needs to consider class balance, not only overall accuracy.

System architecture

Open Images dataCar · Dog · Person

BaselineHOG features + SVM

Transfer learningVGG16 with custom head

EvaluationAccuracy and class metrics

Data and inputs

Open Images data for three classes: Car, Dog, and Person.
2,402 training images and 598 validation images.
Imbalanced, in-the-wild visual samples with diverse backgrounds and viewpoints.

Technical approach

Build a HOG+SVM baseline to establish traditional visual-feature performance.
Use VGG16 as a pretrained feature extractor with a custom dense classification head.
Apply augmentation, dropout, and learning-rate tuning to improve generalization.
Run a fine-tuning experiment by unfreezing deeper VGG16 layers.

Evaluation and results

Key indicators

2,402 training images / 598 validation images

Key indicators

HOG+SVM accuracy 67.00%

Key indicators

Tuned VGG16 accuracy 92.00%

HOG+SVM reached 67.00% accuracy and struggled with visual variation.
The tuned VGG16 workflow reached 92.00% accuracy with stronger class balance.
Fine-tuned VGG16 reached 89.30%, strong but lower than the tuned frozen-transfer setup.
Class-level precision, recall, and F1 stayed balanced across Car, Dog, and Person.

Implementation and code

Implementation focus

The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.

Source code

The code is available for exploring the implementation details and extending the experiment when needed.

Open source code

Scope and responsible use

The project is a focused modeling and evaluation study. Broader use should be supported by validation on additional data, robustness checks, monitoring, and domain-specific evaluation.

Future development

Add more classes and stronger long-tail imbalance.
Compare VGG16 with newer lightweight architectures.
Expand interpretability with saliency maps and failure-case review.

Technical contribution

The project demonstrates how to compare traditional and deep-learning approaches under realistic visual-data imbalance while using class-level evaluation to avoid misleading conclusions.