Visual Intelligence & Deep Learning Dec 10, 2025 Published project
Long-Tailed Object Classification with VGG16

Transfer learning for imbalanced visual recognition

This project studies visual classification under a realistic long-tailed distribution. It compares a handcrafted-feature baseline with VGG16 transfer learning, then evaluates how augmentation, tuning, and fine-tuning affect class-level performance.

PythonTensorFlowKerasVGG16HOGSVMOpen Images

Challenge

  • Real-world visual datasets often have uneven class distributions and varied backgrounds.
  • A classical baseline is useful, but may struggle with complex visual variation.
  • The evaluation needs to consider class balance, not only overall accuracy.

System architecture

Open Images dataCar · Dog · Person
BaselineHOG features + SVM
Transfer learningVGG16 with custom head
EvaluationAccuracy and class metrics

Data and inputs

  • Open Images data for three classes: Car, Dog, and Person.
  • 2,402 training images and 598 validation images.
  • Imbalanced, in-the-wild visual samples with diverse backgrounds and viewpoints.

Technical approach

  • Build a HOG+SVM baseline to establish traditional visual-feature performance.
  • Use VGG16 as a pretrained feature extractor with a custom dense classification head.
  • Apply augmentation, dropout, and learning-rate tuning to improve generalization.
  • Run a fine-tuning experiment by unfreezing deeper VGG16 layers.

Evaluation and results

Key indicators

2,402 training images / 598 validation images

Key indicators

HOG+SVM accuracy 67.00%

Key indicators

Tuned VGG16 accuracy 92.00%

  • HOG+SVM reached 67.00% accuracy and struggled with visual variation.
  • The tuned VGG16 workflow reached 92.00% accuracy with stronger class balance.
  • Fine-tuned VGG16 reached 89.30%, strong but lower than the tuned frozen-transfer setup.
  • Class-level precision, recall, and F1 stayed balanced across Car, Dog, and Person.

Implementation and code

Implementation focus

The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.

Source code

The code is available for exploring the implementation details and extending the experiment when needed.

Open source code

Scope and responsible use

The project is a focused modeling and evaluation study. Broader use should be supported by validation on additional data, robustness checks, monitoring, and domain-specific evaluation.

Future development

  • Add more classes and stronger long-tail imbalance.
  • Compare VGG16 with newer lightweight architectures.
  • Expand interpretability with saliency maps and failure-case review.

Technical contribution

The project demonstrates how to compare traditional and deep-learning approaches under realistic visual-data imbalance while using class-level evaluation to avoid misleading conclusions.