Sunil M Anirudhan

Project Analysis

CIFAR-10 Image Classification with CNN & Transfer Learning

About Project

🚨 Real-World Challenges in Image Classification

Small input size: CIFAR-10 images are only 32×32, limiting detail and context
Overfitting: Models often memorize training data rather than generalize
Poor robustness: Struggle with varied lighting, backgrounds, blur, or webcam noise
Deployment hurdles: Real-time webcam predictions suffer from latency and performance issues, especially on low-resource hardware

Problem Statement

🚨 Real-World Challenges in Image Classification

Small input size: CIFAR-10 images are only 32×32, limiting detail and context
Overfitting: Models often memorize training data rather than generalize
Poor robustness: Struggle with varied lighting, backgrounds, blur, or webcam noise
Deployment hurdles: Real-time webcam predictions suffer from latency and performance issues, especially on low-resource hardware

Objective

🎯 Project Objective

To build a robust and deployable image classification system capable of performing well in real-world scenarios.

🛠️ Key Components

Convolutional Neural Networks (CNNs): Custom or pre-trained models for baseline performance
Data Augmentation: Improve generalization with transformations (e.g., flip, zoom, shift, blur)
Transfer Learning: Utilize ResNet50 with fine-tuning for efficient feature extraction
Real-Time Prediction: Capture and classify images from webcam stream
Deployment: Integrated into a web application using Django (optionally with FastAPI for backend inference)

Proposed Solution

🧠 Model Training & Experiments

Trained a custom CNN on the CIFAR-10 dataset
Experimented with:
- Optimizers: SGD, RMSProp, Adam
- Batch Sizes: 16, 32, 64
- Dropout Rates: 0.1, 0.2, 0.3, 0.5
- Model Variants: Deeper vs Wider architectures
Applied data augmentation (rotation, zoom, contrast, horizontal flipping)
Implemented Transfer Learning with ResNet50:
- Adjusted for CIFAR-10's 32×32 input size
- Froze initial layers to retain pre-trained weights

Deployment

Deployed the best model using a Django-based web interface
Tested real-time webcam input to perform live image classification

Technologies Used

Python, TensorFlow/Keras: CNN model development and training
Pandas, NumPy, Matplotlib: Data preprocessing and visualization
ResNet50 (Transfer Learning): Used as an ImageNet-pretrained backbone
Django: Web interface and deployment
OpenCV: Real-time webcam image capture and handling
Google Colab (T4 GPU): Model training and experimentation environment

Challenges Faced

Slow Training: Each training run took over 589 seconds, especially with deeper CNN architectures
Overfitting: Baseline CNN showed signs of overfitting despite using regularization and dropout
Transfer Learning Issues: ResNet50 underperformed due to input size mismatch (32×32 vs. 224×224)
Heavy Deployment: Django web app crashed under memory load during real-time webcam predictions
Real-time Noise: Webcam inputs had noise, blur, and resolution issues, leading to poor classification results
Compute Requirements: Fine-tuning ResNet50 required image resizing and high compute resources

Methodology

🧪 CNN Model Testing (24 Combinations)

Batch Sizes: 16, 32, 64
Optimizers: SGD, RMSProp, Adam
Dropout Rates: 0.1, 0.2, 0.3, 0.5
Model Variants:
- Deeper: 3 convolutional layers
- Wider: 2 layers × 64 filters
Best Configuration: RMSProp + Deeper model + Dropout 0.1
Best Validation Accuracy: 56.86%

🔄 Data Augmentation

Applied: Horizontal flip, random rotation, zoom, contrast enhancement
Outcome: Marginal overfitting reduction, but short-term accuracy dropped

🔁 Transfer Learning (ResNet50)

include_top=False, added custom dense layers with GlobalAveragePooling
Frozen pretrained layers to preserve learned features
Validation Accuracy: 23.48% (poor due to CIFAR’s small input size)

Validation Insights

Performance order: Baseline CNN > Augmentation > Transfer Learning
Deeper CNNs outperformed wider ones
RMSProp was the most stable optimizer across runs
Clear overfitting observed via divergence in training and validation curves

Result / Outcome

📊 Configuration Comparison

Configuration	Accuracy (Train / Validation)	Remarks
Best CNN (RMSProp, 3-layer)	78.7% / 73.5%	Best overall performance
CNN + Augmentation	65.0% / 54.8%	Slight overfitting, improved robustness
Transfer Learning (ResNet50)	16.6% / 23.4%	Poor performance due to input size mismatch

Model Insights

Confusion observed between visually similar classes (e.g., cats vs dogs, trucks vs cars)
Frogs and airplanes were classified with the least error
Webcam input: Lower accuracy due to blur, lighting, and low resolution
Deployment: Initial version built on Django; planned upgrade to Django + FastAPI

Project Analysis

CIFAR-10 Image Classification with CNN & Transfer Learning

About Project

Problem Statement

Objective

Proposed Solution

Technologies Used

Challenges Faced

Methodology

Result / Outcome

EDA

ML MODEL

Choose Your Language

Project Analysis

CIFAR-10 Image Classification with CNN & Transfer Learning

About Project

Problem Statement

Objective

Proposed Solution

Technologies Used

Challenges Faced

Methodology

Result / Outcome

EDA

ML MODEL