I’m an AI developer and full-stack engineer with expertise in machine learning, deep learning, and data-driven web applications. I specialize in building intelligent systems using Django and FastAPI, integrating real-world models like spam detection, stock prediction, recommender systems, and image classification.
I’ve deployed over 10 ML-powered web apps and designed technical strategies using stock indicators and backtesting methods. My approach blends clean code, smart algorithms, and scalable deployment—turning complex ideas into functional, user-friendly solutions.
EPITA AND EM NORMANDIE | 2024-2026
MSc AI For Marketing Stratergy
Manipal University | 2023-2024
Currently pursuing
Ict Kerala | 2020-2021
Certification in data science and analytics
St Thomas | 2015-2018
BSC CHEMISTRY
This project aims to develop an intelligent spam detection system using deep learning techniques. The objective is to accurately classify emails as either "spam" or "not spam" by building and deploying a robust model. The project involves several stages, starting with dataset acquisition, followed by data preprocessing, model training, and evaluation. The best-performing model and preprocessing pipeline are then saved and integrated into an interactive web application using Django and Streamlit.
Email and SMS platforms are plagued with spam, threatening user security and productivity. Traditional rule-based or statistical filters often fail to adapt to new spam tactics, especially when faced with obfuscated language or minimal messages.
<div class="mb-16 fw-bold">🏆 Best Performing Model</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Model:</strong> Convolutional Neural Network (CNN)</li> <li><strong>Test Accuracy:</strong> 97.83%</li> </ul> <div class="mb-16 fw-bold">📈 Evaluation Metrics (Spam Class)</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Precision:</strong> 94%</li> <li><strong>Recall:</strong> 98%</li> <li><strong>F1-Score:</strong> 96%</li> </ul> <div class="mb-16 fw-bold">🧮 Confusion Matrix</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>True Positives:</strong> 560</li> <li><strong>False Positives:</strong> 33</li> <li><strong>False Negatives:</strong> 9</li> <li><strong>True Negatives:</strong> 1330</li> </ul>
We decided to analyze the players' data from the game FIFA 19. There is a lot of several players from different countries, playing in different competitions. Their abilities in the game should reflect their real-world skills. The game's creators are about its efforts by creating game attributes, such as sprint speed, force bullets, endings or headers that we can express numerically. In the project, we will try to classify players into a game room based on these attributes, and positions and predict their market value in the game. Since the game does not always show players the market value of the player, but only its attributes, our model can be useful in determining the amount offered for the transfer of the player when negotiating in the game. When in the real world the abilities of the players (e.g., ending) do not bear any numerical. We will try to use the fact that in the game of expression, we can find out which attributes are important for certain gaming positions.
Football clubs face the challenge of selecting the right players from a global pool with varying skill levels, prices, and growth potential. Relying solely on scouts and intuition can lead to overpriced or underperforming signings. Clubs like Manchester United need a data-driven approach to:
caret package for model training with 10-fold cross-validation<div class="mb-16 fw-bold">Best Performing Model</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>XGBoost (Model 4)</strong> achieved the lowest RMSE: <strong>0.7217</strong></li> </ul> <div class="mb-16 fw-bold">Top 5 Suggested Player Signings</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>T. Werner</strong> – RW</li> <li><strong>F. Thauvin</strong> – RM</li> <li><strong>Jorginho</strong> – CM</li> <li><strong>D. Alaba</strong> – LB</li> <li><strong>N. Süle</strong> – CB</li> </ul> <div class="mb-16 fw-bold">Transfer Budget</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Total Budget:</strong> €186M</li> </ul> <div class="mb-16 fw-bold">Business Impact</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li>Enabled data-driven recruitment decisions</li> <li>Helped avoid high-fee, high-risk player signings</li> </ul>
The dataset can be used for:
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
The HR salary dataset from Kaggle is a valuable resource for analyzing employee compensation trends. It contains information on employee demographics, job-related details, and compensation figures. This data can be used to identify factors that influence salary, such as experience, education, and job title, and assess organisational pay equity.
<div class="mb-16 fw-bold">Best Model</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Tuned Gradient Boosting</strong></li> </ul> <div class="mb-16 fw-bold">Performance</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>R² Score:</strong> 0.76 (acceptable range for business use)</li> </ul> <div class="mb-16 fw-bold">Business Impact</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li>TCS HR can now predict salaries with improved accuracy and transparency</li> </ul>
.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Stock market forecasting often suffers from inconsistency due to noisy data, emotional trading, and the isolated use of indicators or sentiment. Traders lack a unified system that combines technical signals, sentiment analysis, and AI models to generate high-confidence trade decisions. Traditional methods based on single indicators (like MACD or RSI alone) often fail in volatile or uncertain markets.
Traditional stock prediction methods fail to combine all key market signals — technical patterns, news sentiment, financial fundamentals, and global macroeconomic factors. This fragmented view results in missed opportunities, false signals, and poor portfolio decisions.
Retail investors and analysts need a single, AI-powered platform that:
To develop a comprehensive trading and portfolio decision support system that integrates:
<ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li>Achieved <strong>over 78% accuracy</strong> in directional prediction when sentiment and technical signals aligned</li> <li>Provided <strong>5–20% improved ROI</strong> compared to basic indicator or LSTM-only models</li> <li>Dashboard allowed users to simulate strategies, visualize trades, and rank sectors effectively</li> </ul> <div class="mb-16 fw-bold">Key Capabilities Enabled</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li>Predict market direction using AI (LSTM + FinBERT)</li> <li>Confirm trades with technical indicators (MACD, RSI, EMA, support/resistance)</li> <li>Filter stocks based on fundamentals (e.g., P/E, EPS, ROE)</li> <li>Adapt strategies based on sector and global index trends</li> </ul>
Waste classification and plant disease identification are critical for sustainability and agriculture. Manual sorting is time-consuming, error-prone, and inconsistent. Automating these tasks requires accurate object detection and segmentation, even in noisy, real-world environments with class imbalance and complex backgrounds.
Waste classification and plant disease identification are critical for sustainability and agriculture. Manual sorting is time-consuming, error-prone, and inconsistent. Automating these tasks requires accurate object detection and segmentation, even in noisy, real-world environments with class imbalance and complex backgrounds.
To develop and deploy a robust deep learning system using YOLOv8 for:
<div class="mb-16 fw-bold">Results Summary</div> <table class="table text-secondary-light" style="width: 100%; border-collapse: collapse;"> <thead> <tr> <th style="text-align: left; padding: 8px; border-bottom: 1px solid #ccc;">Task</th> <th style="text-align: left; padding: 8px; border-bottom: 1px solid #ccc;">mAP@50</th> <th style="text-align: left; padding: 8px; border-bottom: 1px solid #ccc;">mAP@50–95</th> <th style="text-align: left; padding: 8px; border-bottom: 1px solid #ccc;">Precision</th> <th style="text-align: left; padding: 8px; border-bottom: 1px solid #ccc;">Recall</th> </tr> </thead> <tbody> <tr> <td style="padding: 8px; border-bottom: 1px solid #eee;">Trash Detection</td> <td style="padding: 8px;">0.989</td> <td style="padding: 8px;">0.884</td> <td style="padding: 8px;">0.973</td> <td style="padding: 8px;">0.963</td> </tr> <tr> <td style="padding: 8px; border-bottom: 1px solid #eee;">Plant Disease Segmentation</td> <td style="padding: 8px;">0.504</td> <td style="padding: 8px;">0.176</td> <td style="padding: 8px;">0.593</td> <td style="padding: 8px;">0.368</td> </tr> </tbody> </table> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px; margin-top: 16px;"> <li><strong>Mosaic augmentation</strong> significantly improved detection accuracy</li> <li>Trash detection model achieved <strong>real-time performance</strong> (2.2 ms/image)</li> <li>Segmentation model underperformed due to limited training scope</li> </ul>
Generating realistic handwritten digit images requires a model capable of learning complex data distributions from limited training samples. Traditional image generation methods often lack diversity and tend to overfit the dataset.
The key challenge is to train a GAN (Generative Adversarial Network) that can:
Generating realistic handwritten digit images requires a model capable of learning complex data distributions from limited training samples. Traditional image generation methods often lack diversity and tend to overfit the dataset.
The key challenge is to train a GAN (Generative Adversarial Network) that can:
To implement and train a Generative Adversarial Network (GAN) on the MNIST dataset that learns to generate synthetic 28×28 grayscale handwritten digit images.
Dense and Conv2DTranspose layers for feature expansion and image shaping.
Conv2D layers with
Dropout for regularization and overfitting control.
train_step()
method for adversarial updates of both generator and discriminator.
Generating high-quality, realistic, and stylistically accurate images using text prompts is a complex task. Models like Stable Diffusion can produce widely varied outputs depending on multiple generation parameters.
Understanding how each parameter impacts the output is critical for controlling quality, style, and realism in AI art and computer vision applications.
Generating high-quality, realistic, and stylistically accurate images using text prompts is a complex task. Models like Stable Diffusion can produce widely varied outputs depending on multiple generation parameters.
Understanding how each parameter impacts the output is critical for controlling quality, style, and realism in AI art and computer vision applications.
To explore and evaluate how prompt design, negative prompts, diffusion schedulers, CFG scale, and inference steps impact the output quality of a Stable Diffusion model.
Identify optimal parameter combinations that strike a balance between:
This project was executed in 6 experimental parts, each focusing on evaluating a specific parameter of the Stable Diffusion model.
| Experiment | Best Configuration | Insight/Output |
|---|---|---|
| Negative Prompting | Added: “daylight, sunny…” | Shifted sunset to night effectively |
| Scheduler Comparison | DPMSolver | Clearer and more realistic than EulerA |
| CFG Scale | 7.5 – 12.0 | Best balance between accuracy and style |
| Inference Steps | 40 | Fast + high-quality rendering |
| Prompt Optimization | CFG=12.0, Steps=40 | Best result for futuristic architecture scene |
<div class="mb-16 fw-bold">📷 Visual Output</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Epoch 1:</strong> Random noise</li> <li><strong>Epoch 10:</strong> Shapes of digits begin to appear</li> <li><strong>Epoch 30:</strong> Recognizable, sharp digits generated</li> </ul> <div class="mb-16 fw-bold">📉 Loss Behavior</div> <table class="text-secondary-light" border="1" cellpadding="8" cellspacing="0" style="border-collapse: collapse;"> <thead> <tr> <th>Epoch</th> <th>d_loss (↓)</th> <th>g_loss (↑)</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>0.45</td> <td>0.44</td> </tr> <tr> <td>10</td> <td>0.65</td> <td>0.91</td> </tr> <tr> <td>30</td> <td>0.64</td> <td>0.95</td> </tr> </tbody> </table> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px; margin-top: 10px;"> <li>Discriminator loss remained stable</li> <li>Generator loss increased as expected (indicates adversarial progress)</li> </ul> <div class="mb-16 fw-bold">🔁 Hyperparameter Tuning</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li>Learning rates tested: <code>0.0002</code>, <code>0.0001</code></li> <li>Adam β₁ values: <code>0.5</code>, <code>0.4</code></li> <li>Latent dimensions: <code>100</code>, <code>128</code></li> </ul> <p class="text-secondary-light"><strong>Best Configuration:</strong></p> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><code>lr = 0.0002</code></li> <li><code>β₁ = 0.4</code></li> <li><code>latent_dim = 100</code></li> <li><strong>Generator loss:</strong> 0.80</li> <li><strong>Discriminator loss:</strong> 0.63</li> </ul>
To build a robust and deployable image classification system capable of performing well in real-world scenarios.
ResNet50 with fine-tuning for efficient feature extractionDjango (optionally with FastAPI for backend inference)CIFAR-10 datasetResNet50:
589 seconds, especially with deeper CNN architecturesResNet50 underperformed due to input size mismatch (32×32 vs. 224×224)ResNet50 required image resizing and high compute resourcesinclude_top=False, added custom dense layers with GlobalAveragePooling<div class="mb-16 fw-bold">📊 Configuration Comparison</div> <table style="width: 100%; border-collapse: collapse; margin-bottom: 20px;" border="1"> <thead style="background-color: #f0f0f0;"> <tr> <th style="padding: 10px;">Configuration</th> <th style="padding: 10px;">Accuracy (Train / Validation)</th> <th style="padding: 10px;">Remarks</th> </tr> </thead> <tbody> <tr> <td style="padding: 10px;">Best CNN (RMSProp, 3-layer)</td> <td style="padding: 10px;">78.7% / 73.5%</td> <td style="padding: 10px;">Best overall performance</td> </tr> <tr> <td style="padding: 10px;">CNN + Augmentation</td> <td style="padding: 10px;">65.0% / 54.8%</td> <td style="padding: 10px;">Slight overfitting, improved robustness</td> </tr> <tr> <td style="padding: 10px;">Transfer Learning (ResNet50)</td> <td style="padding: 10px;">16.6% / 23.4%</td> <td style="padding: 10px;">Poor performance due to input size mismatch</td> </tr> </tbody> </table> <div class="mb-16 fw-bold"> Model Insights</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Confusion observed</strong> between visually similar classes (e.g., cats vs dogs, trucks vs cars)</li> <li><strong>Frogs and airplanes</strong> were classified with the least error</li> <li><strong>Webcam input:</strong> Lower accuracy due to blur, lighting, and low resolution</li> <li><strong>Deployment:</strong> Initial version built on Django; planned upgrade to Django + FastAPI</li> </ul>
E-commerce platforms like Amazon face the challenge of helping users navigate massive product catalogs. Without personalization, users experience choice overload and irrelevant suggestions. Many products lack complete metadata, and cold-start users/items limit collaborative approaches.
E-commerce platforms like Amazon face the challenge of helping users navigate massive product catalogs. Without personalization, users experience choice overload and irrelevant suggestions. Many products lack complete metadata, and cold-start users/items limit collaborative approaches.
To design and evaluate a hybrid recommender system that suggests beauty products on Amazon using multiple recommendation techniques:
To understand the individual performance, interpretability, scalability, and limitations of each method, and propose an ideal hybrid system suitable for large-scale deployment on platforms like Amazon.
ItemId as the keybeauty_amazon_items.csv – Product metadatabeauty_amazon_reviews.csv – Customer reviews and ratings<div class="mb-16 fw-bold">📈 Model Performance Summary</div> <table style="width: 100%; border-collapse: collapse; margin-bottom: 20px;"> <thead> <tr style="background-color: #f2f2f2;"> <th style="padding: 8px; border: 1px solid #ddd;">Model</th> <th style="padding: 8px; border: 1px solid #ddd;">RMSE</th> <th style="padding: 8px; border: 1px solid #ddd;">MAE</th> <th style="padding: 8px; border: 1px solid #ddd;">Precision@5</th> <th style="padding: 8px; border: 1px solid #ddd;">Notes</th> </tr> </thead> <tbody> <tr> <td style="padding: 8px; border: 1px solid #ddd;">Popularity-Based</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">Works for all users but not personalized</td> </tr> <tr> <td style="padding: 8px; border: 1px solid #ddd;">Content-Based Filtering</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">0.0028–0.0032</td> <td style="padding: 8px; border: 1px solid #ddd;">Struggles with relevance, no user signals</td> </tr> <tr> <td style="padding: 8px; border: 1px solid #ddd;">User-User Collaborative</td> <td style="padding: 8px; border: 1px solid #ddd;">0.569</td> <td style="padding: 8px; border: 1px solid #ddd;">0.348</td> <td style="padding: 8px; border: 1px solid #ddd;">—</td> <td style="padding: 8px; border: 1px solid #ddd;">Accurate for dense clusters, cold-start issue</td> </tr> <tr> <td style="padding: 8px; border: 1px solid #ddd;">NMF (Model-Based)</td> <td style="padding: 8px; border: 1px solid #ddd;">2.51</td> <td style="padding: 8px; border: 1px solid #ddd;">1.44</td> <td style="padding: 8px; border: 1px solid #ddd;">0.0</td> <td style="padding: 8px; border: 1px solid #ddd;">Learn
Users often need a centralized virtual assistant that can handle small tasks like reminders, greetings, simple queries (date/time/weather), or custom commands. Existing solutions (like Alexa or Siri) are either voice-only or not customizable for personal needs, and most bots cannot be extended easily by developers.
Users often need a centralized virtual assistant that can handle small tasks like reminders, greetings, simple queries (date/time/weather), or custom commands. Existing solutions (like Alexa or Siri) are either voice-only or not customizable for personal needs, and most bots cannot be extended easily by developers.
get_time()set_reminder()tell_joke()<div class="mb-16 fw-bold"> Features</div> <ul class="text-secondary-light" style="list-style-type: disc; padding-left: 20px;"> <li><strong>Personalized Greetings</strong> – Greets the user by name (configurable)</li> <li><strong>Time & Date Reporting</strong> – Tells the current time and date on request</li> <li><strong>Simple Reminder System</strong> – Sets reminders that are either locally stored or printed</li> <li><strong>Custom Small Talk</strong> – Responds to casual inputs like <em>“Who made you?”</em> or <em>“How are you?”</em></li> <li><strong>Self-Learning Ability</strong> – Can learn new responses via an <strong>admin interface</strong> or <strong>hardcoded functions</strong></li> <li><strong>Local Memory Support</strong> – Optionally integrates with <strong>SQLite</strong> for storing logs or session data</li> </ul>
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Not specified.
Hi! How can I help you today?