Projects
A collection of applied data science projects focused on predictive modeling, segmentation, forecasting, and scenario analysis to support real-world decision-making.
- I. Predictive Modeling & Segmentation
- II. Forecasting & Scenario Analysis
- III. NLP & AI Applications
Each project demonstrates applied machine learning and data science techniques across structured and unstructured data problems.
I. Predictive Modeling & Segmentation
Projects focused on classification, regression, and structured decision modeling.
Credit Risk Prediction Model
- Developed a supervised learning pipeline for credit risk classification using Logistic Regression, Random Forest, and XGBoost
- Engineered predictive features and applied SHAP for model interpretability and risk factor analysis
-
Deployed model via Azure ML with an interactive Power BI dashboard for risk monitoring
Supply Chain Risk Prediction & Inventory Optimization
- Developed a classification pipeline to predict supplier and logistics risk using Logistic Regression and XGBoost
- Engineered features from operational and supplier performance data to improve risk detection
-
Deployed an Azure ML endpoint integrated with Power BI for real-time monitoring and decision support
Customer Value & Lifecycle Modeling
- Applied K-Means clustering and PCA for customer segmentation and behavioral pattern discovery
- Built a regression model to estimate Customer Lifetime Value (CLV) using XGBoost
-
Designed an A/B testing simulation framework to evaluate retention strategies and customer engagement
House Price Prediction
- Developed regression models using linear and tree-based approaches for housing price estimation
- Engineered location and property-based features to improve predictive performance
-
Evaluated multiple models to determine optimal approach for real estate valuation
II. Forecasting & Scenario Analysis
Projects focused on time-series forecasting, uncertainty modeling, and decision support systems.
Financial Forecasting & Scenario API
- Developed time-series forecasting models using Prophet and ARIMA to predict financial trends
- Implemented Monte Carlo simulation to quantify uncertainty and evaluate multiple future scenarios
-
Built and deployed a Flask API for real-time forecasting and scenario-based decision support
Healthcare Workforce Optimization
- Built regression and Random Forest models to forecast healthcare staffing demand
- Designed scenario analysis to evaluate resource allocation under varying patient demand conditions
-
Developed a data-driven framework to support workforce planning and operational decisions
Epidemiology: Toronto Outbreak Forecasting
- Developed time-series forecasting models using Prophet to analyze infection trends
- Identified outbreak patterns to support public health insights and decision-making
-
Built Power BI dashboards for visualization of epidemiological trends
III. NLP & AI Applications
Projects involving natural language processing, transformer models, and interactive AI systems.
Memory Support Chatbot (GPT-2)
- Fine-tuned GPT-2 for context-aware conversational support
- Designed a text generation pipeline for coherent, domain-relevant responses
-
Applied NLP preprocessing to improve model performance and response quality
Question Answering System (RoBERTa / SQuAD)
- Built a transformer-based QA system to extract answers directly from text
- Developed a Gradio interface for real-time question answering
-
Implemented an end-to-end NLP inference pipeline
HuggingFace Fine-Tuning QA System
- Fine-tuned a transformer model on the SQuAD dataset for question answering tasks
- Built an interactive QA interface using Gradio for real-time user interaction
-
Demonstrated an end-to-end workflow from model training to inference
Food Preferences Streamlit App
- Built an interactive Streamlit application for behavioral data exploration
- Designed real-time visualization components for user interaction
-
Applied exploratory data analysis and UX-focused analytics design
Obesity Data Analysis
- Conducted exploratory data analysis on health and lifestyle datasets
- Identified statistical correlations and behavioral patterns
-
Created visualizations to communicate insights and support data-driven storytelling