A collection of applied data science projects focused on predictive modeling, segmentation, forecasting, and scenario analysis to support real-world decision-making.


 

Each project demonstrates applied machine learning and data science techniques across structured and unstructured data problems.

I. Predictive Modeling & Segmentation

Projects focused on classification, regression, and structured decision modeling.

Credit Risk Prediction Model

  • Developed a supervised learning pipeline for credit risk classification using Logistic Regression, Random Forest, and XGBoost
  • Engineered predictive features and applied SHAP for model interpretability and risk factor analysis
  • Deployed model via Azure ML with an interactive Power BI dashboard for risk monitoring

    View on GitHub

Supply Chain Risk Prediction & Inventory Optimization

  • Developed a classification pipeline to predict supplier and logistics risk using Logistic Regression and XGBoost
  • Engineered features from operational and supplier performance data to improve risk detection
  • Deployed an Azure ML endpoint integrated with Power BI for real-time monitoring and decision support

    View on GitHub

Customer Value & Lifecycle Modeling

  • Applied K-Means clustering and PCA for customer segmentation and behavioral pattern discovery
  • Built a regression model to estimate Customer Lifetime Value (CLV) using XGBoost
  • Designed an A/B testing simulation framework to evaluate retention strategies and customer engagement

    View on GitHub

House Price Prediction

  • Developed regression models using linear and tree-based approaches for housing price estimation
  • Engineered location and property-based features to improve predictive performance
  • Evaluated multiple models to determine optimal approach for real estate valuation

    View on GitHub


 

II. Forecasting & Scenario Analysis

Projects focused on time-series forecasting, uncertainty modeling, and decision support systems.

Financial Forecasting & Scenario API

  • Developed time-series forecasting models using Prophet and ARIMA to predict financial trends
  • Implemented Monte Carlo simulation to quantify uncertainty and evaluate multiple future scenarios
  • Built and deployed a Flask API for real-time forecasting and scenario-based decision support

    View on GitHub

Healthcare Workforce Optimization

  • Built regression and Random Forest models to forecast healthcare staffing demand
  • Designed scenario analysis to evaluate resource allocation under varying patient demand conditions
  • Developed a data-driven framework to support workforce planning and operational decisions

    View on GitHub

Epidemiology: Toronto Outbreak Forecasting

  • Developed time-series forecasting models using Prophet to analyze infection trends
  • Identified outbreak patterns to support public health insights and decision-making
  • Built Power BI dashboards for visualization of epidemiological trends

    View on GitHub


 

III. NLP & AI Applications

Projects involving natural language processing, transformer models, and interactive AI systems.

Memory Support Chatbot (GPT-2)

  • Fine-tuned GPT-2 for context-aware conversational support
  • Designed a text generation pipeline for coherent, domain-relevant responses
  • Applied NLP preprocessing to improve model performance and response quality

    View on GitHub

Question Answering System (RoBERTa / SQuAD)

  • Built a transformer-based QA system to extract answers directly from text
  • Developed a Gradio interface for real-time question answering
  • Implemented an end-to-end NLP inference pipeline

    View on GitHub

HuggingFace Fine-Tuning QA System

  • Fine-tuned a transformer model on the SQuAD dataset for question answering tasks
  • Built an interactive QA interface using Gradio for real-time user interaction
  • Demonstrated an end-to-end workflow from model training to inference

    View on GitHub

Food Preferences Streamlit App

  • Built an interactive Streamlit application for behavioral data exploration
  • Designed real-time visualization components for user interaction
  • Applied exploratory data analysis and UX-focused analytics design

    View on GitHub

Obesity Data Analysis

  • Conducted exploratory data analysis on health and lifestyle datasets
  • Identified statistical correlations and behavioral patterns
  • Created visualizations to communicate insights and support data-driven storytelling

    View on GitHub


✨ More projects available on my GitHub