Data Science Portfolio
    • πŸ‡―πŸ‡΅ ζ—₯本θͺž
    • πŸ‡«πŸ‡· FranΓ§ais
  • Home
  • Services
  • Portfolio
  • Blog
  • About

Portfolio - Machine Learning

Machine learning, causal inference, and reinforcement learning projects

Machine Learning

Predictive models, causal inference & reinforcement learning

From classical statistical models to reinforcement learning, projects that create business value.

Projects

Causal Inference Platform for A/B Testing

Problem: What is the real impact of our email campaigns on conversions?

Complete A/B testing platform combining Bayesian analysis and Causal ML to go beyond simple correlation and measure causal impact of marketing interventions.

Approach:

  • Bayesian Analysis with PyMC for interpretable results (probability of being best)
  • Causal ML with X-Learner to estimate heterogeneous effects (CATE)
  • FastAPI for real-time recommendation
  • Streamlit Dashboard for results exploration

Results: On 64,000 e-commerce customers, identified +16% lift for β€œMens” emails vs +8% for β€œWomens”.

Technologies: Python, PyMC, CausalML, SHAP, FastAPI, Streamlit

See details β†’ | Streamlit Dashboard

Dynamic Pricing Engine

Problem: How to optimize prices in real-time based on context and demand?

Project exploring surge pricing strategies used by Uber and Lyft, with methodological progression from econometrics to reinforcement learning.

Progressive approach:

  1. EDA & Storytelling: Dataset exploration and business narrative
  2. Bayesian Price Elasticity: Econometric model with PyMC
  3. Contextual Bandits: Thompson Sampling for exploration/exploitation
  4. Q-Learning: Complete reinforcement learning agent

Dataset: ~700K Uber & Lyft rides (Boston, Nov 2018)

Technologies: Python, PyMC, ArviZ, scikit-learn, Streamlit

See details β†’

Penguin Explorer - Full-stack ML Demo

Problem: How to deploy an ML model end-to-end with modern architecture?

Demonstrative project of a complete ML architecture: from training to deployment, through API and frontend, with R and Python integration.

Architecture:

DuckDB β†’ scikit-learn β†’ Vetiver β†’ FastAPI β†’ Shiny (Python & R)

Highlights:

  • Model Registry with Vetiver and pins
  • REST API auto-generated with OpenAPI documentation
  • Dual frontends: Shiny Python and Shiny R
  • CI/CD with GitHub Actions
  • Complete containerization with Docker Compose

Technologies: Python, R, scikit-learn, Vetiver, FastAPI, Shiny, Docker, GitHub Actions

See details β†’ | GitHub

House Prices Prediction - Harvard Certificate

Problem: Predict house sale prices in Iowa from their characteristics.

Final project for Harvard Data Science certificate, exploring and comparing many regression approaches.

Models tested:

  • Linear regression (baseline)
  • Random Forest
  • XGBoost
  • GAM (Generalized Additive Models)
  • Neural Networks
  • Ensemble (combination of best models)

Results: Systematic performance comparison and identification of most important features.

Technologies: R, RMarkdown, caret, xgboost, randomForest

See details β†’


← Back to Portfolio | See Analysis projects β†’