40
End-to-End Project: Tabular Prediction (Baseline → Production-ish)
3 subtopics
41
Build a clean training pipeline with scikit-learn Pipelines
42
Create a strong baseline and compare to boosting models
43
Package the model + preprocessing for consistent inference
44
Natural Language Processing (Applied)
3 subtopics
45
Text preprocessing and vectorization (TF-IDF, n-grams)
46
Fine-tune a transformer for text classification
47
Evaluate NLP systems (F1, calibration, error analysis)
48
Computer Vision (Applied)
3 subtopics
49
Image data pipelines and augmentation
50
Transfer learning with pretrained CNNs
51
Evaluate vision models (accuracy, confusion, robustness checks)
52
Time Series and Forecasting (Applied)
3 subtopics
53
Train/validation splits for temporal data (leakage-safe)
54
Classical forecasting baselines (ARIMA/ETS)
55
Feature-based ML for forecasting (lags, rolling stats)
56
Interpretability and Debugging Models
3 subtopics
57
Permutation importance and partial dependence (PDP/ICE)
58
SHAP basics for tabular models
59
Systematic error analysis (slices, counterfactual tests)
60
Fairness, Privacy, and Responsible ML
3 subtopics
61
Fairness metrics and trade-offs (group parity basics)
62
Privacy basics (PII handling, differential privacy intro)
63
Model cards, documentation, and risk assessment
64
Capstone: Portfolio, Write-ups, and Interviews
3 subtopics
65
Write project reports with problem framing, metrics, and lessons learned
66
Create a portfolio (GitHub repos, demos, reproducible notebooks)
67
Interview prep: ML system design + modeling questions