18
Problem framing and success metrics
3 subtopics
19
Choose task type (classification, regression, ranking, forecasting)
20
Define baseline and target metric (accuracy, F1, AUROC, RMSE, etc.)
21
Set constraints (latency, memory, interpretability, cost)
22
Data splitting and evaluation hygiene
3 subtopics
23
Train/validation/test splits and leakage patterns
24
Cross-validation (when to use it and pitfalls)
25
Calibration and thresholding for decision-making
26
Generalization, bias–variance, and regularization
3 subtopics
27
Underfitting vs overfitting (diagnosis with learning curves)
28
Regularization methods (L1/L2, early stopping, dropout intuition)
29
Hyperparameter tuning basics (search spaces and budgets)
30
Feature engineering and preprocessing
4 subtopics
31
Scaling/normalization and handling missing values
32
Encoding categorical variables (one-hot, target, embeddings overview)
33
Text/vector representations overview (bag-of-words to embeddings)
34
Feature selection and dimensionality reduction basics
35
Metrics and error analysis
3 subtopics
36
Confusion matrix, precision/recall, ROC/PR curves
37
Residual analysis for regression and heteroscedasticity clues
38
Slice-based evaluation (subgroups, rare cases, long tail)
39
Reproducibility and experiment tracking
2 subtopics
40
Random seeds, deterministic ops, and versioning data/code
41
Run logging (configs, metrics) and comparing experiments