176
ML System Design & Data Pipelines
5 subtopics
177
Data ingestion and validation (schemas, checks)
178
Feature stores & offline/online consistency (concepts)
179
Batch vs streaming pipelines; latency and freshness tradeoffs
180
Training pipelines: orchestration and retries (concepts)
181
System design: SLAs/SLOs, fallbacks, and graceful degradation
182
Deployment, Inference & Monitoring
6 subtopics
183
Packaging models for serving (serialization, preprocessing)
184
Serving patterns: online, batch, edge; choosing the right one
185
Monitoring: data drift, concept drift, performance degradation
186
A/B testing and experimentation in production
187
Incident response for ML systems (rollbacks, guardrails)
188
Human-in-the-loop review and feedback loops
189
MLOps & Experiment Tracking
5 subtopics
190
Experiment tracking tools concepts (metrics, artifacts, lineage)
191
Model registry & lifecycle (staging, prod, rollback)
↗ Version control for data/code (Git + data versioning concepts) (see Chapter 1)
192
CI/CD for ML (tests for data, features, models)
193
Reproducible training with containers (Docker basics for ML)
194
Performance, Scaling & Hardware Basics
4 subtopics
195
Compute basics: CPU vs GPU vs TPU; memory bandwidth
196
Profiling training/inference and finding bottlenecks
197
Distributed training basics (data/model parallelism concepts)
198
Inference optimization: batching, quantization, caching
199
Privacy & Security in ML
4 subtopics
200
Privacy basics: PII, de-identification limits, governance
201
Differential privacy (high-level) and tradeoffs
202
Adversarial examples and robustness overview
203
Secure ML pipelines: secrets, access control, supply chain risks
204
Fairness, Accountability & Transparency
4 subtopics
205
Bias sources: data, labels, measurement, objectives
206
Fairness metrics and tradeoffs (group vs individual)
207
Explainability: global vs local, pitfalls, stakeholder needs
208
Documentation: datasheets, model cards, and audit trails
209
ML Productization & Communication
4 subtopics
210
Communicating results: plots, baselines, and uncertainty
211
Writing a technical ML report (structure + reproducibility checklist)
212
Choosing deployment constraints: latency, cost, privacy, UX
213
Stakeholder alignment: success metrics, guardrails, and iteration plan