Study Path Agent
Copy link
X / Twitter
Facebook
LinkedIn
WhatsApp
Generate Your Own
Agentic AI
149 topics across 7 chapters
Chapter 1
Foundations & mental models
1
LLM basics for agents (what matters in practice)
3 subtopics
2
Context windows, tokens, and truncation failure modes
3
Sampling basics (temperature/top-p) and determinism expectations
4
Structured vs free-form outputs (why agents need structure)
5
Decision-making under uncertainty (agent mindset)
3 subtopics
6
Heuristics, policies, and when to avoid “over-reasoning”
7
MDP/POMDP intuition for partial observability
8
Bandits & exploration vs exploitation (practical lens)
9
Human-in-the-loop UX for agents
3 subtopics
10
Approval flows (confirmations, two-person rules, break-glass)
11
Explanations & transparency (what the user needs to trust actions)
12
Feedback capture loops (labels, corrections, preferences)
13
Prompting & instruction design for controllable agents
4 subtopics
14
System prompts, constraints, and instruction hierarchy
15
Structured outputs: JSON schemas, validation, and repair prompts
16
Few-shot exemplars for tool usage and planning style
17
Prompt anti-patterns (leakage, ambiguity, brittle formatting)
Chapter 2
Agent loop & planning
18
The agent loop: sense → think → act → reflect
3 subtopics
19
Perception/input normalization (parsing, cleaning, grounding)
20
Actuation & side effects (designing safe actions)
21
Termination criteria and “done” detection
22
Planning methods
4 subtopics
23
ReAct-style interleaving of reasoning and acting
24
Tree/graph search planning (beam search, ToT intuition)
25
Constraint-based planning (tools, budgets, policies)
26
Plan execution, monitoring, and re-planning triggers
27
Memory & state management
4 subtopics
28
Short-term state (scratchpad/state object) vs chat history
29
Long-term memory stores (KV stores, vector DBs)
30
Retrieval / RAG for agents
3 subtopics
31
Indexing & chunking for retrieval quality
32
Query rewriting, multi-query, and decomposition for RAG
33
Grounding and source attribution (citations in agent outputs)
34
Summarization & compression strategies for long-running agents
35
Task decomposition & execution graphs
3 subtopics
36
Goal/requirements elicitation (clarifying questions)
37
Designing a subtask graph (dependencies, parallelism)
38
Mapping subtasks to tools/agents (capability-aware scheduling)
↗
Prompting & instruction design for controllable agents
(see Chapter 1)
Chapter 3
Tool use & environment interaction
39
Tool calling (functions), schemas, and routing
4 subtopics
40
Function schema design (inputs/outputs, enums, constraints)
41
Tool selection & routing (rules, classifiers, LLM router)
42
Validation, retries, and repair strategies for tool calls
43
Side-effect safety (idempotency keys, dry-runs, confirmations)
↗
Retrieval / RAG for agents
(see Chapter 2)
44
Web/GUI automation as actions
3 subtopics
45
Browser automation basics (navigation, forms, downloads)
46
Robust selectors & resilience (timeouts, retries, flaky UIs)
47
Constraints & ethics (anti-bot, terms, user consent)
48
Sandboxes, simulators & safe execution environments
2 subtopics
49
Deterministic sandboxing (resource limits, network egress control)
50
Test harnesses & simulators for agents
3 subtopics
51
Mock tools and deterministic fixtures
52
Synthetic users and scripted scenarios
53
Adversarial testing (jailbreaks, injections, worst-case tasks)
54
Data/DB actions (transactions & permissions)
3 subtopics
55
Idempotency, transactions, and exactly-once illusions
56
Schema safety (migrations, forward/backward compatibility)
57
Permissioned queries (row-level security, least privilege)
Chapter 4
Architectures & design patterns
58
Single-agent vs multi-agent systems
3 subtopics
59
When multi-agent helps vs hurts (latency, accuracy, cost)
60
Communication protocols (messages, shared state, contracts)
61
Coordination failures (loops, deadlocks, collusion, drift)
62
Orchestrators & workflow engines
3 subtopics
63
Modeling workflows as DAGs (steps, retries, compensations)
64
Scheduling & queues (workers, priorities, rate limiting)
65
State persistence (checkpoints, resumability)
66
Reflection, critique & verification patterns
3 subtopics
67
Self-critique loops (review then revise)
68
Debate/consensus patterns (multiple proposals, arbitration)
69
Tool-assisted verification (unit checks, validators, theorem-ish)
70
Delegation, roles & handoffs
3 subtopics
71
Role prompting & responsibilities (contracts, boundaries)
72
Task handoff interfaces (artifacts, briefs, acceptance criteria)
73
Budgeting & quotas across sub-agents/tools
74
Event-driven agents (streams, triggers)
3 subtopics
75
Triggers & webhooks (reactive behaviors)
76
Streaming inputs (partial results, incremental decisions)
77
Backpressure and overload handling
Chapter 5
Reliability, safety & alignment
78
Guardrails & policy enforcement
3 subtopics
79
Output filtering & constrained generation (blocklists, allowlists)
80
Policy-as-code (rules engines, centralized governance)
81
Tool access control (capability tokens, scopes, approvals)
82
Error handling, retries & recovery
3 subtopics
83
Retry strategies (exponential backoff, jitter, circuit breakers)
84
Fallback modes (degraded responses, “safe answer only”)
85
Partial failure recovery (compensations, sagas)
86
Security for agents (prompt injection, secrets, tools)
3 subtopics
87
Prompt injection defenses (content isolation, instruction boundaries)
88
Secrets handling & exfiltration prevention (vaulting, redaction)
89
Tool supply-chain risks (untrusted plugins, dependency scanning)
90
Privacy & data governance
3 subtopics
91
PII detection/redaction and minimization
92
Data retention, deletion, and user export requests
93
Consent and user controls for agent actions
94
Human oversight & approvals (operational safety)
3 subtopics
↗
Approval flows (confirmations, two-person rules, break-glass)
(see Chapter 1)
95
Escalation policies (when to page a human)
96
Audit trails for approvals (who/what/when/why)
97
Rate limits, latency & cost control
3 subtopics
98
Token/call budgeting per task (hard and soft limits)
99
Caching and memoization (tool results, retrieval, reasoning artifacts)
100
Batch vs real-time execution (latency-cost tradeoffs)
Chapter 6
Evaluation & observability
101
Agent evaluation methods (offline + online)
3 subtopics
102
Offline evaluation on task suites (replay, deterministic tests)
103
Online evaluation (A/B tests, shadow deployments, canaries)
104
Human evaluation & rubrics (calibration, inter-rater reliability)
↗
Test harnesses & simulators for agents
(see Chapter 3)
105
Metrics: success, safety, latency, cost
4 subtopics
106
Defining “task success” (acceptance criteria, partial credit)
107
Latency SLOs and tail behavior (p95/p99)
108
Cost per task and budget burn-down monitoring
109
Safety metrics (policy violations, risky-tool attempts)
110
Tracing, logging & observability for agent runs
3 subtopics
111
Span tracing across steps and tool calls
112
Prompt/config/version logging for reproducibility
113
Redaction and access control for logs
114
Dataset curation & gold task suites
3 subtopics
115
Collecting real tasks (consent, sampling, representativeness)
116
Labeling guidelines and rubrics for agent outcomes
117
Splits and leakage prevention (train/val/test hygiene)
Chapter 7
Building & deployment (production systems)
118
System design for agentic applications
3 subtopics
119
Component architecture (model, tools, memory, orchestrator)
120
State management and resumability (checkpoints, idempotency keys)
121
Failure domains and blast radius (isolation, kill switches)
122
Production tool integrations
3 subtopics
123
Auth & credential management for tools (OAuth, service accounts)
124
Per-tool rate limits and adaptive throttling
125
Tool SLAs, timeouts, and fallbacks
126
Scalability & concurrency
3 subtopics
127
Parallel tool calls and dependency management
128
Queueing and worker pools (throughput engineering)
129
Concurrency control (locks, optimistic concurrency, dedupe)
130
Monitoring & incident response
3 subtopics
131
Alerting on agent metrics and anomaly detection
132
Runbooks and rollbacks (disable tools, degrade safely)
133
Postmortems and continuous improvement loops
134
CI/CD for prompts, tools, and agent configs
3 subtopics
135
Prompt diffing and regression tests in CI
136
Feature flags for agent behaviors and tools
137
Model/version pinning and compatibility testing
138
Compliance & auditing
3 subtopics
139
Compliance-ready logging (retention, immutability where needed)
140
Access reviews and least-privilege operations
141
Documentation (risk assessments, model cards, SOPs)