This repo’s examples are grouped by intent so you can quickly find the right entrypoint and avoid path confusion.
Canonical guide for the research example: docs/getting-started/research_example.md.
New to the repo? Run these three in order:
python scripts/quickstart_toy.pyuv run inspect eval examples/tasks/prompt_task.py -T prompt="Write a concise overview of LangGraph"uv run python examples/runners/iterative_runner.py --time-limit 120 --max-steps 20 "List repo files and summarize"More detail and setup tips: docs/getting-started/inspect_agents_quickstart.md.
inspect eval)
examples/tasks/prompt_task.py — single-sample ad‑hoc prompt task.examples/tasks/iterative_task.py — iterative agent (no submit) task.examples/tasks/research_task.py — research composition as an Inspect task (optionally via YAML config).examples/runners/supervisor_runner.py — minimal supervisor runner.examples/runners/research_runner.py — research composition runner.examples/runners/iterative_runner.py — iterative agent runner (no submit).examples/runners/profiled_runner.py — Tx.Hx.Nx profile selector for iterative runs.examples/debug/show_limits.py — prints the effective tool‑output truncation cap and its source.examples/demos/simple_arch_demo.py — simple architecture demo (supervisor/iterative modes).examples/demos/subagent_approvals_demo.py — handoff exclusivity + approvals demo (offline).examples/demos/exploration_demo.py — exploration planner demo that prints and writes plan.json.examples/configs/research/supervisor.yaml — YAML composition for the research supervisor.uv run inspect eval examples/tasks/prompt_task.py -T prompt="Write a concise overview of LangGraph"uv run inspect eval examples/tasks/iterative_task.py -T prompt="List repo files and propose a small refactor plan"uv run inspect eval examples/tasks/research_task.py -T config=examples/configs/research/supervisor.yaml -T prompt="Research topic..."uv run python examples/runners/supervisor_runner.py "What is Inspect‑AI?"uv run python examples/runners/research_runner.py --enable-web-search "Research LangGraph vs Inspect"uv run python examples/runners/iterative_runner.py --time-limit 120 --max-steps 20 "List repo files and summarize"uv run python examples/runners/profiled_runner.py --profile T1.H1.N1 "Curate arXiv papers by Quantinuum (2025)"uv run python examples/debug/show_limits.py → prints one line like Tool-output cap: 16384 bytes (default).
config (active GenerateConfig), env (INSPECT_MAX_TOOL_OUTPUT), default (16384 bytes).uv run python examples/demos/simple_arch_demo.py --mode supervisor "Research topic..."uv run python examples/demos/subagent_approvals_demo.py --preset devuv run python examples/demos/exploration_demo.py --breadth 2 --depth 2 --max-queries 6 "Explore Inspect‑AI agent patterns"from examples.inspect.exploration.planner import plan, ExplorationConfig as C
items = plan("Explore Inspect‑AI agent patterns", C(breadth=3, depth=2, seed=0, max_queries=8))
for it in items:
print(it.depth, it.query, it.tags)
uv run python examples/demos/exploration_demo.py --breadth 2 --depth 2 --max-queries 6 \
"Explore Inspect‑AI agent patterns"
# Prints the plan to stdout and writes plan.json in the CWD
cat plan.json | jq .
Notes
examples/tasks/research_task.py and examples/runners/research_runner.py expose a planner_tool to the supervisor so you can plan before searching.This section documents the example commands with argument tables. Defaults are the values used when you omit the flag. See docs/reference/environment.md for provider/model env and tool toggles.
inspect eval)Usage examples
# Ad‑hoc prompt task
uv run inspect eval examples/tasks/prompt_task.py -T prompt="Write a concise overview of LangGraph" -T attempts=1
# Iterative agent as a task
uv run inspect eval examples/tasks/iterative_task.py \
-T prompt="List files and propose a small refactor plan" \
-T time_limit=300 -T max_steps=30 -T enable_exec=true -T enable_web_search=true
# Research composition (inline or YAML config)
uv run inspect eval examples/tasks/research_task.py \
-T prompt="Curate arXiv papers Quantinuum published in 2025" \
-T attempts=1 -T enable_web_search=true -T write_plan=true -T plan_out=plan.json
uv run inspect eval examples/tasks/research_task.py \
-T config=examples/configs/research/supervisor.yaml -T prompt="Research topic..."
prompt_task.py (examples/tasks/prompt_task.py)
| Argument | Type | Default | Description | Related env |
|---|---|---|---|---|
prompt |
str | “Find the latest publication from Quantinuum in arxiv” | Single input sample text. | Model/provider/env per docs/reference/environment.md |
attempts |
int | 1 | Number of agent attempts (submit semantics). | — |
iterative_task.py (examples/tasks/iterative_task.py)
| Argument | Type | Default | Description | Related env |
|---|---|---|---|---|
prompt |
str | “List repository files and summarize key modules.” | Task prompt text. | — |
time_limit |
int (sec) | 600 | Real‑time budget for the loop. | — |
max_steps |
int | 40 | Maximum reasoning/tool steps. | — |
enable_exec |
bool | false | Enables bash() and python() tools. |
Sets INSPECT_ENABLE_EXEC=1 |
enable_web_search |
bool | false | Enables web_search() tool. |
Sets INSPECT_ENABLE_WEB_SEARCH=1 + provider keys |
research_task.py (examples/tasks/research_task.py)
| Argument | Type | Default | Description | Related env |
|---|---|---|---|---|
prompt |
str | “Write a short overview of Inspect‑AI” | Task prompt text. | — |
attempts |
int | 1 | Agent attempts (submit semantics). | — |
config |
str/path | — | Optional YAML composition (overrides inline defaults). | — |
enable_web_search |
bool | false | Enables web_search() in the composition. |
Sets INSPECT_ENABLE_WEB_SEARCH=1 + provider keys |
write_plan |
bool | false | Pre-plan and write plan.json using the examples planner. |
— |
plan_out |
str/path | plan.json |
Output path for the plan JSON. | — |
Planner tool
planner_tool that returns a deterministic JSON exploration plan: { breadth, depth, queries: [{ query, depth, tags }] }.-T write_plan=true -T plan_out=plan.json (see Quick Start above). This runs the same planning logic offline before the agent executes.examples/runners/research_runner.py) also loads planner_tool into the supervisor; no extra flags are required. Use the demo (examples/demos/exploration_demo.py) or the task’s write_plan flag if you want a saved plan.json during a runner-based flow.Tip: Standard tools (think, web_search, etc.) are available via env toggles; see docs/reference/environment.md and docs/tools/*.
See also
python)Supervisor runner (examples/runners/supervisor_runner.py)
uv run python examples/runners/supervisor_runner.py "Write a short overview of LangGraph"
| Argument | Type | Default | Description | Related env |
|---|---|---|---|---|
prompt |
str (positional, optional) | $PROMPT or example text |
User prompt; falls back to $PROMPT then an in‑file default. |
PROMPT |
--provider |
str | DEEPAGENTS_MODEL_PROVIDER or lm-studio |
Model provider routing. | DEEPAGENTS_MODEL_PROVIDER |
--model |
str | INSPECT_EVAL_MODEL or unset |
Explicit model id; may include provider prefix. | INSPECT_EVAL_MODEL |
--enable-think |
flag | false | Enable think() tool. |
INSPECT_ENABLE_THINK=1 |
--enable-web-search |
flag | false | Enable web_search() tool. |
INSPECT_ENABLE_WEB_SEARCH=1 + search keys |
--enable-exec |
flag | false | Enable bash() and python() tools. |
INSPECT_ENABLE_EXEC=1 |
Tip — If you see “No sandbox environment has been provided …” after enabling exec, add a sandbox (--sandbox local for Inspect CLI) or use the profiled runner. See: docs/how-to/inspect_sandbox.md.
| --enable-web-browser | flag | false | Enable browser tools. | INSPECT_ENABLE_WEB_BROWSER=1 |
| --enable-text-editor-tool | flag | false | Expose editor tool directly. | INSPECT_ENABLE_TEXT_EDITOR_TOOL=1 |
Iterative runner (examples/runners/iterative_runner.py)
uv run python examples/runners/iterative_runner.py --time-limit 300 --max-steps 20 "List repo files and summarize"
| Argument | Type | Default | Description | Related env |
|---|---|---|---|---|
prompt |
str (positional, optional) | $PROMPT or example text |
User prompt; falls back to $PROMPT. |
PROMPT |
--time-limit |
int (sec) | 600 | Real‑time budget for the loop. | — |
--max-steps |
int | 40 | Maximum reasoning/tool steps. | — |
--enable-exec |
flag | false | Enable bash() and python() tools. |
INSPECT_ENABLE_EXEC=1 |
--provider |
str | DEEPAGENTS_MODEL_PROVIDER or ollama |
Model provider routing. | DEEPAGENTS_MODEL_PROVIDER |
--model |
str | INSPECT_EVAL_MODEL or unset |
Explicit model id; may include provider prefix. | INSPECT_EVAL_MODEL |
Research runner (examples/runners/research_runner.py)
uv run python examples/runners/research_runner.py --enable-web-search "What is Inspect‑AI?"
| Argument | Type | Default | Description | Related env | ||
|---|---|---|---|---|---|---|
prompt |
str (positional, optional) | $PROMPT or example text |
User prompt; falls back to $PROMPT. |
PROMPT |
||
--provider |
str | DEEPAGENTS_MODEL_PROVIDER or ollama |
Model provider routing. | DEEPAGENTS_MODEL_PROVIDER |
||
--model |
str | INSPECT_EVAL_MODEL or unset |
Explicit model id; may include provider prefix. | INSPECT_EVAL_MODEL |
||
--enable-web-search |
flag | false | Enable web_search() tool. |
INSPECT_ENABLE_WEB_SEARCH=1 + search keys |
||
--approval |
enum | — | Apply approvals preset (dev |
ci |
prod). |
— |
--config |
str/path | — | Load composition from YAML. | — |
Planner tool
planner_tool to the supervisor, enabling “plan before search” behavior out of the box. To capture a plan file, either run the planner demo (examples/demos/exploration_demo.py) separately or use the Inspect task variant with -T write_plan=true.Profiled runner (examples/runners/profiled_runner.py)
uv run python examples/runners/profiled_runner.py --profile T1.H1.N1 "Curate arXiv papers by Quantinuum (2025)"
| Argument | Type | Default | Description | Related env | |||
|---|---|---|---|---|---|---|---|
prompt |
str (positional, optional) | $PROMPT or example text |
User prompt; falls back to $PROMPT. |
PROMPT |
|||
--profile |
Tx.Hx.Nx | T1.H1.N1 |
Profile selector: T=Tooling (T0 unrestricted exec, T1 restricted web, T2 no exec); H=Host (H0 local, H1 docker, H2 k8s, H3 proxmox); N=Network (N0 full, N1 allow‑listed, N2 no external). | Sets tool/env toggles; maps H→Task.sandbox | |||
--tooling |
T0 | T1 | T2 | — | Override T component only. | Sets INSPECT_ENABLE_* accordingly |
|
--host |
H0 | H1 | H2 | H3 | — | Override H component only. | Maps to Inspect sandbox (local|docker|k8s|proxmox) |
--net |
N0 | N1 | N2 | — | Override N component only. | For cluster/network policy integration | |
--approval |
enum | dev |
Approvals preset (ci |
dev |
prod). |
— | |
--time-limit |
int (sec) | 120 | Real‑time budget for the loop. | — | |||
--max-steps |
int | 20 | Maximum reasoning/tool steps. | — | |||
--enable-browser |
flag | false | Also enable browser tools (T0 only recommended). | INSPECT_ENABLE_WEB_BROWSER=1 |
|||
--enable-web-search |
flag | false | Enable web_search() tool. |
INSPECT_ENABLE_WEB_SEARCH=1 + search keys |
|||
--log-dir |
path | INSPECT_LOG_DIR or ./logs |
Where to write logs and traces. | INSPECT_LOG_DIR, INSPECT_TRACE_FILE |
simple_arch_demo.py (examples/demos/simple_arch_demo.py)
uv run python -m examples.demos.simple_arch_demo --mode supervisor "Research topic ..."
| Argument | Type | Default | Description |
|---|---|---|---|
task |
str (positional) | — | The demo task string. |
--mode |
enum | supervisor |
Choose agent style (supervisor or iterative). |
--dev-approvals |
flag | false | Enable dev approvals preset (handoff exclusivity, kill‑switch). |
subagent_approvals_demo.py (examples/demos/subagent_approvals_demo.py)
uv run python examples/demos/subagent_approvals_demo.py --preset dev
| Argument | Type | Default | Description | Related env | ||
|---|---|---|---|---|---|---|
--preset |
enum | dev |
Approvals preset (ci |
dev |
prod). |
Default can be set with DEMO_APPROVAL_PRESET |
REPO_ROOT = Path(__file__).resolve().parents[2] so local src/ code is imported from this repo..env from the repo root and their own folder (if present). You can also pass --env-file to runners that support it or set INSPECT_ENV_FILE.