Minimal domain contract, claim rules, and the reusable multi-domain shell.
A domain is not complete until it can declare what mode produced every answer, where the evidence came from, and which questions are the fixed validation surface. Those three things together make a domain result interpretable rather than merely plausible.
To support the offline grounded-agent pattern, a domain must define four packs.
Every run or answer surface should record, where applicable:
| Field | Purpose |
|---|---|
domain |
Which domain produced this result |
mode |
raw / grounded / artifact / implementation_agent |
question_set or request_set |
Which fixed surface this run covers |
model |
Which model was used for any model-facing steps |
runtime |
Elapsed time for reproducibility comparison |
evidence_source |
What substrate or artifact the answer drew from |
source_type |
database / file / artifact / live-tool |
source_references |
Specific files, tables, or artifact IDs |
tool_path |
Ordered list of tools or queries that ran |
validation_status |
Whether the answer was checked against the validation surface |
snapshot_boundary |
Data freshness boundary if applicable |
This schema is the minimum that keeps results interpretable later. Partial records are acceptable — a grounded run may not have a tool_path if the grounding was a static seed. What matters is that every field that applies is present.
Use these rules consistently across all Phoenix domain reporting.
| Mode | What it justifies | What it does not justify |
|---|---|---|
raw |
Baseline model behavior observation | Claims about domain usefulness |
grounded |
Claims about constrained usefulness | Claims about independent model reasoning |
artifact |
Claims about validated answer availability | Claims about live execution or tool capability |
implementation_agent |
Claims about controlled workflow capability | Claims about raw-model strength outside the workflow |
Do not collapse these into one vague category like "local LLM result." That compression makes every claim uninterpretable. The claim rules are the accountability layer for the whole pattern.
The current implementation exposes a shared multi-domain entrypoint at domains/offline_agent_cli.py. It routes to domain-specific handlers while sharing the provenance schema and mode dispatch logic.
touragent — tennis domain analytics, full reference implementationiso13485 — ISO standards lookupglobaltemperature — historical climate recordsThe CLI normalizes the legacy agent flag to artifact internally. The audience-facing label and the internal implementation label are deliberately separated — artifact is the accurate technical description; agent was the original CLI keyword retained for backward compatibility.
The broader Phoenix architecture should be:
That is the clearest reusable outcome of the TourAgent work. The near-term goal is not to turn every domain into a full offline agent at once — it is to identify which domains are good candidates, apply the same explanation and validation discipline, and reuse the pattern where the substrate justifies it.
Stronger models extend the range of what grounding can accomplish. They do not eliminate the need for a grounded substrate underneath.