Offline Grounded Domain Agent

A domain is not complete until it can declare what mode produced every answer, where the evidence came from, and which questions are the fixed validation surface. Those three things together make a domain result interpretable rather than merely plausible.

Minimal Domain Contract

To support the offline grounded-agent pattern, a domain must define four packs.

Domain Pack

Domain purpose — what question surface this domain answers
Deterministic substrate — what local data and logic it draws from
Fixed validation surface — the minimum question or request set
Time or snapshot boundary if the data has a freshness constraint

Grounding Pack

Verified seed or evidence builder — how the domain produces local context
Prompt template — how that context is injected before the model sees the question
Output schema — what fields the grounded answer must include
Logging path — where grounded run records go

Artifact Pack

Saved validated answer artifact — the frozen or overlaid answer surface
Artifact provenance fields — mode, snapshot, tool path, validation status
Rerun or update policy — when the artifact should be regenerated

Implementation-Agent Pack

Deterministic runner — the workflow script or module
Validation checks — what the runner verifies before accepting an answer
Error handling — how the runner signals and recovers from tool failure
Artifact preservation — how results are saved and identified

Required Recorded Fields

Every run or answer surface should record, where applicable:

Field	Purpose
`domain`	Which domain produced this result
`mode`	raw / grounded / artifact / implementation_agent
`question_set` or `request_set`	Which fixed surface this run covers
`model`	Which model was used for any model-facing steps
`runtime`	Elapsed time for reproducibility comparison
`evidence_source`	What substrate or artifact the answer drew from
`source_type`	database / file / artifact / live-tool
`source_references`	Specific files, tables, or artifact IDs
`tool_path`	Ordered list of tools or queries that ran
`validation_status`	Whether the answer was checked against the validation surface
`snapshot_boundary`	Data freshness boundary if applicable

This schema is the minimum that keeps results interpretable later. Partial records are acceptable — a grounded run may not have a tool_path if the grounding was a static seed. What matters is that every field that applies is present.

Claim Rules

Use these rules consistently across all Phoenix domain reporting.

Mode	What it justifies	What it does not justify
`raw`	Baseline model behavior observation	Claims about domain usefulness
`grounded`	Claims about constrained usefulness	Claims about independent model reasoning
`artifact`	Claims about validated answer availability	Claims about live execution or tool capability
`implementation_agent`	Claims about controlled workflow capability	Claims about raw-model strength outside the workflow

Do not collapse these into one vague category like "local LLM result." That compression makes every claim uninterpretable. The claim rules are the accountability layer for the whole pattern.

Reusable CLI — Multi-Domain Shell

The current implementation exposes a shared multi-domain entrypoint at domains/offline_agent_cli.py. It routes to domain-specific handlers while sharing the provenance schema and mode dispatch logic.

Currently supported domains

touragent — tennis domain analytics, full reference implementation
iso13485 — ISO standards lookup
globaltemperature — historical climate records

Shared output schema

Mode: {mode} Question: {question} {answer} Runtime: {runtime_seconds:.2f}s Provenance: {provenance} Validation: {validation_status} (when applicable) Source Type: {source_type} (when applicable) Tool Path: {tool_path} (when applicable)

Mode normalization

The CLI normalizes the legacy agent flag to artifact internally. The audience-facing label and the internal implementation label are deliberately separated — artifact is the accurate technical description; agent was the original CLI keyword retained for backward compatibility.

Working Conclusion

The broader Phoenix architecture should be:

offline grounded domain agents built on deterministic local substrates, with grounded answering as the default, implementation-agent execution as the escalation path, and explicit provenance on every answer surface.

That is the clearest reusable outcome of the TourAgent work. The near-term goal is not to turn every domain into a full offline agent at once — it is to identify which domains are good candidates, apply the same explanation and validation discipline, and reuse the pattern where the substrate justifies it.

Stronger models extend the range of what grounding can accomplish. They do not eliminate the need for a grounded substrate underneath.