Built on ideas and patterns from tau-bench (Sierra Research, MIT License)
An expert system framework — deterministic substrates, explicit validation, and disciplined agent use.
Project Phoenix publishes principles, standards, and white papers for building grounded domain systems that are inspectable, testable, and useful in practice.
It is not a generic chatbot layer. It is a framework for disciplined domain interfaces, deterministic validation, and carefully bounded agent use.
For grounded domain tasks, harness configuration is the binding constraint — not model identity.
A system with grounded substrates, explicit routing, and a validated output contract will produce consistent results across model families. A system without those things will produce inconsistent results regardless of how capable the model is. This claim has a scope: it holds for well-defined domain tasks where deterministic substrates exist. It does not claim model identity is irrelevant for open-ended reasoning.
Local models are where this principle is proved rather than assumed. Frontier models can compensate for a weak harness — so their results do not distinguish harness quality from model capability. Local models cannot compensate. When a constrained local model converges with a stronger one at the semantic usefulness level — which Papers 1.11 and 1.13 demonstrate — the harness is doing the work. That is the finding. Local model empirics are the stress test that makes it falsifiable.
Push real execution and validation into explicit domain logic instead of relying only on prompt behavior.
Make claims, artifacts, and trust boundaries inspectable rather than implicit.
Provide patterns other people can adapt without needing to copy the full consulting process.
Current umbrella paper for Project Phoenix, with Papers 1.16 and 1.17 now featured as the active measurement-integrity and operator-shell line.
Open Current PaperSeventeen primary papers across grounded systems, orchestration, local inference, operating discipline, measurement integrity, and operator infrastructure.
Open Research PapersOpenClaw as the monitored operator shell. Phoenix as the deterministic authority underneath. Four backends, incident mode, per-backend health, and trend visibility.
Open OpenClaw Demo