Find your entry point — exploring, building, or scaling.
Back to OverviewProject Phoenix is designed for a specific situation. Before going further, check whether your problem fits the regime where Phoenix has been proved to work.
Phoenix works when:
Choose based on where you are right now — not where you want to end up.
For: anyone new to Phoenix, skeptical, or evaluating
Start with the TourAgent live demo. Ten tennis questions with repeatable, grounded answers. The demo shows the deterministic substrate approach in action — what changes when the harness does the work instead of the model.
After the demo, the research papers give you the evidence behind the pattern across multiple domains.
Open TourAgent Demo →For: practitioners with a domain problem in hand
A Phoenix V1 has four steps: define your task class, identify your substrate, write the output contract, and build the verification step. The guide below walks through each one.
Read the V1 guide →For: larger teams, high-stakes domains, or multi-domain systems
Complex implementations — multiple domains, high correctness requirements, existing infrastructure to integrate — benefit from direct engagement. The private consulting layer covers domain analysis, harness design, and adaptation heuristics that are not in the open-core materials.
Start a conversation →A domain is a bounded problem space where the system must reliably answer a defined class of questions using verifiable, grounded inputs. The domain is not the subject matter — it is the combination of task class, substrate, and correctness standard.
TourAgent's domain is tennis tournament data: who won, what the score was, who played whom. The task class is narrow. The substrate is real match records. Correctness is binary. That combination is what makes the harness tractable.
Your domain might be billing state transitions, route optimization, yield forecasting, or document classification — any bounded problem where answers can be grounded and verified. The subject matter is secondary. The structure is what matters.
A minimum viable Phoenix implementation has four parts. Each one is a gate: if you cannot complete a step, stop and resolve it before building further. A system that skips a step is not a Phoenix harness — it is a model with scaffolding, which is a different thing.
Write down the exact questions your system must answer. Be precise. "Answer customer support queries" is not a task class — it is a category. "Return the current status of an order given an order ID" is a task class. The narrower and more explicit the task class, the more tractable the harness. If you cannot write it down in one sentence, your task class is too broad.
Find the ground truth source. This is the engine. It must be deterministic: a database, a solver, a real dataset, an authoritative API. Not model inference. If the answer to a query depends on what the model thinks rather than what a substrate contains, you do not have a Phoenix domain yet. The substrate is what makes the harness trustworthy — without it, correctness cannot be verified.
Before building anything, write down what a correct answer looks like. What fields does it include? What format? What does the system guarantee to return, and what does it guarantee not to return? The output contract is not documentation — it is the specification the verification step checks against. If the output contract is not explicit before you build, you cannot verify correctness after.
Every output must be verified before it leaves the system. The verification step can be a schema check, a read-back against the substrate, a golden file comparison, or a human-in-the-loop review for high-stakes outputs. The method depends on the domain. What is not optional is the step itself. A harness without a verification step produces outputs that may be correct — you just cannot tell which ones.
Project Phoenix is open-core. The distinction is deliberate: the principles, patterns, and research are public because they are useful to practitioners who want to build their own systems. The consulting layer is private because it covers the work that does not generalize cleanly across domains.
A useful heuristic: if your domain fits cleanly into the four V1 steps and your task class is well-defined, the open-core materials are sufficient to get started. If your domain is ambiguous, multi-layered, or has high-stakes correctness requirements, direct engagement is the faster path. Start a conversation →
The self-service Phoenix implementation guide is in progress. It will cover: how to define your domain, what minimum viable harness requires, and what a V1 implementation looks like end to end. Join the list to receive it first.