Open-core standards for grounded domain systems.
Back to OverviewProject Phoenix is best understood as an open-core framework for grounded domain systems rather than as a single interface, a single benchmark story, or a single agent architecture.
Its public core consists of principles, validation standards, grounded architecture patterns, and white papers that explain how deterministic substrates, supervision, provenance, and bounded agent use can be combined into useful systems.
The portfolio now spans 17 primary papers across two tracks: Phoenix Operating Discipline (1.1–1.7, 1.11–1.17) and RVH/ML Evaluation (1.8–1.10). All papers have full drafts; empirical results incorporated in 1.9, 1.10, and the current measurement-integrity and operator-shell line.
Engineering rules for grounded, inspectable systems.
Validation, variation, and operating-discipline patterns.
Generic grounded-domain and bounded-agent patterns.
Public argument, historical support, and current interpretation across 17 primary papers plus companion format sources.
The full inventory remains in canonical portfolio order. The papers below are featured because they define the current measurement-integrity and OpenClaw x Project Phoenix line rather than because the full portfolio has been reordered around them.
The Model Did Not Fail The Protocol, The Terminal Did shows that thinking-mode protocol scores can be invalidated by capture-path artifacts and that corrected clean capture materially changes the local-model ranking.
The Operator Shell Pattern argues that OpenClaw closes a real outer-layer gap around Project Phoenix by adding access, compression, incident discipline, and operator visibility without moving the authority boundary.
The portfolio divides into two tracks, with the operating discipline track further grouped into primary and detail papers.
| # | Title | Track |
|---|---|---|
| Primary Papers — Operating Discipline | ||
| 1.1 | Project Phoenix — Open-Core Standards | Framework |
| 1.2 | Offline Grounded Domain Agent | Grounding |
| 1.3 | Ski Chalet Harness Boundary | Grounding |
| 1.4 | Fab Simulation & RVH | Grounding |
| 1.5 | LocalLLMTSP — Solver-Backed Orchestration | Orchestration |
| 1.6 | Where Orchestration Beats Raw Model Power | Orchestration |
| 1.7 | Agentic Coding Failure Patterns | Operations |
| RVH / ML Evaluation | ||
| 1.8 | Rough Volatility — Cross-Domain Benchmark Principle | RVH |
| 1.9 | Rough Volatility — ML Evaluation Domain | RVH |
| 1.10 | Grounded Agent Failure Is Structurally Determined | Boundary |
| Details — Local Model & Boundary | ||
| 1.11 | Local Model Role Suitability | Local Model |
| 1.12 | ShowcaseAgent Routing And Compression | Local Model |
| 1.13 | TourAgent Local Model Screen | Local Model |
| 1.14 | True Ski Chalet Boundary Result | Boundary |
| 1.15 | When The Organized Stack Loses | Boundary |
| 1.16 | The Model Did Not Fail The Protocol, The Terminal Did | Measurement |
| 1.17 | The Operator Shell Pattern | Operator Layer |
The offline-grounded-agent work (Paper 1.2) is one of the strongest current Project Phoenix results, but it is a special case within a broader framework. The grounded-agent result is better understood as a special case of Project Phoenix — a particularly important recent one, and the clearest current answer to the local-usefulness question.
The larger claim is that useful systems require deterministic grounding, explicit validation, clear trust boundaries, and disciplined operating practice.
Project Phoenix argues that reliability is a systems problem, not just an intelligence problem. The emphasis is therefore on grounded domains, deterministic substrates, validation, and operational discipline rather than on prompt optimism or maximal autonomy.
Without this framing, the current strong grounded-agent result can overshadow the broader framework, and older broad framework papers can overstate stale implementation details. The useful middle position: keep the broad Project Phoenix frame; keep the grounded-agent result visible; do not collapse one into the other.