Vision
A single agentic interface that spans ALL 7 WQU Data Science projects, enabling:
- Cross-project analysis ("Apply GARCH to air quality data")
- Technique comparison ("Which model works best for classification?")
- Educational integration (Textbook reference on demand)
- Unified data exploration across all domains
System Architecture
┌─────────────────────────────────────────────────────────────────┐
│ WQ Unified Cockpit │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────┐│
│ │ Blueprint │ │ Execution │ │ Inspector │ │Artifacts││
│ │ Panel │ │ Trace │ │ Panel │ │Workspace││
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────┘│
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ PlanBuilder │
│ Query Pattern Matching (NLI) → Execution Plan Generation │
│ _plan_proj2_* _plan_proj3_* _plan_cross_project_* │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ AgenticEngine │
│ State Machine: IDLE → PLANNING → AWAITING_APPROVAL → RUNNING │
│ Parameter Resolution: $step_N_result.field │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ToolRegistry (41 tools) │
│ PROJECT TOOLS │ CROSS-PROJECT │ TEXTBOOK │ UTILITY │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ExecutionContext (3-Tier Cache) │
│ step_results │ data_cache │ query_cache │
└─────────────────────────────────────────────────────────────────┘
Per-Project Cockpit Structure
| File |
Purpose |
agentic_engine.py |
Plan execution orchestrator with state machine |
plan_builder.py |
NLI parser converting queries to ExecutionPlans |
data_client.py |
Project-specific data access with 3-tier caching |
tool_registry.py |
Tool catalog (data, model, visualization, utility) |
execution_context.py |
Session state and cache management |
ui_components/ |
tkinter UI panels (blueprint, trace, inspector) |
State Machine
IDLE
│
▼
PLANNING ──────────────────────────────┐
│ │
▼ │
AWAITING_APPROVAL ──[cancel]────────────┤
│ │
[approve] │
│ │
▼ │
RUNNING ──[pause]──► PAUSED ──[resume]──┤
│ │ │
[complete] [cancel] │
│ │ │
▼ ▼ │
COMPLETED IDLE ◄───────────┘
Three-Tier Caching
1. Memory Cache
In-memory dictionary for current session data. Fastest access, cleared on restart.
2. Disk Cache
Parquet files in data/cache/. Persists across sessions.
3. URL Fetch
Fallback to sample data URLs if local files unavailable.
Pattern Matching Hierarchy
PlanBuilder matches queries from most-specific to least-specific:
- Exact lesson references: "lesson 3.3", "run lesson 2.1"
- Cross-project patterns: "apply X from projA to projB"
- Technique patterns: "cluster", "classify", "forecast"
- Domain patterns: "real estate", "air quality", "earthquake"
- General fallbacks: "help", "show capabilities"
File Structure
domains/WQ/
├── unified_agent/
│ ├── data_client.py # UnifiedDataClient
│ ├── tool_registry.py # 41 tools
│ ├── plan_builder.py # Query patterns
│ ├── agentic_engine.py # Orchestration
│ └── wq_cockpit.py # Unified GUI
├── Proj2-8/ # Project data sources
├── textbook_client.py # PDF parsing
├── textbook_tools.py # Reference tools
└── Textbook/ # 22 markdown chapters