Data | Optiver

Dataset Coverage

480K

Rows

200

Stocks

481

Trading Days

17

Columns

Source: ~/Python/Optiver/OptFeatureViz/train.csv — Kaggle Trading at the Close competition dataset. Each row represents a 10-second auction interval for one stock on one trading day.

Schema Reference

Column	Type	Description
`stock_id`	int	Stock identifier (0–199)
`date_id`	int	Trading day identifier (0–480)
`seconds_in_bucket`	int	Seconds elapsed in the auction window
`imbalance_size`	float	Volume of imbalance at current snapshot
`imbalance_buy_sell_flag`	int	Buy (1), sell (-1), or neutral (0) imbalance direction
`reference_price`	float	Price at which imbalance is zero
`matched_size`	float	Volume matched in auction at current price
`far_price`	float	Indicative uncrossing price for all auction orders
`near_price`	float	Indicative uncrossing price for limit orders
`bid_price`	float	Best bid in continuous order book
`bid_size`	float	Volume at best bid
`ask_price`	float	Best ask in continuous order book
`ask_size`	float	Volume at best ask
`wap`	float	Weighted average price from bid/ask sizes
`target`	float	60-second future price movement (prediction target)
`time_id`	int	Unique time bucket identifier
`row_id`	str	Unique row identifier (stock_id–time_id)

System Architecture

unified_agent/ ├── data_client.py # OptiverDataClient: data access layer ├── tool_registry.py # ToolRegistry: 84 tools by category ├── execution_context.py # ExecutionContext: shared pipeline state ├── plan_builder.py # PlanBuilder: regex → execution plan ├── agentic_engine.py # State machine orchestrator ├── feature_engineering.py # V2 microstructure feature generation ├── model_pipeline.py # LightGBM training with time-series CV ├── optuna_tuner.py # Hyperparameter optimization (TPE) ├── walk_forward.py # Walk-forward backtesting ├── feature_drift.py # Feature drift detection (PSI, KS, JS) ├── target_drift.py # Target/concept drift detection ├── data_quality.py # Data quality monitoring ├── monitoring_alerts.py # Alert routing ├── smart_tools.py # Meta-tools (SmartAnalyze, SmartMonitor) └── report_templates.py # Structured report outputs

Execution Flow

CLI + REPL

cli/main.py provides interactive commands and one-shot queries. Natural language maps to tool execution plans.

Planner

PlanBuilder compiles regex patterns into ordered execution steps. More specific patterns take priority.

Engine

AgenticEngine executes plan steps, updates ExecutionContext, and surfaces results.

Reports

report_templates generate analysis, model, and monitoring summaries with structured fields.

Quick Start

# Run interactive REPL cd domains/Optiver/cli && python main.py # One-shot query python main.py "analyze stock 5" # Run domain test suite cd domains/Optiver && python test_domain.py # Full synthesis workflow python main.py "run synthesis workflow standard"