6 Configurations · Data Baseline → Full Synthesis
Each variation adds a complete new capability layer on top of the previous. V6 is the full synthesis capstone — one command runs the entire pipeline.
Data loading, baseline features, analysis
V2 Numba microstructure features
SHAP, Optuna, stacking, walk-forward
Drift detection, data quality, alerting
Parallelized CV, features, tuning
Full pipeline, deployment gate, reports
Establishes the base system: CLI, tool registry, and data client. V1 baseline features compute imbalance, price, and temporal signals.
OptiverDataClientNumba-accelerated V2 feature generation. Triplet and pairwise imbalance, microstructure signals, global stock statistics, and temporal window features. Walk-forward safe.
calculate_triplet_imbalance and calculate_pairwise_imbalancecalculate_microstructure_features and calculate_global_stock_featuresAdds SHAP explainability, Bayesian hyperparameter search, stacking ensembles, and walk-forward backtesting.
Full monitoring suite. PSI, KS test, JS divergence, concept drift, covariate shift, structural break detection, and regime classification.
| Signal | Method | Threshold |
|---|---|---|
| Feature drift | PSI | < 0.1 stable · > 0.25 significant |
| Distribution shift | KS test | alpha = 0.05 |
| Divergence | JS divergence | domain-specific |
| Covariate shift | Domain classifier | AUC > 0.6 indicates shift |
| Structural breaks | CUSUM / Chow | change-point detection |
ThreadPoolExecutor-backed parallel versions of the most compute-intensive operations. max_workers should not exceed available CPU cores.
Full pipeline orchestration with profile-driven depth, deployment readiness gate, and structured reports.
Five deployment criteria: data quality, feature drift, CV MAE, Sharpe ratio, and active alerts. All five must clear before deployment is approved.