Tools | Optiver

Registry Overview

84

Total Tools

5

Data Tools

16

Feature Tools

19

Model + ML Tools

16

Drift + Monitor

Each tool owns one operation. No tool crosses pipeline stages. All outputs are verifiable before the next stage begins.

Data Tools

Tool	Purpose
`load_data`	Load Kaggle Trading at the Close dataset from disk.
`get_stock_info`	Return metadata for a specific stock ID (0–199).
`get_date_range`	Return the date ID range present in the dataset.
`get_data_summary`	Statistical summary of rows, columns, and target distribution.
`filter_data`	Slice dataset by stock ID, date range, or feature subset.

Feature Tools — V1 Baseline

Imbalance Features

Order book imbalance signals from bid/ask size and price.

calculate_imbalance_features

Price Features

WAP, reference price, and bid-ask spread derivations.

calculate_price_features

Temporal Features

Seconds-in-bucket and auction-proximity signals.

calculate_temporal_features

Composite + Info

All-feature pipeline and registry lookup.

calculate_all_features get_feature_info

Feature Tools — V2 Microstructure

Numba-accelerated V2 features for walk-forward safe pipelines.

Triplet + Pairwise Imbalance

calculate_triplet_imbalance calculate_pairwise_imbalance

Microstructure + Global

calculate_microstructure_features calculate_global_stock_features

Temporal Windows

calculate_temporal_shift calculate_temporal_return calculate_temporal_diff generate_all_v2_features

Exports + Verification

apply_zero_sum_adjustment export_features verify_export

Analysis Tools

Tool	Purpose
`analyze_stock`	Distribution, trend, and target summary for a stock ID.
`analyze_target`	Target variable statistics and skew analysis.
`calculate_correlation`	Feature-to-target and feature-to-feature correlation matrices.
`compare_stocks`	Side-by-side stock behavior comparison.
`analyze_temporal_patterns`	Intraday and cross-day temporal signal patterns.

Model Tools

Tool	Purpose
`prepare_train_test`	Time-series aware train/test split with purge gap.
`train_baseline`	Baseline LightGBM fit with default parameters.
`evaluate_model`	MAE, RMSE, and zero-sum adjusted evaluation.
`get_feature_importance`	Gain-based feature importance from trained model.
`create_cv_splits`	Walk-forward cross-validation split generator.
`train_lightgbm_fold`	Single fold LightGBM training with early stopping.
`train_lightgbm_cv`	Full walk-forward CV with OOF predictions.
`predict_ensemble`	Ensemble prediction from CV fold models.
`apply_zero_sum_prediction`	Market-neutral prediction normalization.

Advanced ML — SHAP · Optuna · Ensembles

Explainability

compute_shap_values get_shap_importance explain_single_prediction

Tuning

tune_hyperparameters get_best_params visualize_optimization

Ensembles + Walk-Forward

create_stacking_ensemble train_stacking_meta run_walk_forward analyze_regime

Drift + Quality + Alerts

PSI < 0.1 = stable · PSI > 0.25 = significant drift · KS alpha = 0.05

Statistical Drift

detect_ks_drift calculate_psi calculate_js_divergence track_feature_drift detect_multivariate_drift

Concept + Covariate

detect_target_drift detect_concept_drift detect_covariate_shift analyze_prediction_drift

Quality + Structural

track_missing_values track_outlier_frequency detect_structural_breaks_cusum detect_structural_breaks_chow fit_regime_hmm

Alerts + Dashboard

generate_drift_alerts get_monitoring_dashboard_data

Parallel Execution + Synthesis

Parallel Tools

ThreadPoolExecutor-backed variants of CV, feature gen, tuning, and monitoring.

generate_all_v2_features_parallel train_lightgbm_cv_parallel tune_hyperparameters_parallel run_walk_forward_parallel track_feature_drift_parallel

Synthesis Capstone

End-to-end orchestration with deployment gate and structured reports.

run_synthesis_workflow check_deployment_readiness create_analysis_report create_model_report create_monitoring_report