# datacenter verification modeling this package trains and evaluates the first public baseline model for the synthetic v0 datacenter training-run verification dataset. the training unit is one row from: ```text data/synthetic_v0/features/window_features_all.csv ``` the model does not read raw telemetry samples, events, or snapshots directly ## outputs the default model run is written to: ```text data/model_runs/synthetic_v0_baseline/ ``` it contains the fitted calibrated model, preprocessing pipeline, split manifest, feature metadata, predictions, metrics, calibration diagnostics, feature importance, and an evidence audit sample ## leakage controls rows are split by `episode_id`, not randomly. This is required because the same latent episode appears in multiple adjacent windows. The default split is scenario-stratified at the episode level with seed `20260510`: ```text train: 60% validation/calibration: 20% test: 20% ``` the supervised model excludes identifiers, direct labels, site id, episode id, raw manifest hashes, and synthetic-only audit columns such as `latent_workload_class` and `synthetic_evidence_profile` ## model the supervised baseline is: ```text SimpleImputer + OneHotEncoder preprocessing HistGradientBoostingClassifier CalibratedClassifierCV on the validation split ``` the package also includes `rule_baseline.py`, a deterministic evidence-rule baseline that encodes broad study logic: - capacity below threshold with strong coverage is negative evidence - capacity alone does not prove training - missing data is not zero activity - integrity anomalies are warnings, not positive proof - labels 3 and 4 require coherent multi-layer evidence ## commands train, evaluate, and generate all default artifacts: ```bash python src/datacenter_verification_modeling/train_model.py \ --features data/synthetic_v0/features/window_features_all.csv \ --output data/model_runs/synthetic_v0_baseline \ --seed 20260510 ``` Evaluate an existing run: ```bash python src/datacenter_verification_modeling/evaluate_model.py \ --model-run data/model_runs/synthetic_v0_baseline \ --features data/synthetic_v0/features/window_features_all.csv ``` Generate predictions from an existing run: ```bash python src/datacenter_verification_modeling/predict.py \ --model-run data/model_runs/synthetic_v0_baseline \ --features data/synthetic_v0/features/window_features_all.csv \ --output data/model_runs/synthetic_v0_baseline/predictions_all.csv ``` Prepare split and feature metadata without training: ```bash python src/datacenter_verification_modeling/prepare_features.py \ --features data/synthetic_v0/features/window_features_all.csv \ --output data/model_runs/synthetic_v0_baseline \ --seed 20260510 ``` ## governance outputs prediction files include: - `p_label_0` through `p_label_4` - `raw_p_label_0` through `raw_p_label_4` - `p_large_training` - `severity_score` - `capacity_possible` - `negative_certification_confidence` - `integrity_warning` - `critical_missing_layers` - `top_evidence` the `p_label_*` columns are post-processed with the capacity gate; the `raw_p_label_*` columns preserve the calibrated model probabilities before that gate ## limitations This is a synthetic v0 prototype ! Metrics are useful for testing the pipeline, not for deployment claims. Real datacenter use would require real telemetry, controlled drills, operational calibration, and independent review