idacy's picture
Upload live inference API deployment files
e4b1ed6 verified
|
Raw
History Blame Contribute Delete
3.46 kB
# datacenter verification modeling
this package trains and evaluates the first public baseline model for the synthetic v0 datacenter training-run verification dataset.
the training unit is one row from:
```text
data/synthetic_v0/features/window_features_all.csv
```
the model does not read raw telemetry samples, events, or snapshots directly
## outputs
the default model run is written to:
```text
data/model_runs/synthetic_v0_baseline/
```
it contains the fitted calibrated model, preprocessing pipeline, split manifest, feature metadata, predictions, metrics, calibration diagnostics, feature importance, and an evidence audit sample
## leakage controls
rows are split by `episode_id`, not randomly. This is required because the same latent episode appears in multiple adjacent windows. The default split is scenario-stratified at the episode level with seed `20260510`:
```text
train: 60%
validation/calibration: 20%
test: 20%
```
the supervised model excludes identifiers, direct labels, site id, episode id, raw manifest hashes, and synthetic-only audit columns such as
`latent_workload_class` and `synthetic_evidence_profile`
## model
the supervised baseline is:
```text
SimpleImputer + OneHotEncoder preprocessing
HistGradientBoostingClassifier
CalibratedClassifierCV on the validation split
```
the package also includes `rule_baseline.py`, a deterministic evidence-rule baseline that encodes broad study logic:
- capacity below threshold with strong coverage is negative evidence
- capacity alone does not prove training
- missing data is not zero activity
- integrity anomalies are warnings, not positive proof
- labels 3 and 4 require coherent multi-layer evidence
## commands
train, evaluate, and generate all default artifacts:
```bash
python src/datacenter_verification_modeling/train_model.py \
--features data/synthetic_v0/features/window_features_all.csv \
--output data/model_runs/synthetic_v0_baseline \
--seed 20260510
```
Evaluate an existing run:
```bash
python src/datacenter_verification_modeling/evaluate_model.py \
--model-run data/model_runs/synthetic_v0_baseline \
--features data/synthetic_v0/features/window_features_all.csv
```
Generate predictions from an existing run:
```bash
python src/datacenter_verification_modeling/predict.py \
--model-run data/model_runs/synthetic_v0_baseline \
--features data/synthetic_v0/features/window_features_all.csv \
--output data/model_runs/synthetic_v0_baseline/predictions_all.csv
```
Prepare split and feature metadata without training:
```bash
python src/datacenter_verification_modeling/prepare_features.py \
--features data/synthetic_v0/features/window_features_all.csv \
--output data/model_runs/synthetic_v0_baseline \
--seed 20260510
```
## governance outputs
prediction files include:
- `p_label_0` through `p_label_4`
- `raw_p_label_0` through `raw_p_label_4`
- `p_large_training`
- `severity_score`
- `capacity_possible`
- `negative_certification_confidence`
- `integrity_warning`
- `critical_missing_layers`
- `top_evidence`
the `p_label_*` columns are post-processed with the capacity gate; the `raw_p_label_*` columns preserve the calibrated model probabilities before that
gate
## limitations
This is a synthetic v0 prototype ! Metrics are useful for testing the pipeline, not for deployment claims. Real datacenter use would require real telemetry, controlled drills, operational calibration, and independent review