Spaces:

idacy
/

datacenter-verification-api

Sleeping

App Files Files Community

datacenter-verification-api / src /datacenter_verification_modeling /README.md

idacy

Upload live inference API deployment files

e4b1ed6 verified about 2 months ago

preview code

Raw

History Blame Contribute Delete

3.46 kB

	# datacenter verification modeling

	this package trains and evaluates the first public baseline model for the synthetic v0 datacenter training-run verification dataset.

	the training unit is one row from:

	```text
	data/synthetic_v0/features/window_features_all.csv
	```

	the model does not read raw telemetry samples, events, or snapshots directly

	## outputs

	the default model run is written to:

	```text
	data/model_runs/synthetic_v0_baseline/
	```

	it contains the fitted calibrated model, preprocessing pipeline, split manifest, feature metadata, predictions, metrics, calibration diagnostics, feature importance, and an evidence audit sample

	## leakage controls

	rows are split by `episode_id`, not randomly. This is required because the same latent episode appears in multiple adjacent windows. The default split is scenario-stratified at the episode level with seed `20260510`:

	```text
	train: 60%
	validation/calibration: 20%
	test: 20%
	```

	the supervised model excludes identifiers, direct labels, site id, episode id, raw manifest hashes, and synthetic-only audit columns such as
	`latent_workload_class` and `synthetic_evidence_profile`

	## model

	the supervised baseline is:

	```text
	SimpleImputer + OneHotEncoder preprocessing
	HistGradientBoostingClassifier
	CalibratedClassifierCV on the validation split
	```

	the package also includes `rule_baseline.py`, a deterministic evidence-rule baseline that encodes broad study logic:

	- capacity below threshold with strong coverage is negative evidence
	- capacity alone does not prove training
	- missing data is not zero activity
	- integrity anomalies are warnings, not positive proof
	- labels 3 and 4 require coherent multi-layer evidence

	## commands

	train, evaluate, and generate all default artifacts:

	```bash
	python src/datacenter_verification_modeling/train_model.py \
	--features data/synthetic_v0/features/window_features_all.csv \
	--output data/model_runs/synthetic_v0_baseline \
	--seed 20260510
	```

	Evaluate an existing run:

	```bash
	python src/datacenter_verification_modeling/evaluate_model.py \
	--model-run data/model_runs/synthetic_v0_baseline \
	--features data/synthetic_v0/features/window_features_all.csv
	```

	Generate predictions from an existing run:

	```bash
	python src/datacenter_verification_modeling/predict.py \
	--model-run data/model_runs/synthetic_v0_baseline \
	--features data/synthetic_v0/features/window_features_all.csv \
	--output data/model_runs/synthetic_v0_baseline/predictions_all.csv
	```

	Prepare split and feature metadata without training:

	```bash
	python src/datacenter_verification_modeling/prepare_features.py \
	--features data/synthetic_v0/features/window_features_all.csv \
	--output data/model_runs/synthetic_v0_baseline \
	--seed 20260510
	```

	## governance outputs

	prediction files include:

	- `p_label_0` through `p_label_4`
	- `raw_p_label_0` through `raw_p_label_4`
	- `p_large_training`
	- `severity_score`
	- `capacity_possible`
	- `negative_certification_confidence`
	- `integrity_warning`
	- `critical_missing_layers`
	- `top_evidence`

	the `p_label_` columns are post-processed with the capacity gate; the `raw_p_label_` columns preserve the calibrated model probabilities before that
	gate

	## limitations

	This is a synthetic v0 prototype ! Metrics are useful for testing the pipeline, not for deployment claims. Real datacenter use would require real telemetry, controlled drills, operational calibration, and independent review