episodic-ingestion-compiler β Rust integration bundle
Rust-native packaging of the ingestion-model-spec-v1 encoder
(ModernBERT-large + 5 heads) for consumers using candle-transformers.
Zero Python runtime needed.
Full training checkpoint (best.pt + calibrator.joblib) lives at
Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000.
Headline metrics (13,786-row held-out test split)
| Metric | Value |
|---|---|
| schema_validity | 1.000 |
| abstention_accuracy | 0.976 |
| claim_present F1 | 0.962 |
| false_emission_rate | 0.017 |
| span_token_IoU | 0.969 |
| predicate_accuracy_macro | 0.966 |
| subject_type_accuracy_macro | 0.997 |
Contents
model.safetensors β ModernBERT backbone + 5 heads, bf16, 790 MB
config.json β arch + trained args + head schema
tokenizer.json β HF tokenizers-crate compatible
spec.json β spec-v1 contract (predicates, subject_types,
object_value shapes, literal fields)
calibrator.json β isotonic regression as piecewise-linear points
tensor_index.json β per-tensor shape/dtype (debug aid)
Backbone keys are remapped to backbone.model.* so
candle_transformers::models::modernbert::ModernBert::load(vb.pp("backbone"), ...)
resolves without any glue code.
Loading in Rust
[dependencies]
candle-core = "0.9"
candle-nn = "0.9"
candle-transformers = "0.9"
tokenizers = { version = "0.22", default-features = false, features = ["onig"] }
safetensors = "0.4"
Working crate + example at https://github.com/avifenesh/episodic-ingestion-compiler/tree/main/rust/ingestion-model
use candle_core::Device;
use ingestion_model::{Bundle, IngestionEncoder, Role};
let bundle = Bundle::load("ingestion_model_v1")?;
let device = Device::cuda_if_available(0).unwrap_or(Device::Cpu);
let encoder = IngestionEncoder::load(&bundle, device)?;
let raw = encoder.predict(
Role::Assistant,
"I edited crates/foo/src/lib.rs and ran cargo test which passed.",
)?;
// raw = claim_prob, predicate, class_hint, subject_type,
// confidence_raw, span_char (in original blurb), span_tok
let calibrated = bundle.calibrator.as_ref()
.map(|c| c.apply(raw.confidence_raw))
.unwrap_or(raw.confidence_raw);
On CPU (single-core gemm), warm-start is ~100 ms/req with bf16 weights
automatically upcast to fp32 (candle's CPU matmul doesn't support bf16).
Batched / CUDA paths exposed via predict_batch and --features cuda.
Spec-v1 contract
Closed vocabularies are in spec.json:
- 21 predicates β Class S (declarative) + Class E (evidence) +
reserved (
reverted_file,deleted_file,created_file,committed,deployed,incident_observed) - 13 subject_types β
objective,command,file,pr,incident,policy,person,repo,team,service,document,ticket,thread - 2 deferred predicates that the model MUST NEVER emit β
has_current_input,has_phase - object_value shapes per predicate family, for the wrapper
The model emits the information-bearing subset; the caller's
deterministic wrapper fills in literal fields (status="candidate",
source_authority="model_draft", extraction_method="ingestion_model_v1"),
per-predicate object_value dicts, subject_id extraction from the
span, and spec validation (silent-drop on failure).
Python reference implementation:
scripts/ingestion_wrapper.py.
Rust crate ships only the encoder for now; the wrapper lives in the
caller's crate so it can evolve independently of the model.
Calibration
calibrator.json is a piecewise-linear isotonic map. Raw confidence
ECE 0.340 β post-calibration ECE 0.000 on the 2879-row cal split.
Apply with (c - x[i]) / (x[i+1] - x[i]) linear interp between the
knots, clip to endpoints out of range.
License / lineage
ModernBERT-large is Apache-2.0. The 5 adapter heads and calibrator are derivatives; training code is at https://github.com/avifenesh/episodic-ingestion-compiler.
- Downloads last month
- 55
Model tree for Avifenesh/episodic-ingestion-compiler-rust-bundle
Base model
answerdotai/ModernBERT-large