episodic-ingestion-compiler — Rust integration bundle

Rust-native packaging of the ingestion-model-spec-v1 encoder (ModernBERT-large + 5 heads) for consumers using candle-transformers. Zero Python runtime needed.

Full training checkpoint (best.pt + calibrator.joblib) lives at Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000.

Headline metrics (13,786-row held-out test split)

Metric	Value
schema_validity	1.000
abstention_accuracy	0.976
claim_present F1	0.962
false_emission_rate	0.017
span_token_IoU	0.969
predicate_accuracy_macro	0.966
subject_type_accuracy_macro	0.997

model.safetensors     — ModernBERT backbone + 5 heads, bf16, 790 MB
config.json           — arch + trained args + head schema
tokenizer.json        — HF tokenizers-crate compatible
spec.json             — spec-v1 contract (predicates, subject_types,
                        object_value shapes, literal fields)
calibrator.json       — isotonic regression as piecewise-linear points
tensor_index.json     — per-tensor shape/dtype (debug aid)

Backbone keys are remapped to backbone.model.* so candle_transformers::models::modernbert::ModernBert::load(vb.pp("backbone"), ...) resolves without any glue code.

Loading in Rust

[dependencies]
candle-core = "0.9"
candle-nn = "0.9"
candle-transformers = "0.9"
tokenizers = { version = "0.22", default-features = false, features = ["onig"] }
safetensors = "0.4"

Working crate + example at https://github.com/avifenesh/episodic-ingestion-compiler/tree/main/rust/ingestion-model

use candle_core::Device;
use ingestion_model::{Bundle, IngestionEncoder, Role};

let bundle = Bundle::load("ingestion_model_v1")?;
let device = Device::cuda_if_available(0).unwrap_or(Device::Cpu);
let encoder = IngestionEncoder::load(&bundle, device)?;

let raw = encoder.predict(
    Role::Assistant,
    "I edited crates/foo/src/lib.rs and ran cargo test which passed.",
)?;
// raw = claim_prob, predicate, class_hint, subject_type,
//       confidence_raw, span_char (in original blurb), span_tok
let calibrated = bundle.calibrator.as_ref()
    .map(|c| c.apply(raw.confidence_raw))
    .unwrap_or(raw.confidence_raw);

On CPU (single-core gemm), warm-start is ~100 ms/req with bf16 weights automatically upcast to fp32 (candle's CPU matmul doesn't support bf16). Batched / CUDA paths exposed via predict_batch and --features cuda.

Spec-v1 contract

Closed vocabularies are in spec.json:

21 predicates — Class S (declarative) + Class E (evidence) + reserved (reverted_file, deleted_file, created_file, committed, deployed, incident_observed)
13 subject_types — objective, command, file, pr, incident, policy, person, repo, team, service, document, ticket, thread
2 deferred predicates that the model MUST NEVER emit — has_current_input, has_phase
object_value shapes per predicate family, for the wrapper

The model emits the information-bearing subset; the caller's deterministic wrapper fills in literal fields (status="candidate", source_authority="model_draft", extraction_method="ingestion_model_v1"), per-predicate object_value dicts, subject_id extraction from the span, and spec validation (silent-drop on failure).

Python reference implementation: scripts/ingestion_wrapper.py. Rust crate ships only the encoder for now; the wrapper lives in the caller's crate so it can evolve independently of the model.

Calibration

calibrator.json is a piecewise-linear isotonic map. Raw confidence ECE 0.340 → post-calibration ECE 0.000 on the 2879-row cal split. Apply with (c - x[i]) / (x[i+1] - x[i]) linear interp between the knots, clip to endpoints out of range.

License / lineage

ModernBERT-large is Apache-2.0. The 5 adapter heads and calibrator are derivatives; training code is at https://github.com/avifenesh/episodic-ingestion-compiler.

Downloads last month: 55

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Avifenesh/episodic-ingestion-compiler-rust-bundle

Base model

answerdotai/ModernBERT-large

Quantized

(10)

this model

Avifenesh
/

episodic-ingestion-compiler-rust-bundle