s-nlp
/

pisco-compression-probe

Model card Files Files and versions

pisco-compression-probe / README.md

wexumin's picture

Update README.md

2a857dc verified about 1 month ago

|

history blame contribute delete

2.77 kB

	# PISCO Compression Probe

	A lightweight classifier that predicts whether PISCO's compressed representation will yield a correct answer — enabling selective routing between compressed and full-context inference.

	Model: Mid-layer hidden state probe trained on [PISCO](https://arxiv.org/abs/2501.16075) decoder representations.
	Task: Binary classification — `0` = no overflow, compressed answer is correct, `1` = information overflow, compressed answer is likely wrong.

	## Revisions

	\| Dataset \| Revision \| Train AUC \| Test AUC \|
	\|-----------\|-------------\|-----------\|----------\|
	\| Combined \| `main` \| 0.8258 \| 0.7643 \|
	\| SQuAD \| `squad_v2` \| 0.7550 \| 0.7059 \|
	\| HotpotQA \| `hotpotqa` \| 0.8234 \| 0.7476 \|
	\| TriviaQA \| `triviaqa` \| 0.8977 \| 0.8042 \|

	<sub>Combined = SQuAD + HotpotQA + TriviaQA</sub>

	## Installation

	```bash
	pip install torch huggingface_hub
	```

	## Usage

	The model class is stored in the repo — no local installation needed.

	```python
	import importlib.util
	from huggingface_hub import hf_hub_download
	import torch

	# 1. Load the model class directly from the repo
	path = hf_hub_download("s-nlp/pisco-compression-probe", "probe_clf.py")
	spec = importlib.util.spec_from_file_location("probe_clf", path)
	mod = importlib.util.module_from_spec(spec)
	spec.loader.exec_module(mod)
	PISCOClassifier = mod.PISCOClassifier

	# 2. Pick dataset revision

	clf = PISCOClassifier.from_pretrained(
	"wexumin/pisco-compression-probe",
	revision="hotpotqa",
	)

	# X: last-token hidden state from PISCO decoder layer 16 (shape: N x 4096)
	# Captured during forward — input is [instruction + compressed context + query]
	probs = clf.predict_proba(X) # np.ndarray, P(overflow)
	preds = clf.predict(X) # binary, uses stored threshold
	```

	## Routing logic

	```
	# pred=0 → answer likely correct → use PISCO output
	# pred=1 → answer likely wrong → fall back to full context
	```

	## Citation

	```bibtex
	@inproceedings{belikova-etal-2026-detecting,
	title = "Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation",
	author = "Belikova, Julia and Rozhevskii, Danila and Svirin, Dennis and Polev, Konstantin and Panchenko, Alexander",
	editor = "Baez Santamaria, Selene and Somayajula, Sai Ashish and Yamaguchi, Atsuki",
	booktitle = "Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 4: Student Research Workshop)",
	month = mar,
	year = "2026",
	address = "Rabat, Morocco",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2026.eacl-srw.59/",
	pages = "797--810",
	ISBN = "979-8-89176-383-8"
	}
	```