wexumin's picture
Update README.md
2a857dc verified
# PISCO Compression Probe
A lightweight classifier that predicts whether PISCO's compressed representation will yield a correct answer β€” enabling selective routing between compressed and full-context inference.
**Model:** Mid-layer hidden state probe trained on [PISCO](https://arxiv.org/abs/2501.16075) decoder representations.
**Task:** Binary classification β€” `0` = no overflow, compressed answer is correct, `1` = information overflow, compressed answer is likely wrong.
## Revisions
| Dataset | Revision | Train AUC | Test AUC |
|-----------|-------------|-----------|----------|
| Combined | `main` | 0.8258 | 0.7643 |
| SQuAD | `squad_v2` | 0.7550 | 0.7059 |
| HotpotQA | `hotpotqa` | 0.8234 | 0.7476 |
| TriviaQA | `triviaqa` | 0.8977 | 0.8042 |
<sub>*Combined = SQuAD + HotpotQA + TriviaQA*</sub>
## Installation
```bash
pip install torch huggingface_hub
```
## Usage
The model class is stored in the repo β€” no local installation needed.
```python
import importlib.util
from huggingface_hub import hf_hub_download
import torch
# 1. Load the model class directly from the repo
path = hf_hub_download("s-nlp/pisco-compression-probe", "probe_clf.py")
spec = importlib.util.spec_from_file_location("probe_clf", path)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PISCOClassifier = mod.PISCOClassifier
# 2. Pick dataset revision
clf = PISCOClassifier.from_pretrained(
"wexumin/pisco-compression-probe",
revision="hotpotqa",
)
# X: last-token hidden state from PISCO decoder layer 16 (shape: N x 4096)
# Captured during forward β€” input is [instruction + compressed context + query]
probs = clf.predict_proba(X) # np.ndarray, P(overflow)
preds = clf.predict(X) # binary, uses stored threshold
```
## Routing logic
```
# pred=0 β†’ answer likely correct β†’ use PISCO output
# pred=1 β†’ answer likely wrong β†’ fall back to full context
```
## Citation
```bibtex
@inproceedings{belikova-etal-2026-detecting,
title = "Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation",
author = "Belikova, Julia and Rozhevskii, Danila and Svirin, Dennis and Polev, Konstantin and Panchenko, Alexander",
editor = "Baez Santamaria, Selene and Somayajula, Sai Ashish and Yamaguchi, Atsuki",
booktitle = "Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 4: Student Research Workshop)",
month = mar,
year = "2026",
address = "Rabat, Morocco",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2026.eacl-srw.59/",
pages = "797--810",
ISBN = "979-8-89176-383-8"
}
```