# Overflow Probe on PISCO representations A binary MLP probe that detects **token overflow** in soft-compressed document representations [PISCO](https://arxiv.org/abs/2501.16075). Token overflow occurs when a document's information content exceeds the capacity of the compressed token budget, leading to degraded downstream QA performance. ## How It Works The probe takes a 4096-dim vector: | Component | Description | |-----------|-------------| | `mid_q` | Last hidden representation from mid layer (16) of a PISCO decoder model with standard prompt, compressed context, and a question.| Output: probability that the compressed representation has **overflowed** (i.e., lost critical information). ## Installation ```bash pip install torch huggingface_hub ``` ## Usage ### 1. Get the class definition The model requires the `PISCOClassifier` class to load. Grab it from this repo: ```python from huggingface_hub import hf_hub_download import importlib.util, sys path = hf_hub_download("wexumin/overflow_probe_pisco_squad", "pisco_clf.py") spec = importlib.util.spec_from_file_location("pisco_clf", path) mod = importlib.util.module_from_spec(spec) spec.loader.exec_module(mod) PISCOClassifier = mod.PISCOClassifier ``` ### 2. Load the model ```python model = PISCOClassifier.from_pretrained("wexumin/overflow_probe_pisco_squad") ``` ### 3. Run inference ```python # postproj: compressed doc embedding (4096-dim) x = mid_q probs = model.predict_proba(x) # (n, ) — is overflow probability preds = model.predict(x) # (n,) — binary 0/1 (one can provide custom threshold parameter) ``` ## Training Data - **SQuAD** — extractive QA over Wikipedia paragraphs Each context in the dataset was reduced to just question-answering sentence and then filled with noise context to be up to 128 tokens (in terms of pisco encoder tokenzier). ## Architecture ``` → Linear(4096, 512) → LayerNorm → GELU → Dropout(0.3) → Linear(512, 128) → GELU → Dropout(0.2) → Linear(128, 1) ``` ## Citation ```bibtex @inproceedings{belikova-etal-2026-detecting, title = "Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation", author = "Belikova, Julia and Rozhevskii, Danila and Svirin, Dennis and Polev, Konstantin and Panchenko, Alexander", editor = "Baez Santamaria, Selene and Somayajula, Sai Ashish and Yamaguchi, Atsuki", booktitle = "Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 4: Student Research Workshop)", month = mar, year = "2026", address = "Rabat, Morocco", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2026.eacl-srw.59/", pages = "797--810", ISBN = "979-8-89176-383-8" } ```