| language: | |
| - en | |
| - ru | |
| tags: | |
| - efficientrag | |
| - multi-hop-qa | |
| - token-classification | |
| - deberta-v3 | |
| license: mit | |
| base_model: microsoft/mdeberta-v3-base | |
| # EfficientRAG Filter (mdeberta-v3-base) | |
| **Filter** component of [EfficientRAG](https://arxiv.org/abs/2408.04259) — constructs next-hop queries via token selection. | |
| ## What it does | |
| Given the original question + extracted useful tokens, the Filter selects which tokens to keep in the next retrieval query. This is extractive (no generation) — it picks words from the input. | |
| ## Architecture | |
| - Base: `microsoft/mdeberta-v3-base` (86M params, multilingual) | |
| - Standard `DebertaV2ForTokenClassification` with 2 labels (keep/discard) | |
| ## Training | |
| | | | | |
| |--|--| | |
| | Data | 5,691 samples (HotpotQA EN + Dragon-derec RU) | | |
| | Epochs | 2 | | |
| | Batch size | 4 | | |
| | LR | 1e-5 | | |
| | Max length | 128 | | |
| | Hardware | Apple M3 Pro, ~17 minutes | | |
| ## Usage | |
| ## Related | |
| - Training data: [Necent/efficientrag-filter-training-data](https://huggingface.co/datasets/Necent/efficientrag-filter-training-data) | |
| - Labeler model: [Necent/efficientrag-labeler-mdeberta-v3-base](https://huggingface.co/Necent/efficientrag-labeler-mdeberta-v3-base) | |
| - Paper: [EfficientRAG (arXiv:2408.04259)](https://arxiv.org/abs/2408.04259) | |