Improve model card: add library name, update license, link paper, and enhance content
#2
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,29 +1,35 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- state-spaces/mamba2-130m
|
|
|
|
|
|
|
|
|
|
| 7 |
pipeline_tag: question-answering
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
# Single-Pass
|
|
|
|
|
|
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
|
|
|
| 15 |
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
We highly recommend creating a new conda environment first:
|
| 20 |
-
```
|
| 21 |
conda create -n mamba_retriever python=3.10.14
|
| 22 |
conda activate mamba_retriever
|
| 23 |
```
|
| 24 |
|
| 25 |
-
Then, run the following in your terminal:
|
| 26 |
-
```
|
| 27 |
git clone https://github.com/state-spaces/mamba.git
|
| 28 |
conda install cudatoolkit==11.8 -c nvidia
|
| 29 |
pip install -r requirements.txt
|
|
@@ -33,18 +39,54 @@ cd mamba
|
|
| 33 |
pip install .
|
| 34 |
```
|
| 35 |
|
| 36 |
-
Next, download and install the following two files from
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 40 |
-
```
|
| 41 |
|
| 42 |
-
You can install them using
|
| 43 |
-
```
|
| 44 |
pip install mamba_ssm-2.2.2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 45 |
pip install causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 46 |
```
|
| 47 |
|
| 48 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
-
All evaluation code and
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- state-spaces/mamba2-130m
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: mit
|
| 7 |
pipeline_tag: question-answering
|
| 8 |
+
library_name: transformers
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Single-Pass Document Scanning for Question Answering
|
| 12 |
+
|
| 13 |
+
This repository contains the model checkpoint for **Single-Pass Scanner**, a method introduced in the paper [Single-Pass Document Scanning for Question Answering](https://huggingface.co/papers/2504.03101).
|
| 14 |
|
| 15 |
+
The model architecture is built upon [mamba](https://github.com/state-spaces/mamba), and is trained from [mamba2-130m](https://huggingface.co/state-spaces/mamba2-130m).
|
| 16 |
|
| 17 |
+
## Abstract
|
| 18 |
+
Handling extremely large documents for question answering is challenging: chunk-based embedding methods often lose track of important global context, while full-context transformers can be prohibitively expensive for hundreds of thousands of tokens. We propose a single-pass document scanning approach that processes the entire text in linear time, preserving global coherence while deciding which sentences are most relevant to the query. On 41 QA benchmarks, our single-pass scanner consistently outperforms chunk-based embedding methods and competes with large language models at a fraction of the computational cost. By conditioning on the entire preceding context without chunk breaks, the method preserves global coherence, which is especially important for long documents. Overall, single-pass document scanning offers a simple solution for question answering over massive text.
|
| 19 |
|
| 20 |
+
## Code and Project Page
|
| 21 |
+
The official GitHub repository for the project, containing all code, datasets, and additional details, is available at: [https://github.com/MambaRetriever/MambaRetriever](https://github.com/MambaRetriever/MambaRetriever).
|
| 22 |
|
| 23 |
+
## Installation
|
| 24 |
|
| 25 |
We highly recommend creating a new conda environment first:
|
| 26 |
+
```bash
|
| 27 |
conda create -n mamba_retriever python=3.10.14
|
| 28 |
conda activate mamba_retriever
|
| 29 |
```
|
| 30 |
|
| 31 |
+
Then, run the following in your terminal to install necessary packages and the `mamba` library:
|
| 32 |
+
```bash
|
| 33 |
git clone https://github.com/state-spaces/mamba.git
|
| 34 |
conda install cudatoolkit==11.8 -c nvidia
|
| 35 |
pip install -r requirements.txt
|
|
|
|
| 39 |
pip install .
|
| 40 |
```
|
| 41 |
|
| 42 |
+
Next, download and install the following two `.whl` files for Mamba dependencies from their respective releases pages:
|
| 43 |
+
- `mamba_ssm-2.2.2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl` from [https://github.com/state-spaces/mamba/releases](https://github.com/state-spaces/mamba/releases)
|
| 44 |
+
- `causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl` from [https://github.com/Dao-AILab/causal-conv1d/releases](https://github.com/Dao-AILab/causal-conv1d/releases)
|
|
|
|
|
|
|
| 45 |
|
| 46 |
+
You can install them using `pip`:
|
| 47 |
+
```bash
|
| 48 |
pip install mamba_ssm-2.2.2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 49 |
pip install causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 50 |
```
|
| 51 |
|
| 52 |
+
## Usage
|
| 53 |
+
|
| 54 |
+
Our model checkpoints are available at [Hugging Face](https://huggingface.co/MambaRetriever). You can load the model using the `transformers` library with `trust_remote_code=True`.
|
| 55 |
+
|
| 56 |
+
```python
|
| 57 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 58 |
+
import torch
|
| 59 |
+
|
| 60 |
+
# Load the model and tokenizer
|
| 61 |
+
model_name = "MambaRetriever/SPScanner-130m"
|
| 62 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
| 63 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name, trust_remote_code=True)
|
| 64 |
+
|
| 65 |
+
# Move model to GPU if available
|
| 66 |
+
if torch.cuda.is_available():
|
| 67 |
+
model = model.cuda()
|
| 68 |
+
|
| 69 |
+
# Example usage (simplified for demonstration, refer to GitHub for full pipeline)
|
| 70 |
+
question = "What is the capital of France?"
|
| 71 |
+
context = "Paris is the capital and most populous city of France."
|
| 72 |
+
|
| 73 |
+
# Tokenize input (example, actual tokenization depends on dataset format and model's specific use case)
|
| 74 |
+
# This model is for document scanning/retrieval; actual usage involves processing long documents.
|
| 75 |
+
# The following is a conceptual example for basic model loading and forward pass.
|
| 76 |
+
inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=512)
|
| 77 |
+
|
| 78 |
+
# Move inputs to GPU if model is on GPU
|
| 79 |
+
if torch.cuda.is_available():
|
| 80 |
+
inputs = {k: v.cuda() for k, v in inputs.items()}
|
| 81 |
+
|
| 82 |
+
# Get model outputs (e.g., relevance scores for sentences/chunks)
|
| 83 |
+
with torch.no_grad():
|
| 84 |
+
outputs = model(**inputs)
|
| 85 |
+
|
| 86 |
+
# Further processing to extract answer would follow, as per the official repository's pipeline.
|
| 87 |
+
print("Model output logits (example for a classification task):", outputs.logits)
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
## Evaluation and Training
|
| 91 |
|
| 92 |
+
All detailed evaluation code and scripts, as well as training instructions and synthetic data generation steps, are available in the [Single-Pass Scanner GitHub repository](https://github.com/MambaRetriever/MambaRetriever). Please refer to the `run_evaluation.sh` and `run_training.sh` scripts, along with the `Synthetic Data Generation` section in the GitHub README, for comprehensive guidance.
|