SPScanner-1.3b / README.md

Improve model card: add paper details, update license, and add library tag

026ba69 verified 6 months ago

3.28 kB

	---
	base_model:
	- state-spaces/mamba2-1.3b
	language:
	- en
	license: mit
	pipeline_tag: question-answering
	library_name: transformers
	---

	# Single-Pass Document Scanning for Question Answering

	This repository contains the model checkpoint for [Single-Pass Document Scanning for Question Answering](https://huggingface.co/papers/2504.03101), presented in the paper of the same name.

	The Single-Pass Scanner addresses the challenge of handling extremely large documents for question answering by processing the entire text in linear time, preserving global coherence while identifying the most relevant sentences for a given query. Built upon the Mamba architecture, it offers a computationally efficient solution for QA over massive text.

	## Abstract
	Handling extremely large documents for question answering is challenging: chunk-based embedding methods often lose track of important global context, while full-context transformers can be prohibitively expensive for hundreds of thousands of tokens. We propose a single-pass document scanning approach that processes the entire text in linear time, preserving global coherence while deciding which sentences are most relevant to the query. On 41 QA benchmarks, our single-pass scanner consistently outperforms chunk-based embedding methods and competes with large language models at a fraction of the computational cost. By conditioning on the entire preceding context without chunk breaks, the method preserves global coherence, which is especially important for long documents. Overall, single-pass document scanning offers a simple solution for question answering over massive text.

	For the official code, setup instructions, and detailed evaluation, please refer to the [Single-Pass Scanner GitHub repository](https://github.com/MambaRetriever/MambaRetriever). The training and evaluation datasets are available at [Hugging Face Datasets](https://huggingface.co/datasets/MambaRetriever/MambaRetriever).

	The model architecture is built upon [mamba](https://github.com/state-spaces/mamba), and is trained from [mamba2-1.3b](https://huggingface.co/state-spaces/mamba2-1.3b).

	## Usage

	We highly recommend creating a new conda environment first:
	```
	conda create -n mamba_retriever python=3.10.14
	conda activate mamba_retriever
	```

	Then, run the following in your terminal:
	```
	git clone https://github.com/state-spaces/mamba.git
	conda install cudatoolkit==11.8 -c nvidia
	pip install -r requirements.txt
	pip3 install torch==2.1.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
	pip install accelerate -U
	cd mamba
	pip install .
	```

	Next, download and install the following two files from https://github.com/state-spaces/mamba/releases and https://github.com/Dao-AILab/causal-conv1d/releases:
	```
	mamba_ssm-2.2.2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
	causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
	```

	You can install them using
	```
	pip install mamba_ssm-2.2.2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
	pip install causal_conv1d-1.4.0+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
	```

	## Evaluation

	All evaluation code and details are available at [Single-Pass Scanner Github](https://github.com/MambaRetriever/MambaRetriever)