allenai/qasper
Viewer • Updated • 1.59k • 5.73k • 99
DistillAgent-PaperQA-3B is a compact agentic QA model distilled from tool-using trajectories for question answering over scientific papers (QASPER).
It is fine-tuned from Qwen/Qwen2.5-3B-Instruct using LoRA/rsLoRA with constrained Thought/Action/Observation/Final Answer trajectories.
Qwen/Qwen2.5-3B-Instruct| Model | EM | Mean F1 | Mean hops | Mean latency |
|---|---|---|---|---|
| DistillAgent-PaperQA-3B (SFT) | 14.5% | 0.2425 | 2.36 | 37.28s |
| Base Qwen2.5-3B-Instruct | 9.0% | 0.1650 | 3.00 | 20.04s |
Notes:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
repo_id = "QuantumCuddle/DistillAgent-PaperQA-3B"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype=torch.float16,
device_map="auto",
)
prompt = "QUESTION: What baseline method is used?\nAVAILABLE PAPER SECTIONS:\n1. Abstract\n2. Methods\n..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.0)
print(tokenizer.decode(out[0], skip_special_tokens=True))
@misc{distillagent_paperqa_3b_2026,
title={DistillAgent-PaperQA-3B},
author={QuantumCuddle},
year={2026},
howpublished={\url{https://huggingface.co/QuantumCuddle/DistillAgent-PaperQA-3B}}
}