metadata
language:
- en
license: apache-2.0
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
tags:
- distillation
- agentic-rag
- qasper
- scientific-qa
- react
- lora
datasets:
- allenai/qasper
DistillAgent-PaperQA-3B
DistillAgent-PaperQA-3B is a compact agentic QA model distilled from tool-using trajectories for question answering over scientific papers (QASPER).
It is fine-tuned from Qwen/Qwen2.5-3B-Instruct using LoRA/rsLoRA with constrained Thought/Action/Observation/Final Answer trajectories.
Highlights
- Small model with practical agentic behavior on research-paper QA.
- Outperforms base model in our QASPER 200-sample evaluation.
Model Details
- Base model:
Qwen/Qwen2.5-3B-Instruct - Training: LoRA / rsLoRA SFT
- Domain: scientific paper QA (QASPER)
- Inference style: constrained ReAct + section lookup
Evaluation Summary (QASPER, 200 samples)
| Model | EM | Mean F1 | Mean hops | Mean latency |
|---|---|---|---|---|
| DistillAgent-PaperQA-3B (SFT) | 14.5% | 0.2425 | 2.36 | 37.28s |
| Base Qwen2.5-3B-Instruct | 9.0% | 0.1650 | 3.00 | 20.04s |
Notes:
- Hops and latency depend on runtime harness and hardware.
- Main quality outcome: SFT > base on EM and F1.
Intended Use
- QA over scientific/technical papers with section-level lookup or retrieval.
- Research and educational workflows for compact agentic model distillation.
Limitations
- Sensitive to runtime prompt/harness format.
- Multi-hop behavior can increase latency.
- Should not be used as sole source for high-stakes scientific or medical decisions.
Usage (Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
repo_id = "QuantumCuddle/DistillAgent-PaperQA-3B"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype=torch.float16,
device_map="auto",
)
prompt = "QUESTION: What baseline method is used?\nAVAILABLE PAPER SECTIONS:\n1. Abstract\n2. Methods\n..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.0)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Citation
@misc{distillagent_paperqa_3b_2026,
title={DistillAgent-PaperQA-3B},
author={QuantumCuddle},
year={2026},
howpublished={\url{https://huggingface.co/QuantumCuddle/DistillAgent-PaperQA-3B}}
}