--- language: - en license: apache-2.0 base_model: - Qwen/Qwen2.5-3B-Instruct pipeline_tag: text-generation tags: - distillation - agentic-rag - qasper - scientific-qa - react - lora datasets: - allenai/qasper --- # DistillAgent-PaperQA-3B DistillAgent-PaperQA-3B is a compact agentic QA model distilled from tool-using trajectories for question answering over scientific papers (QASPER). It is fine-tuned from `Qwen/Qwen2.5-3B-Instruct` using LoRA/rsLoRA with constrained Thought/Action/Observation/Final Answer trajectories. ## Highlights - Small model with practical agentic behavior on research-paper QA. - Outperforms base model in our QASPER 200-sample evaluation. ## Model Details - Base model: `Qwen/Qwen2.5-3B-Instruct` - Training: LoRA / rsLoRA SFT - Domain: scientific paper QA (QASPER) - Inference style: constrained ReAct + section lookup ## Evaluation Summary (QASPER, 200 samples) | Model | EM | Mean F1 | Mean hops | Mean latency | |---|---:|---:|---:|---:| | DistillAgent-PaperQA-3B (SFT) | 14.5% | 0.2425 | 2.36 | 37.28s | | Base Qwen2.5-3B-Instruct | 9.0% | 0.1650 | 3.00 | 20.04s | Notes: - Hops and latency depend on runtime harness and hardware. - Main quality outcome: SFT > base on EM and F1. ## Intended Use - QA over scientific/technical papers with section-level lookup or retrieval. - Research and educational workflows for compact agentic model distillation. ## Limitations - Sensitive to runtime prompt/harness format. - Multi-hop behavior can increase latency. - Should not be used as sole source for high-stakes scientific or medical decisions. ## Usage (Transformers) ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch repo_id = "QuantumCuddle/DistillAgent-PaperQA-3B" tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForCausalLM.from_pretrained( repo_id, torch_dtype=torch.float16, device_map="auto", ) prompt = "QUESTION: What baseline method is used?\nAVAILABLE PAPER SECTIONS:\n1. Abstract\n2. Methods\n..." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=256, temperature=0.0) print(tokenizer.decode(out[0], skip_special_tokens=True)) ``` ## Citation ```bibtex @misc{distillagent_paperqa_3b_2026, title={DistillAgent-PaperQA-3B}, author={QuantumCuddle}, year={2026}, howpublished={\url{https://huggingface.co/QuantumCuddle/DistillAgent-PaperQA-3B}} } ```