OSSAS-Qwen3-14B / README.md
amarinference's picture
Update README.md
709c383 verified
---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen3-14B
---
# Paper-Summarizer-Qwen3-14B
A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.
## Model Description
This model is part of [Project OSSAS](https://github.com/context-labs/laion-data-explorer), developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.
**Base Model**: Qwen 3 14B
**Training Data**: 110,000 curated research papers
**Performance**: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%)
**Cost Efficiency**: 98% lower cost than closed-source alternatives
This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.
The model supports papers up to 131K tokens.
## Usage
### Serving the Model
```bash
vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
--port 8000 \
--host 0.0.0.0 \
--trust-remote-code \
--data-parallel-size 1 \
--tensor-parallel-size 1 \
--max-num-seqs 32 \
--max-model-len 131072 \
--max-num-batched-tokens 8192 \
--gpu-memory-utilization 0.90 \
--enable-prefix-caching \
--enable-chunked-prefill
```
### Making Requests
```python
import requests
# System prompt (required)
system_prompt = """[Insert the full system prompt from the prompt.txt file -
see the full prompt in the model repository]"""
# User prompt: the paper text to summarize
paper_text = """
Title: Your Paper Title
Authors: Author 1, Author 2
Abstract: ...
[Full paper content]
"""
# API request
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "inference-net/Paper-Summarizer-Qwen3-14B",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": paper_text}
],
"temperature": 0.2
},
timeout=600
)
result = response.json()
summary = result["choices"][0]["message"]["content"]
print(summary)
```
### System Prompt
The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:
1. **Classify** the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
2. **Extract** structured information including:
- Title, authors, publication year
- Research context and hypotheses
- Methodological details
- Key results with quantitative data
- Claims with supporting evidence
- Limitations and ethical considerations
The full system prompt is available in the model repository's `prompt.txt` file.
### Output Format
The model outputs a single valid JSON object with this structure:
```json
{
"article_classification": "SCIENTIFIC_TEXT",
"reason": null,
"summary": {
"title": "",
"authors": "",
"publication_year": null,
"field_subfield": "",
"executive_summary": "",
"research_context": "",
"methodological_details": "",
"key_results": "",
"claims": [...],
"contradictions_and_limitations": "",
...
}
}
```
## Performance
### LLM-as-a-Judge Evaluation
- **Score**: 4.207/5.0
- **Comparison**: Within 15% of GPT-5 (4.805/5.0)
### QA Dataset Evaluation
- **Accuracy**: 73.9%
- **Comparison**: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)
### Throughput
- **Requests/sec**: 0.43
- **Input Tokens/sec**: 7,516.54
- **Output Tokens/sec**: 2,588.30
## Training Details
- **Training Set**: 100,000 papers
- **Validation Set**: 10,000 papers
- **Average Paper Length**: 81,334 characters
- **Training Approach**: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)
## Limitations
- May generate subtle factual errors (hallucinations) for fine-grained details
- Context limit (131K tokens) may truncate extremely long documents
- Unified schema may not capture all domain-specific nuances
- Summaries are research aids, not replacements for primary sources in high-stakes scenarios
## Related Resources
- **Paper Visualization Website**: https://laion.inference.net
- **Visualization Repository**: https://github.com/context-labs/laion-data-explorer
- **Alexandria Paper**: https://arxiv.org/abs/2502.19413
- **Nemotron Variant**: inference-net/Paper-Summarizer-Nemotron-12B
## License
[License information to be added]
## Acknowledgments
This work was made possible through collaboration with:
- LAION
- Wynd Labs
- Inference.net
- Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex