inference-net
/

OSSAS-Qwen3-14B

 - en
 base_model:
 - Qwen/Qwen3-14B
+---
+# Paper-Summarizer-Qwen3-14B
+A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.
+## Model Description
+This model is part of [Project AELLA](https://github.com/context-labs/laion-data-explorer), developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.
+**Base Model**: Qwen 3 14B
+**Training Data**: 110,000 curated research papers
+**Performance**: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%)
+**Cost Efficiency**: 98% lower cost than closed-source alternatives
+This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.
+The model supports papers up to 131K tokens.
+## Usage
+### Serving the Model
+```bash
+vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
+--port 8000 \
+--host 0.0.0.0 \
+--trust-remote-code \
+--data-parallel-size 1 \
+--tensor-parallel-size 1 \
+--max-num-seqs 32 \
+--max-model-len 131072 \
+--max-num-batched-tokens 8192 \
+--gpu-memory-utilization 0.90 \
+--enable-prefix-caching \
+--enable-chunked-prefill
+```
+### Making Requests
+```python
+import requests
+# System prompt (required)
+system_prompt = """[Insert the full system prompt from the prompt.txt file -
+see the full prompt in the model repository]"""
+# User prompt: the paper text to summarize
+paper_text = """
+Title: Your Paper Title
+Authors: Author 1, Author 2
+Abstract: ...
+[Full paper content]
+"""
+# API request
+response = requests.post(
+"http://localhost:8000/v1/chat/completions",
+json={
+"model": "inference-net/Paper-Summarizer-Qwen3-14B",
+"messages": [
+{"role": "system", "content": system_prompt},
+{"role": "user", "content": paper_text}
+],
+"temperature": 0.2
+},
+timeout=600
+)
+result = response.json()
+summary = result["choices"][0]["message"]["content"]
+print(summary)
+```
+### System Prompt
+The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:
+1. **Classify** the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
+2. **Extract** structured information including:
+- Title, authors, publication year
+- Research context and hypotheses
+- Methodological details
+- Key results with quantitative data
+- Claims with supporting evidence
+- Limitations and ethical considerations
+The full system prompt is available in the model repository's `prompt.txt` file.
+### Output Format
+The model outputs a single valid JSON object with this structure:
+```json
+{
+"article_classification": "SCIENTIFIC_TEXT",
+"reason": null,
+"summary": {
+"title": "",
+"authors": "",
+"publication_year": null,
+"field_subfield": "",
+"executive_summary": "",
+"research_context": "",
+"methodological_details": "",
+"key_results": "",
+"claims": [...],
+"contradictions_and_limitations": "",
+...
+}
+}
+```
+## Performance
+### LLM-as-a-Judge Evaluation
+- **Score**: 4.207/5.0
+- **Comparison**: Within 15% of GPT-5 (4.805/5.0)
+### QA Dataset Evaluation
+- **Accuracy**: 73.9%
+- **Comparison**: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)
+### Throughput
+- **Requests/sec**: 0.43
+- **Input Tokens/sec**: 7,516.54
+- **Output Tokens/sec**: 2,588.30
+## Training Details
+- **Training Set**: 100,000 papers
+- **Validation Set**: 10,000 papers
+- **Average Paper Length**: 81,334 characters
+- **Training Approach**: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)
+## Limitations
+- May generate subtle factual errors (hallucinations) for fine-grained details
+- Context limit (131K tokens) may truncate extremely long documents
+- Unified schema may not capture all domain-specific nuances
+- Summaries are research aids, not replacements for primary sources in high-stakes scenarios
+## Related Resources
+- **Paper Visualization Website**: https://laion.inference.net
+- **Visualization Repository**: https://github.com/context-labs/laion-data-explorer
+- **Alexandria Paper**: https://arxiv.org/abs/2502.19413
+- **Nemotron Variant**: inference-net/Paper-Summarizer-Nemotron-12B
+## License
+[License information to be added]
+## Acknowledgments
+This work was made possible through collaboration with:
+- LAION
+- Wynd Labs
+- Inference.net
+- Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex