amarinference commited on
Commit
5b4766f
·
verified ·
1 Parent(s): 40eaa0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +158 -1
README.md CHANGED
@@ -4,4 +4,161 @@ language:
4
  - en
5
  base_model:
6
  - Qwen/Qwen3-14B
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - en
5
  base_model:
6
  - Qwen/Qwen3-14B
7
+ ---
8
+
9
+ # Paper-Summarizer-Qwen3-14B
10
+
11
+ A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.
12
+
13
+ ## Model Description
14
+
15
+ This model is part of [Project AELLA](https://github.com/context-labs/laion-data-explorer), developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.
16
+
17
+ **Base Model**: Qwen 3 14B
18
+ **Training Data**: 110,000 curated research papers
19
+ **Performance**: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%)
20
+ **Cost Efficiency**: 98% lower cost than closed-source alternatives
21
+
22
+ This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.
23
+
24
+ The model supports papers up to 131K tokens.
25
+
26
+ ## Usage
27
+
28
+ ### Serving the Model
29
+ ```bash
30
+ vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
31
+ --port 8000 \
32
+ --host 0.0.0.0 \
33
+ --trust-remote-code \
34
+ --data-parallel-size 1 \
35
+ --tensor-parallel-size 1 \
36
+ --max-num-seqs 32 \
37
+ --max-model-len 131072 \
38
+ --max-num-batched-tokens 8192 \
39
+ --gpu-memory-utilization 0.90 \
40
+ --enable-prefix-caching \
41
+ --enable-chunked-prefill
42
+ ```
43
+
44
+ ### Making Requests
45
+ ```python
46
+ import requests
47
+
48
+ # System prompt (required)
49
+ system_prompt = """[Insert the full system prompt from the prompt.txt file -
50
+ see the full prompt in the model repository]"""
51
+
52
+ # User prompt: the paper text to summarize
53
+ paper_text = """
54
+ Title: Your Paper Title
55
+ Authors: Author 1, Author 2
56
+ Abstract: ...
57
+ [Full paper content]
58
+ """
59
+
60
+ # API request
61
+ response = requests.post(
62
+ "http://localhost:8000/v1/chat/completions",
63
+ json={
64
+ "model": "inference-net/Paper-Summarizer-Qwen3-14B",
65
+ "messages": [
66
+ {"role": "system", "content": system_prompt},
67
+ {"role": "user", "content": paper_text}
68
+ ],
69
+ "temperature": 0.2
70
+ },
71
+ timeout=600
72
+ )
73
+
74
+ result = response.json()
75
+ summary = result["choices"][0]["message"]["content"]
76
+ print(summary)
77
+ ```
78
+
79
+ ### System Prompt
80
+
81
+ The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:
82
+
83
+ 1. **Classify** the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
84
+ 2. **Extract** structured information including:
85
+ - Title, authors, publication year
86
+ - Research context and hypotheses
87
+ - Methodological details
88
+ - Key results with quantitative data
89
+ - Claims with supporting evidence
90
+ - Limitations and ethical considerations
91
+
92
+ The full system prompt is available in the model repository's `prompt.txt` file.
93
+
94
+ ### Output Format
95
+
96
+ The model outputs a single valid JSON object with this structure:
97
+ ```json
98
+ {
99
+ "article_classification": "SCIENTIFIC_TEXT",
100
+ "reason": null,
101
+ "summary": {
102
+ "title": "",
103
+ "authors": "",
104
+ "publication_year": null,
105
+ "field_subfield": "",
106
+ "executive_summary": "",
107
+ "research_context": "",
108
+ "methodological_details": "",
109
+ "key_results": "",
110
+ "claims": [...],
111
+ "contradictions_and_limitations": "",
112
+ ...
113
+ }
114
+ }
115
+ ```
116
+
117
+ ## Performance
118
+
119
+ ### LLM-as-a-Judge Evaluation
120
+ - **Score**: 4.207/5.0
121
+ - **Comparison**: Within 15% of GPT-5 (4.805/5.0)
122
+
123
+ ### QA Dataset Evaluation
124
+ - **Accuracy**: 73.9%
125
+ - **Comparison**: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)
126
+
127
+ ### Throughput
128
+ - **Requests/sec**: 0.43
129
+ - **Input Tokens/sec**: 7,516.54
130
+ - **Output Tokens/sec**: 2,588.30
131
+
132
+ ## Training Details
133
+
134
+ - **Training Set**: 100,000 papers
135
+ - **Validation Set**: 10,000 papers
136
+ - **Average Paper Length**: 81,334 characters
137
+ - **Training Approach**: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)
138
+
139
+ ## Limitations
140
+
141
+ - May generate subtle factual errors (hallucinations) for fine-grained details
142
+ - Context limit (131K tokens) may truncate extremely long documents
143
+ - Unified schema may not capture all domain-specific nuances
144
+ - Summaries are research aids, not replacements for primary sources in high-stakes scenarios
145
+
146
+
147
+ ## Related Resources
148
+
149
+ - **Paper Visualization Website**: https://laion.inference.net
150
+ - **Visualization Repository**: https://github.com/context-labs/laion-data-explorer
151
+ - **Alexandria Paper**: https://arxiv.org/abs/2502.19413
152
+ - **Nemotron Variant**: inference-net/Paper-Summarizer-Nemotron-12B
153
+
154
+ ## License
155
+
156
+ [License information to be added]
157
+
158
+ ## Acknowledgments
159
+
160
+ This work was made possible through collaboration with:
161
+ - LAION
162
+ - Wynd Labs
163
+ - Inference.net
164
+ - Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex