--- license: apache-2.0 language: - en base_model: - Qwen/Qwen3-14B --- # Paper-Summarizer-Qwen3-14B A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format. ## Model Description This model is part of [Project OSSAS](https://github.com/context-labs/laion-data-explorer), developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale. **Base Model**: Qwen 3 14B **Training Data**: 110,000 curated research papers **Performance**: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%) **Cost Efficiency**: 98% lower cost than closed-source alternatives This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations. The model supports papers up to 131K tokens. ## Usage ### Serving the Model ```bash vllm serve inference-net/Paper-Summarizer-Qwen3-14B \ --port 8000 \ --host 0.0.0.0 \ --trust-remote-code \ --data-parallel-size 1 \ --tensor-parallel-size 1 \ --max-num-seqs 32 \ --max-model-len 131072 \ --max-num-batched-tokens 8192 \ --gpu-memory-utilization 0.90 \ --enable-prefix-caching \ --enable-chunked-prefill ``` ### Making Requests ```python import requests # System prompt (required) system_prompt = """[Insert the full system prompt from the prompt.txt file - see the full prompt in the model repository]""" # User prompt: the paper text to summarize paper_text = """ Title: Your Paper Title Authors: Author 1, Author 2 Abstract: ... [Full paper content] """ # API request response = requests.post( "http://localhost:8000/v1/chat/completions", json={ "model": "inference-net/Paper-Summarizer-Qwen3-14B", "messages": [ {"role": "system", "content": system_prompt}, {"role": "user", "content": paper_text} ], "temperature": 0.2 }, timeout=600 ) result = response.json() summary = result["choices"][0]["message"]["content"] print(summary) ``` ### System Prompt The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to: 1. **Classify** the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT 2. **Extract** structured information including: - Title, authors, publication year - Research context and hypotheses - Methodological details - Key results with quantitative data - Claims with supporting evidence - Limitations and ethical considerations The full system prompt is available in the model repository's `prompt.txt` file. ### Output Format The model outputs a single valid JSON object with this structure: ```json { "article_classification": "SCIENTIFIC_TEXT", "reason": null, "summary": { "title": "", "authors": "", "publication_year": null, "field_subfield": "", "executive_summary": "", "research_context": "", "methodological_details": "", "key_results": "", "claims": [...], "contradictions_and_limitations": "", ... } } ``` ## Performance ### LLM-as-a-Judge Evaluation - **Score**: 4.207/5.0 - **Comparison**: Within 15% of GPT-5 (4.805/5.0) ### QA Dataset Evaluation - **Accuracy**: 73.9% - **Comparison**: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%) ### Throughput - **Requests/sec**: 0.43 - **Input Tokens/sec**: 7,516.54 - **Output Tokens/sec**: 2,588.30 ## Training Details - **Training Set**: 100,000 papers - **Validation Set**: 10,000 papers - **Average Paper Length**: 81,334 characters - **Training Approach**: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro) ## Limitations - May generate subtle factual errors (hallucinations) for fine-grained details - Context limit (131K tokens) may truncate extremely long documents - Unified schema may not capture all domain-specific nuances - Summaries are research aids, not replacements for primary sources in high-stakes scenarios ## Related Resources - **Paper Visualization Website**: https://laion.inference.net - **Visualization Repository**: https://github.com/context-labs/laion-data-explorer - **Alexandria Paper**: https://arxiv.org/abs/2502.19413 - **Nemotron Variant**: inference-net/Paper-Summarizer-Nemotron-12B ## License [License information to be added] ## Acknowledgments This work was made possible through collaboration with: - LAION - Wynd Labs - Inference.net - Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex