PaperAudit Qwen3 8B (SFT + RL)
Model Overview
PaperAudit_Qwen3_8B_sft_rl is a medium-scale model specifically trained for academic paper error detection and automated review tasks. This model is based on Qwen3 8B and has been optimized through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).
Model Information
- Base Model: Qwen3 8B
- Model Parameters: ~8 billion parameters
- Training Method: Supervised Fine-Tuning (SFT) + Reinforcement Learning (RLHF)
- Model Architecture: Qwen3ForCausalLM
- Context Length: 40,960 tokens
- Data Type: bfloat16
Model Features
- Balanced Performance: 8B parameter scale, achieving a good balance between performance and efficiency
- Specialized Optimization: Specifically optimized for academic paper error detection and review tasks
- Reinforcement Learning: Aligned with human preferences through RLHF to improve review quality and error detection accuracy
- Long Context Support: Supports 40K tokens context length, suitable for processing complete academic papers
Training Data
This model is trained on PaperAudit_Dataset. The dataset includes:
- Academic papers downloaded from OpenReview
- Structured paper content (processed via LlamaParse and LLM)
- Synthetic error data for training error detection models
- Human review feedback data
For more details about the dataset, please visit: https://huggingface.co/datasets/mayiwen/PaperAudit_Dataset
Usage
Install Dependencies
pip install transformers torch accelerate
Load Model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "./qwen3_8b_sft_rl"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
device_map="auto"
)
Inference Example
# Prepare input (paper error detection task)
prompt = """Please detect errors in the following academic paper paragraph:
[Paper content...]
Please identify errors and provide correction suggestions."""
# Encode input
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate response
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.pad_token_id
)
# Decode output
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Application Scenarios
- Academic paper error detection
- Automated paper review
- Academic writing quality assessment
- Paper content analysis and feedback generation
- Academic review assistant tools
Model Architecture Details
- Hidden Size: 4096
- Intermediate Size: 12288
- Number of Attention Heads: 32
- Number of Key-Value Heads: 8 (Grouped Query Attention)
- Number of Hidden Layers: 36
- Vocabulary Size: 151,936
Performance Advantages
Compared to the 3B model, the 8B model performs better in handling complex paper analysis tasks, with the ability to:
- More accurately identify subtle academic errors
- Provide more detailed and professional review comments
- Better understand academic writing norms and standards
Notes
- This model is specifically optimized for academic paper review tasks and may require further fine-tuning for other domains
- It is recommended to use bfloat16 precision to save memory and improve inference speed
- For long document processing, appropriate context window management strategies are recommended
- Requires at least 16GB GPU memory for inference
Related Resources
- Training Dataset: PaperAudit_Dataset
- PaperAudit Project: For more details, please refer to the PaperAudit project documentation
License
Please refer to the license terms of the base model Qwen3.
- Downloads last month
- 11
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support