PaperAudit Qwen3 8B (SFT + RL)

Model Overview

PaperAudit_Qwen3_8B_sft_rl is a medium-scale model specifically trained for academic paper error detection and automated review tasks. This model is based on Qwen3 8B and has been optimized through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).

Model Information

Base Model: Qwen3 8B
Model Parameters: ~8 billion parameters
Training Method: Supervised Fine-Tuning (SFT) + Reinforcement Learning (RLHF)
Model Architecture: Qwen3ForCausalLM
Context Length: 40,960 tokens
Data Type: bfloat16

Model Features

Balanced Performance: 8B parameter scale, achieving a good balance between performance and efficiency
Specialized Optimization: Specifically optimized for academic paper error detection and review tasks
Reinforcement Learning: Aligned with human preferences through RLHF to improve review quality and error detection accuracy
Long Context Support: Supports 40K tokens context length, suitable for processing complete academic papers

Training Data

This model is trained on PaperAudit_Dataset. The dataset includes:

Academic papers downloaded from OpenReview
Structured paper content (processed via LlamaParse and LLM)
Synthetic error data for training error detection models
Human review feedback data

For more details about the dataset, please visit: https://huggingface.co/datasets/mayiwen/PaperAudit_Dataset

Usage

Install Dependencies

pip install transformers torch accelerate

Load Model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "./qwen3_8b_sft_rl"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

Inference Example

# Prepare input (paper error detection task)
prompt = """Please detect errors in the following academic paper paragraph:

[Paper content...]

Please identify errors and provide correction suggestions."""

# Encode input
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate response
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id
    )

# Decode output
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Application Scenarios

Academic paper error detection
Automated paper review
Academic writing quality assessment
Paper content analysis and feedback generation
Academic review assistant tools

Model Architecture Details

Hidden Size: 4096
Intermediate Size: 12288
Number of Attention Heads: 32
Number of Key-Value Heads: 8 (Grouped Query Attention)
Number of Hidden Layers: 36
Vocabulary Size: 151,936

Performance Advantages

Compared to the 3B model, the 8B model performs better in handling complex paper analysis tasks, with the ability to:

More accurately identify subtle academic errors
Provide more detailed and professional review comments
Better understand academic writing norms and standards

Notes

This model is specifically optimized for academic paper review tasks and may require further fine-tuning for other domains
It is recommended to use bfloat16 precision to save memory and improve inference speed
For long document processing, appropriate context window management strategies are recommended
Requires at least 16GB GPU memory for inference

Related Resources

Training Dataset: PaperAudit_Dataset
PaperAudit Project: For more details, please refer to the PaperAudit project documentation

License

Please refer to the license terms of the base model Qwen3.

Downloads last month: 2

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mayiwen/PaperAudit_Qwen3_8B_sft_rl

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B