PaperAudit Llama3.2 3B (SFT + RL)
Model Overview
PaperAudit_Llama3.2_3B_sft_rl is a lightweight model specifically trained for academic paper error detection and automated review tasks. This model is based on Llama 3.2 3B Instruct and has been optimized through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).
Model Information
- Base Model: Llama 3.2 3B Instruct
- Model Parameters: ~3 billion parameters
- Training Method: Supervised Fine-Tuning (SFT) + Reinforcement Learning (RLHF)
- Model Architecture: LlamaForCausalLM
- Context Length: 131,072 tokens
- Data Type: bfloat16
Model Features
- Lightweight and Efficient: 3B parameter scale, suitable for resource-constrained environments
- Specialized Optimization: Specifically optimized for academic paper error detection and review tasks
- Reinforcement Learning: Aligned with human preferences through RLHF to improve review quality and error detection accuracy
- Long Context: Supports ultra-long context (131K tokens), suitable for processing complete academic papers
Training Data
This model is trained on PaperAudit_Dataset. The dataset includes:
- Academic papers downloaded from OpenReview
- Structured paper content (processed via LlamaParse and LLM)
- Synthetic error data for training error detection models
- Human review feedback data
For more details about the dataset, please visit: https://huggingface.co/datasets/mayiwen/PaperAudit_Dataset
Usage
Install Dependencies
pip install transformers torch accelerate
Load Model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "./llama3.2_3b_sft_rl"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
device_map="auto"
)
Inference Example
# Prepare input (paper error detection task)
prompt = """Please detect errors in the following academic paper paragraph:
[Paper content...]
Please identify errors and provide correction suggestions."""
# Encode input
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate response
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.pad_token_id
)
# Decode output
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Application Scenarios
- Academic paper error detection
- Automated paper review
- Academic writing quality assessment
- Paper content analysis and feedback generation
Model Architecture Details
- Hidden Size: 3072
- Intermediate Size: 8192
- Number of Attention Heads: 24
- Number of Key-Value Heads: 8 (Grouped Query Attention)
- Number of Hidden Layers: 28
- Vocabulary Size: 128,256
Notes
- This model is specifically optimized for academic paper review tasks and may require further fine-tuning for other domains
- It is recommended to use bfloat16 precision to save memory and improve inference speed
- For long document processing, appropriate context window management strategies are recommended
Related Resources
- Training Dataset: PaperAudit_Dataset
- PaperAudit Project: For more details, please refer to the PaperAudit project documentation
License
Please refer to the license terms of the base model Llama 3.2.
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mayiwen/PaperAudit_Llama3.2_3B_sft_rl
Base model
meta-llama/Llama-3.2-3B-Instruct