File size: 987 Bytes
83c83aa
9b5b8be
 
 
 
 
 
 
 
83c83aa
9b5b8be
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---

language: en
license: mit
tags:
  - llm
  - sft
  - rlhf
  - qwen
  - llama
---


# PaperAudit SFT/RL Model Collection
This repo aggregates all SFT/RL fine-tuned models for the PaperAudit project.

## Model List
| Model Name | Hugging Face Link | Description |
|------------|-------------------|-------------|
| Qwen3-8B-sft-rl | [mayiwen/PaperAudit_Qwen3_8B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_8B_sft_rl) | Qwen3 8B model fine-tuned with SFT + RL for PaperAudit |
| Qwen3-14B-sft-rl | [mayiwen/PaperAudit_Qwen3_14B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_14B_sft_rl) | Qwen3 14B model fine-tuned with SFT + RL for PaperAudit |
| Llama3.2-3B-sft-rl | [mayiwen/PaperAudit_Llama3.2_3B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Llama3.2_3B_sft_rl) | Llama3.2 3B model fine-tuned with SFT + RL for PaperAudit |

## Usage
Refer to each model's repo for detailed usage instructions (training code, inference examples, etc.).