Upload README.md for PaperAudit_Models collection
Browse files
README.md
CHANGED
|
@@ -1,3 +1,23 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- llm
|
| 6 |
+
- sft
|
| 7 |
+
- rlhf
|
| 8 |
+
- qwen
|
| 9 |
+
- llama
|
| 10 |
---
|
| 11 |
+
|
| 12 |
+
# PaperAudit SFT/RL Model Collection
|
| 13 |
+
This repo aggregates all SFT/RL fine-tuned models for the PaperAudit project.
|
| 14 |
+
|
| 15 |
+
## Model List
|
| 16 |
+
| Model Name | Hugging Face Link | Description |
|
| 17 |
+
|------------|-------------------|-------------|
|
| 18 |
+
| Qwen3-8B-sft-rl | [mayiwen/PaperAudit_Qwen3_8B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_8B_sft_rl) | Qwen3 8B model fine-tuned with SFT + RL for PaperAudit |
|
| 19 |
+
| Qwen3-14B-sft-rl | [mayiwen/PaperAudit_Qwen3_14B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_14B_sft_rl) | Qwen3 14B model fine-tuned with SFT + RL for PaperAudit |
|
| 20 |
+
| Llama3.2-3B-sft-rl | [mayiwen/PaperAudit_Llama3.2_3B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Llama3.2_3B_sft_rl) | Llama3.2 3B model fine-tuned with SFT + RL for PaperAudit |
|
| 21 |
+
|
| 22 |
+
## Usage
|
| 23 |
+
Refer to each model's repo for detailed usage instructions (training code, inference examples, etc.).
|