mayiwen
/

PaperAudit_Models

Model card Files Files and versions

PaperAudit_Models / README.md

mayiwen's picture

Upload README.md for PaperAudit_Models collection

9b5b8be verified 10 days ago

|

history blame contribute delete

987 Bytes

	---
	language: en
	license: mit
	tags:
	- llm
	- sft
	- rlhf
	- qwen
	- llama
	---

	# PaperAudit SFT/RL Model Collection
	This repo aggregates all SFT/RL fine-tuned models for the PaperAudit project.

	## Model List
	\| Model Name \| Hugging Face Link \| Description \|
	\|------------\|-------------------\|-------------\|
	\| Qwen3-8B-sft-rl \| [mayiwen/PaperAudit_Qwen3_8B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_8B_sft_rl) \| Qwen3 8B model fine-tuned with SFT + RL for PaperAudit \|
	\| Qwen3-14B-sft-rl \| [mayiwen/PaperAudit_Qwen3_14B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Qwen3_14B_sft_rl) \| Qwen3 14B model fine-tuned with SFT + RL for PaperAudit \|
	\| Llama3.2-3B-sft-rl \| [mayiwen/PaperAudit_Llama3.2_3B_sft_rl](https://huggingface.co/mayiwen/PaperAudit_Llama3.2_3B_sft_rl) \| Llama3.2 3B model fine-tuned with SFT + RL for PaperAudit \|

	## Usage
	Refer to each model's repo for detailed usage instructions (training code, inference examples, etc.).