Instructions to use Yougen/Qwen3Fangwusha14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Yougen/Qwen3Fangwusha14B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Yougen/Qwen3Fangwusha14B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Yougen/Qwen3Fangwusha14B")
model = AutoModelForCausalLM.from_pretrained("Yougen/Qwen3Fangwusha14B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Yougen/Qwen3Fangwusha14B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Yougen/Qwen3Fangwusha14B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yougen/Qwen3Fangwusha14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Yougen/Qwen3Fangwusha14B

SGLang

How to use Yougen/Qwen3Fangwusha14B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Yougen/Qwen3Fangwusha14B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yougen/Qwen3Fangwusha14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Yougen/Qwen3Fangwusha14B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yougen/Qwen3Fangwusha14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Yougen/Qwen3Fangwusha14B with Docker Model Runner:
```
docker model run hf.co/Yougen/Qwen3Fangwusha14B
```

Model Card for Yougen/Qwen3Fangwusha14B

Qwen3Fangwusha14B是基于Qwen3-14B进行微调的中文大语言模型，专注于提升中文对话能力、指令遵循和通用任务表现。该模型属于Fangwusha系列，旨在为中文用户提供高质量、安全可靠的AI助手服务。

Model Details

Model Description

Qwen3Fangwusha14B是一个150亿参数的自回归语言模型，在Qwen3-14B基础上通过高质量中文数据集进行了进一步微调。模型采用BF16精度训练，优化了中文语义理解、逻辑推理和多轮对话能力，适用于各种中文自然语言处理任务。

Developed by: Yougen Yuan
Funded by [optional]: [More Information Needed]
Shared by [optional]: Yougen Yuan
Model type: Auto-regressive language model (Decoder-only)
Language(s) (NLP): 中文 (zh), 英文 (en)
License: Apache-2.0
Finetuned from model [optional]: Qwen/Qwen3-14B

Model Sources [optional]

Repository: https://huggingface.co/Yougen/Qwen3Fangwusha14B
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

该模型可直接用于以下任务：

中文对话与问答
文本生成与续写
信息提取与总结
翻译与语言转换
代码辅助与解释
创意写作与内容创作

Downstream Use [optional]

该模型可进一步微调用于：

特定领域知识库问答
客户服务机器人
教育辅导系统
企业内部智能助手
内容审核与分类

Out-of-Scope Use

该模型不应用于：

生成违法、有害、暴力或歧视性内容
未经授权的医疗诊断、法律建议或金融投资建议
冒充他人或进行欺诈活动
生成可能侵犯知识产权的内容
高风险决策系统（如自动驾驶、医疗设备控制等）

Bias, Risks, and Limitations

模型可能会生成不准确、不完整或误导性的信息，特别是在处理专业领域知识时
模型可能会反映训练数据中存在的偏见和刻板印象
模型在处理长文本时可能会出现上下文理解能力下降的情况
模型可能会产生幻觉，编造不存在的事实或引用
模型的英文能力相对中文较弱

Recommendations

用户在使用该模型时应：

对模型生成的内容进行事实核查和验证
意识到模型可能存在的偏见和局限性
在高风险场景中谨慎使用，必要时咨询专业人士
遵守相关法律法规和道德规范
报告任何有害或不当的模型输出

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Yougen/Qwen3Fangwusha14B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "你好，请介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

Training Data

该模型使用了多种高质量中文数据集进行微调，包括：

通用对话数据集
指令遵循数据集
知识问答数据集
逻辑推理数据集

所有数据集均经过严格的质量过滤和去重处理，确保训练数据的质量和多样性。

Training Procedure

Preprocessing [optional]

训练数据经过了以下预处理步骤：

文本清洗和标准化
格式统一和规范化
质量过滤和去重
数据增强和多样化

Training Hyperparameters

Training regime: BF16 mixed precision
Optimizer: AdamW
Learning rate: [More Information Needed]
Batch size: [More Information Needed]
Epochs: [More Information Needed]
Warmup steps: [More Information Needed]
Weight decay: [More Information Needed]

Speeds, Sizes, Times [optional]

Model size: 15B parameters
Checkpoint size: ~30GB (BF16)
Training duration: [More Information Needed]
Training hardware: [More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

模型在以下基准测试集上进行了评估：

C-Eval (中文通用能力评估)
MMLU (多任务语言理解)
GSM8K (数学推理)
HumanEval (代码生成)

Factors

评估涵盖了以下维度：

知识掌握程度
逻辑推理能力
指令遵循能力
中文理解与生成能力
代码生成能力

Metrics

Accuracy: 用于知识问答和选择题任务
Pass@k: 用于代码生成任务
BLEU/ROUGE: 用于文本生成和翻译任务
Human evaluation: 用于对话质量和整体表现评估

Results

[More Information Needed]

Summary

[More Information Needed]

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

该模型基于Qwen3架构，采用解码器-only的Transformer结构：

上下文窗口大小：[More Information Needed]
注意力机制：Grouped-Query Attention (GQA)
激活函数：SwiGLU
词表大小：[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

Framework: PyTorch 2.x
Training library: LLaMA-Factory
Inference library: Transformers 4.x
Acceleration: FlashAttention-2

Citation [optional]

BibTeX:

@misc{qwen3fangwusha14b,
  author = {Yuan, Yougen},
  title = {Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Yougen/Qwen3Fangwusha14B}}
}

APA:

Yuan, Y. (2026). Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model. Hugging Face. https://huggingface.co/Yougen/Qwen3Fangwusha14B