YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
license: apache-2.0 base_model: Qwen/Qwen2.5-1.5B-Instruct tags:
- qwen2.5
- instruction-tuning
- sft
{
"_name_or_path": "Qwen/Qwen2.5-1.5B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 1536,
"initializer_range": 0.02,
"intermediate_size": 8960,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 12,
"num_hidden_layers": 28,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.45.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
使用方法
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "AlexanderWang915/qwen2.5-1.5b-smolinstruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
# 示例对话
messages = [
{"role": "user", "content": "你好!"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.95
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
训练详情
- 使用 LLaMA-Factory 进行训练
- 基于 SmolInstruct 数据集的100k样本
- 指令微调以提升对话和指令遵循能力
注意事项
本模型继承了基础模型的许可协议。使用时请遵守相关法律法规,不得用于违法违规用途。
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support