summerMC/TRM-textV2-SFT
This model is a fine-tuned version of summerMC/TRM-textV2 using the mlabonne/FineTome-100k dataset.
Model description
- Model Type: Custom Transformer (TRM-textV2)
- Task: Chat / Instruction Following
- Language: Japanese / English
- Sequence Length: 512
Training Details
- Method: Full Supervised Fine-Tuning (SFT)
- Format: ChatML (
<|im_start|>,<|im_end|>) - Optimizer: AdamW
- Learning Rate: 2e-5
- Epochs: 1
How to use
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("summerMC/TRM-textV2-SFT", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("summerMC/TRM-textV2-SFT", trust_remote_code=True)
prompt = "<|im_start|>user
富士山の高さについて教えてください。<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 433
Model tree for summerMC/TRM-textV2-SFT
Unable to build the model tree, the base model loops to the model itself. Learn more.