nanochat d40 (SFT)
Custom nanochat checkpoint packaged for Hugging Face with trust_remote_code.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tok = AutoTokenizer.from_pretrained("ljt019/nanochat-d40", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("ljt019/nanochat-d40", trust_remote_code=True)
inputs = tok("Hello!", return_tensors="pt")
out = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.6, top_k=50)
print(tok.decode(out[0].tolist()))
Notes
- Weights are from
chatsft_checkpoints/d40/model_000681.pt. - Tokenizer is loaded from
tokenizer.pkl(tiktoken encoding).
- Downloads last month
- 59