tiny-gpt-1-1m
This repository contains a pretrained TinyGPT checkpoint published for public use. This checkpoint is provided for educational and experimentation purposes.
Artifacts
tiny_gpt_latest.pt: training checkpoint with model and optimizer statetokenizer.model: SentencePiece tokenizer used for training and generationconfig.json: model configuration serialized from the checkpointtraining_config.yaml: training and MLflow settings used for the run
How to use
Use with Transformers.
Starting with transformers >= 4.43.0, you can run conversational inference using the pipeline abstraction or by leveraging the Auto classes with generate().
Make sure to update your Transformers installation via pip install --upgrade transformers.
import torch
import transformers
model_id = "vjkhambe/tiny-gpt-1-1m"
device = 0 if torch.cuda.is_available() else -1
model = transformers.AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
dtype=torch.bfloat16,
)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model.generation_config.max_length = None
model.generation_config.max_new_tokens = 64
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device=device,
)
print(pipeline("Hey how are you doing today?"))
Training details
- Base package:
tiny_gpt_pretrain - Model and training configuration are stored in the checkpoint and
training_config.yaml - The exported checkpoint includes optimizer state for continued fine-tuning or evaluation
License
Released under the Apache-2.0 license.
Target repo: vjkhambe/tiny-gpt-1-1m
- Downloads last month
- 12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support