πͺ Cooking Recipe GPT β Small Causal Language Model
This is a small GPT-style causal language model trained on structured cooking recipe data. It follows a standard GPT-2 architecture with minimal custom tweaks. The model can generate complete ingredient lists and directions from a given title or partial prompt like "No-Bake Nut Cookies".
π§ Model Architecture
- Base: GPT-2 like architecture
- Total Parameters: ~3.1M
- Number of Layers (Blocks): 2
- vocab_size:4,000
- Embedding Dimension: 384
- Hidden Dimension: 512
- Attention Heads: 4
- Max Sequence Length: 512
- Position Encoding: Sinusoidal
- Normalization: LayerNorm
- Activation Function: ReLU
- Dropout: 0.1
π Training Details
π§ Pretraining
- Training Data: 180,000 structured recipes
- Validation Data: 20,000 recipes
- Epochs: 4
- Training Loss:
1.7928 - Validation Loss:
1.7219
π SFT (Supervised Fine-Tuning)
- SFT Training Data: 50,000 recipes
- SFT Validation Data: 5,000 recipes
- SFT Training Loss:
1.6551 - SFT Validation Loss:
1.6066
π Example Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("gurumurthy3/cooking-recipe", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("gurumurthy3/cooking-recipe")
prompt = "No-Bake Nut Cookies"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
output = model.generate(
input_ids,
tokenizer,
max_new_tokens=250,
temperature=1.0,
top_k=50,
eos_token_id=tokenizer.eos_token_id
)
print(output)
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support