πŸͺ Cooking Recipe GPT β€” Small Causal Language Model

This is a small GPT-style causal language model trained on structured cooking recipe data. It follows a standard GPT-2 architecture with minimal custom tweaks. The model can generate complete ingredient lists and directions from a given title or partial prompt like "No-Bake Nut Cookies".


πŸ”§ Model Architecture

  • Base: GPT-2 like architecture
  • Total Parameters: ~3.1M
  • Number of Layers (Blocks): 2
  • vocab_size:4,000
  • Embedding Dimension: 384
  • Hidden Dimension: 512
  • Attention Heads: 4
  • Max Sequence Length: 512
  • Position Encoding: Sinusoidal
  • Normalization: LayerNorm
  • Activation Function: ReLU
  • Dropout: 0.1

πŸ“š Training Details

🧠 Pretraining

  • Training Data: 180,000 structured recipes
  • Validation Data: 20,000 recipes
  • Epochs: 4
  • Training Loss: 1.7928
  • Validation Loss: 1.7219

πŸ”„ SFT (Supervised Fine-Tuning)

  • SFT Training Data: 50,000 recipes
  • SFT Validation Data: 5,000 recipes
  • SFT Training Loss: 1.6551
  • SFT Validation Loss: 1.6066

πŸš€ Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("gurumurthy3/cooking-recipe", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("gurumurthy3/cooking-recipe")

prompt = "No-Bake Nut Cookies"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

output = model.generate(
    input_ids,
    tokenizer,
    max_new_tokens=250,
    temperature=1.0,
    top_k=50,
    eos_token_id=tokenizer.eos_token_id
)

print(output)
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support