| # Model Card: T5-Base Fine-Tuned for Recipe Direction Generation (FP16) | |
| **Model Overview** | |
| - **Model Name:** t5-base-recipe-finetuned-fp16 | |
| - **Model Type:** Sequence-to-Sequence Transformer | |
| - **Base Model:** google/t5-base (220M parameters) | |
| - **Quantization:** FP16 (half-precision floating-point) | |
| - **Task:** Generate cooking directions from a list of ingredients | |
| **Intended Use** | |
| - This model is designed to generate step-by-step cooking directions given a list of ingredients. It’s intended for: | |
| - Recipe creation assistance. | |
| - Educational purposes in culinary AI research. | |
| - Exploration of text-to-text generation in domain-specific tasks. | |
| - Primary Users: Home cooks, recipe developers, AI researchers. | |
| # Model Details | |
| - **Architecture:** T5 (Text-to-Text Transfer Transformer), encoder-decoder Transformer with 12 layers, 768 hidden size, 12 attention heads. | |
| - **Input:** Text string in the format "generate recipe directions from ingredients: <ingredient1> <ingredient2> ...". | |
| - **Output:** Text string containing cooking directions. | |
| - **Quantization:** Converted to FP16 for reduced memory usage (~425 MB vs. ~850 MB in FP32) and faster inference on GPU. | |
| - **Hardware:** Fine-tuned and tested on a 12 GB NVIDIA GPU with CUDA. | |
| # Training Data | |
| - **Dataset:** RecipeNLG | |
| - **Source:** Publicly available recipe dataset (downloaded as CSV) | |
| - **Size:** 2,231,142 examples (original); subset of 178,491 used for training (10% of train split) | |
| # Splits: | |
| - **Train:** 178,491 examples (subset) | |
| - **Validation:** 223,114 examples | |
| - **Test:** 223,115 examples | |
| - **Attributes:** ingredients (list of ingredients), directions (list of steps) | |
| - **Preprocessing:** Converted stringified lists to text; input prefixed with "generate recipe directions from ingredients: ". | |
| # Training Procedure | |
| **Framework:** Hugging Face Transformers | |
| **Hyperparameters:** | |
| - Epochs: 2 | |
| - Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps) | |
| - Learning Rate: 2e-5 | |
| - Optimizer: AdamW | |
| - Mixed Precision: FP16 (fp16=True) | |
| - Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization. | |
| - Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled). | |
| # Evaluation | |
| - Metrics: Loss (to be filled post-training) | |
| - Validation Loss: [TBD after training] | |
| - Test Loss: [TBD after evaluation] | |
| - Method: Evaluated using Trainer.evaluate() on validation and test splits. | |
| - Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps). | |
| # Performance | |
| - Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"] | |
| - Strengths: Expected to generate plausible directions for common ingredient combinations. | |
| # Limitations: | |
| - Limited training on subset may reduce generalization. | |
| - Sporadic data mismatches may affect output quality. | |
| - FP16 quantization might slightly alter precision vs. FP32. | |
| # Usage | |
| # Installation | |
| ```python | |
| pip install transformers torch datasets | |
| ``` | |
| # Inference Example | |
| ```python | |
| from transformers import T5Tokenizer, T5ForConditionalGeneration | |
| import torch | |
| model_path = "./t5_recipe_finetuned_fp16" | |
| tokenizer = T5Tokenizer.from_pretrained(model_path) | |
| model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda").half() | |
| ingredients = ["1 lb chicken breast", "2 cups rice", "1 onion", "2 tbsp soy sauce"] | |
| input_text = "generate recipe directions from ingredients: " + " ".join(ingredients) | |
| input_ids = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True).input_ids.to("cuda") | |
| model.eval() | |
| with torch.no_grad(): | |
| output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2) | |
| directions = tokenizer.decode(output_ids[0], skip_special_tokens=True) | |
| print(directions) | |
| ``` | |