Update README.md
Browse files
README.md
CHANGED
|
@@ -34,44 +34,34 @@
|
|
| 34 |
# Training Procedure
|
| 35 |
**Framework:** Hugging Face Transformers
|
| 36 |
**Hyperparameters:**
|
| 37 |
-
Epochs:
|
| 38 |
-
Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
|
| 39 |
-
Learning Rate: 2e-5
|
| 40 |
-
Optimizer: AdamW
|
| 41 |
-
Mixed Precision: FP16 (fp16=True)
|
| 42 |
-
Training Time: ~
|
| 43 |
-
Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
|
| 44 |
-
Evaluation
|
| 45 |
-
Metrics: Loss (to be filled post-training)
|
| 46 |
-
Validation Loss: [TBD after training]
|
| 47 |
-
Test Loss: [TBD after evaluation]
|
| 48 |
-
Method: Evaluated using Trainer.evaluate() on validation and test splits.
|
| 49 |
-
Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
|
| 50 |
-
Performance
|
| 51 |
-
Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
|
| 52 |
-
Strengths: Expected to generate plausible directions for common ingredient combinations.
|
| 53 |
-
Limitations:
|
| 54 |
-
Limited training on subset may reduce generalization.
|
| 55 |
-
Sporadic data mismatches may affect output quality.
|
| 56 |
-
FP16 quantization might slightly alter precision vs. FP32.
|
| 57 |
-
Usage
|
| 58 |
-
Installation
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
Collapse
|
| 62 |
-
|
| 63 |
-
Wrap
|
| 64 |
-
|
| 65 |
-
Copy
|
| 66 |
pip install transformers torch datasets
|
| 67 |
-
|
| 68 |
-
|
| 69 |
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
Wrap
|
| 73 |
-
|
| 74 |
-
Copy
|
| 75 |
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
| 76 |
import torch
|
| 77 |
|
|
@@ -88,22 +78,4 @@ with torch.no_grad():
|
|
| 88 |
output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
|
| 89 |
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
| 90 |
print(directions)
|
| 91 |
-
|
| 92 |
-
Location: ./t5_recipe_finetuned_fp16
|
| 93 |
-
Size: ~425 MB (FP16 weights)
|
| 94 |
-
Limitations and Biases
|
| 95 |
-
Data Quality: Some RecipeNLG entries have mismatched ingredients and directions, potentially leading to nonsensical outputs.
|
| 96 |
-
Scope: Trained only on English recipes; may not handle non-English inputs or exotic cuisines well.
|
| 97 |
-
Bias: Reflects biases in RecipeNLG (e.g., Western cuisine dominance).
|
| 98 |
-
Quantization: FP16 may introduce minor numerical differences vs. FP32, though mitigated by FP16 training.
|
| 99 |
-
Ethical Considerations
|
| 100 |
-
Use: Should not be used to replace professional culinary expertise without validation.
|
| 101 |
-
Safety: Generated directions aren’t guaranteed to be safe or accurate (e.g., cooking times, temperatures).
|
| 102 |
-
Contact
|
| 103 |
-
Author: [Your Name/Group Name]
|
| 104 |
-
Support: [Your Email/GitHub, if applicable]
|
| 105 |
-
Citation
|
| 106 |
-
If you use this model, please cite:
|
| 107 |
-
|
| 108 |
-
RecipeNLG dataset: [Add citation if available]
|
| 109 |
-
T5: Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" (2020)
|
|
|
|
| 34 |
# Training Procedure
|
| 35 |
**Framework:** Hugging Face Transformers
|
| 36 |
**Hyperparameters:**
|
| 37 |
+
- Epochs: 2
|
| 38 |
+
- Effective Batch Size: 32 (8 per device, 4 gradient accumulation steps)
|
| 39 |
+
- Learning Rate: 2e-5
|
| 40 |
+
- Optimizer: AdamW
|
| 41 |
+
- Mixed Precision: FP16 (fp16=True)
|
| 42 |
+
- Training Time: ~12 hours estimated for subset (1 epoch); full dataset (3 epochs) estimated at ~68 hours per epoch without optimization.
|
| 43 |
+
- Compute: Single 12 GB GPU (NVIDIA, CUDA-enabled).
|
| 44 |
+
# Evaluation
|
| 45 |
+
- Metrics: Loss (to be filled post-training)
|
| 46 |
+
- Validation Loss: [TBD after training]
|
| 47 |
+
- Test Loss: [TBD after evaluation]
|
| 48 |
+
- Method: Evaluated using Trainer.evaluate() on validation and test splits.
|
| 49 |
+
- Qualitative: Generated directions checked for coherence with input ingredients (e.g., chicken and rice input should yield relevant steps).
|
| 50 |
+
# Performance
|
| 51 |
+
- Results: [TBD; e.g., "Validation Loss: X.XX, Test Loss: Y.YY after 1 epoch on subset"]
|
| 52 |
+
- Strengths: Expected to generate plausible directions for common ingredient combinations.
|
| 53 |
+
# Limitations:
|
| 54 |
+
- Limited training on subset may reduce generalization.
|
| 55 |
+
- Sporadic data mismatches may affect output quality.
|
| 56 |
+
- FP16 quantization might slightly alter precision vs. FP32.
|
| 57 |
+
# Usage
|
| 58 |
+
# Installation
|
| 59 |
+
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
pip install transformers torch datasets
|
| 61 |
+
```
|
| 62 |
+
# Inference Example
|
| 63 |
|
| 64 |
+
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
| 66 |
import torch
|
| 67 |
|
|
|
|
| 78 |
output_ids = model.generate(input_ids, max_length=256, num_beams=4, early_stopping=True, no_repeat_ngram_size=2)
|
| 79 |
directions = tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
| 80 |
print(directions)
|
| 81 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|