sumitdotml commited on
Commit
41f25ff
·
verified ·
1 Parent(s): 6c90a19

Update model card with project details, eval results, and usage instructions

Browse files
Files changed (1) hide show
  1. README.md +168 -35
README.md CHANGED
@@ -1,59 +1,192 @@
1
  ---
2
  base_model: mistralai/Ministral-8B-Instruct-2410
3
- library_name: transformers
4
- model_name: robuchan
 
 
5
  tags:
6
- - generated_from_trainer
7
- - trl
8
- - hf_jobs
9
- - sft
10
- licence: license
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Model Card for robuchan
14
 
15
- This model is a fine-tuned version of [mistralai/Ministral-8B-Instruct-2410](https://huggingface.co/mistralai/Ministral-8B-Instruct-2410).
16
- It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
- ## Quick start
19
 
20
- ```python
21
- from transformers import pipeline
 
22
 
23
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
24
- generator = pipeline("text-generation", model="sumitdotml/robuchan", device="cuda")
25
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
26
- print(output["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ```
28
 
29
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/sumit-ml/robuchan/runs/p50dki7k)
32
 
 
33
 
 
34
 
35
- This model was trained with SFT.
 
 
 
 
36
 
37
- ### Framework versions
 
 
 
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  - TRL: 0.29.0
40
  - Transformers: 5.2.0
41
- - Pytorch: 2.6.0+cu124
42
  - Datasets: 4.6.1
43
- - Tokenizers: 0.22.2
44
-
45
- ## Citations
46
 
 
47
 
48
-
49
- Cite TRL as:
50
-
51
  ```bibtex
52
- @software{vonwerra2020trl,
53
- title = {{TRL: Transformers Reinforcement Learning}},
54
- author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
55
- license = {Apache-2.0},
56
- url = {https://github.com/huggingface/trl},
57
- year = {2020}
58
  }
59
- ```
 
1
  ---
2
  base_model: mistralai/Ministral-8B-Instruct-2410
3
+ library_name: peft
4
+ license: apache-2.0
5
+ language:
6
+ - en
7
  tags:
8
+ - recipe-adaptation
9
+ - dietary-restrictions
10
+ - culinary
11
+ - sft
12
+ - lora
13
+ - trl
14
+ - hf_jobs
15
+ - mistral-hackathon
16
+ datasets:
17
+ - sumitdotml/robuchan-data
18
+ pipeline_tag: text-generation
19
+ model-index:
20
+ - name: robuchan
21
+ results:
22
+ - task:
23
+ type: text-generation
24
+ name: Recipe Dietary Adaptation
25
+ metrics:
26
+ - name: Format Compliance
27
+ type: format_compliance
28
+ value: 1.0
29
+ verified: false
30
+ - name: Dietary Constraint Compliance
31
+ type: constraint_compliance
32
+ value: 0.33
33
+ verified: false
34
  ---
35
 
36
+ # Robuchan
37
 
38
+ A LoRA adapter for [Ministral-8B-Instruct-2410](https://huggingface.co/mistralai/Ministral-8B-Instruct-2410) fine-tuned on synthetic dietary recipe adaptations.
 
39
 
40
+ Given a recipe and a dietary restriction (vegan, gluten-free, dairy-free, etc.), Robuchan produces a structured adaptation with ingredient substitutions, updated steps, flavor preservation notes, and a compliance self-check.
41
 
42
+ Built for the [Mistral AI Worldwide Hackathon Tokyo](https://worldwide-hackathon.mistral.ai/) (Feb 28 - Mar 1, 2026).
43
+
44
+ ## Usage
45
 
46
+ ```python
47
+ from peft import PeftModel
48
+ from transformers import AutoModelForCausalLM, AutoTokenizer
49
+
50
+ base_model = AutoModelForCausalLM.from_pretrained(
51
+ "mistralai/Ministral-8B-Instruct-2410",
52
+ device_map="auto",
53
+ load_in_4bit=True,
54
+ )
55
+ model = PeftModel.from_pretrained(base_model, "sumitdotml/robuchan")
56
+ tokenizer = AutoTokenizer.from_pretrained("sumitdotml/robuchan")
57
+
58
+ messages = [
59
+ {
60
+ "role": "system",
61
+ "content": (
62
+ "You are a culinary adaptation assistant. "
63
+ "Priority: (1) strict dietary compliance, (2) preserve dish identity and flavor profile, "
64
+ "(3) keep instructions practical and cookable. "
65
+ "Never include forbidden ingredients or their derivatives (stocks, sauces, pastes, broths). "
66
+ "If no exact compliant substitute exists, acknowledge the gap, choose the closest viable option, "
67
+ "and state the trade-off. "
68
+ "Output sections exactly: Substitution Plan, Adapted Ingredients, Adapted Steps, "
69
+ "Flavor Preservation Notes, Constraint Check."
70
+ ),
71
+ },
72
+ {
73
+ "role": "user",
74
+ "content": (
75
+ "Recipe: Mapo Tofu\n"
76
+ "Cuisine: Sichuan Chinese\n"
77
+ "Ingredients: 400g firm tofu, 200g ground pork, 2 tbsp doubanjiang, "
78
+ "1 tbsp oyster sauce, 3 cloves garlic, 1 inch ginger, 2 scallions, "
79
+ "1 tbsp cornstarch, 2 tbsp neutral oil\n"
80
+ "Steps: 1) Brown pork in oil until crispy. 2) Add minced garlic, ginger, "
81
+ "and doubanjiang; stir-fry 30 seconds. 3) Add tofu cubes and 1 cup water; "
82
+ "simmer 8 minutes. 4) Mix cornstarch slurry and stir in to thicken. "
83
+ "5) Garnish with sliced scallions.\n"
84
+ "Restrictions: vegetarian, shellfish-free\n"
85
+ "Must Keep Flavor Notes: mala heat, savory umami, silky sauce"
86
+ ),
87
+ },
88
+ ]
89
+
90
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
91
+ inputs = inputs.to(model.device)
92
+ outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, do_sample=True)
93
+ print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
94
  ```
95
 
96
+ ## Output Format
97
+
98
+ The model produces five structured sections:
99
+
100
+ | Section | Content |
101
+ |---------|---------|
102
+ | **Substitution Plan** | One row per banned ingredient: `original -> replacement (rationale)` |
103
+ | **Adapted Ingredients** | Full ingredient list with quantities — no placeholders |
104
+ | **Adapted Steps** | Complete numbered cooking steps reflecting all substitutions |
105
+ | **Flavor Preservation Notes** | 3+ notes on how taste/texture/aroma are maintained |
106
+ | **Constraint Check** | Explicit checklist confirming all violations resolved |
107
+
108
+ ## Training
109
+
110
+ | Detail | Value |
111
+ |--------|-------|
112
+ | Base model | `mistralai/Ministral-8B-Instruct-2410` |
113
+ | Method | QLoRA SFT via [TRL](https://github.com/huggingface/trl) on HF Jobs (A10G) |
114
+ | LoRA rank | 16 |
115
+ | LoRA alpha | 32 |
116
+ | LoRA dropout | 0.05 |
117
+ | Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj` |
118
+ | Training examples | 1,090 |
119
+ | Validation examples | 122 |
120
+ | Epochs completed | ~0.95 (OOM at epoch boundary eval on A10G 24GB) |
121
+ | Final train loss | 0.77 |
122
+
123
+ ### Dataset
124
+
125
+ Training data was synthetically generated from [Food.com's 530K recipe corpus](https://www.kaggle.com/datasets/irkaal/foodcom-recipes-and-reviews/data):
126
+
127
+ 1. Filter source recipes that violate at least one supported dietary constraint
128
+ 2. Generate structured adaptations using `mistral-large-latest`
129
+ 3. Score each candidate with deterministic quality checks (constraint compliance, ingredient relevance, structural completeness)
130
+ 4. Keep only passing candidates — single candidate per recipe, drop on fail
131
+
132
+ The dataset covers 10 dietary categories: vegan, vegetarian, dairy-free, gluten-free, nut-free, egg-free, shellfish-free, low-sodium, low-sugar, low-fat.
133
+
134
+ Three prompt templates (labeled-block, natural-request, goal-oriented) at a 50/30/20 split prevent format overfitting.
135
 
136
+ Dataset: [`sumitdotml/robuchan-data`](https://huggingface.co/datasets/sumitdotml/robuchan-data)
137
 
138
+ ## Evaluation
139
 
140
+ Three-layer evaluation: format compliance (deterministic header parsing), dietary constraint compliance (regex against banned-ingredient lists), and LLM-as-judge via `mistral-large-latest`.
141
 
142
+ | Metric | Baseline (`mistral-small-latest`, n=50) | Robuchan (n=3) | Delta |
143
+ |--------|----------------------------------------:|---------------:|------:|
144
+ | Format Compliance | 14% | 100% | **+86pp** |
145
+ | Constraint Compliance | 0% | 33% | **+33pp** |
146
+ | Judge Overall Score | 9.20/10 | — | — |
147
 
148
+ **Key findings:**
149
+ - The base model writes fluent recipe adaptations but fails at structured output (only 14% contain all 5 required sections) and completely fails dietary compliance (0% pass the banned-ingredient check).
150
+ - Robuchan fixes structured output (100%) and begins enforcing dietary constraints (33%), though more training would likely improve compliance further.
151
+ - The LLM judge overestimates compliance (9.88/10 for the base model despite 0% deterministic pass) — it measures *attempt quality*, not correctness.
152
 
153
+ W&B: [sumit-ml/robuchan](https://wandb.ai/sumit-ml/robuchan/runs/uuj6tmlo)
154
+
155
+ ## Limitations
156
+
157
+ - **Small eval sample.** Only 3 rows were evaluated on the fine-tuned model before the HF Space crashed. Results are directionally strong but not statistically robust.
158
+ - **Partial training.** The adapter was saved from ~95% through epoch 1. More training would likely improve constraint compliance.
159
+ - **English only.** Training data and evaluation are English-language recipes only.
160
+ - **Not safety-tested.** This model is a hackathon prototype. Do not rely on it for medical dietary advice (severe allergies, celiac disease, etc.).
161
+
162
+ ## Links
163
+
164
+ - Code: [github.com/sumitdotml/robuchan](https://github.com/sumitdotml/robuchan)
165
+ - Dataset: [sumitdotml/robuchan-data](https://huggingface.co/datasets/sumitdotml/robuchan-data)
166
+ - Demo Space: [sumitdotml/robuchan-demo](https://huggingface.co/spaces/sumitdotml/robuchan-demo)
167
+ - Demo video: [YouTube](https://www.youtube.com/watch?v=LIlsP0OqTf4)
168
+ - W&B: [sumit-ml/robuchan](https://wandb.ai/sumit-ml/robuchan)
169
+
170
+ ## Authors
171
+
172
+ - [sumitdotml](https://github.com/sumitdotml)
173
+ - [Kaustubh Hiware](https://github.com/kaustubhhiware)
174
+
175
+ ## Framework Versions
176
+
177
+ - PEFT: 0.18.1
178
  - TRL: 0.29.0
179
  - Transformers: 5.2.0
180
+ - PyTorch: 2.6.0+cu124
181
  - Datasets: 4.6.1
 
 
 
182
 
183
+ ## Citation
184
 
 
 
 
185
  ```bibtex
186
+ @misc{robuchan2026,
187
+ title = {Robuchan: Recipe Dietary Adaptation via Fine-Tuned Ministral-8B},
188
+ author = {sumitdotml and Hiware, Kaustubh},
189
+ year = {2026},
190
+ url = {https://huggingface.co/sumitdotml/robuchan}
 
191
  }
192
+ ```