--- base_model: - joackimagno/Qwen-2.5-General-Recipe-Generation - Qwen/Qwen2.5-7B tags: - text-generation-inference - transformers - unsloth - qwen2 license: apache-2.0 language: - en datasets: - joackimagno/general-recipes - joackimagno/FILIPINO_RECIPES_2K_V2 metrics: - bleu - rouge - meteor model-index: - name: MASID-v1.2 results: - task: name: Text Generation type: text-generation dataset: name: joackimagno/FILIPINO_RECIPES_2K_V2 type: joackimagno/FILIPINO_RECIPES_2K_V2 split: test metrics: - name: BLEU-4 type: bleu value: 0.07 - name: METEOR type: meteor value: 0.35 - name: ROUGE-L (F1) type: rouge value: 0.32 unit: f1 config: rougeL --- # MASID-v1.2 **MASID-v1.2** is a transfer-learned Filipino main-dish recipe generator. It is trained **on top of** the base model **[`joackimagno/Qwen-2.5-General-Recipe-Generation`](https://huggingface.co/joackimagno/Qwen-2.5-General-Recipe-Generation)**, which itself was fine-tuned from **Qwen2.5-7B** using **~60k** general recipes from **`joackimagno/general-recipes`**. **MASID-v1.2** then performs **a second-stage fine-tuning** on **`joackimagno/FILIPINO_RECIPES_2K_V2` (~2k)** to specialize in **Filipino main dish generation**. The goal is to generate structured and culturally faithful Filipino recipes while benefiting from broader cooking knowledge learned during the general-recipe stage. --- ## Model Details - **Base Model (stage 0)**: [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) - **Intermediate Model (stage 1)**: [`joackimagno/Qwen-2.5-General-Recipe-Generation`](https://huggingface.co/joackimagno/Qwen-2.5-General-Recipe-Generation) — trained on ~60k general recipes - **Specialization Dataset (stage 2)**: [`joackimagno/FILIPINO_RECIPES_2K_V2`](https://huggingface.co/datasets/joackimagno/FILIPINO_RECIPES_2K_V2) (~2,000 samples) - **Objective**: Recipe text generation (Filipino cuisine, main dishes) - **Method**: Transfer learning (continued fine-tuning from the general-recipe model) --- ## Intended Use - Assisting in **recipe writing** - Exploring **Filipino food culture** - Generating **cooking instructions** in natural language --- ## Limitations - Trained on a relatively **small Filipino dataset (~2k)** for the specialization stage. - May occasionally produce **hallucinated ingredients** or **imprecise steps**. - Not a substitute for **nutrition** or **food-safety** advice. - Best for **research, education, and creative** use cases. --- ## Evaluation | Dataset | Split | BLEU-4 | METEOR | ROUGE-L (F1) | |------------------------------------|:-----:|:------:|:------:|:------------:| | joackimagno/FILIPINO_RECIPES_2K_V2 | test | 0.07 | 0.35 | 0.32 | > Notes: Evaluated with Alpaca-style prompting; simple post-processing (strip, EOS truncation). > If you rerun evaluation, pin dataset and package versions for reproducibility. ## Dataset Comparison: | Dataset | Description | |------------------------------------|:------------:| | joackimagno/FILIPINO_RECIPES_2K| Ingredient Name excludes basic pantry items (e.g. oil, water) but includes any ingredients| | joackimagno/FILIPINO_RECIPES_2K_V2 | Ingredient Name only contains classified ingredients from the small object detection model| --- --- This Qwen2 model was trained **2× faster** with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face’s TRL library. [](https://github.com/unslothai/unsloth) ## Example Usage ```python from typing import List import torch from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig # Load model and tokenizer model_name = "joackimagno/MASID-v1.2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto", ) # ============================================================== # Alpaca-style prompt # ============================================================== SYSTEM_INSTRUCTION = ( "You are a Filipino chef. Generate Filipino MAIN DISH recipes.\n" "Follow these output rules:\n" "1) Use standard stovetop or oven methods.\n" "2) Keep steps concise and logically ordered.\n" "3) Output FORMAT and ORDER must be exactly:\n" " Recipe name, Prep time, Cook time, Total time, Servings,\n" " Full Ingredients (numbered list), Instructions (numbered list)" ) ALPACA_TEMPLATE = ( "Below is an instruction that describes a task, paired with an input that " "provides further context. Write a response that appropriately completes the request.\n\n" "### Instruction:\n{}\n\n### Input:\n{}\n\n### Response:\n{}" ) def make_model_input_from_ing(ing_names: List[str]) -> str: return ( "Ingredients to use: " + ", ".join(ing_names) + ".\n" "Task: create a Filipino main dish recipe using these ingredients. " "Keep steps concise, clear, and coherent." ) # Example input ing_names = ["Beef", "Potato", "Sili", "Carrot", "Sayote"] alpaca_prompt = ALPACA_TEMPLATE.format( SYSTEM_INSTRUCTION, make_model_input_from_ing(ing_names), "" # leave response empty for model to generate ) # ============================================================== # Run inference # ============================================================== inputs = tokenizer(alpaca_prompt, return_tensors="pt").to(model.device) gen_config = GenerationConfig( max_new_tokens=512, temperature=0.7, top_p=0.9, do_sample=True, ) outputs = model.generate(**inputs, generation_config=gen_config) generated = tokenizer.decode( outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True ) print(generated.strip())