Qwen2.5-7B-Chef-VN

Qwen2.5-7B-Chef-VN is a fine-tuned large language model specialized in the culinary domain. Acting as a "Master Chef", it provides detailed, step-by-step cooking instructions, exact ingredient measurements, and culinary advice primarily in Vietnamese.

Model Details

Model Description

This model was fine-tuned using Supervised Fine-Tuning (SFT) and QLoRA on the Qwen/Qwen2.5-7B-Instruct base model. The training data was derived from the AkashPS11/recipes_data_food.com dataset, which was parsed and formatted into a conversational ChatML structure to teach the model how to guide users through recipes interactively.

  • Developed by: NotIsora (Đoàn Thiên An)
  • Model type: Causal Language Model (Fine-tuned via LoRA)
  • Language(s) (NLP): Vietnamese (vi), English (en)
  • License: Apache 2.0
  • Finetuned from model: Qwen/Qwen2.5-7B-Instruct

Uses

Direct Use

The model is intended to be used as a virtual chef or culinary assistant. Users can input a dish name or a list of available ingredients, and the model will return a comprehensive cooking guide including:

  • Ingredient lists with quantities.
  • Step-by-step preparation and cooking instructions.

Out-of-Scope Use

The model should not be used for:

  • Medical or dietary advice (e.g., prescribing diets for medical conditions).
  • Generating harmful, toxic, or unsafe content.
  • Tasks entirely unrelated to food, cooking, or culinary arts (its performance may degrade outside its specialized domain).

If you want to train by yourself

Bias, Risks, and Limitations

While the model generates detailed recipes, cooking involves physical safety (e.g., using knives, handling hot surfaces, food safety/allergies). Users should exercise common sense and verify food safety standards independently. The model may occasionally hallucinate ingredients or steps that do not perfectly align with traditional recipes.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

REPO_ID = "NotIsora/Qwen2.5-7B-Chef-VN"

tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
    REPO_ID,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

messages = [
    {"role": "system", "content": "Bạn là một siêu đầu bếp. Người dùng sẽ cung cấp nguyên liệu hoặc một món ăn, nhiệm vụ của bạn là hướng dẫn họ cách nấu chi tiết và ngon nhất."},
    {"role": "user", "content": "Hãy hướng dẫn tôi nấu ăn món khoai tây nghiền."}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.inference_mode():
    outputs = model.generate(
        **inputs,
        max_new_tokens=1024,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        repetition_penalty=1.15,
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Training Details

Training Data

  • The model was trained on a processed subset of the AkashPS11/recipes_data_food.com dataset. The data was filtered, parsed, and converted into ChatML format to simulate a user asking for a recipe and a chef responding with structured instructions.

Training Procedure

  • The model was trained using parameter-efficient fine-tuning (QLoRA) to optimize VRAM usage while maintaining performance.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Epochs: 6
  • Max Sequence Length: 1024
  • Per-device Batch Size: 2
  • Gradient Accumulation Steps: 4
  • Optimizer: paged_adamw_8bit
  • Learning Rate: 5e-5
  • Learning Rate Scheduler: Cosine
  • Warmup Ratio: 0.1
  • LoRA Rank (r): 16
  • LoRA Alpha: 32

Technical Specifications

Compute Infrastructure:

  • The model was trained on Google Colab.

Hardware:

  • GPU: 1x NVIDIA L4 / T4 Tensor Core GPU

Software:

  • PyTorch
  • Transformers
  • PEFT TRL
  • BitsAndBytes
  • FlashAttention / SDPA

Model Card Contact For any questions, issues, or collaborations, feel free to reach out via Hugging Face.

Downloads last month
42
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NotIsora/Qwen2.5-7B-Chef-VN

Base model

Qwen/Qwen2.5-7B
Adapter
(2233)
this model

Dataset used to train NotIsora/Qwen2.5-7B-Chef-VN