ChefAI-7b-lora / README.md

Upload README.md with huggingface_hub

5d42981 verified 2 months ago

5.14 kB

	---
	base_model: Qwen/Qwen2.5-7B-Instruct
	library_name: peft
	pipeline_tag: text-generation
	license: apache-2.0
	language:
	- en
	- hi
	tags:
	- lora
	- transformers
	- qwen2
	- indian-cuisine
	- recipe-generation
	- cooking
	- peft
	---

	# ChefAI - Indian Recipe Generator 🍛

	A fine-tuned Qwen2.5-7B model specialized in generating authentic Indian recipes.

	## Model Description

	ChefAI is a LoRA adapter fine-tuned on the Qwen2.5-7B-Instruct model using Indian recipe data. It can generate detailed recipes with ingredients, step-by-step instructions, and cooking tips for a wide variety of Indian dishes.

	### Key Features
	- 🍲 Generates authentic Indian recipes
	- 📝 Provides detailed step-by-step cooking instructions
	- 🥘 Covers vegetarian and non-vegetarian dishes
	- 🌶️ Includes regional cuisines from across India

	## Requirements

	```bash
	pip install transformers peft torch accelerate
	# Optional: For 4-bit quantization (recommended for low VRAM)
	pip install bitsandbytes
	```

	## How to Use

	### Basic Usage (No Quantization - Requires ~16GB VRAM)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	# Load base model and tokenizer
	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-7B-Instruct",
	device_map="auto",
	torch_dtype="auto",
	trust_remote_code=True,
	)
	tokenizer = AutoTokenizer.from_pretrained("Raazi29/ChefAI-7b-lora")

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "Raazi29/ChefAI-7b-lora")
	model.eval()

	# Generate recipe
	messages = [{"role": "user", "content": "Give me a recipe for butter chicken"}]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt"
	).to(model.device)

	outputs = model.generate(
	input_ids=inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### With 4-bit Quantization (Requires ~6GB VRAM)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	# Quantization config for 4-bit loading
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_use_double_quant=True,
	)

	# Load base model with quantization
	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-7B-Instruct",
	quantization_config=bnb_config,
	device_map="auto",
	trust_remote_code=True,
	)

	# Load tokenizer and LoRA adapter
	tokenizer = AutoTokenizer.from_pretrained("Raazi29/ChefAI-7b-lora")
	model = PeftModel.from_pretrained(base_model, "Raazi29/ChefAI-7b-lora")
	model.eval()

	# Generate recipe
	messages = [{"role": "user", "content": "Give me a recipe for butter chicken"}]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt"
	).to(model.device)

	outputs = model.generate(
	input_ids=inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Google Colab Quick Start

	```python
	# Install dependencies
	!pip install -q transformers peft accelerate bitsandbytes

	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	# Load with 4-bit quantization for Colab's limited VRAM
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.float16,
	)

	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-7B-Instruct",
	quantization_config=bnb_config,
	device_map="auto",
	)

	tokenizer = AutoTokenizer.from_pretrained("Raazi29/ChefAI-7b-lora")
	model = PeftModel.from_pretrained(base_model, "Raazi29/ChefAI-7b-lora")

	# Test it
	messages = [{"role": "user", "content": "How to make paneer tikka?"}]
	inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
	outputs = model.generate(input_ids=inputs, max_new_tokens=512, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Example Prompts

	- "Give me a recipe for butter chicken"
	- "How do I make paneer tikka masala?"
	- "What ingredients do I need for biryani?"
	- "Suggest a quick vegetarian Indian dinner recipe"
	- "How to make dal makhani step by step?"

	## Training Details

	- Base Model: Qwen2.5-7B-Instruct
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- LoRA Rank: 8
	- LoRA Alpha: 16
	- Target Modules: q_proj, k_proj, v_proj, o_proj

	## Limitations

	- Responses may occasionally include non-Indian dishes
	- Some regional recipes might not be fully accurate
	- Always verify cooking times and temperatures for safety

	## License

	Apache 2.0