Instructions to use v-rusu/recipe-extractor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use v-rusu/recipe-extractor with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-3-270m-it")
model = PeftModel.from_pretrained(base_model, "v-rusu/recipe-extractor")

Transformers

How to use v-rusu/recipe-extractor with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="v-rusu/recipe-extractor")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("v-rusu/recipe-extractor", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use v-rusu/recipe-extractor with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "v-rusu/recipe-extractor"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "v-rusu/recipe-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/v-rusu/recipe-extractor

SGLang

How to use v-rusu/recipe-extractor with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "v-rusu/recipe-extractor" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "v-rusu/recipe-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "v-rusu/recipe-extractor" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "v-rusu/recipe-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use v-rusu/recipe-extractor with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for v-rusu/recipe-extractor to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for v-rusu/recipe-extractor to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for v-rusu/recipe-extractor to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="v-rusu/recipe-extractor",
    max_seq_length=2048,
)

Docker Model Runner
How to use v-rusu/recipe-extractor with Docker Model Runner:
```
docker model run hf.co/v-rusu/recipe-extractor
```

Recipe Extractor - Gemma-3 270M Fine-tuned

This is a fine-tuned version of Gemma-3 270M trained to extract structured JSON-LD recipe data from unstructured blog posts. The model can parse messy recipe blog posts (with stories, ads, and chaotic formatting) and output clean, valid schema.org Recipe objects.

Model Details

Model Description

This model extracts structured recipe information from natural language text in the form of JSON-LD following the schema.org Recipe specification. It's designed to handle various blog post styles including:

Minimal, organized formats
Fluffy blog posts with stories and advertisements
Chaotic, poorly formatted text
Short Instagram-style posts
Extremely verbose, unstructured content

The model was fine-tuned using LoRA (Low-Rank Adaptation) to maintain efficiency while achieving good extraction accuracy.

Developed by: Vlad Rusu
Model type: Text Generation (Recipe Extraction)
Language(s): English
License: MIT
Fine-tuned from model: unsloth/gemma-3-270m-it
Training Framework: Unsloth + PEFT (LoRA)

Model Sources

Repository: https://github.com/v-rusu/finetune-recipe-extractor
Dataset: https://huggingface.co/datasets/v-rusu/recipe-extractor-dataset
Developer Website: https://vladr.com
LinkedIn: https://www.linkedin.com/in/vrusu

Uses

Direct Use

The model can be used directly to extract recipe information from blog posts, social media posts, or any text containing recipe information. It outputs valid JSON-LD in schema.org Recipe format, which can be:

Embedded in web pages for SEO
Used in recipe management applications
Parsed by search engines and recipe aggregators
Stored in structured databases

Example Usage

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="v-rusu/recipe-extractor-gemma-3-270m",
    max_seq_length=8192,
    load_in_4bit=False,
)

messages = [
    {"role": "system", "content": "You are a recipe extraction assistant. Extract recipe information from the provided text and output it as a valid JSON-LD object following the schema.org Recipe format."},
    {"role": "user", "content": "Extract recipe information from this text:\n\n[your blog post text here]"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True,
).to("cuda")

outputs = model.generate(
    input_ids=inputs,
    max_new_tokens=1500,
    temperature=1,
    top_p=0.95,
    top_k=64,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Downstream Use

The extracted JSON-LD can be integrated into:

Recipe website SEO optimization
Recipe aggregation platforms
Meal planning applications
Nutritional analysis tools
Content management systems

Out-of-Scope Use

This model is not suitable for:

Medical or dietary advice
Allergen detection (requires specialized models)
Nutritional calculation (outputs rely on source accuracy)
Non-recipe content extraction
Languages other than English

Bias, Risks, and Limitations

Dataset Bias: The model was trained on data derived from AllRecipes, which may not represent global cuisine diversity
Synthetic Data: Training data was synthetically generated, which may not capture all real-world edge cases
Format Assumptions: The model expects blog-style recipe text and may not handle highly structured or tabular inputs well
Accuracy: Recipe quantities and instructions depend on accurate extraction from source text
No Validation: The model does not verify recipe feasibility or safety

Recommendations

Always validate extracted recipes for completeness and accuracy
Use with diverse recipe sources to identify potential biases
Implement additional validation for allergen and dietary information
Consider human review for production recipe applications
Test with your specific use case before deployment

How to Get Started with the Model

Installation

pip install unsloth transformers

Quick Start

from unsloth import FastLanguageModel

# Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="v-rusu/recipe-extractor-gemma-3-270m",
    max_seq_length=8192,
)

# Prepare your input
blog_post = """
Today I'm sharing my grandmother's famous chocolate chip cookies! 
These cookies are the best - soft, chewy, and packed with chocolate.

Ingredients:
- 2 cups all-purpose flour
- 1 cup sugar
- 1 cup chocolate chips
- 2 eggs
- 1 tsp vanilla extract

Instructions:
First, preheat your oven to 350°F. Then mix all dry ingredients...
"""

messages = [
    {"role": "system", "content": "You are a recipe extraction assistant. Extract recipe information from the provided text and output it as a valid JSON-LD object following the schema.org Recipe format."},
    {"role": "user", "content": f"Extract recipe information from this text:\n\n{blog_post}"},
]

# Generate
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt").to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=1500)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was trained on the recipe-extractor-dataset, which contains:

6,196 training examples
100% synthetic data generated using Deepseek v3.2
Source: Derived from the AllRecipes Kaggle Dataset

The dataset generation pipeline:

Real recipe data downloaded from Kaggle
Synthetic blog posts generated in 5 different styles (minimal, fluffy, chaotic, etc.) using Deepseek v3.2
Recipe JSON-LD extraction with chain-of-thought reasoning traces using Deepseek v3.2
Reasoning traces removed for training (preserved in dataset for potential reasoning model training)

Blog Post Styles (weighted distribution):

Instagram short (weight: 1) - Very brief posts
Minimal organized (weight: 2) - Clean, structured format
Fluffy organized (weight: 5) - Typical recipe blogs with stories
Chaotic unstructured (weight: 2) - Poorly formatted content
Super fluffy chaotic (weight: 1) - Extremely verbose and messy

Training Procedure

Preprocessing

Reasoning traces (<think>...</think>) removed from assistant responses since Gemma-3 270M is a non-reasoning model
Messages formatted using ChatML (Gemma3 chat template)
95/5 train/test split (seed: 42)
Maximum sequence length: 8,192 tokens

Training Hyperparameters

Base Model: unsloth/gemma-3-270m-it
Training regime: LoRA fine-tuning with 4-bit quantization
LoRA Configuration:
- Rank (r): 64
- Alpha: 64
- Dropout: 0
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training steps: 35 (max_steps)
Batch size: 4 per device
Gradient accumulation: 4 steps (effective batch size: 16)
Learning rate: 5e-4
Optimizer: AdamW 8-bit
Weight decay: 0.001
LR scheduler: Linear
Warmup steps: 5
Seed: 3407
Training objective: Supervised fine-tuning (SFT) on assistant responses only

Speeds, Sizes, Times

Training framework: Unsloth (optimized training)
Model size: ~270M parameters base + LoRA adapters
Training time: Varies by hardware (optimized for Google Colab T4 GPU)
Gradient checkpointing: Enabled (Unsloth mode)

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a held-out 5% test set from the recipe-extractor-dataset (~310 examples).

Evaluation Methodology

Two-stage evaluation process:

Structural Validation (RecipeEvaluator):
- Validates JSON syntax
- Checks required fields: @context, @type, name
- Validates optional fields: description, ingredients, instructions, times, yield, category, cuisine, keywords
- Verifies schema.org Recipe format compliance
- Checks ISO 8601 duration formats
Quality Assessment (VibesEvaluator):
- LLM-based evaluation using DeepSeek v3.2
- Assesses extraction quality and completeness
- Scores on a 1-10 scale

Metrics

Structural Validity Rate: Percentage of outputs that are valid JSON-LD
Field Completeness: Coverage of optional schema.org fields
Quality Score: Average LLM-assessed quality rating

Results

Results vary by training configuration. The best performing model (gemma-3-r64-max35-lr5e-4) achieved:

High structural validity on test set
Consistent extraction of required fields
Good handling of diverse blog post styles

Full evaluation results are saved per training run and include per-sample validation details.

Environmental Impact

Hardware Type: NVIDIA GPU (T4 or better recommended)
Training optimizations: Unsloth framework, 4-bit quantization, LoRA
Compute efficiency: Optimized for Google Colab free tier
Carbon footprint: Minimal due to small model size and efficient training

The use of a small model (270M parameters) and efficient fine-tuning techniques (LoRA, quantization) significantly reduces computational requirements compared to training larger models.

Technical Specifications

Model Architecture and Objective

Architecture: Gemma-3 270M (decoder-only transformer)
Fine-tuning method: LoRA (Low-Rank Adaptation)
Objective: Supervised fine-tuning for structured information extraction
Context length: 8,192 tokens
Output format: JSON-LD (schema.org Recipe)

Compute Infrastructure

Hardware

Development: Local machines with LM Studio (CPU/GPU)
Training: Google Colab with T4 GPU (recommended)
Inference: CPU or GPU (model supports various quantization levels)

Software

Framework: Unsloth
Libraries: PEFT 0.18.1, Transformers, TRL, PyTorch
Dataset Generation: LM Studio with Qwen3-14B and Deepseek v3.2
Quantization: Supports GGUF export (Q8_0, BF16, F16)
Compatible with: llama.cpp, Ollama, LM Studio

Citation

BibTeX:

@software{rusu2026recipe_extractor,
  author = {Rusu, Vlad},
  title = {Recipe Extractor: Fine-tuned Gemma-3 270M for Recipe JSON-LD Extraction},
  year = {2026},
  url = {https://github.com/v-rusu/finetune-recipe-extractor},
  note = {Fine-tuned on synthetic recipe blog data}
}

APA:

Rusu, V. (2026). Recipe Extractor: Fine-tuned Gemma-3 270M for Recipe JSON-LD Extraction [Computer software]. https://github.com/v-rusu/finetune-recipe-extractor

Glossary

JSON-LD: JSON for Linking Data, a method of encoding linked data using JSON
Schema.org Recipe: A standardized format for representing recipe information on the web
LoRA: Low-Rank Adaptation, a parameter-efficient fine-tuning technique
Unsloth: An optimized framework for efficient LLM training
Chain-of-thought: A reasoning approach where models show step-by-step thinking
GGUF: A file format for storing language models for efficient inference

More Information

Project Resources

GitHub Repository: https://github.com/v-rusu/finetune-recipe-extractor
Training Dataset: https://huggingface.co/datasets/v-rusu/recipe-extractor-dataset
Blog: https://vladr.com

Pipeline Scripts

The training pipeline includes:

01_download_dataset.py - Download AllRecipes from Kaggle
02_generate_blogs.py - Generate synthetic blog posts
03_generate_recipe_json.py - Extract recipes with reasoning
04_generate_finetuning_dataset.py - Create training dataset
05_finetune.py - Fine-tune the model
06_eval.py - Evaluate model performance

Complete documentation available in the GitHub repository.

Model Card Authors

Vlad Rusu

Model Card Contact

LinkedIn: https://www.linkedin.com/in/vrusu
Website: https://vladr.com
GitHub: https://github.com/v-rusu

Framework versions

PEFT 0.18.1
Transformers (latest compatible version)
Unsloth

Downloads last month: -

Model tree for v-rusu/recipe-extractor

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it