FoodExtract-v1 / README.md
danishsyed-dev
Use Python 3.11 to fix audioop compatibility
ecb8e20

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: FoodExtract v1
emoji: πŸ•
colorFrom: yellow
colorTo: red
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
license: mit
python_version: '3.11'
short_description: Extract food & drinks from text with Gemma 3

πŸ• Fully Fine-Tune a Small Language Model (Gemma 3 270M)

A tutorial on how to fully fine-tune Google's Gemma 3 270M model using Hugging Face libraries to extract food and drink items from text.

πŸ“– Overview

This project demonstrates Supervised Fine-Tuning (SFT) of a Small Language Model (SLM) for a specific task: extracting food and drink items from text. The fine-tuned model can process text inputs and return structured data about food/drink content.

Why Fine-tune a Small Language Model?

  • βœ… Own the model - Run anywhere without API costs
  • βœ… Simple tasks work well with smaller models
  • βœ… No API calls needed - Run offline
  • βœ… Batch processing - Much faster than API calls
  • βœ… Task-specific optimization - Better performance on your use case

🎯 What We're Building

A model that extracts food and drink items from text, returning structured output:

Input:

A plate of rice cakes, salmon, cottage cheese and small cherry tomatoes with a cup of tea.

Output:

food_or_drink: 1
tags: fi
foods: rice cakes, salmon, cottage cheese, cherry tomatoes
drinks: cup of tea

πŸ› οΈ Technologies Used

  • Model: Gemma 3 270M
  • Dataset: FoodExtract-1k
  • Libraries:
    • transformers - Model loading and inference
    • trl - Transformers Reinforcement Learning (SFT)
    • datasets - Data loading
    • accelerate - Training acceleration
    • gradio - Interactive demo

πŸ“‹ Requirements

  • Python 3.8+
  • GPU with at least 16GB VRAM (Google Colab T4 works!)
  • Hugging Face account (for uploading model)

πŸš€ Quick Start

1. Open in Google Colab

Open In Colab

2. Enable GPU Runtime

  • Go to Runtime β†’ Change runtime type β†’ Select GPU

3. Run All Cells

The notebook will:

  1. Install dependencies
  2. Load the base model
  3. Prepare the dataset
  4. Fine-tune (3 epochs, ~18 minutes)
  5. Evaluate and create demo

πŸ“Š Training Results

After 3 epochs of training:

Epoch Training Loss Validation Loss Token Accuracy
1 2.17 2.24 58.8%
2 1.25 2.28 58.9%
3 1.07 2.46 58.6%

πŸ”‘ Key Concepts

Full Fine-Tuning vs LORA

  • Full Fine-Tuning: All model weights updated (used here)
  • LORA: Only adapter weights trained (less resources needed)

SLM (Small Language Model)

  • Models under 1B parameters
  • Great for specific tasks
  • Can be tailored for your use case

Tokens In, Tokens Out

  • Think of any problem as: What tokens do I want in, and what tokens do I want out?

πŸ“ Project Structure

β”œβ”€β”€ NVIDIA-DGX-Spark-hugging_face_llm_full_fine_tune_tutorial-VIDEO.ipynb
β”œβ”€β”€ README.md
β”œβ”€β”€ .gitignore
└── Fully Fine-Tune a SLM-Gemma 3 270M.pdf (reference)

πŸŽ₯ Video Tutorial

Watch the full video walkthrough

πŸ“š Resources

🏷️ Tags Dictionary

The model assigns these tags to text:

Tag Meaning
np Nutrition Panel
il Ingredient List
me Menu
re Recipe
fi Food Items
di Drink Items
fa Food Advertisement
fp Food Packaging

πŸ“ License

This project is for educational purposes. Please refer to the original sources for licensing information.

πŸ‘€ Author

Created following the tutorial by Daniel Bourke


Note: The model weights are not included in this repository. You can either:

  1. Run the notebook to create your own fine-tuned model
  2. Download from Hugging Face Hub if available