food-vlm-tiny-quality-v3 (LoRA adapter)

LoRA adapter fine-tuned locally for food image understanding and structured nutrition JSON output.

Training data

  • Dataset: Codatta/MM-Food-100K
  • Task format: conversational VLM SFT with image + prompt and JSON target
  • Prepared subset with local image download, parsing, and filtering

Intended output format

The model is trained to return strict JSON containing fields like:

  • ingredients
  • portion_size
  • nutritional_profile
  • dish_name

How to use

Load this adapter on top of the base model:

from peft import PeftModel
from transformers import AutoModelForImageTextToText, AutoProcessor

base = "trl-internal-testing/tiny-Qwen2_5_VLForConditionalGeneration"
adapter = "thyfriendlyfox/food-vlm-tiny-quality-v3-adapter"

processor = AutoProcessor.from_pretrained(base, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(base, trust_remote_code=True)
model = PeftModel.from_pretrained(model, adapter)
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thyfriendlyfox/food-vlm-tiny-quality-v3-adapter

Dataset used to train thyfriendlyfox/food-vlm-tiny-quality-v3-adapter