Gemma-Train / README.md
turtle170's picture
Update README.md
c4662a8 verified
metadata
title: Gemma-3-4B-PT Full-Model Reasoning Research
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
short_description: Researching multimodal SFT logic on Gemma-3-4B-PT
hf_oauth: true
hf_oauth_expiration_minutes: 36000
hf_oauth_scopes:
  - read-repos
  - write-repos
  - manage-repos
  - inference-api
  - read-billing
tags:
  - autotrain
  - gemma
  - multimodal
  - reasoning
  - sft

🎯 Project Objective: Improving Multimodal Logic in Gemma 3

This Space is dedicated to an educational research project focused on Full-Model Supervised Fine-Tuning (SFT) of the google/gemma-3-4b-pt architecture.

The goal is to move beyond standard Low-Rank Adaptation (LoRA) to observe how full-parameter updates affect the model's ability to handle complex chain-of-thought reasoning across multimodal inputs.

🛠️ Hardware Requirements & Grant Justification

NVIDIA L40S

Because Gemma 3 is a multimodal model, the vision-language alignment layers and the full-parameter gradient states require the 48GB VRAM capacity of the L40S. This high memory ceiling is essential for maintaining stability during the SFT process and preventing OOM (Out of Memory) errors when calculating multimodal attention gradients at 4B scale. Using an L40S will allow for faster dataset tokenization and more efficient model sharding, significantly reducing the total grant time used.

🧪 Methodology

  • Training Type: Full-Model SFT (Supervised Fine-Tuning)
  • Precision: FP8 with adamw_bnb_8bit optimizer and Unsloth
  • Data: Curated reasoning dataset formatted in ChatML for logical consistency.

🤝 Community Commitment

As per the grant request, once training is finalized:

  1. The full model weights will be pushed to the Hub.
  2. Training logs (Loss curves/Perplexity) will be made public.
  3. The Space will be manually reverted to the Free CPU tier to release resources back to the community.

📜 Docs & Citation

Official Documentation: AutoTrain Docs

@misc{thakur2024autotrainnocodetrainingstateoftheart,
      title={AutoTrain: No-code training for state-of-the-art models}, 
      author={Abhishek Thakur},
      year={2024},
      eprint={2410.15735},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={[https://arxiv.org/abs/2410.15735](https://arxiv.org/abs/2410.15735)}, 
}