---
title: Gemma-3-4B-PT Full-Model Reasoning Research
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
short_description: Researching multimodal SFT logic on Gemma-3-4B-PT
hf_oauth: true
hf_oauth_expiration_minutes: 36000
hf_oauth_scopes:
 - read-repos
 - write-repos
 - manage-repos
 - inference-api
 - read-billing
tags:
 - autotrain
 - gemma
 - multimodal
 - reasoning
 - sft
---

# 🎯 Project Objective: Improving Multimodal Logic in Gemma 3

This Space is dedicated to an educational research project focused on **Full-Model Supervised Fine-Tuning (SFT)** of the `google/gemma-3-4b-pt` architecture. 

The goal is to move beyond standard Low-Rank Adaptation (LoRA) to observe how full-parameter updates affect the model's ability to handle complex chain-of-thought reasoning across multimodal inputs.

## 🛠️ Hardware Requirements & Grant Justification
NVIDIA L40S

Because Gemma 3 is a multimodal model, the vision-language alignment layers and the full-parameter gradient states require the **48GB VRAM capacity of the L40S**. This high memory ceiling is essential for maintaining stability during the SFT process and preventing OOM (Out of Memory) errors when calculating multimodal attention gradients at 4B scale. Using an L40S will allow for faster dataset tokenization and more efficient model sharding, significantly reducing the total grant time used.

## 🧪 Methodology
- **Training Type:** Full-Model SFT (Supervised Fine-Tuning)
- **Precision:** `FP8` with `adamw_bnb_8bit` optimizer and `Unsloth`
- **Data:** Curated reasoning dataset formatted in ChatML for logical consistency.

## 🤝 Community Commitment
As per the grant request, once training is finalized:
1. The **full model weights** will be pushed to the Hub.
2. Training logs (Loss curves/Perplexity) will be made public.
3. **The Space will be manually reverted to the Free CPU tier to release resources back to the community.**

# 📜 Docs & Citation

Official Documentation: [AutoTrain Docs](https://huggingface.co/docs/autotrain)

```bibtex
@misc{thakur2024autotrainnocodetrainingstateoftheart,
      title={AutoTrain: No-code training for state-of-the-art models}, 
      author={Abhishek Thakur},
      year={2024},
      eprint={2410.15735},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={[https://arxiv.org/abs/2410.15735](https://arxiv.org/abs/2410.15735)}, 
}