--- title: Gemma-3-4B-PT Full-Model Reasoning Research emoji: ๐Ÿง  colorFrom: blue colorTo: green sdk: docker pinned: false short_description: Researching multimodal SFT logic on Gemma-3-4B-PT hf_oauth: true hf_oauth_expiration_minutes: 36000 hf_oauth_scopes: - read-repos - write-repos - manage-repos - inference-api - read-billing tags: - autotrain - gemma - multimodal - reasoning - sft --- # ๐ŸŽฏ Project Objective: Improving Multimodal Logic in Gemma 3 This Space is dedicated to an educational research project focused on **Full-Model Supervised Fine-Tuning (SFT)** of the `google/gemma-3-4b-pt` architecture. The goal is to move beyond standard Low-Rank Adaptation (LoRA) to observe how full-parameter updates affect the model's ability to handle complex chain-of-thought reasoning across multimodal inputs. ## ๐Ÿ› ๏ธ Hardware Requirements & Grant Justification NVIDIA L40S Because Gemma 3 is a multimodal model, the vision-language alignment layers and the full-parameter gradient states require the **48GB VRAM capacity of the L40S**. This high memory ceiling is essential for maintaining stability during the SFT process and preventing OOM (Out of Memory) errors when calculating multimodal attention gradients at 4B scale. Using an L40S will allow for faster dataset tokenization and more efficient model sharding, significantly reducing the total grant time used. ## ๐Ÿงช Methodology - **Training Type:** Full-Model SFT (Supervised Fine-Tuning) - **Precision:** `FP8` with `adamw_bnb_8bit` optimizer and `Unsloth` - **Data:** Curated reasoning dataset formatted in ChatML for logical consistency. ## ๐Ÿค Community Commitment As per the grant request, once training is finalized: 1. The **full model weights** will be pushed to the Hub. 2. Training logs (Loss curves/Perplexity) will be made public. 3. **The Space will be manually reverted to the Free CPU tier to release resources back to the community.** # ๐Ÿ“œ Docs & Citation Official Documentation: [AutoTrain Docs](https://huggingface.co/docs/autotrain) ```bibtex @misc{thakur2024autotrainnocodetrainingstateoftheart, title={AutoTrain: No-code training for state-of-the-art models}, author={Abhishek Thakur}, year={2024}, eprint={2410.15735}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={[https://arxiv.org/abs/2410.15735](https://arxiv.org/abs/2410.15735)}, }