gemma4-e2b-kinetics54K_FFT

Full fine-tune of google/gemma-4-e2b-it on AI-generated video data derived from Kinetics.

Training

  • Dataset: bear7011/gemma-4-e4b-kinetics_54K
  • Samples: 54,618 video instruction examples (80% for training, 10% for validation, 10% for test)
  • Method: full fine-tuning, no LoRA
  • Precision: bfloat16
  • GPUs: 4
  • DeepSpeed: ZeRO-3 with CPU optimizer and parameter offload
  • Epochs: 1
  • Global steps: 1366
  • Per-device batch size: 1
  • Gradient accumulation steps: 8
  • Optimizer: AdamW
  • Learning rate: 5e-6
  • Projector learning rate: 5e-6
  • Image encoder learning rate: 5e-6
  • Weight decay: 0.01
  • Warmup ratio: 0.03
  • LR scheduler: cosine
  • Gradient checkpointing: enabled
  • Max sequence length: 3072
  • Final training loss: 0.9243
Downloads last month
316
Safetensors
Model size
5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support