YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

llama-3.1-8b-sdft-simpleqa-ga2-step1400

Training Hyperparameters

Parameter Value
learning_rate 1e-05
num_train_epochs 10
per_device_train_batch_size 1
gradient_accumulation_steps 2
weight_decay 0.0
warmup_ratio 0.03
warmup_steps 0
lr_scheduler_type SchedulerType.CONSTANT
optim OptimizerNames.ADAMW_TORCH_FUSED
bf16 True
fp16 False
max_grad_norm 1
max_steps -1
save_steps 100
deepspeed /home/it4i-cfierro/weight-steering/deepspeed_configs/zero3_offload.json
gradient_checkpointing True

Training Results

  • Total steps: 1400
  • Best metric: None
  • Best checkpoint: None
Downloads last month
2
Safetensors
Model size
266k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support