You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

dense_swe_100m_mult_reseg_ep20

This model is a fine-tuned version of on the arrow dataset. It achieves the following results on the evaluation set:

  • Loss: 5.5959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1331
  • training_steps: 13311
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
8.9334 0.7510 500 8.6265
7.5445 1.5017 1000 7.2155
6.466 2.2523 1500 6.1474
5.726 3.0030 2000 5.6561
5.4102 3.7540 2500 5.3874
5.0792 4.5047 3000 5.2244
4.9331 5.2554 3500 5.1198
4.7877 6.0060 4000 5.0532
4.5897 6.7570 4500 5.0004
4.3866 7.5077 5000 4.9941
4.318 8.2584 5500 5.0072
4.2382 9.0090 6000 5.0087
4.0438 9.7600 6500 5.0272
3.8705 10.5107 7000 5.0767
3.8325 11.2614 7500 5.1324
3.7636 12.0120 8000 5.1551
3.5985 12.7630 8500 5.2060
3.4497 13.5137 9000 5.2728
3.4188 14.2644 9500 5.3354
3.3734 15.0150 10000 5.3705
3.2303 15.7661 10500 5.4190
3.1182 16.5167 11000 5.4762
3.1019 17.2674 11500 5.5233
3.059 18.0180 12000 5.5469
2.9652 18.7691 12500 5.5735
2.9104 19.5197 13000 5.5938

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.9.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results