Chrisfrancisque's picture
Upload FFT model trained on coding datasets
a331bc2 verified
---
language:
- en
license: llama2
tags:
- code
- llama2
- full-fine-tuning
- mask-fine-tuning
- coding
datasets:
- tulu3_persona_python
- evol_code
- code_alpaca
base_model: meta-llama/Llama-2-7b-hf
---
# llama2-7b-coding-fft
This model is a **Full Fine-Tuned (FFT)** version of LLaMA2-7B on coding datasets, trained as part of replicating the [Mask Fine-Tuning (MFT) paper](https://arxiv.org/abs/2503.22764v1).
## Model Details
- **Base Model:** meta-llama/Llama-2-7b-hf
- **Training Type:** Full Fine-Tuning (FFT)
- **Domain:** Coding
- **Hardware:** TPU v4-8
- **Training Framework:** PyTorch + torch_xla
## Training Data
The model was trained on 30,000 samples from three coding datasets (matching the paper):
- **Tulu 3 Persona Python:** 10,000 samples
- **Evol CodeAlpaca:** 10,000 samples
- **Code-Alpaca:** 10,000 samples
## Training Configuration
- **Epochs:** 2
- **Sequence Length:** 4096
- **Learning Rate:** 2e-5
- **Batch Size:** 8 (effective)
- **Optimizer:** AdamW
- **LR Scheduler:** Linear with warmup
- **Mixed Precision:** bfloat16
## Training Results
- **Final Loss:** 0.15353151041666666
- **Final Perplexity:** 1.1673020833333334
- **Training Time:** ~7 hours on TPU v4-8
- **Total Steps:** 7500
### Loss Progression
- Epoch 0: 0.42591484375
- Epoch 1: 0.15353151041666666
## Intended Use
This model serves as the **FFT baseline** for the Mask Fine-Tuning paper replication. It will be evaluated on:
- **HumanEval** (code generation benchmark)
- Target: Match paper's FFT baseline of 29.3%
## Evaluation
Evaluation on HumanEval is pending. Results will be updated here once available.
## Citation
If you use this model, please cite the original MFT paper:
```bibtex
@article{mft2025,
title={Mask Fine-Tuning},
author={[Authors from paper]},
journal={arXiv preprint arXiv:2503.22764v1},
year={2025}
}
```
## Reproducibility
Training configuration and code available at: [GitHub Repository](https://github.com/chrisfrancisque/mft-tpu)
## License
This model inherits the LLaMA 2 Community License from the base model.