File size: 2,069 Bytes

a331bc2

---
language:
- en
license: llama2
tags:
- code
- llama2
- full-fine-tuning
- mask-fine-tuning
- coding
datasets:
- tulu3_persona_python
- evol_code
- code_alpaca
base_model: meta-llama/Llama-2-7b-hf
---

# llama2-7b-coding-fft

This model is a **Full Fine-Tuned (FFT)** version of LLaMA2-7B on coding datasets, trained as part of replicating the [Mask Fine-Tuning (MFT) paper](https://arxiv.org/abs/2503.22764v1).

## Model Details

- **Base Model:** meta-llama/Llama-2-7b-hf
- **Training Type:** Full Fine-Tuning (FFT)
- **Domain:** Coding
- **Hardware:** TPU v4-8
- **Training Framework:** PyTorch + torch_xla

## Training Data

The model was trained on 30,000 samples from three coding datasets (matching the paper):
- **Tulu 3 Persona Python:** 10,000 samples
- **Evol CodeAlpaca:** 10,000 samples
- **Code-Alpaca:** 10,000 samples

## Training Configuration

- **Epochs:** 2
- **Sequence Length:** 4096
- **Learning Rate:** 2e-5
- **Batch Size:** 8 (effective)
- **Optimizer:** AdamW
- **LR Scheduler:** Linear with warmup
- **Mixed Precision:** bfloat16

## Training Results

- **Final Loss:** 0.15353151041666666
- **Final Perplexity:** 1.1673020833333334
- **Training Time:** ~7 hours on TPU v4-8
- **Total Steps:** 7500

### Loss Progression
- Epoch 0: 0.42591484375
- Epoch 1: 0.15353151041666666

## Intended Use

This model serves as the **FFT baseline** for the Mask Fine-Tuning paper replication. It will be evaluated on:
- **HumanEval** (code generation benchmark)
- Target: Match paper's FFT baseline of 29.3%

## Evaluation

Evaluation on HumanEval is pending. Results will be updated here once available.

## Citation

If you use this model, please cite the original MFT paper:

```bibtex
@article{mft2025,
  title={Mask Fine-Tuning},
  author={[Authors from paper]},
  journal={arXiv preprint arXiv:2503.22764v1},
  year={2025}
}
```

## Reproducibility

Training configuration and code available at: [GitHub Repository](https://github.com/chrisfrancisque/mft-tpu)

## License

This model inherits the LLaMA 2 Community License from the base model.