| license: openrail | |
| library_name: transformers | |
| tags: | |
| - code | |
| datasets: | |
| - codeparrot/github-jupyter-code-to-text | |
| base_model: bigcode/santacoder | |
| # Santacoder code-to-text | |
| This model is a fine-tuned version of [bigcode/santacoder](https://huggingface.co/bigcode/santacoder) on | |
| [copdeparrot/gitub-jupyter-code-to-text](https://huggingface.co/datasets/codeparrot/github-jupyter-code-to-text). | |
| ## Training procedure | |
| The model was trained on 4 A100 for 3h with the following hyperparameters were used during training on 4 A100: | |
| - learning_rate: 5e-05 | |
| - train_batch_size: 2 | |
| - eval_batch_size: 2 | |
| - seed: 42 | |
| - gradient_accumulation_steps: 4 | |
| - total_train_batch_size: 4 | |
| - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 | |
| - lr_scheduler_type: cosine | |
| - lr_scheduler_warmup_steps: 100 | |
| - training_steps: 800 | |