--- license: cc-by-4.0 language: - en library_name: transformers pipeline_tag: text-classification tags: - code metrics: - accuracy - f1 --- # CodeBERT-SO Repository for CodeBERT, fine-tuned on Stack Overflow snippets with respect to NL-PL pairs of 6 languages (Python, Java, JavaScript, PHP, Ruby, Go). ## Training Objective This model is initialized with [CodeBERT-base](https://huggingface.co/microsoft/codebert-base) and trained to classify whether a user will drop out given their posts and code snippets. ## Training Regime Training was done across 8 epochs with a batch size of 8, learning rate of 1e-5, epsilon (weight update denominator) of 1e-8. A random 20% sample of the entire dataset was used as the validation set. ## Performance * Final validation accuracy: 0.822 * Final validation F1: 0.809 * Final validation loss: 0.5