| language: en | |
| tags: | |
| - code | |
| - algorithms | |
| - competitive-programming | |
| - multi-label-classification | |
| - codebert | |
| datasets: | |
| - xCodeEval | |
| metrics: | |
| - f1 | |
| - precision | |
| - recall | |
| library_name: transformers | |
| pipeline_tag: text-classification | |
| # CodeBERT Algorithm Tagger | |
| A fine-tuned CodeBERT model for multi-label classification of algorithmic problems from competitive programming platforms like Codeforces. | |
| ## Model Description | |
| This model predicts algorithmic tags/categories for competitive programming problems based on their problem descriptions and solution code. It's fine-tuned from [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base). | |
| **Supported Tags:** | |
| - math | |
| - graphs | |
| - strings | |
| - number theory | |
| - trees | |
| - geometry | |
| - games | |
| - probabilities | |
| ## Training Data | |
| - **Dataset**: xCodeEval (Codeforces problems) | |
| - **Training examples**: 2,147 problems (filtered for focus tags) | |
| - **Test examples**: 531 problems | |
| - **Source**: Problems and solutions from Codeforces platform | |
| ### Model Architecture | |
| - **Input**: Concatenated problem description and solution code | |
| - **Encoder**: CodeBERT (RoBERTa-based architecture) | |
| - **Output**: 8-dimensional binary classification (one per tag) | |
| ## Usage | |
| ### Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |