Instructions to use dzungpham/graphcodebert-code-classification with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dzungpham/graphcodebert-code-classification with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("dzungpham/graphcodebert-code-classification", dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 2,795 Bytes
f6ef07a b586b98 f6ef07a b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 b586b98 d2240b9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | ---
license: mit
metrics:
- accuracy
- f1
- precision
- recall
base_model:
- microsoft/unixcoder-base
library_name: transformers
tags:
- detection
- AI-generated
- transformers
- bert
---
## Task Overview
The rapid advancement of generative models has made it increasingly challenging to distinguish machine-generated code from human-written code, particularly across different programming languages, domains, and generation techniques.
SemEval-2026 Task 13 focuses on developing systems capable of detecting machine-generated code under diverse conditions. The evaluation emphasizes generalization to unseen programming languages, generator families, and application scenarios.
The task is divided into three subtasks.
---
### Subtask A: Binary Machine-Generated Code Detection
**Goal:**
Given a code snippet, determine whether it is:
- Fully human-written, or
- Fully machine-generated
**Training Languages:** C++, Python, Java
**Training Domain:** Algorithmic (e.g., LeetCode-style problems)
**Evaluation Settings:**
| Setting | Language | Domain |
|--------------------------------------|-------------------------|----------------------|
| (i) Seen Languages & Seen Domains | C++, Python, Java | Algorithmic |
| (ii) Unseen Languages & Seen Domains | Go, PHP, C#, C, JS | Algorithmic |
| (iii) Seen Languages & Unseen Domains| C++, Python, Java | Research, Production |
| (iv) Unseen Languages & Domains | Go, PHP, C#, C, JS | Research, Production |
**Dataset Size:**
- Train: 500,000 samples (238,000 human-written, 262,000 machine-generated)
- Validation: 100,000 samples
**Data Format:**
Each dataset includes the following fields:
- `code`: The code snippet
- `label`: Binary label (0 for human-written, 1 for machine-generated)
- `language`: Programming language of the snippet
Label mappings are provided in `task_A/label_to_id.json` and `task_A/id_to_label.json`.
**Evaluation Metric:**
The primary metric for Subtask A is Macro F1-score, ensuring balanced performance across both classes.
**Submission Format:**
Participants must submit a `.csv` file containing:
- `id`: Unique identifier for each code snippet
- `label`: Predicted label (0 or 1)
A sample submission file is available in the `task_A/` directory.
**Baseline Models:**
Baseline implementations are provided in the `baselines/` directory, including starter code and pre-trained checkpoints for models such as GraphCodeBERT and UniXcoder.
**Restrictions:**
- No external training data may be used; only the provided datasets are allowed.
- Specialized AI-generated code detectors are not permitted. General-purpose code models (e.g., CodeBERT, StarCoder) are allowed.
|