--- license: mit metrics: - accuracy - f1 - precision - recall base_model: - microsoft/unixcoder-base library_name: transformers tags: - detection - AI-generated - transformers - bert --- ## Task Overview The rapid advancement of generative models has made it increasingly challenging to distinguish machine-generated code from human-written code, particularly across different programming languages, domains, and generation techniques. SemEval-2026 Task 13 focuses on developing systems capable of detecting machine-generated code under diverse conditions. The evaluation emphasizes generalization to unseen programming languages, generator families, and application scenarios. The task is divided into three subtasks. --- ### Subtask A: Binary Machine-Generated Code Detection **Goal:** Given a code snippet, determine whether it is: - Fully human-written, or - Fully machine-generated **Training Languages:** C++, Python, Java **Training Domain:** Algorithmic (e.g., LeetCode-style problems) **Evaluation Settings:** | Setting | Language | Domain | |--------------------------------------|-------------------------|----------------------| | (i) Seen Languages & Seen Domains | C++, Python, Java | Algorithmic | | (ii) Unseen Languages & Seen Domains | Go, PHP, C#, C, JS | Algorithmic | | (iii) Seen Languages & Unseen Domains| C++, Python, Java | Research, Production | | (iv) Unseen Languages & Domains | Go, PHP, C#, C, JS | Research, Production | **Dataset Size:** - Train: 500,000 samples (238,000 human-written, 262,000 machine-generated) - Validation: 100,000 samples **Data Format:** Each dataset includes the following fields: - `code`: The code snippet - `label`: Binary label (0 for human-written, 1 for machine-generated) - `language`: Programming language of the snippet Label mappings are provided in `task_A/label_to_id.json` and `task_A/id_to_label.json`. **Evaluation Metric:** The primary metric for Subtask A is Macro F1-score, ensuring balanced performance across both classes. **Submission Format:** Participants must submit a `.csv` file containing: - `id`: Unique identifier for each code snippet - `label`: Predicted label (0 or 1) A sample submission file is available in the `task_A/` directory. **Baseline Models:** Baseline implementations are provided in the `baselines/` directory, including starter code and pre-trained checkpoints for models such as GraphCodeBERT and UniXcoder. **Restrictions:** - No external training data may be used; only the provided datasets are allowed. - Specialized AI-generated code detectors are not permitted. General-purpose code models (e.g., CodeBERT, StarCoder) are allowed.