metadata
title: Code Comment Classification Api
emoji: 📚
colorFrom: indigo
colorTo: yellow
sdk: docker
pinned: false
license: mit
short_description: Multi-label classification of code-comment sentences
Overview
This repository implements an end-to-end pipeline to classify comment sentences into language-specific categories and to aggregate results at file/PR level so reviewers can focus on rationale, usage notes, deprecations, examples, and other high-value signals. The project targets and aims to surpass the NLBSE’26 baselines, providing reproducible training, evaluation, and inference.
Core choices:
- Task: multi-label text classification at sentence level
- Scope: three languages with per-language models (Java, Python, Pharo)
- Usage: batch predictions on submissions (pre-review), summaries per file/PR
- Human-in-the-loop: reviewer confirmations/overrides feed threshold recalibration