Sky-Blue-da-ba-dee's picture
fixed a typo in the project name
b4c95e1
metadata
title: Code Comment Classification Api
emoji: 📚
colorFrom: indigo
colorTo: yellow
sdk: docker
pinned: false
license: mit
short_description: Multi-label classification of code-comment sentences

Overview

This repository implements an end-to-end pipeline to classify comment sentences into language-specific categories and to aggregate results at file/PR level so reviewers can focus on rationale, usage notes, deprecations, examples, and other high-value signals. The project targets and aims to surpass the NLBSE’26 baselines, providing reproducible training, evaluation, and inference.

Core choices:

  • Task: multi-label text classification at sentence level
  • Scope: three languages with per-language models (Java, Python, Pharo)
  • Usage: batch predictions on submissions (pre-review), summaries per file/PR
  • Human-in-the-loop: reviewer confirmations/overrides feed threshold recalibration