PR Code Quality Scorer

Classifier for code and comment quality (e.g. for pull requests or snippets). Predicts a quality score or label (e.g. good / needs improvement) from code text using a transformer-based encoder.

Overview

Intended for integration into code review workflows: given a code block or diff, the model outputs a quality category or score. A Gradio Space is provided for quick try-it-out.

Model

Uses a small Hugging Face model (e.g. CodeBERT or microsoft/codebert-base) to encode the snippet, then a linear head for classification.
Labels can be binary (good/bad) or ordinal; training data format is described in train.py. Inference in inference.py; optional checkpoint at ./checkpoints/code_quality.pt.

Usage

Inference / demo:

pip install -r requirements.txt
python app.py

Training (if you have labeled data): adapt train.py to your dataset and run:

python train.py

Limitations / future work

Quality is subjective; the model reflects the labeling scheme of the training data.
Could be extended to multi-label (readability, security, style) or regression scores.

Author

Alireza Aminzadeh

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support