CodeT5-Small โ€” Code Comment Generator

Fine-tuned Salesforce/codet5-small on a filtered subset of CodeSearchNet to generate natural-language comments and docstrings from source code.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("melfatihomran/codet5-small-code-comment-gen")
model: [melfatihomran/codet5-small-code-comment-gen](https://huggingface.co/melfatihomran/codet5-small-code-comment-gen)
code = "def add(a, b):\n    return a + b"
inputs = tokenizer(code, return_tensors="pt")
output = model.generate(**inputs, max_length=64, num_beams=4)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training

Parameter Value
Base model Salesforce/codet5-small
Dataset sentence-transformers/codesearchnet (pair)
Train / Val / Test 8,000 / 1,000 / 1,000
Epochs 5
Learning rate 5e-5
Batch size 8
Precision fp16 (GPU)

Results

Metric Score
BLEU 19.65
ROUGE-1 41.11
ROUGE-2 23.41
ROUGE-L 38.83
Exact Match 5.60%
Downloads last month
-
Safetensors
Model size
60.5M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for melfatihomran/codet5-small-code-comment-gen

Finetuned
(92)
this model