melfatihomran's picture
Update README.md
558ec57 verified
|
Raw
History Blame Contribute Delete
1.35 kB
metadata
library_name: transformers
tags:
  - code
  - code-generation
  - codet5
  - comment-generation
  - seq2seq
language:
  - en
base_model: Salesforce/codet5-small

CodeT5-Small — Code Comment Generator

Fine-tuned Salesforce/codet5-small on a filtered subset of CodeSearchNet to generate natural-language comments and docstrings from source code.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("melfatihomran/codet5-small-code-comment-gen")
model: [melfatihomran/codet5-small-code-comment-gen](https://huggingface.co/melfatihomran/codet5-small-code-comment-gen)
code = "def add(a, b):\n    return a + b"
inputs = tokenizer(code, return_tensors="pt")
output = model.generate(**inputs, max_length=64, num_beams=4)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training

Parameter Value
Base model Salesforce/codet5-small
Dataset sentence-transformers/codesearchnet (pair)
Train / Val / Test 8,000 / 1,000 / 1,000
Epochs 5
Learning rate 5e-5
Batch size 8
Precision fp16 (GPU)

Results

Metric Score
BLEU 19.65
ROUGE-1 41.11
ROUGE-2 23.41
ROUGE-L 38.83
Exact Match 5.60%