--- library_name: transformers tags: - code - code-generation - codet5 - comment-generation - seq2seq language: - en base_model: Salesforce/codet5-small --- # CodeT5-Small — Code Comment Generator Fine-tuned [`Salesforce/codet5-small`](https://huggingface.co/Salesforce/codet5-small) on a filtered subset of CodeSearchNet to generate natural-language comments and docstrings from source code. ## Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("melfatihomran/codet5-small-code-comment-gen") model: [melfatihomran/codet5-small-code-comment-gen](https://huggingface.co/melfatihomran/codet5-small-code-comment-gen) code = "def add(a, b):\n return a + b" inputs = tokenizer(code, return_tensors="pt") output = model.generate(**inputs, max_length=64, num_beams=4) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Training | Parameter | Value | |-----------|-------| | Base model | Salesforce/codet5-small | | Dataset | sentence-transformers/codesearchnet (pair) | | Train / Val / Test | 8,000 / 1,000 / 1,000 | | Epochs | 5 | | Learning rate | 5e-5 | | Batch size | 8 | | Precision | fp16 (GPU) | ## Results | Metric | Score | |--------|-------| | BLEU | 19.65 | | ROUGE-1 | 41.11 | | ROUGE-2 | 23.41 | | ROUGE-L | 38.83 | | Exact Match | 5.60% |