# Python Docstring Generator Generates docstrings for Python code snippets using a sequence-to-sequence model (e.g. T5 or CodeT5). Useful for code summarization and documentation. ## Task Given a Python function or code block (without a docstring), the model produces a short natural-language description suitable as a docstring. ## Model - Uses **Hugging Face Transformers** with a small T5 or CodeT5 checkpoint (e.g. `t5-small`, or `Salesforce/codet5-small` for code). - Inference script loads the model and tokenizer and runs generation with configurable length and sampling. ## Dataset - Training (optional): datasets like **CodeXGlue** code-to-text, or **DocstringGeneration**-style data from Hugging Face Datasets. - For inference only, no dataset is required; use pre-trained weights. ## Usage ```bash pip install -r requirements.txt python inference.py --input "def add(a, b): return a + b" ``` For a quick demo in the browser, run the Gradio app: ```bash python app.py ``` ## Example Input: ```python def factorial(n): if n <= 1: return 1 return n * factorial(n - 1) ``` Output (example): `"Compute the factorial of n recursively."` ## Files - `inference.py` — loads T5 (or CodeT5), runs generation; can take a file path or inline code. - `app.py` — Gradio UI for pasting code and getting a docstring. ## Limitations / future work - Quality depends on the base model and any fine-tuning; out-of-domain code may get generic descriptions. - Could be extended to multi-line docstrings or different styles (Google, NumPy, Sphinx). ## Author **Alireza Aminzadeh** - Email: [alireza.aminzadeh@hotmail.com](mailto:alireza.aminzadeh@hotmail.com) - Hugging Face: [syeedalireza](https://huggingface.co/syeedalireza) - LinkedIn: [alirezaaminzadeh](https://www.linkedin.com/in/alirezaaminzadeh)