| # Python Docstring Generator | |
| Generates docstrings for Python code snippets using a sequence-to-sequence model (e.g. T5 or CodeT5). Useful for code summarization and documentation. | |
| ## Task | |
| Given a Python function or code block (without a docstring), the model produces a short natural-language description suitable as a docstring. | |
| ## Model | |
| - Uses **Hugging Face Transformers** with a small T5 or CodeT5 checkpoint (e.g. `t5-small`, or `Salesforce/codet5-small` for code). | |
| - Inference script loads the model and tokenizer and runs generation with configurable length and sampling. | |
| ## Dataset | |
| - Training (optional): datasets like **CodeXGlue** code-to-text, or **DocstringGeneration**-style data from Hugging Face Datasets. | |
| - For inference only, no dataset is required; use pre-trained weights. | |
| ## Usage | |
| ```bash | |
| pip install -r requirements.txt | |
| python inference.py --input "def add(a, b): return a + b" | |
| ``` | |
| For a quick demo in the browser, run the Gradio app: | |
| ```bash | |
| python app.py | |
| ``` | |
| ## Example | |
| Input: | |
| ```python | |
| def factorial(n): | |
| if n <= 1: | |
| return 1 | |
| return n * factorial(n - 1) | |
| ``` | |
| Output (example): `"Compute the factorial of n recursively."` | |
| ## Files | |
| - `inference.py` — loads T5 (or CodeT5), runs generation; can take a file path or inline code. | |
| - `app.py` — Gradio UI for pasting code and getting a docstring. | |
| ## Limitations / future work | |
| - Quality depends on the base model and any fine-tuning; out-of-domain code may get generic descriptions. | |
| - Could be extended to multi-line docstrings or different styles (Google, NumPy, Sphinx). | |
| ## Author | |
| **Alireza Aminzadeh** | |
| - Email: [alireza.aminzadeh@hotmail.com](mailto:alireza.aminzadeh@hotmail.com) | |
| - Hugging Face: [syeedalireza](https://huggingface.co/syeedalireza) | |
| - LinkedIn: [alirezaaminzadeh](https://www.linkedin.com/in/alirezaaminzadeh) | |