Spaces:
Running
Running
| # Tiny Scribe - Project Context | |
| ## Project Overview | |
| **Tiny Scribe** is a lightweight, local LLM-powered transcript summarization tool. It is designed to run efficiently on standard hardware (including free CPU tiers on HuggingFace Spaces) using GGUF quantized models. | |
| The project features a web interface (Gradio) and a CLI tool, supporting over 24 models ranging from 100M to 30B parameters. It includes specialized features like live streaming, reasoning mode (thinking) for supported models, and dual-language output (English/Traditional Chinese). | |
| ## Tech Stack | |
| * **Language:** Python 3.10+ | |
| * **UI Framework:** Gradio (Web), `argparse` (CLI) | |
| * **Inference Engine:** `llama-cpp-python` (Python bindings for `llama.cpp`) | |
| * **Model Format:** GGUF (Quantized) | |
| * **Containerization:** Docker (optimized for HuggingFace Spaces) | |
| * **Utilities:** `opencc` (Chinese conversion), `huggingface_hub` | |
| ## Key Files & Directories | |
| * `app.py`: The main entry point for the Gradio web application. Contains the UI layout, model loading logic, and generation pipeline. | |
| * `summarize_transcript.py`: Command-line interface for batch processing or local summarization without the web UI. | |
| * `Dockerfile`: Defines the build environment. **Crucial:** It installs a specific pre-compiled wheel for `llama-cpp-python` to ensure compatibility and performance on HF Spaces (Free CPU tier). | |
| * `deploy.sh`: Helper script to stage, commit, and push changes to the HuggingFace Space. Enforces non-generic commit messages. | |
| * `requirements.txt`: Python dependencies (excluding `llama-cpp-python` which is handled specially in Docker). | |
| * `transcripts/`: Directory for storing input transcript files. | |
| * `AGENTS.md` / `CLAUDE.md`: Existing context files for other AI assistants. | |
| ## Build & Run Instructions | |
| ### 1. Installation | |
| The project relies on `llama-cpp-python`. For local development, you must install it separately, as it's not in `requirements.txt` to avoid build errors on systems without compilers. | |
| ```bash | |
| # Install general dependencies | |
| pip install -r requirements.txt | |
| # Install llama-cpp-python (with CUDA support if available, otherwise CPU) | |
| # See: https://github.com/abetlen/llama-cpp-python#installation | |
| pip install llama-cpp-python | |
| ``` | |
| ### 2. Running the Web UI | |
| ```bash | |
| python app.py | |
| # Access at http://localhost:7860 | |
| ``` | |
| ### 3. Running the CLI | |
| ```bash | |
| # Basic English summary | |
| python summarize_transcript.py -i transcripts/your_file.txt | |
| # Traditional Chinese output | |
| python summarize_transcript.py -i transcripts/your_file.txt -l zh-TW | |
| # Use a specific model | |
| python summarize_transcript.py -i transcripts/your_file.txt -m "unsloth/Qwen3-1.7B-GGUF" | |
| ``` | |
| ### 4. Deployment (HuggingFace Spaces) | |
| Always use the provided script to ensure clean commits and deployment: | |
| ```bash | |
| ./deploy.sh "Your descriptive commit message" | |
| ``` | |
| ## Model Architecture & Categories | |
| The project categorizes models to help users balance speed vs. quality: | |
| * **Tiny (0.1-0.6B):** Extremely fast, good for simple formatting (e.g., Qwen3-0.6B). | |
| * **Compact (1.5-2.6B):** Good balance for free tier (e.g., Granite-3.1-1B, Qwen3-1.7B). | |
| * **Standard (3-7B):** Higher quality, slower on CPU (e.g., Llama-3-8B variants). | |
| * **Medium (21-30B):** High performance, requires significant RAM (e.g., Command R, Qwen-30B). | |
| ## Development Conventions | |
| * **Dependency Management:** `llama-cpp-python` is pinned in the `Dockerfile` via a custom wheel URL. Do not add it to `requirements.txt` unless you are changing the build strategy. | |
| * **Code Style:** The project uses `ruff` for linting. | |
| * **Git:** Use `deploy.sh` to push. Avoid generic commit messages like "update" or "fix". | |
| * **Environment:** The app is optimized for Linux/Docker environments. Local Windows development may require extra setup for `llama-cpp-python` compilation. | |