Spaces:

Luigi
/

tiny-scribe

Running

Luigi commited on Jan 31

Commit

ddec8de

1 Parent(s): 80ca4af

Update AGENTS.md with Gradio app coverage and deployment info

- Add Gradio web app commands and documentation
- Include Docker/HF Spaces deployment section
- Update dependencies with version specifications
- Add gradio import to code style guidelines
- Improve thinking block regex pattern to support both formats
- Add HF Spaces resource constraints (2 vCPUs, 16GB RAM)
- Update project structure with app.py and Dockerfile

Files changed (1) hide show

AGENTS.md +33 -12

AGENTS.md CHANGED Viewed

@@ -2,17 +2,22 @@
 ## Project Overview
-Tiny Scribe is a Python CLI tool for summarizing transcripts using GGUF models (e.g., ERNIE, Qwen, Granite) with llama-cpp-python. It supports live streaming output and Traditional Chinese (zh-TW) conversion via OpenCC.
 ## Build / Lint / Test Commands
-**Run the script:**
 ```bash
 python summarize_transcript.py -i ./transcripts/short.txt
 python summarize_transcript.py -m unsloth/Qwen3-1.7B-GGUF:Q2_K_L
 python summarize_transcript.py -c  # CPU only
 ```
 **Linting (if ruff installed):**
 ```bash
 ruff check .
@@ -23,6 +28,7 @@ ruff format .            # Auto-format code
 **Type checking (if mypy installed):**
 ```bash
 mypy summarize_transcript.py
 ```
 **Running tests:**
@@ -62,6 +68,7 @@ from typing import Tuple, Optional, Generator
 from llama_cpp import Llama
 from huggingface_hub import hf_hub_download
 from opencc import OpenCC
 ```
 **Type Hints:**
@@ -90,29 +97,29 @@ from opencc import OpenCC
 ## Dependencies
 **Required:**
-- `llama-cpp-python` - Core inference engine
-- `huggingface-hub` - Model downloading
-- `opencc` - Chinese text conversion
 **Development (optional):**
-- `pytest` - Testing framework
 - `ruff` - Linting and formatting
 - `mypy` - Type checking
-- `black` - Code formatting
 ## Project Structure
 ```
 tiny-scribe/
 ├── summarize_transcript.py    # Main CLI script
 ├── transcripts/               # Input transcript files
 │   ├── short.txt
 │   └── full.txt
-├── summary.txt                # Generated output
 ├── llama-cpp-python/          # Git submodule
 │   ├── tests/                 # Test suite
-│   │   ├── test_llama.py
-│   │   └── test_llama_grammar.py
 │   └── vendor/llama.cpp/      # Core C++ library
 └── README.md                  # Project documentation
 ```
@@ -143,7 +150,7 @@ stream = llm.create_chat_completion(
 **Thinking Block Parsing:**
 ```python
 # Extract thinking/reasoning blocks from model output
-THINKING_PATTERN = re.compile(r'<thinking>(.*?)</thinking>', re.DOTALL)
 for chunk in stream:
     delta = chunk["choices"][0]["delta"]
@@ -169,8 +176,9 @@ traditional_text = converter.convert(simplified_text)
 - Always call `llm.reset()` after completion to ensure state isolation
 - Model format: `repo_id:quant` (e.g., `unsloth/Qwen3-1.7B-GGUF:Q2_K_L`)
 - Default language output is Traditional Chinese (zh-TW) via OpenCC conversion
-- Claude permissions configured in `.claude/settings.local.json` for tool access
 - HuggingFace cache at `~/.cache/huggingface/hub/` - clean periodically
 ## Git Submodule Management
@@ -181,3 +189,16 @@ git submodule update --init --recursive
 # Update llama-cpp-python to latest
 cd llama-cpp-python && git pull origin main && cd .. && git add llama-cpp-python
 ```

 ## Project Overview
+Tiny Scribe is a Python CLI tool and Gradio web app for summarizing transcripts using GGUF models (e.g., ERNIE, Qwen, Granite) with llama-cpp-python. It supports live streaming output and Traditional Chinese (zh-TW) conversion via OpenCC.
 ## Build / Lint / Test Commands
+**Run the CLI script:**
 ```bash
 python summarize_transcript.py -i ./transcripts/short.txt
 python summarize_transcript.py -m unsloth/Qwen3-1.7B-GGUF:Q2_K_L
 python summarize_transcript.py -c  # CPU only
 ```
+**Run the Gradio web app:**
+```bash
+python app.py  # Starts on port 7860
+```
 **Linting (if ruff installed):**
 ```bash
 ruff check .
 **Type checking (if mypy installed):**
 ```bash
 mypy summarize_transcript.py
+mypy app.py
 ```
 **Running tests:**
 from llama_cpp import Llama
 from huggingface_hub import hf_hub_download
 from opencc import OpenCC
+import gradio as gr
 ```
 **Type Hints:**
 ## Dependencies
 **Required:**
+- `llama-cpp-python>=0.3.0` - Core inference engine
+- `gradio>=5.0.0` - Web UI framework
+- `huggingface-hub>=0.23.0` - Model downloading
+- `opencc-python-reimplemented>=0.1.7` - Chinese text conversion
 **Development (optional):**
+- `pytest>=7.4.0` - Testing framework
 - `ruff` - Linting and formatting
 - `mypy` - Type checking
 ## Project Structure
 ```
 tiny-scribe/
 ├── summarize_transcript.py    # Main CLI script
+├── app.py                     # Gradio web app (HuggingFace Spaces)
+├── requirements.txt           # Python dependencies
+├── Dockerfile                 # HF Spaces deployment config
 ├── transcripts/               # Input transcript files
 │   ├── short.txt
 │   └── full.txt
 ├── llama-cpp-python/          # Git submodule
 │   ├── tests/                 # Test suite
 │   └── vendor/llama.cpp/      # Core C++ library
 └── README.md                  # Project documentation
 ```
 **Thinking Block Parsing:**
 ```python
 # Extract thinking/reasoning blocks from model output
+THINKING_PATTERN = re.compile(r'<think(?:ing)?>(.*?)</think(?:ing)?>', re.DOTALL)
 for chunk in stream:
     delta = chunk["choices"][0]["delta"]
 - Always call `llm.reset()` after completion to ensure state isolation
 - Model format: `repo_id:quant` (e.g., `unsloth/Qwen3-1.7B-GGUF:Q2_K_L`)
 - Default language output is Traditional Chinese (zh-TW) via OpenCC conversion
 - HuggingFace cache at `~/.cache/huggingface/hub/` - clean periodically
+- HF Spaces runs on CPU tier with 2 vCPUs, 16GB RAM
+- Keep model sizes under 4GB for reasonable performance on free tier
 ## Git Submodule Management
 # Update llama-cpp-python to latest
 cd llama-cpp-python && git pull origin main && cd .. && git add llama-cpp-python
 ```
+## Docker/HuggingFace Spaces Deployment
+```bash
+# Build locally
+docker build -t tiny-scribe .
+# Run locally
+docker run -p 7860:7860 tiny-scribe
+# Deploy script
+./deploy.sh  # Commits, pushes, and triggers HF Spaces rebuild
+```