SummarizerApp / BACKEND_PLAN.md
ming
chore: initialize FastAPI backend project structure and testing setup
9024ad9
# Text Summarizer Backend - Development Plan
## Overview
A minimal FastAPI backend for text summarization using local Ollama, designed to be callable from an Android app and extensible for cloud hosting.
## Architecture Goals
- **Local-first**: Use Ollama running locally for privacy and cost control
- **Cloud-ready**: Structure code to easily deploy to cloud later
- **Minimal v1**: Focus on core summarization functionality
- **Android-friendly**: RESTful API optimized for mobile app consumption
## Technology Stack
- **Backend**: FastAPI + Python
- **LLM**: Ollama (local)
- **Server**: Uvicorn
- **Validation**: Pydantic
- **Testing**: Pytest + pytest-asyncio + httpx (for async testing)
- **Containerization**: Docker (for cloud deployment)
## Project Structure
```
app/
β”œβ”€β”€ main.py # FastAPI app entry point
β”œβ”€β”€ api/
β”‚ └── v1/
β”‚ β”œβ”€β”€ routes.py # API route definitions
β”‚ └── schemas.py # Pydantic models
β”œβ”€β”€ services/
β”‚ └── summarizer.py # Ollama integration
β”œβ”€β”€ core/
β”‚ β”œβ”€β”€ config.py # Configuration management
β”‚ └── logging.py # Logging setup
tests/
β”œβ”€β”€ test_api.py # API endpoint tests
β”œβ”€β”€ test_services.py # Service layer tests
β”œβ”€β”€ test_schemas.py # Pydantic model tests
β”œβ”€β”€ test_config.py # Configuration tests
└── conftest.py # Test configuration and fixtures
requirements.txt
Dockerfile
docker-compose.yml
README.md
```
## API Contract (v1)
### POST /api/v1/summarize
**Request:**
```json
{
"text": "string (required)",
"max_tokens": 256,
"prompt": "Summarize concisely."
}
```
**Response:**
```json
{
"summary": "string",
"model": "llama3.1:8b",
"tokens_used": 512,
"latency_ms": 1234
}
```
### GET /health
**Response:**
```json
{
"status": "ok",
"ollama": "reachable"
}
```
## Development Phases
### Phase 1: Foundation
- [ ] Project scaffold and directory structure
- [ ] Core dependencies and requirements.txt (including test dependencies)
- [ ] Basic FastAPI app setup
- [ ] Configuration management with environment variables
- [ ] Logging setup
- [ ] Health check endpoint
- [ ] Basic test setup and configuration
### Phase 2: Core Feature
- [ ] Pydantic schemas for request/response
- [ ] Unit tests for schemas (validation, serialization)
- [ ] Ollama service integration
- [ ] Unit tests for Ollama service (mocked)
- [ ] Summarization endpoint implementation
- [ ] Integration tests for API endpoints
- [ ] Input validation and error handling
- [ ] Basic request/response logging
### Phase 3: Quality & DX
- [ ] Error handling middleware
- [ ] Request ID middleware
- [ ] Input size limits and validation
- [ ] Rate limiting (optional for v1)
- [ ] Test coverage analysis and improvement
- [ ] Performance tests for summarization endpoint
### Phase 4: Cloud-Ready Structure
- [ ] Dockerfile for containerization
- [ ] docker-compose.yml for local development
- [ ] Environment-based configuration
- [ ] CORS configuration for Android app
- [ ] Security headers and API key support (optional)
- [ ] Metrics endpoint (optional)
### Phase 5: Documentation & Examples
- [ ] Comprehensive README with setup instructions
- [ ] API documentation (FastAPI auto-docs)
- [ ] Example curl commands
- [ ] Android client integration examples
- [ ] Deployment guide for cloud hosting
## Configuration
### Environment Variables
```bash
# Ollama Configuration
OLLAMA_MODEL=llama3.1:8b
OLLAMA_HOST=http://127.0.0.1:11434
OLLAMA_TIMEOUT=30
# Server Configuration
SERVER_HOST=127.0.0.1
SERVER_PORT=8000
LOG_LEVEL=INFO
# Optional: API Security
API_KEY_ENABLED=false
API_KEY=your-secret-key
# Optional: Rate Limiting
RATE_LIMIT_ENABLED=false
RATE_LIMIT_REQUESTS=60
RATE_LIMIT_WINDOW=60
```
## Local Development Setup
### Prerequisites
1. Install Ollama:
```bash
# macOS
brew install ollama
# Or download from https://ollama.ai
```
2. Start Ollama service:
```bash
ollama serve
```
3. Pull a model:
```bash
ollama pull llama3.1:8b
# or
ollama pull mistral
```
### Running the API
```bash
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export OLLAMA_MODEL=llama3.1:8b
# Run the server
uvicorn app.main:app --host 127.0.0.1 --port 8000 --reload
```
### Testing the API
```bash
# Health check
curl http://127.0.0.1:8000/health
# Summarize text
curl -X POST http://127.0.0.1:8000/api/v1/summarize \
-H "Content-Type: application/json" \
-d '{"text": "Your long text to summarize here..."}'
```
### Running Tests
```bash
# Run all tests
pytest
# Run tests with coverage
pytest --cov=app --cov-report=html --cov-report=term
# Run specific test file
pytest tests/test_api.py
# Run tests with verbose output
pytest -v
# Run tests and stop on first failure
pytest -x
```
## Testing Strategy
### Test Types
1. **Unit Tests**
- Pydantic model validation
- Service layer logic (with mocked Ollama)
- Configuration loading
- Utility functions
2. **Integration Tests**
- API endpoint testing with TestClient
- End-to-end summarization flow
- Error handling scenarios
- Health check functionality
3. **Mock Strategy**
- Mock Ollama HTTP calls using `httpx` or `responses`
- Mock external dependencies
- Use fixtures for common test data
### Test Coverage Goals
- **Minimum 90% code coverage**
- **100% coverage for critical paths** (API endpoints, error handling)
- **All edge cases tested** (empty input, large input, network failures)
### Test Data
```python
# Example test fixtures
SAMPLE_TEXT = "This is a long text that needs to be summarized..."
SAMPLE_SUMMARY = "This text discusses summarization."
MOCK_OLLAMA_RESPONSE = {
"model": "llama3.1:8b",
"response": SAMPLE_SUMMARY,
"done": True
}
```
### Continuous Testing
- Tests run on every code change
- Pre-commit hooks for test execution
- CI/CD pipeline integration ready
## Android Integration
### Example Android HTTP Client
```kotlin
// Using Retrofit or OkHttp
data class SummarizeRequest(
val text: String,
val max_tokens: Int = 256,
val prompt: String = "Summarize concisely."
)
data class SummarizeResponse(
val summary: String,
val model: String,
val tokens_used: Int,
val latency_ms: Int
)
// API call
@POST("api/v1/summarize")
suspend fun summarize(@Body request: SummarizeRequest): SummarizeResponse
```
## Cloud Deployment Considerations
### Future Extensions
- **Authentication**: API key or OAuth2
- **Rate Limiting**: Redis-based distributed rate limiting
- **Monitoring**: Prometheus metrics, health checks
- **Scaling**: Multiple replicas, load balancing
- **Database**: Usage tracking, user management
- **Caching**: Redis for response caching
- **Security**: HTTPS, input sanitization, CORS policies
### Deployment Options
- **Docker**: Containerized deployment
- **Cloud Platforms**: AWS, GCP, Azure, Railway, Render
- **Serverless**: AWS Lambda, Vercel Functions (with Ollama API)
- **VPS**: DigitalOcean, Linode with Docker
## Success Criteria
- [ ] API responds to health checks
- [ ] Successfully summarizes text via Ollama
- [ ] Handles errors gracefully
- [ ] Works with Android app
- [ ] Can be containerized
- [ ] **All tests pass with >90% coverage**
- [ ] Documentation is complete
## Future Enhancements (Post-v1)
- [ ] Streaming responses
- [ ] Batch summarization
- [ ] Multiple model support
- [ ] Prompt templates and presets
- [ ] Usage analytics
- [ ] Multi-language support
- [ ] Advanced rate limiting
- [ ] User authentication and authorization