# Architecture Documentation

## Overview

The NLP Analysis API follows a clean architecture pattern with clear separation of concerns. This document explains the structure and design decisions.

## Directory Structure

```
sentimant/
├── main.py                      # Application entry point
├── run_server.py                # Server startup script
├── requirements.txt             # Dependencies
├── README.md                    # User documentation
├── ARCHITECTURE.md              # This file
└── lib/                         # Core application code
    ├── __init__.py
    ├── models.py                # Data models/schemas
    ├── services.py              # Business logic
    ├── routes.py                # API routes
    └── providers/               # Model management
        ├── __init__.py
        └── model_providers.py   # Model providers
```

## Architecture Layers

### 1. Models Layer (`lib/models.py`)

**Responsibility**: Define data structures using Pydantic for:
- Request validation
- Response serialization
- Type safety

**Key Models**:
- `TextInput`: Input for text-based operations
- `BatchTextInput`: Input for batch processing
- `SentimentResponse`: Sentiment analysis output
- `NERResponse`: Named Entity Recognition output
- `TranslationResponse`: Translation output
- `Entity`: Individual entity structure

### 2. Providers Layer (`lib/providers/model_providers.py`)

**Responsibility**: Model loading, initialization, and prediction

**Design Pattern**: Provider pattern

**Key Components**:

#### `ModelProvider` (Base Class)
- Abstract base for all model providers
- Defines interface: `load_model()`, `predict()`, `is_loaded()`

#### `SentimentModelProvider`
- Manages sentiment analysis models
- Default: `cardiffnlp/twitter-roberta-base-sentiment-latest`
- Handles model loading errors with fallback

#### `NERModelProvider`
- Manages Named Entity Recognition models
- Default: `dslim/bert-base-NER`
- Returns aggregated entities

#### `TranslationModelProvider`
- Manages translation models
- Lazy loads models per language pair
- Caches loaded models in memory

### 3. Services Layer (`lib/services.py`)

**Responsibility**: Business logic and data transformation

**Key Services**:

#### `SentimentService`
- Analyzes sentiment using `SentimentModelProvider`
- Formats results into `SentimentResponse`
- Maps model labels to user-friendly format
- Handles batch processing

#### `NERService`
- Extracts entities using `NERModelProvider`
- Converts raw predictions to `Entity` objects
- Returns structured `NERResponse`

#### `TranslationService`
- Translates text using `TranslationModelProvider`
- Manages language pair selection
- Returns clean translation text

### 4. Routes Layer (`lib/routes.py`)

**Responsibility**: API endpoint definitions and HTTP handling

**Features**:
- FastAPI dependency injection for services
- Error handling and HTTP exceptions
- Request/response model validation

**Endpoints**:
- `GET /`: Basic status
- `GET /health`: Health check with model status
- `POST /analyze`: Sentiment analysis
- `POST /analyze-batch`: Batch sentiment analysis
- `POST /ner`: Named Entity Recognition
- `POST /translate`: Translation

### 5. Application Layer (`main.py`)

**Responsibility**: Application initialization and configuration

**Key Responsibilities**:
- FastAPI app creation
- CORS configuration
- Model provider initialization
- Service initialization
- Model loading on startup
- Router registration

## Data Flow

```
Client Request
    ↓
FastAPI Routes (lib/routes.py)
    ↓
Service Layer (lib/services.py)
    ↓
Model Provider (lib/providers/model_providers.py)
    ↓
Hugging Face Transformers
    ↓
Raw Prediction
    ↓
Service Layer (data transformation)
    ↓
Pydantic Model (validation)
    ↓
JSON Response to Client
```

## Design Principles

### 1. Separation of Concerns
- Each layer has a single, well-defined responsibility
- Models don't contain business logic
- Providers don't know about services
- Routes don't contain business logic

### 2. Dependency Injection
- Services injected into routes via FastAPI dependencies
- Enables easy testing and mocking
- Loose coupling between components

### 3. Clean Interfaces
- Abstract base classes define contracts
- Consistent method signatures
- Type hints throughout

### 4. Error Handling
- Comprehensive exception handling at each layer
- User-friendly error messages
- Proper HTTP status codes

### 5. Model Management
- Lazy loading for translation models
- Eager loading for core models (sentiment, NER)
- Caching to avoid redundant loads

## Extension Points

### Adding a New Model Type

1. **Create Provider** (`lib/providers/model_providers.py`):
```python
class NewModelProvider(ModelProvider):
    def __init__(self, model_name: str = "model/path"):
        super().__init__()
        self.model_name = model_name
    
    def load_model(self):
        # Load model logic
        pass
    
    def predict(self, text: str):
        # Prediction logic
        pass
```

2. **Create Service** (`lib/services.py`):
```python
class NewModelService:
    def __init__(self, model_provider: NewModelProvider):
        self.model_provider = model_provider
    
    def process(self, text: str) -> ResponseModel:
        # Business logic
        pass
```

3. **Add Route** (`lib/routes.py`):
```python
@router.post("/new-endpoint", response_model=ResponseModel)
async def new_endpoint(
    input_data: InputModel,
    service: NewModelService = Depends(get_new_model_service)
):
    return service.process(input_data.text)
```

4. **Register in main.py**:
```python
new_model = NewModelProvider()
new_service = NewModelService(new_model)
# Add to routes
```

### Adding a New Endpoint

1. Create route in `lib/routes.py`
2. Use dependency injection for services
3. Define request/response models in `lib/models.py`
4. Router automatically picks it up

## Testing Strategy

### Unit Tests
- Test each service independently
- Mock model providers
- Test data transformations

### Integration Tests
- Test full request/response cycle
- Use test fixtures
- Verify model outputs

### Load Tests
- Test batch processing
- Test concurrent requests
- Measure response times

## Deployment Considerations

### Model Loading
- First request may be slow (cold start)
- Consider warming up models on startup
- Monitor memory usage

### Caching
- Translation models cached in memory
- Consider Redis for distributed caching
- Cache predictions for frequently used texts

### Scaling
- Stateless design enables horizontal scaling
- Consider model server separation
- Use load balancing

## Future Enhancements

1. **Model Registry**: Centralized model management
2. **Async Processing**: Background task queue for long operations
3. **Model Versioning**: Support multiple model versions
4. **Metrics**: Prometheus metrics integration
5. **Auth**: API key authentication
6. **Rate Limiting**: Request rate limiting
7. **Batch Processing**: Async batch job processing
8. **Model A/B Testing**: Compare model performance

## Performance Optimizations

1. **Model Quantization**: Reduce model size and speed
2. **TensorRT/ONNX**: Faster inference
3. **Batching**: Process multiple texts together
4. **GPU Support**: CUDA acceleration
5. **Connection Pooling**: Efficient database connections
6. **Response Caching**: Cache frequent requests

## Security Considerations

1. **Input Validation**: All inputs validated via Pydantic
2. **Rate Limiting**: Prevent abuse
3. **CORS**: Configured for Flutter app
4. **Logging**: Comprehensive logging for audit
5. **Error Messages**: Don't expose internal details