File size: 7,715 Bytes
e4eb82b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
# Architecture Documentation

## Overview

The NLP Analysis API follows a clean architecture pattern with clear separation of concerns. This document explains the structure and design decisions.

## Directory Structure

```
sentimant/
β”œβ”€β”€ main.py                      # Application entry point
β”œβ”€β”€ run_server.py                # Server startup script
β”œβ”€β”€ requirements.txt             # Dependencies
β”œβ”€β”€ README.md                    # User documentation
β”œβ”€β”€ ARCHITECTURE.md              # This file
└── lib/                         # Core application code
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ models.py                # Data models/schemas
    β”œβ”€β”€ services.py              # Business logic
    β”œβ”€β”€ routes.py                # API routes
    └── providers/               # Model management
        β”œβ”€β”€ __init__.py
        └── model_providers.py   # Model providers
```

## Architecture Layers

### 1. Models Layer (`lib/models.py`)

**Responsibility**: Define data structures using Pydantic for:
- Request validation
- Response serialization
- Type safety

**Key Models**:
- `TextInput`: Input for text-based operations
- `BatchTextInput`: Input for batch processing
- `SentimentResponse`: Sentiment analysis output
- `NERResponse`: Named Entity Recognition output
- `TranslationResponse`: Translation output
- `Entity`: Individual entity structure

### 2. Providers Layer (`lib/providers/model_providers.py`)

**Responsibility**: Model loading, initialization, and prediction

**Design Pattern**: Provider pattern

**Key Components**:

#### `ModelProvider` (Base Class)
- Abstract base for all model providers
- Defines interface: `load_model()`, `predict()`, `is_loaded()`

#### `SentimentModelProvider`
- Manages sentiment analysis models
- Default: `cardiffnlp/twitter-roberta-base-sentiment-latest`
- Handles model loading errors with fallback

#### `NERModelProvider`
- Manages Named Entity Recognition models
- Default: `dslim/bert-base-NER`
- Returns aggregated entities

#### `TranslationModelProvider`
- Manages translation models
- Lazy loads models per language pair
- Caches loaded models in memory

### 3. Services Layer (`lib/services.py`)

**Responsibility**: Business logic and data transformation

**Key Services**:

#### `SentimentService`
- Analyzes sentiment using `SentimentModelProvider`
- Formats results into `SentimentResponse`
- Maps model labels to user-friendly format
- Handles batch processing

#### `NERService`
- Extracts entities using `NERModelProvider`
- Converts raw predictions to `Entity` objects
- Returns structured `NERResponse`

#### `TranslationService`
- Translates text using `TranslationModelProvider`
- Manages language pair selection
- Returns clean translation text

### 4. Routes Layer (`lib/routes.py`)

**Responsibility**: API endpoint definitions and HTTP handling

**Features**:
- FastAPI dependency injection for services
- Error handling and HTTP exceptions
- Request/response model validation

**Endpoints**:
- `GET /`: Basic status
- `GET /health`: Health check with model status
- `POST /analyze`: Sentiment analysis
- `POST /analyze-batch`: Batch sentiment analysis
- `POST /ner`: Named Entity Recognition
- `POST /translate`: Translation

### 5. Application Layer (`main.py`)

**Responsibility**: Application initialization and configuration

**Key Responsibilities**:
- FastAPI app creation
- CORS configuration
- Model provider initialization
- Service initialization
- Model loading on startup
- Router registration

## Data Flow

```
Client Request
    ↓
FastAPI Routes (lib/routes.py)
    ↓
Service Layer (lib/services.py)
    ↓
Model Provider (lib/providers/model_providers.py)
    ↓
Hugging Face Transformers
    ↓
Raw Prediction
    ↓
Service Layer (data transformation)
    ↓
Pydantic Model (validation)
    ↓
JSON Response to Client
```

## Design Principles

### 1. Separation of Concerns
- Each layer has a single, well-defined responsibility
- Models don't contain business logic
- Providers don't know about services
- Routes don't contain business logic

### 2. Dependency Injection
- Services injected into routes via FastAPI dependencies
- Enables easy testing and mocking
- Loose coupling between components

### 3. Clean Interfaces
- Abstract base classes define contracts
- Consistent method signatures
- Type hints throughout

### 4. Error Handling
- Comprehensive exception handling at each layer
- User-friendly error messages
- Proper HTTP status codes

### 5. Model Management
- Lazy loading for translation models
- Eager loading for core models (sentiment, NER)
- Caching to avoid redundant loads

## Extension Points

### Adding a New Model Type

1. **Create Provider** (`lib/providers/model_providers.py`):
```python
class NewModelProvider(ModelProvider):
    def __init__(self, model_name: str = "model/path"):
        super().__init__()
        self.model_name = model_name
    
    def load_model(self):
        # Load model logic
        pass
    
    def predict(self, text: str):
        # Prediction logic
        pass
```

2. **Create Service** (`lib/services.py`):
```python
class NewModelService:
    def __init__(self, model_provider: NewModelProvider):
        self.model_provider = model_provider
    
    def process(self, text: str) -> ResponseModel:
        # Business logic
        pass
```

3. **Add Route** (`lib/routes.py`):
```python
@router.post("/new-endpoint", response_model=ResponseModel)
async def new_endpoint(
    input_data: InputModel,
    service: NewModelService = Depends(get_new_model_service)
):
    return service.process(input_data.text)
```

4. **Register in main.py**:
```python
new_model = NewModelProvider()
new_service = NewModelService(new_model)
# Add to routes
```

### Adding a New Endpoint

1. Create route in `lib/routes.py`
2. Use dependency injection for services
3. Define request/response models in `lib/models.py`
4. Router automatically picks it up

## Testing Strategy

### Unit Tests
- Test each service independently
- Mock model providers
- Test data transformations

### Integration Tests
- Test full request/response cycle
- Use test fixtures
- Verify model outputs

### Load Tests
- Test batch processing
- Test concurrent requests
- Measure response times

## Deployment Considerations

### Model Loading
- First request may be slow (cold start)
- Consider warming up models on startup
- Monitor memory usage

### Caching
- Translation models cached in memory
- Consider Redis for distributed caching
- Cache predictions for frequently used texts

### Scaling
- Stateless design enables horizontal scaling
- Consider model server separation
- Use load balancing

## Future Enhancements

1. **Model Registry**: Centralized model management
2. **Async Processing**: Background task queue for long operations
3. **Model Versioning**: Support multiple model versions
4. **Metrics**: Prometheus metrics integration
5. **Auth**: API key authentication
6. **Rate Limiting**: Request rate limiting
7. **Batch Processing**: Async batch job processing
8. **Model A/B Testing**: Compare model performance

## Performance Optimizations

1. **Model Quantization**: Reduce model size and speed
2. **TensorRT/ONNX**: Faster inference
3. **Batching**: Process multiple texts together
4. **GPU Support**: CUDA acceleration
5. **Connection Pooling**: Efficient database connections
6. **Response Caching**: Cache frequent requests

## Security Considerations

1. **Input Validation**: All inputs validated via Pydantic
2. **Rate Limiting**: Prevent abuse
3. **CORS**: Configured for Flutter app
4. **Logging**: Comprehensive logging for audit
5. **Error Messages**: Don't expose internal details