Spaces:

karim323
/

nlp-analysis-api

Running

App Files Files Community

nlp-analysis-api / ARCHITECTURE.md

karim323

Add NLP Analysis API backend with FastAPI and transformers

e4eb82b 3 months ago

preview code

raw

history blame contribute delete

7.72 kB

Architecture Documentation

Overview

The NLP Analysis API follows a clean architecture pattern with clear separation of concerns. This document explains the structure and design decisions.

Directory Structure

sentimant/
├── main.py                      # Application entry point
├── run_server.py                # Server startup script
├── requirements.txt             # Dependencies
├── README.md                    # User documentation
├── ARCHITECTURE.md              # This file
└── lib/                         # Core application code
    ├── __init__.py
    ├── models.py                # Data models/schemas
    ├── services.py              # Business logic
    ├── routes.py                # API routes
    └── providers/               # Model management
        ├── __init__.py
        └── model_providers.py   # Model providers

Architecture Layers

1. Models Layer (`lib/models.py`)

Responsibility: Define data structures using Pydantic for:

Request validation
Response serialization
Type safety

Key Models:

TextInput: Input for text-based operations
BatchTextInput: Input for batch processing
SentimentResponse: Sentiment analysis output
NERResponse: Named Entity Recognition output
TranslationResponse: Translation output
Entity: Individual entity structure

2. Providers Layer (`lib/providers/model_providers.py`)

Responsibility: Model loading, initialization, and prediction

Design Pattern: Provider pattern

Key Components:

`ModelProvider` (Base Class)

Abstract base for all model providers
Defines interface: load_model(), predict(), is_loaded()

`SentimentModelProvider`

Manages sentiment analysis models
Default: cardiffnlp/twitter-roberta-base-sentiment-latest
Handles model loading errors with fallback

`NERModelProvider`

Manages Named Entity Recognition models
Default: dslim/bert-base-NER
Returns aggregated entities

`TranslationModelProvider`

Manages translation models
Lazy loads models per language pair
Caches loaded models in memory

3. Services Layer (`lib/services.py`)

Responsibility: Business logic and data transformation

Key Services:

`SentimentService`

Analyzes sentiment using SentimentModelProvider
Formats results into SentimentResponse
Maps model labels to user-friendly format
Handles batch processing

`NERService`

Extracts entities using NERModelProvider
Converts raw predictions to Entity objects
Returns structured NERResponse

`TranslationService`

Translates text using TranslationModelProvider
Manages language pair selection
Returns clean translation text

4. Routes Layer (`lib/routes.py`)

Responsibility: API endpoint definitions and HTTP handling

Features:

FastAPI dependency injection for services
Error handling and HTTP exceptions
Request/response model validation

Endpoints:

GET /: Basic status
GET /health: Health check with model status
POST /analyze: Sentiment analysis
POST /analyze-batch: Batch sentiment analysis
POST /ner: Named Entity Recognition
POST /translate: Translation

5. Application Layer (`main.py`)

Responsibility: Application initialization and configuration

Key Responsibilities:

FastAPI app creation
CORS configuration
Model provider initialization
Service initialization
Model loading on startup
Router registration

Data Flow

Client Request
    ↓
FastAPI Routes (lib/routes.py)
    ↓
Service Layer (lib/services.py)
    ↓
Model Provider (lib/providers/model_providers.py)
    ↓
Hugging Face Transformers
    ↓
Raw Prediction
    ↓
Service Layer (data transformation)
    ↓
Pydantic Model (validation)
    ↓
JSON Response to Client

Design Principles

1. Separation of Concerns

Each layer has a single, well-defined responsibility
Models don't contain business logic
Providers don't know about services
Routes don't contain business logic

2. Dependency Injection

Services injected into routes via FastAPI dependencies
Enables easy testing and mocking
Loose coupling between components

3. Clean Interfaces

Abstract base classes define contracts
Consistent method signatures
Type hints throughout

4. Error Handling

Comprehensive exception handling at each layer
User-friendly error messages
Proper HTTP status codes

5. Model Management

Lazy loading for translation models
Eager loading for core models (sentiment, NER)
Caching to avoid redundant loads

Extension Points

Adding a New Model Type

Create Provider (lib/providers/model_providers.py):

class NewModelProvider(ModelProvider):
    def __init__(self, model_name: str = "model/path"):
        super().__init__()
        self.model_name = model_name
    
    def load_model(self):
        # Load model logic
        pass
    
    def predict(self, text: str):
        # Prediction logic
        pass

Create Service (lib/services.py):

class NewModelService:
    def __init__(self, model_provider: NewModelProvider):
        self.model_provider = model_provider
    
    def process(self, text: str) -> ResponseModel:
        # Business logic
        pass

Add Route (lib/routes.py):

@router.post("/new-endpoint", response_model=ResponseModel)
async def new_endpoint(
    input_data: InputModel,
    service: NewModelService = Depends(get_new_model_service)
):
    return service.process(input_data.text)

Register in main.py:

new_model = NewModelProvider()
new_service = NewModelService(new_model)
# Add to routes

Adding a New Endpoint

Create route in lib/routes.py
Use dependency injection for services
Define request/response models in lib/models.py
Router automatically picks it up

Testing Strategy

Unit Tests

Test each service independently
Mock model providers
Test data transformations

Integration Tests

Test full request/response cycle
Use test fixtures
Verify model outputs

Load Tests

Test batch processing
Test concurrent requests
Measure response times

Deployment Considerations

Model Loading

First request may be slow (cold start)
Consider warming up models on startup
Monitor memory usage

Caching

Translation models cached in memory
Consider Redis for distributed caching
Cache predictions for frequently used texts

Scaling

Stateless design enables horizontal scaling
Consider model server separation
Use load balancing

Future Enhancements

Model Registry: Centralized model management
Async Processing: Background task queue for long operations
Model Versioning: Support multiple model versions
Metrics: Prometheus metrics integration
Auth: API key authentication
Rate Limiting: Request rate limiting
Batch Processing: Async batch job processing
Model A/B Testing: Compare model performance

Performance Optimizations

Model Quantization: Reduce model size and speed
TensorRT/ONNX: Faster inference
Batching: Process multiple texts together
GPU Support: CUDA acceleration
Connection Pooling: Efficient database connections
Response Caching: Cache frequent requests

Security Considerations

Input Validation: All inputs validated via Pydantic
Rate Limiting: Prevent abuse
CORS: Configured for Flutter app
Logging: Comprehensive logging for audit
Error Messages: Don't expose internal details