File size: 4,689 Bytes
a62077e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
# AI Engine Architecture

## Clean Architecture Implementation

This AI engine follows clean architecture principles with proper separation of concerns.

---

## Module Structure

```
diagnosis/ai_engine/
├── detect_stuttering.py    # Main detector class (business logic)
├── model_loader.py         # Singleton pattern for model loading
└── features.py             # Feature extraction (ASR features)
```

---

## Architecture Pattern

### 1. Model Loader (`model_loader.py`)
**Responsibility**: Singleton pattern for model instance management

- Ensures models are loaded only once
- Provides clean interface: `get_stutter_detector()`
- Handles initialization and error handling
- Used by API layer (`app.py`)

**Usage:**
```python
from diagnosis.ai_engine.model_loader import get_stutter_detector

detector = get_stutter_detector()  # Singleton instance
```

---

### 2. Feature Extractor (`features.py`)
**Responsibility**: Feature extraction from audio using IndicWav2Vec Hindi

**Class**: `ASRFeatureExtractor`

**Methods:**
- `extract_audio_features()` - Raw audio feature extraction
- `get_transcription_features()` - Transcription with confidence scores
- `get_word_level_features()` - Word-level timestamps and confidence

**Design Pattern**: 
- Takes pre-loaded model and processor as dependencies
- Single responsibility: feature extraction only
- Reusable across different use cases

**Usage:**
```python
from .features import ASRFeatureExtractor

extractor = ASRFeatureExtractor(model, processor, device)
features = extractor.get_transcription_features(audio)
```

---

### 3. Detector (`detect_stuttering.py`)
**Responsibility**: High-level stutter detection orchestration

**Class**: `AdvancedStutterDetector`

**Design:**
- Uses feature extractor for transcription (composition)
- Orchestrates the analysis pipeline
- Returns structured results

**Flow:**
```
Audio Input

Feature Extractor (ASR)

Text Analysis

Results
```

---

## Benefits of This Architecture

### ✅ Separation of Concerns
- **Model Loading**: Isolated in `model_loader.py`
- **Feature Extraction**: Isolated in `features.py`
- **Business Logic**: In `detect_stuttering.py`

### ✅ Single Responsibility Principle
- Each module has one clear purpose
- Easy to test and maintain
- Easy to extend or replace components

### ✅ Dependency Injection
- Feature extractor receives model/processor as dependencies
- No tight coupling
- Easy to mock for testing

### ✅ Reusability
- Feature extractor can be used independently
- Model loader can be used by other modules
- Clean interfaces between layers

---

## Data Flow

```
API Request (app.py)

get_stutter_detector() [model_loader.py]

AdvancedStutterDetector [detect_stuttering.py]

ASRFeatureExtractor [features.py]

IndicWav2Vec Hindi Model

Results back through layers
```

---

## Comparison with Django App

**Before (Django App):**
- Model loading logic in Django app
- Feature extraction in Django app
- Tight coupling between web app and ML logic

**After (AI Engine Service):**
- ✅ Model loading in AI engine service
- ✅ Feature extraction in AI engine service
- ✅ Django app only calls API (loose coupling)
- ✅ ML logic isolated in dedicated service

---

## Extension Points

### Adding New Features
1. Add method to `ASRFeatureExtractor` in `features.py`
2. Use in `AdvancedStutterDetector` via composition
3. No changes needed to model loader

### Adding New Models
1. Update `detect_stuttering.py` to load new model
2. Create new feature extractor if needed
3. Model loader remains unchanged

### Testing
- Mock `ASRFeatureExtractor` in tests
- Mock model loader for integration tests
- Each component can be tested independently

---

## Key Principles Applied

1. **Dependency Inversion**: High-level modules don't depend on low-level modules
2. **Open/Closed**: Open for extension, closed for modification
3. **Interface Segregation**: Clean, focused interfaces
4. **Don't Repeat Yourself (DRY)**: Feature extraction logic centralized
5. **Single Source of Truth**: Model instance managed by singleton

---

## File Responsibilities

| File | Responsibility | Depends On |
|------|---------------|------------|
| `model_loader.py` | Singleton model management | `detect_stuttering.py` |
| `features.py` | Feature extraction | `transformers`, `torch` |
| `detect_stuttering.py` | Business logic orchestration | `features.py`, `model_loader.py` |
| `app.py` | API layer | `model_loader.py` |

---

This architecture ensures the ML/AI logic stays in the AI engine service, not in the Django web application, following microservices best practices.