Faham commited on
Commit
db77419
Β·
1 Parent(s): b1acf7e

UPDATE: readme

Browse files
Files changed (3) hide show
  1. .dockerignore +1 -1
  2. README.md +27 -1
  3. src/utils/preprocessing.py +1 -4
.dockerignore CHANGED
@@ -48,7 +48,7 @@ Thumbs.db
48
  .ipynb_checkpoints/
49
 
50
  # Models (if they're large)
51
- models/*.pth
52
 
53
  # Logs
54
  *.log
 
48
  .ipynb_checkpoints/
49
 
50
  # Models (if they're large)
51
+ model_weights/*.pth
52
 
53
  # Logs
54
  *.log
README.md CHANGED
@@ -76,9 +76,23 @@ sentiment-fused/
76
  β”œβ”€β”€ notebooks/ # Development notebooks
77
  β”‚ β”œβ”€β”€ audio_sentiment_analysis.ipynb # Audio model development
78
  β”‚ └── vision_sentiment_analysis.ipynb # Vision model development
79
- └── models/ # Model storage directory
 
 
 
 
 
 
80
  ```
81
 
 
 
 
 
 
 
 
 
82
  ## Key Features
83
 
84
  - **Real-time Analysis**: Instant sentiment predictions with confidence scores
@@ -264,5 +278,17 @@ Key libraries used:
264
  6. **Production Ready**: Docker containerization and deployment
265
  7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
266
  8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
 
 
 
 
 
 
 
 
 
 
 
 
267
 
268
  This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.
 
76
  β”œβ”€β”€ notebooks/ # Development notebooks
77
  β”‚ β”œβ”€β”€ audio_sentiment_analysis.ipynb # Audio model development
78
  β”‚ └── vision_sentiment_analysis.ipynb # Vision model development
79
+ β”œβ”€β”€ model_weights/ # Model storage directory (downloaded .pth files)
80
+ └── src/ # Source code package
81
+ β”œβ”€β”€ __init__.py # Package initialization
82
+ β”œβ”€β”€ config/ # Configuration settings
83
+ β”œβ”€β”€ models/ # Model logic and inference code
84
+ β”œβ”€β”€ utils/ # Utility functions and preprocessing
85
+ └── ui/ # User interface components
86
  ```
87
 
88
+ ### Directory Explanation
89
+
90
+ - **`model_weights/`**: Contains the actual trained model files (`.pth` files) downloaded from Google Drive at inference time.
91
+ - **`src/models/`**: Contains the Python code for model loading, inference, and prediction logic
92
+ - **`src/utils/`**: Contains preprocessing utilities for audio, vision, and text data
93
+ - **`src/config/`**: Contains centralized configuration settings for the entire application
94
+ - **`src/ui/`**: Contains Streamlit UI components and styling
95
+
96
  ## Key Features
97
 
98
  - **Real-time Analysis**: Instant sentiment predictions with confidence scores
 
278
  6. **Production Ready**: Docker containerization and deployment
279
  7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
280
  8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
281
+ 9. **Modular Architecture**: Clean, maintainable code structure with separated concerns
282
+ 10. **Professional Code Organization**: Proper Python packaging with config, models, utils, and UI modules
283
+
284
+ ## Recent Improvements
285
+
286
+ The project has been refactored from a monolithic structure to a clean, modular architecture:
287
+
288
+ - **Modular Design**: Separated into logical modules (`src/config/`, `src/models/`, `src/utils/`, `src/ui/`)
289
+ - **Centralized Configuration**: All settings consolidated in `src/config/settings.py`
290
+ - **Clean Separation**: Model logic, preprocessing, and UI components are now in dedicated modules
291
+ - **Better Maintainability**: Easier to modify, test, and extend individual components
292
+ - **Professional Structure**: Follows Python packaging best practices
293
 
294
  This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.
src/utils/preprocessing.py CHANGED
@@ -20,13 +20,10 @@ except ImportError:
20
  from ..config.settings import (
21
  IMAGE_TRANSFORMS,
22
  AUDIO_MODEL_CONFIG,
23
- VISION_MODEL_CONFIG,
24
- SUPPORTED_IMAGE_FORMATS,
25
- SUPPORTED_AUDIO_FORMATS,
26
  )
27
 
28
  # Add Any to typing imports
29
- from typing import List, Optional, Tuple, Union, Any
30
 
31
  # Add torch import for audio preprocessing
32
  try:
 
20
  from ..config.settings import (
21
  IMAGE_TRANSFORMS,
22
  AUDIO_MODEL_CONFIG,
 
 
 
23
  )
24
 
25
  # Add Any to typing imports
26
+ from typing import List, Optional, Union, Any
27
 
28
  # Add torch import for audio preprocessing
29
  try: