|
|
--- |
|
|
title: Frugal AI Challenge Submission |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: docker |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
## π Audio classification |
|
|
|
|
|
### Strategy for solving the problem |
|
|
|
|
|
To minimize energy consumption, we deliberately **chose not to use deep learning techniques** such as CNN-based spectrogram analysis, LSTM on raw audio signals or transformer models, which are generally **more computationally intensive**. |
|
|
|
|
|
Instead, a more **lightweight approach** was adopted: |
|
|
- Feature extraction from the audio signal (MFCCs and spectral contrast) |
|
|
- Training a simple machine learning model (decision tree) on these extracted features |
|
|
|
|
|
Potential Improvements (Not Yet Tested) |
|
|
- Hyperparameter tuning for better performance |
|
|
- Exploring alternative lightweight ML models, such as logistic regression or k-nearest neighbors |
|
|
- Feature extraction without Librosa, using NumPy directly to compute basic signal properties, further reducing dependencies and overhead. |
|
|
|
|
|
The model is exported from the notebook `notebooks\Audio_Challenge.ipynb` and saved as `model_audio.pkl` |
|
|
|
|
|
## π Text classification |
|
|
|
|
|
### Evaluate locally |
|
|
|
|
|
To evaluate the model locally, you can use the following command: |
|
|
|
|
|
```bash |
|
|
python main.py --config config_evaluation_{model_name}.json |
|
|
``` |
|
|
|
|
|
where `{model_name}` is either `distilBERT` or `embeddingML`. |
|
|
|
|
|
|
|
|
### Models Description |
|
|
|
|
|
#### DistilBERT Model |
|
|
|
|
|
The model uses the `distilbert-base-uncased` model from the Hugging Face Transformers library, fine-tuned on the |
|
|
training dataset (see below). |
|
|
|
|
|
#### Embedding + ML Model |
|
|
|
|
|
The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple |
|
|
TF-IDF vectorizer, and the ML model is a logistic regression. |
|
|
|