audio_Data-for-good-Grenoble

Sleeping

App Files Files Community

audio_Data-for-good-Grenoble / README.md

pierre-loic

update audio content

aeb6036 11 months ago

preview code

raw

history blame contribute delete

1.72 kB

	---
	title: Frugal AI Challenge Submission
	emoji: 🌍
	colorFrom: blue
	colorTo: green
	sdk: docker
	pinned: false
	---

	## 🔊 Audio classification

	### Strategy for solving the problem

	To minimize energy consumption, we deliberately chose not to use deep learning techniques such as CNN-based spectrogram analysis, LSTM on raw audio signals or transformer models, which are generally more computationally intensive.

	Instead, a more lightweight approach was adopted:
	- Feature extraction from the audio signal (MFCCs and spectral contrast)
	- Training a simple machine learning model (decision tree) on these extracted features

	Potential Improvements (Not Yet Tested)
	- Hyperparameter tuning for better performance
	- Exploring alternative lightweight ML models, such as logistic regression or k-nearest neighbors
	- Feature extraction without Librosa, using NumPy directly to compute basic signal properties, further reducing dependencies and overhead.

	The model is exported from the notebook `notebooks\Audio_Challenge.ipynb` and saved as `model_audio.pkl`

	## 📚 Text classification

	### Evaluate locally

	To evaluate the model locally, you can use the following command:

	```bash
	python main.py --config config_evaluation_{model_name}.json
	```

	where `{model_name}` is either `distilBERT` or `embeddingML`.


	### Models Description

	#### DistilBERT Model

	The model uses the `distilbert-base-uncased` model from the Hugging Face Transformers library, fine-tuned on the
	training dataset (see below).

	#### Embedding + ML Model

	The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple
	TF-IDF vectorizer, and the ML model is a logistic regression.