QuefrencyGuardian / README.md

tlemagueresse

Delete the second example

cf27fba 12 months ago

4.4 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- rfcx/frugalai
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: audio-classification
	tags:
	- acoustics
	- lgbm
	- frugality
	- signal-processing
	- climate
	- chainsaw
	---
	# Quefrency Guardian: Chainsaw Noise Detector

	An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment.

	## Installation

	You can install and use the model in two different ways:

	### Option 1: Clone the repository

	To download the entire repository containing the code, model, and associated files, follow these steps:

	```bash
	git clone https://huggingface.co/tlmk22/QuefrencyGuardian
	cd QuefrencyGuardian
	pip install -r requirements.txt
	```

	Once installed, you can directly import the files into your existing project and use the model.

	---

	### Option 2: Dynamically load from the Hub

	If you only want to download the required files to use the model (without cloning the full repository), you can use the `hf_hub_download` function provided by Hugging Face. This method downloads only what is necessary directly from the Hub.

	Here's an example:

	```python
	import os
	import sys

	from huggingface_hub import hf_hub_download
	import importlib.util

	# Specify the repository
	repo_id = "tlmk22/QuefrencyGuardian"

	# Download the Python file containing the model class and add it to your path
	model_path = hf_hub_download(repo_id=repo_id, filename="model.py")
	model_dir = os.path.dirname(model_path)
	if model_dir not in sys.path:
	sys.path.append(model_dir)

	# Dynamically load the class from the downloaded file
	spec = importlib.util.spec_from_file_location("model", model_path)
	model_module = importlib.util.module_from_spec(spec)
	spec.loader.exec_module(model_module)

	# Import the FastModelHuggingFace class
	FastModelHuggingFace = model_module.FastModelHuggingFace

	# Load the pre-trained model
	fast_model = FastModelHuggingFace.from_pretrained(repo_id)

	# Perform predictions
	result = fast_model.predict("path/to/audio.wav", device="cpu")
	map_labels = {0: "chainsaw", 1: "environment"}
	print(f"Prediction Result: {map_labels[result[0]]}")
	```

	Depending on your needs, you can either clone the repository for a full installation or use Hugging Face's dynamic download functionalities for lightweight and direct usage.

	---

	## Model Overview

	### Features

	The model uses:
	- Spectrogram Features
	- Cepstral Features: Calculated as the FFT of the log spectrogram between [`f_min`-`f_max`] in a filtered quefrency range [`fc_min`-`fc_max`].
	- Time Averaging: Both feature sets are averaged across the entire audio clip for robustness in noisy settings (Welch methodology).

	---

	### LightGBM Model

	The model is a binary classifier (chainsaw vs environment) trained on the `rfcx/frugalai` dataset.
	Key model parameters are included in `model/lgbm_params.json`.

	---

	## Usage

	Two example scripts demonstrating how to use the repository or the model downloaded from Hugging Face are available in the `examples` directory.

	---

	### Performance

	- Accuracy: Achieved 95% on the test set with a 4.5% FPR at the default threshold during the challenge.
	- Environmental Impact: Inference energy consumption was measured at 0.21 Wh, tracked using CodeCarbon. This metric is dependent on the challenge's infrastructure, as the code was executed within a Docker container provided by the platform.

	---

	### License

	This project is licensed under the [Creative Commons Attribution Non-Commercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/).
	You are free to share and adapt the work for non-commercial purposes, provided attribution is given.

	---

	## Dataset

	The model was trained and evaluated on the [Rainforest Connection (RFCx) Frugal AI](https://huggingface.co/datasets/rfcx/frugalai) dataset.

	#### Labels:
	- `0`: Chainsaw
	- `1`: Environment

	---

	## Limitations

	- Audio Length: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz.
	- Environmental Noise: The model might misclassify if recordings are noisy or if machinery similar to chainsaws is present.