QuefrencyGuardian / README.md
tlemagueresse
Delete the second example
cf27fba
---
license: cc-by-nc-4.0
datasets:
- rfcx/frugalai
language:
- en
metrics:
- accuracy
pipeline_tag: audio-classification
tags:
- acoustics
- lgbm
- frugality
- signal-processing
- climate
- chainsaw
---
# Quefrency Guardian: Chainsaw Noise Detector
An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment.
## Installation
You can install and use the model in two different ways:
### Option 1: Clone the repository
To download the entire repository containing the code, model, and associated files, follow these steps:
```bash
git clone https://huggingface.co/tlmk22/QuefrencyGuardian
cd QuefrencyGuardian
pip install -r requirements.txt
```
Once installed, you can directly import the files into your existing project and use the model.
---
### Option 2: Dynamically load from the Hub
If you only want to download the required files to use the model (without cloning the full repository), you can use the `hf_hub_download` function provided by Hugging Face. This method downloads only what is necessary directly from the Hub.
Here's an example:
```python
import os
import sys
from huggingface_hub import hf_hub_download
import importlib.util
# Specify the repository
repo_id = "tlmk22/QuefrencyGuardian"
# Download the Python file containing the model class and add it to your path
model_path = hf_hub_download(repo_id=repo_id, filename="model.py")
model_dir = os.path.dirname(model_path)
if model_dir not in sys.path:
sys.path.append(model_dir)
# Dynamically load the class from the downloaded file
spec = importlib.util.spec_from_file_location("model", model_path)
model_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(model_module)
# Import the FastModelHuggingFace class
FastModelHuggingFace = model_module.FastModelHuggingFace
# Load the pre-trained model
fast_model = FastModelHuggingFace.from_pretrained(repo_id)
# Perform predictions
result = fast_model.predict("path/to/audio.wav", device="cpu")
map_labels = {0: "chainsaw", 1: "environment"}
print(f"Prediction Result: {map_labels[result[0]]}")
```
Depending on your needs, you can either clone the repository for a full installation or use Hugging Face's dynamic download functionalities for lightweight and direct usage.
---
## Model Overview
### Features
The model uses:
- **Spectrogram Features**
- **Cepstral Features**: Calculated as the FFT of the log spectrogram between [`f_min`-`f_max`] in a filtered quefrency range [`fc_min`-`fc_max`].
- **Time Averaging**: Both feature sets are averaged across the entire audio clip for robustness in noisy settings (Welch methodology).
---
### LightGBM Model
The model is a **binary classifier** (chainsaw vs environment) trained on the `rfcx/frugalai` dataset.
Key model parameters are included in `model/lgbm_params.json`.
---
## Usage
Two example scripts demonstrating how to use the repository or the model downloaded from Hugging Face are available in the `examples` directory.
---
### Performance
- **Accuracy**: Achieved 95% on the test set with a 4.5% FPR at the default threshold during the challenge.
- **Environmental Impact**: Inference energy consumption was measured at **0.21 Wh**, tracked using CodeCarbon. This metric is dependent on the challenge's infrastructure, as the code was executed within a Docker container provided by the platform.
---
### License
This project is licensed under the [Creative Commons Attribution Non-Commercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/).
You are free to share and adapt the work for non-commercial purposes, provided attribution is given.
---
## Dataset
The model was trained and evaluated on the [Rainforest Connection (RFCx) Frugal AI](https://huggingface.co/datasets/rfcx/frugalai) dataset.
#### Labels:
- `0`: Chainsaw
- `1`: Environment
---
## Limitations
- **Audio Length**: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz.
- **Environmental Noise**: The model might misclassify if recordings are noisy or if machinery similar to chainsaws is present.