sangambhamare
/

TruthDetection

Model card Files Files and versions

TruthDetection / README.md

sangambhamare's picture

Update README.md

63f1f11 verified 6 months ago

|

history blame contribute delete

2.72 kB

	# Truth Detection from Audio Stories

	This model predicts whether a short audio story is truthful or deceptive using MFCC feature extraction and a Random Forest classifier.

	## Model Details

	* Type: Random Forest Classifier
	* Features: 13-dimensional MFCC (Mel-Frequency Cepstral Coefficients)
	* Training Framework: scikit-learn (`joblib` serialization)
	* Input: WAV audio file
	* Output: Predicted label: `True Story` or `Deceptive Story`

	## Intended Uses & Limitations

	Intended Uses:

	* Detecting potential deception in short, spoken stories or statements.
	* Research experiments on vocal biomarkers of deception.
	* Educational demonstrations on audio feature extraction and classification.

	Limitations & Risks:

	* The model was trained on a limited dataset; performance may degrade on different languages, audio quality, or speaking styles.
	* Predictions are probabilistic and should not be used as sole evidence in high-stakes scenarios (e.g., legal or security decisions).
	* Cultural, linguistic, or demographic biases in the training data can lead to unfair predictions.

	## Evaluation Metrics

	* Accuracy: 91%
	* Languages in Training Data: 15+ spoken languages

	## Training Data

	* Source: Curated dataset of narrated stories labeled as truthful or deceptive.
	* Preprocessing: Resampled to original sampling rates, trimmed to 30 seconds, MFCC extraction.

	## How to Use

	### Installation

	```bash
	pip install -r requirements.txt
	```

	### Loading the Model in Python

	```python
	import joblib
	from huggingface_hub import hf_hub_download

	repo_id = "sangambhamare/TruthDetection"
	model_file = hf_hub_download(repo_id=repo_id, filename="model.joblib")
	model = joblib.load(model_file)
	```

	### Making Predictions

	```python
	import librosa
	import numpy as np

	def extract_mfcc(file_path):
	y, sr = librosa.load(file_path, sr=None)
	mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
	return np.mean(mfcc, axis=1)

	features = extract_mfcc("path/to/audio.wav").reshape(1, -1)
	prediction = model.predict(features)[0]
	label = "True Story" if prediction == 1 else "Deceptive Story"
	print(label)
	```

	## Gradio Demo

	A live demo of this model is available via a Gradio interface. To launch locally:

	```bash
	python app.py
	```

	This will start a web app where you can upload a WAV file and see the prediction.

	tag::end

	---

	## Citation

	If you use this model in your research, please cite:

	```
	@misc{bhamare2025truthdetection,
	title={Truth Detection from Audio Stories},
	author={Sangam Sanjay Bhamare},
	year={2025},
	howpublished={\url{https://huggingface.co/sangambhamare/TruthDetection}}
	}
	```

	## License

	This model is released under the MIT License.