| --- |
| license: mit |
| language: en |
| pipeline_tag: audio-classification |
| library_name: transformers |
| tags: |
| - deepfake |
| - audio |
| - wav2vec2 |
| - pytorch |
| --- |
| |
| # π Deepfake Audio Detection Model |
|
|
| ## π Overview |
| This model detects whether an audio file is **REAL or FAKE (AI-generated voice)**. |
|
|
| It is based on **Wav2Vec2 architecture** and uses transformer-based audio embeddings. |
|
|
| --- |
|
|
| ## π― Task |
| Binary Classification: |
| - 0 β REAL AUDIO |
| - 1 β FAKE AUDIO |
|
|
| --- |
|
|
| ## π₯ Input |
| - Audio file (.wav) |
| - Sampling rate: 16kHz |
|
|
| --- |
|
|
| ## π€ Output |
| - Fake probability (0 to 1) |
|
|
| --- |
|
|
| ## βοΈ Model Files |
| - pytorch_model.bin |
| - config.json |
| - preprocessor_config.json |
| - tokenizer files |
|
|
| --- |
|
|
| ## π Usage |
|
|
| ```python |
| from transformers import AutoProcessor, AutoModel |
| import librosa |
| import torch |
| |
| processor = AutoProcessor.from_pretrained("Simma7/audio_model") |
| model = AutoModel.from_pretrained("Simma7/audio_model") |
| |
| audio, sr = librosa.load("test.wav", sr=16000) |
| |
| inputs = processor(audio, sampling_rate=16000, return_tensors="pt") |
| |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| |
| embedding = outputs.last_hidden_state.mean(dim=1) |
| prob = torch.sigmoid(embedding.mean()).item() |
| |
| print(prob) |