File size: 1,297 Bytes
9e60060
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---

library_name: keras
tags:
  - audio-classification
  - cnn
  - cebuano
  - sinama
  - mel-spectrogram
pipeline_tag: audio-classification
---


# Sinama Audio Classifier

A CNN-based audio classification model trained to recognise spoken
Cebuano / Sinama words from short audio clips.

## Usage

### Via Inference API

```python

import requests



API_URL = "https://api-inference.huggingface.co/models/YOUR_USERNAME/sinama-translator"

headers = {"Authorization": "Bearer hf_YOUR_TOKEN"}



with open("audio.wav", "rb") as f:

    response = requests.post(API_URL, headers=headers, data=f.read())



print(response.json())

# [{"label": "ako", "score": 0.95}, ...]

```

### Local inference

```python

import tensorflow as tf, json, librosa, numpy as np



model = tf.keras.models.load_model("best_model.keras")

with open("label_map.json") as f:

    label_map = {int(k): v for k, v in json.load(f).items()}



# preprocess your audio the same way as training …

pred = model.predict(features)

print(label_map[pred.argmax()])

```

## Training details

- **Architecture:** 3-block CNN (Conv2D → BN → ReLU → MaxPool → Dropout)
- **Features:** 128-bin Mel Spectrogram, 4 s clips, 22 050 Hz
- **Optimiser:** Adam
- **Loss:** Categorical cross-entropy