File size: 3,640 Bytes
0bc8fa1
85efe28
 
 
 
0bc8fa1
 
 
 
 
85efe28
0bc8fa1
 
85efe28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
title: WimBERT Synth v0
emoji: 🏛️
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Dutch multi-label classifier for signal messages
---

# WimBERT Synth v0: Dutch Multi-Label Signal Classifier

Demo of a dual-head BERT classifier trained on synthetic Dutch government signals.
Predicts relevant topics (**onderwerp**, 64 labels) and sentiment/experience 
(**beleving**, 33 labels) for each input message.

## 🚀 Usage

1. Enter Dutch text (e.g., a citizen feedback message about government services)
2. Click **Voorspel** to classify
3. Adjust **Drempel** (threshold) to change prediction sensitivity
4. View results in three tabs:
   - **Samenvatting**: Top-K predictions per head with color-coded probabilities
   - **Alle labels**: Complete list of all labels sorted by probability
   - **JSON**: Raw predictions in machine-readable format

## 🎯 Features

- **Dual-head classification**: Simultaneously predicts topic (onderwerp) and experience (beleving)
- **Interactive threshold**: Adjust which labels are considered "predicted"
- **Color-coded visualization**: Probability intensity shown via color (darker = higher probability)
- **Accessible**: All probabilities shown numerically, colors are enhancements
- **Fast**: Optimized for CPU inference (~2-5s) with optional GPU acceleration

## 🤖 Model

- **Base model**: `bert-base-multilingual-cased`
- **Architecture**: Dual classification heads with 64 onderwerp + 33 beleving labels
- **Training**: Synthetic data via Argilla + distillation pipeline
- **License**: Apache-2.0
- **Full model card**: [UWV/wimbert-synth-v0](https://huggingface.co/UWV/wimbert-synth-v0)

### Labels

**Onderwerp (64 topics)**:
Advies, Algemene veiligheid, Begeleiding, Bijstand, Bouwoverlast, COVID-19, Criminaliteit, 
Documentaanvraag, Energiekosten, Evenementen, Financiële regelingen, Geluidsoverlast, 
Gemeentelijke heffingen, Hangjongeren, Huisdierenoverlast, Hulp aan dak- en thuislozen, 
Infrastructuur, Kwijtschelding, Migratie, Onderhoud omgeving, Parkeren, Schade en claims, 
Verkeersmaatregelen, Verkeersveiligheid, Wijkteam, and more...

**Beleving (33 experiences)**:
Afspraakmogelijkheden, Algemene ervaring, Behulpzaamheid, Bereikbaarheid, Bezwaar & bewijs, 
Communicatie, Deskundigheid, Duidelijkheid, Efficiëntie, Faciliteiten, Gebruiksgemak, 
Informatievoorziening, Integriteit, Kwaliteit klantenservice, Snelheid van afhandeling, 
Vriendelijkheid, Wachttijd, and more...

## 🔒 Privacy

- Input text is processed **in-memory only**
- No data is logged or stored beyond standard Gradio telemetry
- Model runs entirely within this Space (no external API calls)

## ⚙️ Hardware

- **CPU**: Works on free tier (~3-5s inference)
- **GPU (T4)**: Recommended for production (<1s inference)

Current Space is running on: **CPU** with FP32

## 🛠️ Local Development

```bash
# Clone and setup
git clone https://huggingface.co/spaces/UWV/wimbert-synth-v0
cd wimbert-synth-v0
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run
python app.py
```

## 📊 Example Use Cases

- **Citizen feedback routing**: Automatically categorize incoming messages
- **Sentiment analysis**: Understand citizen experience with government services
- **Analytics**: Aggregate trends across topics and experiences
- **Triage**: Prioritize urgent or negative feedback

⚠️ **Note**: This is a research/demo tool. Not intended for automated decision-making.

---

**Built with**: Gradio • Transformers • PyTorch  
**Developed by**: UWV  
**License**: Apache-2.0