wimbert-space / README.md
yhavinga's picture
Initial Gradio Space implementation for WimBERT Synth v0
85efe28

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: WimBERT Synth v0
emoji: ๐Ÿ›๏ธ
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Dutch multi-label classifier for signal messages

WimBERT Synth v0: Dutch Multi-Label Signal Classifier

Demo of a dual-head BERT classifier trained on synthetic Dutch government signals. Predicts relevant topics (onderwerp, 64 labels) and sentiment/experience (beleving, 33 labels) for each input message.

๐Ÿš€ Usage

  1. Enter Dutch text (e.g., a citizen feedback message about government services)
  2. Click Voorspel to classify
  3. Adjust Drempel (threshold) to change prediction sensitivity
  4. View results in three tabs:
    • Samenvatting: Top-K predictions per head with color-coded probabilities
    • Alle labels: Complete list of all labels sorted by probability
    • JSON: Raw predictions in machine-readable format

๐ŸŽฏ Features

  • Dual-head classification: Simultaneously predicts topic (onderwerp) and experience (beleving)
  • Interactive threshold: Adjust which labels are considered "predicted"
  • Color-coded visualization: Probability intensity shown via color (darker = higher probability)
  • Accessible: All probabilities shown numerically, colors are enhancements
  • Fast: Optimized for CPU inference (~2-5s) with optional GPU acceleration

๐Ÿค– Model

  • Base model: bert-base-multilingual-cased
  • Architecture: Dual classification heads with 64 onderwerp + 33 beleving labels
  • Training: Synthetic data via Argilla + distillation pipeline
  • License: Apache-2.0
  • Full model card: UWV/wimbert-synth-v0

Labels

Onderwerp (64 topics): Advies, Algemene veiligheid, Begeleiding, Bijstand, Bouwoverlast, COVID-19, Criminaliteit, Documentaanvraag, Energiekosten, Evenementen, Financiรซle regelingen, Geluidsoverlast, Gemeentelijke heffingen, Hangjongeren, Huisdierenoverlast, Hulp aan dak- en thuislozen, Infrastructuur, Kwijtschelding, Migratie, Onderhoud omgeving, Parkeren, Schade en claims, Verkeersmaatregelen, Verkeersveiligheid, Wijkteam, and more...

Beleving (33 experiences): Afspraakmogelijkheden, Algemene ervaring, Behulpzaamheid, Bereikbaarheid, Bezwaar & bewijs, Communicatie, Deskundigheid, Duidelijkheid, Efficiรซntie, Faciliteiten, Gebruiksgemak, Informatievoorziening, Integriteit, Kwaliteit klantenservice, Snelheid van afhandeling, Vriendelijkheid, Wachttijd, and more...

๐Ÿ”’ Privacy

  • Input text is processed in-memory only
  • No data is logged or stored beyond standard Gradio telemetry
  • Model runs entirely within this Space (no external API calls)

โš™๏ธ Hardware

  • CPU: Works on free tier (~3-5s inference)
  • GPU (T4): Recommended for production (<1s inference)

Current Space is running on: CPU with FP32

๐Ÿ› ๏ธ Local Development

# Clone and setup
git clone https://huggingface.co/spaces/UWV/wimbert-synth-v0
cd wimbert-synth-v0
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run
python app.py

๐Ÿ“Š Example Use Cases

  • Citizen feedback routing: Automatically categorize incoming messages
  • Sentiment analysis: Understand citizen experience with government services
  • Analytics: Aggregate trends across topics and experiences
  • Triage: Prioritize urgent or negative feedback

โš ๏ธ Note: This is a research/demo tool. Not intended for automated decision-making.


Built with: Gradio โ€ข Transformers โ€ข PyTorch
Developed by: UWV
License: Apache-2.0