nodoki-ner-polish

Custom Polish NER model for nodoki.app - a local-first PWA for personal information management.

Model Description

This model is fine-tuned DistilBERT for Polish Named Entity Recognition with custom entity types:

  • PERSON - Names of people (👤 osoba)
  • POLISH_CITY - Polish cities (📍 miasto)
  • FOREIGN_CITY - International cities (📍 miasto)
  • POLISH_STORE - Polish retail chains (🛒 sklep)
  • AMOUNT_PLN - Money amounts in PLN (💰 kwota PLN)
  • MONTH - Polish month names (📅 miesiąc)
  • WEEKDAY - Days of the week (📆 dzień tygodnia)
  • DATE_RELATIVE - Relative dates like wczoraj, dzisiaj, jutro (🕐 data względna)

Performance

  • F1 Score: 0.9985
  • Format: ONNX quantized (uint8)
  • Size: ~130MB

Usage with Transformers.js

import { pipeline } from '@xenova/transformers';

const classifier = await pipeline(
  'token-classification',
  'HerqAI/nodoki-ner-polish'
);

const result = await classifier('Wczoraj kupiłem w Biedronce mleko za 4,99 zł');
console.log(result);

Training Details

  • Base model: distilbert-base-multilingual-cased
  • Training samples: 5,000
  • Validation samples: 1,000
  • Test samples: 750
  • Epochs: 4
  • Learning rate: 2e-5
  • Batch size: 16

License

Apache 2.0

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support