grit-id/id_nergrit_corpus
Updated • 234 • 7
How to use nerdv2/id_nergrit_indonesian_spacy with spaCy:
!pip install https://huggingface.co/nerdv2/id_nergrit_indonesian_spacy/resolve/main/id_nergrit_indonesian_spacy-any-py3-none-any.whl
# Using spacy.load().
import spacy
nlp = spacy.load("id_nergrit_indonesian_spacy")
# Importing as module.
import id_nergrit_indonesian_spacy
nlp = id_nergrit_indonesian_spacy.load()This is a spaCy model trained on the Nergrit Corpus for Indonesian Named Entity Recognition.
This model recognizes 19 entity types in Indonesian text:
| Metric | Score |
|---|---|
| F1 Score | 74.84% |
| Precision | 77.48% |
| Recall | 72.37% |
| Entity | F1 Score |
|---|---|
| PRC (Percent) | 93.72% |
| DAT (Date) | 92.41% |
| MON (Money) | 92.56% |
| TIM (Time) | 88.51% |
| CRD (Cardinal) | 86.23% |
pip install spacy
pip install https://huggingface.co/nerdv2/id_nergrit_indonesian_spacy/resolve/main/id_nergrit_indonesian_spacy-1.0.0-py3-none-any.whl
import spacy
# Load the model
nlp = spacy.load("id_nergrit_indonesian_spacy")
# Process text
text = "Presiden Joko Widodo mengunjungi Jakarta pada tanggal 17 Agustus 2023."
doc = nlp(text)
# Extract entities
for ent in doc.ents:
print(f"{ent.text} -> {ent.label_}")
Output:
Joko Widodo -> PER
Jakarta -> GPE
17 Agustus 2023 -> DAT
import spacy
nlp = spacy.load("id_nergrit_indonesian_spacy")
texts = [
"Bank Indonesia mengumumkan suku bunga sebesar 5.75 persen.",
"Menteri Keuangan Sri Mulyani menyatakan APBN 2023 mencapai Rp 3000 triliun."
]
for doc in nlp.pipe(texts):
print([(ent.text, ent.label_) for ent in doc.ents])
import spacy
# Load directly from Hugging Face
nlp = spacy.load("id_nergrit_indonesian_spacy")
doc = nlp("Universitas Indonesia terletak di Depok, Jawa Barat.")
for ent in doc.ents:
print(f"{ent.text} ({ent.label_})")
The model was trained on the Nergrit Corpus dataset:
Dataset source: grit-id/id_nergrit_corpus
If you use this model, please cite:
@misc{id_nergrit_indonesian_spacy,
author = {nerdv2},
title = {Indonesian Named Entity Recognition Model},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/nerdv2/id_nergrit_indonesian_spacy}
}
MIT License
For questions or issues, please open an issue on the Hugging Face model page.