JengaAI Swahili NER (Context-Aware)

Discharged as part of the JengaAI Framework, this model is a specialized Named Entity Recognition (NER) system designed for the African context. It moves beyond standard entity detection (Person/Org) to extract incident-specific details, making it a powerful tool for Automated incident processing, Legal tech, and Data Anonymization.

Model Capabilities

This model is fine-tuned to detect 10 specific entity types relevant to structured data extraction from unstructured reports:

Label	Description	Example
`NAME`	Names of individuals involved	"Kamau", "John Doe"
`AGE`	Age of individuals	"34", "18 years old"
`GENDER`	Gender identification	"male", "female"
`PHONE_NUMBER`	Contact information	"0712345678"
`LOCATION`	General location areas	"Moi Avenue", "Nairobi"
`LANDMARK`	Specific reference points	"National Archives"
`O`	Outside (non-entity)	-

Usage with JengaAI

This model is built on the distilbert-base-uncased backbone for efficiency on edge devices.

from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

# 1. Load Model
tokenizer = AutoTokenizer.from_pretrained("Rogendo/JengaAI_NER_distilbert-base-uncased-v01")
model = AutoModelForTokenClassification.from_pretrained("Rogendo/JengaAI_NER_distilbert-base-uncased-v01")

# 2. Prepare Input
text = "Kamau reported the incident at Nairobi."
inputs = tokenizer(text, return_tensors="pt")

# 3. Inference
with torch.no_grad():
    logits = model(**inputs).logits

# 4. Decode
predictions = torch.argmax(logits, dim=2)
predicted_token_class = [model.config.id2label[t.item()] for t in predictions[0]]
print(predicted_token_class)

Intended Use & Impact

Legal Tech: Automating the digitization of police abstracts and court affidavits.
Emergency Response: Rapidly extracting location and incident details from distress texts.
Data Sovereignty: Processing sensitive PII locally without sending data to foreign APIs.

Training Data

Trained on ner synthetic dataset, a curated dataset of synthetic incident reports reflecting linguistic patterns found in East African administrative text.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rogendo/JengaAI_NER_distilbert-base-uncased-v01

Base model

distilbert/distilbert-base-uncased

Finetuned

(11766)

this model

Collection including Rogendo/JengaAI_NER_distilbert-base-uncased-v01

JengaAI - Tujenge ai yetu na JengaAI

Collection

A framework purpose-built for Kenya's national security and governnce . It supports evrythng frm pretranng simple trnsformrs to complex fusion models • 9 items • Updated Apr 1 • 2