mobanon-models / README.md

PaulCamacho

Upload README.md with huggingface_hub

3dcbf14 verified 5 days ago

preview code

raw

history blame contribute delete

2.75 kB

metadata

language:
  - de
library_name: tflite
tags:
  - named-entity-recognition
  - ner
  - german
  - tflite
  - on-device
  - mobile
  - android
  - ios
datasets:
  - GermanEval/germeval_14
base_model: deepset/gelectra-large
pipeline_tag: token-classification
license: mit

MobAnon NER Model

German Named Entity Recognition model for the MobAnon document anonymization app. Fine-tuned from deepset/gelectra-large on GermEval14 for on-device inference.

Model Details

Property	Value
Base model	deepset/gelectra-large
Training data	GermEval14 (German NER)
Format	TensorFlow Lite (float16 quantized)
Size	~638 MB
Test F1	~87-89%
Max sequence length	128 tokens

Entity Types

The model detects four semantic entity types using BIO tagging:

Entity	Examples
PERSON	Max Mustermann, Dr. Schmidt
ORGANIZATION	Deutsche Bank, Bundesgerichtshof
LOCATION	Frankfurt, Deutschland, Berliner Str.
MISC	Events, dates, other named entities

MobAnon supplements these with regex-based detection for structured entities (email, phone, IBAN, identifiers).

Usage

This model is downloaded automatically by the MobAnon app on first use. No manual setup required.

Direct download

# Via huggingface-cli
huggingface-cli download PaulCamacho/mobanon-models deepseek.tflite

# Via URL
wget https://huggingface.co/PaulCamacho/mobanon-models/resolve/main/deepseek.tflite

Input/Output Specification

Tensor	Shape	Type	Description
`input_ids`	[1, 128]	int32	Tokenized input IDs
`attention_mask`	[1, 128]	int32	Attention mask
`logits`	[1, 128, 9]	float32	Per-token logits for 9 BIO labels

Labels

Index	Label	Entity
0	O	Outside
1	B-PER	Begin Person
2	I-PER	Inside Person
3	B-ORG	Begin Organization
4	I-ORG	Inside Organization
5	B-LOC	Begin Location
6	I-LOC	Inside Location
7	B-MISC	Begin Miscellaneous
8	I-MISC	Inside Miscellaneous

Training

cd base_model
python train_ner.py --epochs 3 --batch-size 16 --fp16
python export_to_onnx.py --static-shapes
python convert_to_tflite.py --quantize float16

See the base_model README for the full training and conversion pipeline.

License

MIT