mobanon-models / README.md
PaulCamacho's picture
Upload README.md with huggingface_hub
3dcbf14 verified
---
language:
- de
library_name: tflite
tags:
- named-entity-recognition
- ner
- german
- tflite
- on-device
- mobile
- android
- ios
datasets:
- GermanEval/germeval_14
base_model: deepset/gelectra-large
pipeline_tag: token-classification
license: mit
---
# MobAnon NER Model
German Named Entity Recognition model for the [MobAnon](https://github.com/jurasoft/JURA-KI-Anonymer-Mobile) document anonymization app. Fine-tuned from [deepset/gelectra-large](https://huggingface.co/deepset/gelectra-large) on [GermEval14](https://huggingface.co/datasets/GermanEval/germeval_14) for on-device inference.
## Model Details
| Property | Value |
|----------|-------|
| Base model | deepset/gelectra-large |
| Training data | GermEval14 (German NER) |
| Format | TensorFlow Lite (float16 quantized) |
| Size | ~638 MB |
| Test F1 | ~87-89% |
| Max sequence length | 128 tokens |
## Entity Types
The model detects four semantic entity types using BIO tagging:
| Entity | Examples |
|--------|----------|
| **PERSON** | Max Mustermann, Dr. Schmidt |
| **ORGANIZATION** | Deutsche Bank, Bundesgerichtshof |
| **LOCATION** | Frankfurt, Deutschland, Berliner Str. |
| **MISC** | Events, dates, other named entities |
MobAnon supplements these with regex-based detection for structured entities (email, phone, IBAN, identifiers).
## Usage
This model is downloaded automatically by the MobAnon app on first use. No manual setup required.
### Direct download
```bash
# Via huggingface-cli
huggingface-cli download PaulCamacho/mobanon-models deepseek.tflite
# Via URL
wget https://huggingface.co/PaulCamacho/mobanon-models/resolve/main/deepseek.tflite
```
### Input/Output Specification
| Tensor | Shape | Type | Description |
|--------|-------|------|-------------|
| `input_ids` | [1, 128] | int32 | Tokenized input IDs |
| `attention_mask` | [1, 128] | int32 | Attention mask |
| `logits` | [1, 128, 9] | float32 | Per-token logits for 9 BIO labels |
### Labels
| Index | Label | Entity |
|-------|-------|--------|
| 0 | O | Outside |
| 1 | B-PER | Begin Person |
| 2 | I-PER | Inside Person |
| 3 | B-ORG | Begin Organization |
| 4 | I-ORG | Inside Organization |
| 5 | B-LOC | Begin Location |
| 6 | I-LOC | Inside Location |
| 7 | B-MISC | Begin Miscellaneous |
| 8 | I-MISC | Inside Miscellaneous |
## Training
```bash
cd base_model
python train_ner.py --epochs 3 --batch-size 16 --fp16
python export_to_onnx.py --static-shapes
python convert_to_tflite.py --quantize float16
```
See the [base_model README](https://github.com/jurasoft/JURA-KI-Anonymer-Mobile/tree/main/base_model) for the full training and conversion pipeline.
## License
MIT