| # NER Benchmark Results | |
| **Model:** Minibase-NER-Standard | |
| **Dataset:** ner_benchmark_dataset.jsonl | |
| **Sample Size:** 100 | |
| **Date:** 2025-10-07T13:41:36.866891 | |
| ## Overall Performance | |
| | Metric | Score | Description | | |
| |--------|-------|-------------| | |
| | F1 Score | 0.951 | Overall NER performance (harmonic mean of precision and recall) | | |
| | Precision | 0.915 | Accuracy of entity predictions | | |
| | Recall | 1.000 | Ability to find all entities | | |
| | Average Latency | 323.3ms | Response time performance | | |
| ## Entity Type Performance | |
| | Entity Type | Accuracy | Correct/Total | | |
| |-------------|----------|---------------| | |
| | PERSON | 1.000 | 100/100 | | |
| | ORG | 1.000 | 100/100 | | |
| | LOC | 0.660 | 66/100 | | |
| | MISC | 1.000 | 34/34 | | |
| ## Key Improvements | |
| - **BIO Tagging**: Model outputs entities in BIO (Beginning-Inside-Outside) format | |
| - **Multiple Entity Types**: Supports PERSON, ORG, LOC, and MISC entities | |
| - **Entity-Level Evaluation**: Metrics calculated at entity level rather than token level | |
| - **Comprehensive Coverage**: Evaluates across different text domains | |
| ## Example Results | |
| ### Example 1 | |
| **Input:** John Smith works at Google in New York and uses Python programming language.... | |
| **Predicted:** { "PER": ["John Smith"], "ORG": ["Google"], "LOC": ["New York"], "MISC": ["Python"] }... | |
| **F1 Score:** 0.857 | |
| ### Example 2 | |
| **Input:** Microsoft Corporation announced that Satya Nadella will visit London next week.... | |
| **Predicted:** { "PER": ["Satya Nadella"], "ORG": ["Microsoft Corporation"], "LOC": ["London"], "MISC": [] }... | |
| **F1 Score:** 1.000 | |
| ### Example 3 | |
| **Input:** The University of Cambridge is located in the United Kingdom and was founded by King Henry III.... | |
| **Predicted:** { "PER": ["King Henry III"], "ORG": ["University of Cambridge"], "LOC": ["United Kingdom"], "MISC": [] }... | |
| **F1 Score:** 1.000 | |