| --- |
| license: apache-2.0 |
| base_model: distilbert-base-uncased |
| tags: |
| - token-classification |
| - ner |
| - bootstrap-labels |
| - eval-mentions-bootstrap-v2 |
| metrics: |
| - seqeval |
| language: |
| - en |
| --- |
| |
| # davanstrien/eval-extraction-ner-v2 |
|
|
| Token classifier trained on **bootstrap NER labels** from [`davanstrien/eval-mentions-bootstrap-v2`](https://huggingface.co/datasets/davanstrien/eval-mentions-bootstrap-v2). Demonstrates the `bootstrap-labels` skill workflow: GLiNER bootstraps coarse labels, a small task-specific model is trained on them. |
|
|
| ## Training data |
|
|
| - Source: `davanstrien/eval-mentions-bootstrap-v2` |
| - Bootstrap model: GLiNER (via `uv-scripts/gliner`) |
| - Score threshold: 0.8 (entities below this dropped) |
| - Span blacklist: ['learning_rate', 'eval_batch_size', 'epsilon', 'lr_scheduler_warmup_ratio', 'lr_scheduler_type', 'epoch', 'batch_size', 'optimizer', 'gradient_accumulation_steps', 'warmup_ratio', 'seed', 'weight_decay', 'model', 'dataset', 'transformers', 'training dataset', 'training data', 'unknown dataset', 'f1'] |
| - Train rows: 306 |
| - Val rows: 35 |
| - Token-label distribution (excluding `O`): |
| - BENCHMARK_NAME: 3663 |
| - EVALUATION_METRIC: 719 |
| |
| ## Eval results |
| |
| | Metric | Value | |
| |---|---| |
| | F1 | 0.0000 | |
| | Precision | 0.0000 | |
| | Recall | 0.0000 | |
| | Accuracy | 0.9756 | |
| |
| (Note: held-out 10% of bootstrap labels — these are *silver labels*, not human-reviewed gold. Numbers reflect agreement with GLiNER, not absolute accuracy.) |
| |
| ## Caveats |
| |
| - This is a **V0 model** trained on bootstrap labels with no human review pass. Expect it to inherit GLiNER's failure modes. |
| - The intended use is *as the V1 in an active-learning loop*: deploy as Label Studio ML backend, route disagreements with GLiNER to humans, retrain on corrections. See the [bootstrap-labels skill](https://github.com/huggingface/skills) for the full workflow. |
| |
| ## Usage |
| |
| ```python |
| from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline |
| |
| ner = pipeline("token-classification", model="davanstrien/eval-extraction-ner-v2", aggregation_strategy="simple") |
| ner("This model was evaluated on MMLU and HellaSwag.") |
| ``` |
| |