File size: 11,644 Bytes

---
language:
- en
license: apache-2.0
library_name: llm2ner
base_model: EleutherAI/pythia-2.8b
tags:
- ner
- span-detection
- llm
- pytorch
pipeline_tag: token-classification
model_name: ToMMeR-pythia-2.8b_L5_R64
source: https://github.com/VictorMorand/llm2ner
paper: https://arxiv.org/abs/2510.19410
---

# ToMMeR-pythia-2.8b_L5_R64


[![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2510.19410)
[![All Models](https://img.shields.io/badge/🤗%20Hugging%20Face%20Models-blue)](https://huggingface.co/llm2ner)
[![GitHub](https://img.shields.io/badge/GitHub-Code-blue)](https://github.com/VictorMorand/llm2ner)


ToMMeR is a lightweight probing model extracting emergent mention detection capabilities from early layers representations of any LLM backbone, achieving high Zero Shot recall across a wide set of 13 NER benchmarks.

## Model Details

This model can be plugged at layer 5 of `EleutherAI/pythia-2.8b`, with a computational overhead not greater than an additional attention head.

| Property  | Value |
|-----------|-------|
| Base LLM  | `EleutherAI/pythia-2.8b` |
| Layer     | 5|
| #Params   | 330.2K |


# Usage

## Installation
To use ToMMeR, you need to install its codebase first.

```bash
pip install git+https://github.com/VictorMorand/llm2ner.git
```


## Raw inference
By default, ToMMeR outputs span probabilities, but we also propose built-in options for decoding entities.

- Inputs:
  - tokens (batch, seq): tokens to process,
  - model: LLM to extract representation from.
- Outputs: (batch, seq, seq) matrix (masked outside valid spans)

```python
from xpm_torch.huggingface import TorchHFHub
from llm2ner import ToMMeR, utils

tommer: ToMMeR = TorchHFHub.from_pretrained("llm2ner/ToMMeR-pythia-2.8b_L5_R64")
# load Backbone llm, optionnally cut the unused layer to save GPU space.
llm = utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
tommer.to(llm.device)

#### Raw Inference
text = ["Large language models are awesome"]
print(f"Input text: {text[0]}")

#tokenize in shape (1, seq_len)
tokens = llm.tokenizer(text, return_tensors="pt")["input_ids"].to(llm.device)
# Output raw scores
output = tommer.forward(tokens, llm) # (batch_size, seq_len, seq_len)
print(f"Raw Output shape: {output.shape}")

#use given decoding strategy to infer entities
entities = tommer.infer_entities(tokens=tokens, model=llm, threshold=0.5, decoding_strategy="greedy")
str_entities = [ llm.tokenizer.decode(tokens[0,b:e+1]) for b, e in entities[0]]
print(f"Predicted entities: {str_entities}")

>>>INFO:root:Cut LlamaModel with 16 layers to 7 layers
>>> Input text: Large language models are awesome
>>> Raw Output shape: torch.Size([1, 6, 6])
>>> Predicted entities: ['Large language models']
```


## Fancy Outputs

We also provide inference and plotting utils in `llm2ner.plotting`.

```python
from xpm_torch.huggingface import TorchHFHub
from llm2ner import ToMMeR, utils, plotting

tommer: ToMMeR = TorchHFHub.from_pretrained("llm2ner/ToMMeR-pythia-2.8b_L5_R64")
# load Backbone llm, optionnally cut the unused layer to save GPU space.
llm = utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
tommer.to(llm.device)

text = "Large language models are awesome. While trained on language modeling, they exhibit emergent Zero Shot abilities that make them suitable for a wide range of tasks, including Named Entity Recognition (NER). "

#fancy interactive output
outputs = plotting.demo_inference( text, tommer, llm,
    decoding_strategy="threshold",  # or "greedy" for flat segmentation
    threshold=0.5, # default 50%
    show_attn=True,
)
```
<div>
<span class="tex2jax_ignore"><div class="spans" style="line-height: 2.5; direction: ltr">
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    Large
    <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
    language
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 57px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
    models
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
</span>
are awesome . While trained on
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    language
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    modeling
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
</span>
, they exhibit
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    emergent
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    abilities
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
</span>
that make them suitable for a wide range of
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    tasks
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
, including
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    Named
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    Entity

<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
</span>
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    Recognition
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
</span>
(
<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
    NER
<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
</span>
<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
        PRED
    </span>
</span>
</span>
) . </div></span>
</div>

Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details and a demo notebook.

## Evaluation Results

| dataset             |   precision |   recall |     f1 |   n_samples |
|---------------------|-------------|----------|--------|-------------|
| MultiNERD           |      0.2019 |   0.9565 | 0.3334 |      154144 |
| CoNLL 2003          |      0.2674 |   0.7119 | 0.3888 |       16493 |
| CrossNER_politics   |      0.2921 |   0.9404 | 0.4458 |        1389 |
| CrossNER_AI         |      0.3119 |   0.9119 | 0.4648 |         879 |
| CrossNER_literature |      0.3336 |   0.8854 | 0.4846 |         916 |
| CrossNER_science    |      0.3468 |   0.9167 | 0.5033 |        1193 |
| CrossNER_music      |      0.3641 |   0.9124 | 0.5205 |         945 |
| ncbi                |      0.1187 |   0.8584 | 0.2085 |        3952 |
| FabNER              |      0.2974 |   0.7065 | 0.4186 |       13681 |
| WikiNeural          |      0.1897 |   0.9313 | 0.3151 |       92672 |
| GENIA_NER           |      0.2361 |   0.9237 | 0.3761 |       16563 |
| ACE 2005            |      0.2338 |   0.3292 | 0.2734 |        8230 |
| Ontonotes           |      0.2296 |   0.6734 | 0.3424 |       42193 |
| Aggregated          |      0.2121 |   0.8771 | 0.3415 |      353250 |
| Mean                |      0.2633 |   0.8198 | 0.3904 |      353250 |
## Citation
If using this model or the approach, please cite the associated paper:
```
@misc{morand2025tommerefficiententity,
      title={ToMMeR -- Efficient Entity Mention Detection from Large Language Models},
      author={Victor Morand and Nadi Tomeh and Josiane Mothe and Benjamin Piwowarski},
      year={2025},
      eprint={2510.19410},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19410},
}
```

## License
Apache-2.0 (see repository for full text).