|
|
--- |
|
|
base_model: Qwen/Qwen3-4B-Instruct-2507 |
|
|
library_name: peft |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- ner |
|
|
- named-entity-recognition |
|
|
- early-modern-english |
|
|
- historical-texts |
|
|
- commodity |
|
|
- lora |
|
|
- transformers |
|
|
language: |
|
|
- en |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# EarlyModernNER - COMMODITY Adapter |
|
|
|
|
|
A LoRA adapter specialized for extracting **COMMODITY** entities from Early Modern English documents (1500-1800). |
|
|
|
|
|
## Entity Type: COMMODITY |
|
|
|
|
|
Extracts trade goods, materials, and foodstuffs: |
|
|
- Agricultural products (sugar, tobacco, cotton, indigo) |
|
|
- Foodstuffs (butter, salt, bread, fish, spices) |
|
|
- Raw materials (timber, iron, wool, silk) |
|
|
- Spices and luxury goods (pepper, cinnamon, nutmeg) |
|
|
- Manufactured goods (cloth, linen) |
|
|
|
|
|
**Does NOT extract:** |
|
|
- Currency terms (money, guineas, pounds) |
|
|
- Abstract concepts |
|
|
- Nationalities |
|
|
|
|
|
## Performance |
|
|
|
|
|
Evaluated on 100 gold-standard annotated documents: |
|
|
|
|
|
| Metric | Score | |
|
|
|--------|-------| |
|
|
| Precision | 0.8473 | |
|
|
| Recall | 0.8043 | |
|
|
| F1 | 0.8253 | |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base Model:** Qwen/Qwen3-4B-Instruct-2507 |
|
|
- **Method:** QLoRA (4-bit quantization) |
|
|
- **LoRA Rank:** 64 |
|
|
- **Epochs:** 2 |
|
|
- **Training Data:** Augmented silver-standard annotations with synthetic hard negatives |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen3-4B-Instruct-2507", |
|
|
device_map="auto", |
|
|
trust_remote_code=True, |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Instruct-2507") |
|
|
model = PeftModel.from_pretrained(base_model, "path/to/commodity_lora") |
|
|
``` |
|
|
|
|
|
Or use via the EarlyModernNER package: |
|
|
```bash |
|
|
python -m earlymodernner --input your_docs/ --output results.jsonl |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Distinguishing commodities from other nouns requires context |
|
|
- Historical commodity names may differ from modern terms |
|
|
- Recipe/household texts may have different commodity distributions than trade documents |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|
|
|
## Author |
|
|
|
|
|
Jacob Polay, MA Student, University of Saskatchewan |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@software{earlymodernner, |
|
|
title = {EarlyModernNER: Named Entity Recognition for Early Modern English}, |
|
|
author = {Polay, Jacob}, |
|
|
year = {2026} |
|
|
} |
|
|
``` |
|
|
|