File size: 3,912 Bytes
53633fd 1727f84 53633fd 1727f84 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | ---
tags:
- autotrain
- text-generation
- transformers
- named entity recognition
widget:
- text: 'I love AutoTrain because '
license: mit
datasets:
- conll2012_ontonotesv5
language:
- en
---
# Phi-2 model fine-tuned for named entity recognition task
The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset.
- Dataset Source: [conll2012_ontonotesv5](https://huggingface.co/datasets/conll2012_ontonotesv5)
- Subset Used: English_v12
- Number of Examples: 87,265
The prompts and expected outputs were constructed as described in [1].
Example input:
```md
Instruct: I am an excelent linquist. The task is to label organization entities in the given sentence. Below are some examples
Input: A spokesman for B. A. T said of the amended filings that,`` It would appear that nothing substantive has changed.
Output: A spokesman for @@B. A. T## said of the amended filings that,`` It would appear that nothing substantive has changed.
Input: Since NBC's interest in the Qintex bid for MGM / UA was disclosed, Mr. Wright has n't been available for comment.
Output: Since @@NBC##'s interest in the @@Qintex## bid for @@MGM / UA## was disclosed, Mr. Wright has n't been available for comment.
Input: You know news organizations demand total transparency whether you're General Motors or United States government /.
Output: You know news organizations demand total transparency whether you're @@General Motors## or United States government /.
Input: We respectfully invite you to watch a special edition of Across China.
Output:
```
Expected output:
```md
We respectfully invite you to watch a special edition of @@Across China##.
```
This model is trained to recognize the named entity categories
- person
- nationalities or religious or political groups
- facility
- organization
- geopolitical entity
- location
- product
- date
- time expression
- percentage
- monetary value
- quantity
- event
- work of art
- law/legal reference
- language name
# Model Trained Using AutoTrain
This model was trained using **SFT** AutoTrain trainer. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).
Hyperparameters:
```json
{
"model": "microsoft/phi-2",
"valid_split": null,
"add_eos_token": false,
"block_size": 1024,
"model_max_length": 1024,
"padding": "right",
"trainer": "sft",
"use_flash_attention_2": false,
"disable_gradient_checkpointing": false,
"evaluation_strategy": "epoch",
"save_total_limit": 1,
"save_strategy": "epoch",
"auto_find_batch_size": false,
"mixed_precision": "bf16",
"lr": 0.0002,
"epochs": 1,
"batch_size": 1,
"warmup_ratio": 0.1,
"gradient_accumulation": 4,
"optimizer": "adamw_torch",
"scheduler": "linear",
"weight_decay": 0.01,
"max_grad_norm": 1.0,
"seed": 42,
"apply_chat_template": false,
"quantization": "int4",
"target_modules": null,
"merge_adapter": false,
"peft": true,
"lora_r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05,
"dpo_beta": 0.1,
}
```
# Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "pahautelman/phi2-ner-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path
).eval()
prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.'
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt')
outputs = model.generate(
inputs.to(model.device),
max_new_tokens=9,
do_sample=False,
)
output = tokenizer.batch_decode(outputs)[0]
# Model response: "Output: Russian President, Vladimir Putin"
print(output)
```
# References:
[1] Wang et al., GPT-NER: Named entity recognition via large language models 2023 |