File size: 3,912 Bytes
53633fd
1727f84
 
 
 
 
 
 
 
 
 
 
 
53633fd
 
 
1727f84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
tags:
- autotrain
- text-generation
- transformers
- named entity recognition
widget:
- text: 'I love AutoTrain because '
license: mit
datasets:
- conll2012_ontonotesv5
language:
- en
---


# Phi-2 model fine-tuned for named entity recognition task
The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset.
- Dataset Source: [conll2012_ontonotesv5](https://huggingface.co/datasets/conll2012_ontonotesv5)
- Subset Used: English_v12
- Number of Examples: 87,265
  
The prompts and expected outputs were constructed as described in [1].

Example input:
```md
Instruct: I am an excelent linquist. The task is to label organization entities in the given sentence. Below are some examples

Input: A spokesman for B. A. T said of the amended filings that,`` It would appear that nothing substantive has changed.
Output: A spokesman for @@B. A. T## said of the amended filings that,`` It would appear that nothing substantive has changed.

Input: Since NBC's interest in the Qintex bid for MGM / UA was disclosed, Mr. Wright has n't been available for comment.
Output: Since @@NBC##'s interest in the @@Qintex## bid for @@MGM / UA## was disclosed, Mr. Wright has n't been available for comment.

Input: You know news organizations demand total transparency whether you're General Motors or United States government /.
Output: You know news organizations demand total transparency whether you're @@General Motors## or United States government /.

Input: We respectfully invite you to watch a special edition of Across China.
Output:
```
Expected output:
```md
We respectfully invite you to watch a special edition of @@Across China##.
```

This model is trained to recognize the named entity categories
- person
- nationalities or religious or political groups
- facility
- organization
- geopolitical entity
- location
- product
- date
- time expression
- percentage
- monetary value
- quantity
- event
- work of art
- law/legal reference
- language name

# Model Trained Using AutoTrain

This model was trained using **SFT** AutoTrain trainer. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).

Hyperparameters:
```json
{
    "model": "microsoft/phi-2",
    "valid_split": null,
    "add_eos_token": false,
    "block_size": 1024,
    "model_max_length": 1024,
    "padding": "right",
    "trainer": "sft",
    "use_flash_attention_2": false,
    "disable_gradient_checkpointing": false,
    "evaluation_strategy": "epoch",
    "save_total_limit": 1,
    "save_strategy": "epoch",
    "auto_find_batch_size": false,
    "mixed_precision": "bf16",
    "lr": 0.0002,
    "epochs": 1,
    "batch_size": 1,
    "warmup_ratio": 0.1,
    "gradient_accumulation": 4,
    "optimizer": "adamw_torch",
    "scheduler": "linear",
    "weight_decay": 0.01,
    "max_grad_norm": 1.0,
    "seed": 42,
    "apply_chat_template": false,
    "quantization": "int4",
    "target_modules": null,
    "merge_adapter": false,
    "peft": true,
    "lora_r": 16,
    "lora_alpha": 32,
    "lora_dropout": 0.05,
    "dpo_beta": 0.1,
}
```

# Usage

```python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "pahautelman/phi2-ner-v1"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path
).eval()

prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.'

inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt')
outputs = model.generate(
    inputs.to(model.device),
    max_new_tokens=9,
    do_sample=False,
)
output = tokenizer.batch_decode(outputs)[0]

# Model response: "Output: Russian President, Vladimir Putin"
print(output)
```

# References:
[1] Wang et al., GPT-NER: Named entity recognition via large language models 2023