phi2-ner-v1 / README.md

Update README.md

1727f84 verified about 2 years ago

3.91 kB

	---
	tags:
	- autotrain
	- text-generation
	- transformers
	- named entity recognition
	widget:
	- text: 'I love AutoTrain because '
	license: mit
	datasets:
	- conll2012_ontonotesv5
	language:
	- en
	---


	# Phi-2 model fine-tuned for named entity recognition task
	The model was fine-tuned using one quarter of the ConLL 2012 OntoNotes v5 dataset.
	- Dataset Source: [conll2012_ontonotesv5](https://huggingface.co/datasets/conll2012_ontonotesv5)
	- Subset Used: English_v12
	- Number of Examples: 87,265

	The prompts and expected outputs were constructed as described in [1].

	Example input:
	```md
	Instruct: I am an excelent linquist. The task is to label organization entities in the given sentence. Below are some examples

	Input: A spokesman for B. A. T said of the amended filings that,`` It would appear that nothing substantive has changed.
	Output: A spokesman for @@B. A. T## said of the amended filings that,`` It would appear that nothing substantive has changed.

	Input: Since NBC's interest in the Qintex bid for MGM / UA was disclosed, Mr. Wright has n't been available for comment.
	Output: Since @@NBC##'s interest in the @@Qintex## bid for @@MGM / UA## was disclosed, Mr. Wright has n't been available for comment.

	Input: You know news organizations demand total transparency whether you're General Motors or United States government /.
	Output: You know news organizations demand total transparency whether you're @@General Motors## or United States government /.

	Input: We respectfully invite you to watch a special edition of Across China.
	Output:
	```
	Expected output:
	```md
	We respectfully invite you to watch a special edition of @@Across China##.
	```

	This model is trained to recognize the named entity categories
	- person
	- nationalities or religious or political groups
	- facility
	- organization
	- geopolitical entity
	- location
	- product
	- date
	- time expression
	- percentage
	- monetary value
	- quantity
	- event
	- work of art
	- law/legal reference
	- language name

	# Model Trained Using AutoTrain

	This model was trained using SFT AutoTrain trainer. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).

	Hyperparameters:
	```json
	{
	"model": "microsoft/phi-2",
	"valid_split": null,
	"add_eos_token": false,
	"block_size": 1024,
	"model_max_length": 1024,
	"padding": "right",
	"trainer": "sft",
	"use_flash_attention_2": false,
	"disable_gradient_checkpointing": false,
	"evaluation_strategy": "epoch",
	"save_total_limit": 1,
	"save_strategy": "epoch",
	"auto_find_batch_size": false,
	"mixed_precision": "bf16",
	"lr": 0.0002,
	"epochs": 1,
	"batch_size": 1,
	"warmup_ratio": 0.1,
	"gradient_accumulation": 4,
	"optimizer": "adamw_torch",
	"scheduler": "linear",
	"weight_decay": 0.01,
	"max_grad_norm": 1.0,
	"seed": 42,
	"apply_chat_template": false,
	"quantization": "int4",
	"target_modules": null,
	"merge_adapter": false,
	"peft": true,
	"lora_r": 16,
	"lora_alpha": 32,
	"lora_dropout": 0.05,
	"dpo_beta": 0.1,
	}
	```

	# Usage

	```python

	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_path = "pahautelman/phi2-ner-v1"

	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(
	model_path
	).eval()

	prompt = 'Label the person entities in the given sentence: Russian President Vladimir Putin is due to arrive in Havana a few hours from now to become the first post-Soviet leader to visit Cuba.'

	inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt')
	outputs = model.generate(
	inputs.to(model.device),
	max_new_tokens=9,
	do_sample=False,
	)
	output = tokenizer.batch_decode(outputs)[0]

	# Model response: "Output: Russian President, Vladimir Putin"
	print(output)
	```

	# References:
	[1] Wang et al., GPT-NER: Named entity recognition via large language models 2023