Instructions to use EmmaStrong/RA-IT-NER-zh-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EmmaStrong/RA-IT-NER-zh-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="EmmaStrong/RA-IT-NER-zh-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("EmmaStrong/RA-IT-NER-zh-7B")
model = AutoModelForCausalLM.from_pretrained("EmmaStrong/RA-IT-NER-zh-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use EmmaStrong/RA-IT-NER-zh-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EmmaStrong/RA-IT-NER-zh-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EmmaStrong/RA-IT-NER-zh-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/EmmaStrong/RA-IT-NER-zh-7B

SGLang

How to use EmmaStrong/RA-IT-NER-zh-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "EmmaStrong/RA-IT-NER-zh-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EmmaStrong/RA-IT-NER-zh-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "EmmaStrong/RA-IT-NER-zh-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EmmaStrong/RA-IT-NER-zh-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use EmmaStrong/RA-IT-NER-zh-7B with Docker Model Runner:
```
docker model run hf.co/EmmaStrong/RA-IT-NER-zh-7B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

RA-IT-NER-zh-7B

Description: The RA-IT-NER-zh-7B model is trained from Qwen1.5-7B using the proposed Retrieval Augmented Instruction Tuning (RA-IT) approach. This model can be used for Chinese Open NER with and without RAG. The training data is our constructed Sky-NER , an instruction tuning dataset for Chinese OpenNER. We follow the recipe of UniversalNER and use the large-scale SkyPile Corpus to construct this dataset. The data was collected by prompting gpt-3.5-turbo-0125 to label entities from passages and provide entity tags. The data collection prompt is as follows:

Instruction:
给定一段文本，你的任务是抽取所有实体并识别它们的实体类别。输出应为以下JSON格式：[{"实体1": "实体1的类别"}, ...]。

Check our paper for more information. Check our github repo about how to use the model.

Inference

The template for inference instances is as follows:

Prompting template:
USER: 以下是一些命名实体识别的例子：{Fill the NER examples here}
ASSISTANT: 我已读完这些例子。
USER: 文本：{Fill the input text here}
ASSISTANT: 我已读完这段文本。
USER: 文本中属于"{Fill the entity type here} "的实体有哪些？
ASSISTANT: (model's predictions in JSON format)

Note:

The model can conduct inference with and without NER examples. If you want to conduct inference without examples, just start from the third line in the above template by directly inputting "文本：{input text}" in the "USER" role.
Inferences are based on one entity type at a time. For multiple entity types, create separate instances for each type.