Instructions to use zjunlp/OneKE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zjunlp/OneKE with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zjunlp/OneKE")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("zjunlp/OneKE") model = AutoModelForCausalLM.from_pretrained("zjunlp/OneKE") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use zjunlp/OneKE with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zjunlp/OneKE" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/OneKE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/zjunlp/OneKE
- SGLang
How to use zjunlp/OneKE with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zjunlp/OneKE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/OneKE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zjunlp/OneKE" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/OneKE", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use zjunlp/OneKE with Docker Model Runner:
docker model run hf.co/zjunlp/OneKE
Update README.md
Browse files
README.md
CHANGED
|
@@ -28,7 +28,7 @@ OneKE is a new bilingual knowledge extraction large model developed jointly by Z
|
|
| 28 |
|
| 29 |
|
| 30 |
## How is OneKE trained?
|
| 31 |
-
OneKE mainly focuses on schema-generalizable information extraction. Due to issues such as non-standard formats, noisy data, and lack of diversity in existing extraction instruction data, OneKE adopted techniques such as normalization and cleaning of extraction instructions, difficult negative sample collection, and schema-based
|
| 32 |
|
| 33 |
The zero-shot generalization comparison results of OneKE with other large models are as follows:
|
| 34 |
* `NER-en`: CrossNER_AI, CrossNER_literature, CrossNER_music, CrossNER_politics, CrossNER_science
|
|
@@ -268,7 +268,7 @@ split_num_mapper = {
|
|
| 268 |
```
|
| 269 |
|
| 270 |
|
| 271 |
-
Since predicting all schemas in the label set at once is too challenging and not easily scalable, OneKE uses a
|
| 272 |
|
| 273 |
**schema格式**:
|
| 274 |
|
|
@@ -281,7 +281,7 @@ EEA: [{"event_type": "Finance/Trading - Interest Rate Hike", "arguments": ["Time
|
|
| 281 |
```
|
| 282 |
|
| 283 |
|
| 284 |
-
Below is a simple
|
| 285 |
|
| 286 |
```python
|
| 287 |
def get_instruction(language, task, schema, input):
|
|
@@ -359,7 +359,7 @@ for split_schema in split_schemas:
|
|
| 359 |
|
| 360 |
|
| 361 |
<details>
|
| 362 |
-
<summary><b>Event Extraction (EE)
|
| 363 |
|
| 364 |
```json
|
| 365 |
{
|
|
@@ -407,7 +407,7 @@ for split_schema in split_schemas:
|
|
| 407 |
|
| 408 |
|
| 409 |
<details>
|
| 410 |
-
<summary><b>Knowledge Graph Construction (KGC)
|
| 411 |
|
| 412 |
```json
|
| 413 |
{
|
|
|
|
| 28 |
|
| 29 |
|
| 30 |
## How is OneKE trained?
|
| 31 |
+
OneKE mainly focuses on schema-generalizable information extraction. Due to issues such as non-standard formats, noisy data, and lack of diversity in existing extraction instruction data, OneKE adopted techniques such as normalization and cleaning of extraction instructions, difficult negative sample collection, and schema-based batched instruction construction, as shown in the illustration. For more detailed information, refer to the paper "[IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus](https://arxiv.org/abs/2402.14710) [[Github](https://github.com/zjunlp/IEPile)]".
|
| 32 |
|
| 33 |
The zero-shot generalization comparison results of OneKE with other large models are as follows:
|
| 34 |
* `NER-en`: CrossNER_AI, CrossNER_literature, CrossNER_music, CrossNER_politics, CrossNER_science
|
|
|
|
| 268 |
```
|
| 269 |
|
| 270 |
|
| 271 |
+
Since predicting all schemas in the label set at once is too challenging and not easily scalable, OneKE uses a batched approach during training. It divides the number of schemas asked in the instructions, querying a fixed number of schemas at a time. Hence, if the label set of a piece of data is too long, it will be split into multiple instructions that the model will address in turns.
|
| 272 |
|
| 273 |
**schema格式**:
|
| 274 |
|
|
|
|
| 281 |
```
|
| 282 |
|
| 283 |
|
| 284 |
+
Below is a simple Batched Instruction Generation script:
|
| 285 |
|
| 286 |
```python
|
| 287 |
def get_instruction(language, task, schema, input):
|
|
|
|
| 359 |
|
| 360 |
|
| 361 |
<details>
|
| 362 |
+
<summary><b>Event Extraction (EE) Description Instructions</b></summary>
|
| 363 |
|
| 364 |
```json
|
| 365 |
{
|
|
|
|
| 407 |
|
| 408 |
|
| 409 |
<details>
|
| 410 |
+
<summary><b>Knowledge Graph Construction (KGC) Description Instructions</b></summary>
|
| 411 |
|
| 412 |
```json
|
| 413 |
{
|