File size: 4,928 Bytes
9ade236
7d01797
 
9ade236
 
 
 
7d01797
 
 
 
 
 
 
9ade236
 
 
 
256f197
7d01797
d9a7255
 
7d01797
 
 
 
 
 
 
 
 
6a468a7
7d01797
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d66b69d
 
7d01797
 
 
9ae27df
 
7d01797
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9ae27df
 
d66b69d
7d01797
 
d66b69d
 
7d01797
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d66b69d
7d01797
 
 
d66b69d
 
7d01797
9ade236
7d01797
 
 
 
 
 
 
 
 
9ade236
7d01797
 
9ade236
d66b69d
 
 
 
6a468a7
 
 
 
 
d66b69d
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
license: apache-2.0
base_model: microsoft/phi-4
tags:
- text-generation-inference
- transformers
- unsloth
- phi-4
- information-extraction
- ner
- relation-extraction
- knowledge-graph
- slm
model_creator: FinaPolat
language:
- en
---

# Phi-4-AdaptableIE: Efficient Adaptive Knowledge Graph Extraction

#### This model has gguf version: https://huggingface.co/FinaPolat/phi4_adaptableIE_v2-gguf

Phi-4-AdaptableIE is a specialized **14.7B parameter Small Language Model (SLM)** optimized via **Supervised Fine-Tuning (SFT)** for high-precision, **Joint Named Entity Recognition (NER) and Relation Extraction (RE)**. 

Unlike traditional multi-stage pipelines that are prone to cascading error propagation, this model performs entity identification and relational mapping in a single cohesive pass. It is designed to be **ontology-adaptive**, allowing it to conform to dynamic, unseen schemas at inference time through a specialized **Structured Prompt Architecture**.



## 🚀 Model Highlights
- **Joint Extraction:** Unified NER + RE reducing pipeline complexity.
- **Ontology-Adaptive:** Zero-shot adaptation to diverse domains (Astronomy, Music, Healthcare, etc.) via dynamic schema variables.
- **Local & Private:** Optimized for **local CPU-only inference** (via GGUF/Ollama - FinaPolat/phi4_adaptableIE_v2-gguf ), ensuring data sovereignty without external API dependencies.
- **Instruction Aligned:** Fine-tuned to follow strict negative constraints, ensuring zero conversational filler in outputs.

## 🛠 Methodology
The model was fine-tuned using **QLoRA** on the **WebNLG** subset of the **Text2KGBench** benchmark. The training process focused on **Conversational Alignment**, ensuring the model treats extraction as a strict logical mapping:
`Prompt = f(task, schema, example, text)`

---

## 📝 Prompting Strategy
To achieve high-fidelity extraction, the model requires a specific prompt structure.

### 1. System Prompt
```json
{
  "role": "system",
  "content": "You are a helpful AI assistant specializing in Information Extraction tasks such as Named Entity Recognition and Relation Extraction. Follow the instructions given by the user."
}
```


### 2. User Prompt Template

```css

Information Extraction is the process of automatically identifying and extracting structured information from unstructured text data... [Context] ...
Always extract numbers, dates, and currency values regardless of the specific task.

The task at hand is {task}.

Here is an example of task execution:
{example}

Analyze the text and targets carefully, identify relevant information.
Extract the information in the following format: `{output_format}`. 
If no matching entities are found, return an empty list: []. 
Please provide only the extracted information without any explanations.

Schema: {schema}
Text: {inputs}

```

### 3. 💻 Usage Examples
Option 1: Transformers (Single GPU)

```python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "FinaPolat/phi4_adaptableIE_v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)

task = "Joint NER and RE"
schema = "['CelestialBody', 'apoapsis', 'averageSpeed']"
inputs = "(19255) 1994 VK8 has an average speed of 4.56 km per second."
output_format = "[('subject', 'predicate', 'object')]"

prompt = f"Task: {task}\nSchema: {schema}\nText: {inputs}\nExtract:"

input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=256, temperature=0.0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

Option 2: High-Throughput Batch Inference (vLLM)

```python

from vllm import LLM, SamplingParams

llm = LLM(
    model="FinaPolat/phi4_adaptableIE_v2",
    dtype="bfloat16",
    trust_remote_code=True,
    gpu_memory_utilization=0.9,
    max_model_len=3000,
    enforce_eager=True, 
    distributed_executor_backend="uni" 
)

sampling_params = SamplingParams(temperature=0.0, max_tokens=256)
outputs = llm.chat(batch_prompts, sampling_params=sampling_params, use_tqdm=True)

```

### 4. 📦 Deployment & Hardware Requirements

| Deployment Mode | Quantization | Hardware Requirement                     | Target Latency |
|-----------------|--------------|------------------------------------------|----------------|
| Server-side     | BF16         | 1× NVIDIA A100 / RTX 4090 (24GB+)         | Ultra-Low      |
| Local Consumer  | 4-bit GGUF   | 16GB RAM (Apple Silicon / PC CPU)         | Moderate       |


For CPU-only local execution, refer to the GGUF version: phi4_adaptableIE_v2-gguf📜 

### 5. Citation & Credits

If you use this model in your research, please cite the Text2KGBench framework and the Microsoft Phi-4 technical report and our work: 
https://github.com/FinaPolat/ENEXA_adaptable_extraction

Video: https://www.youtube.com/watch?v=your-video-