FinaPolat commited on
Commit
7d01797
·
verified ·
1 Parent(s): ae46814

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -9
README.md CHANGED
@@ -1,21 +1,105 @@
1
  ---
2
- base_model: unsloth/phi-4-unsloth-bnb-4bit
 
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
7
- - llama
8
- license: apache-2.0
 
 
 
 
 
9
  language:
10
  - en
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- - **Developed by:** FinaPolat
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit
 
 
 
 
 
 
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: microsoft/phi-4
4
  tags:
5
  - text-generation-inference
6
  - transformers
7
  - unsloth
8
+ - phi-4
9
+ - information-extraction
10
+ - ner
11
+ - relation-extraction
12
+ - knowledge-graph
13
+ - slm
14
+ model_creator: FinaPolat
15
  language:
16
  - en
17
  ---
18
 
19
+ # Phi-4-AdaptableIE: Efficient & Privacy-Preserving Knowledge Graph Extraction
20
+
21
+ Phi-4-AdaptableIE is a specialized **14.7B parameter Small Language Model (SLM)** optimized via **Supervised Fine-Tuning (SFT)** for high-precision, **Joint Named Entity Recognition (NER) and Relation Extraction (RE)**.
22
+
23
+ Unlike traditional multi-stage pipelines that are prone to cascading error propagation, this model performs entity identification and relational mapping in a single cohesive pass. It is designed to be **ontology-adaptive**, allowing it to conform to dynamic, unseen schemas at inference time through a specialized **Structured Prompt Architecture**.
24
+
25
+
26
+
27
+ ## 🚀 Model Highlights
28
+ - **Joint Extraction:** Unified NER + RE reducing pipeline complexity.
29
+ - **Ontology-Adaptive:** Zero-shot adaptation to diverse domains (Astronomy, Music, Healthcare, etc.) via dynamic schema variables.
30
+ - **Local & Private:** Optimized for **local CPU-only inference** (via GGUF/Ollama), ensuring data sovereignty without external API dependencies.
31
+ - **Instruction Aligned:** Fine-tuned to follow strict negative constraints, ensuring zero conversational filler in outputs.
32
+
33
+ ## 🛠 Methodology
34
+ The model was fine-tuned using **QLoRA** on the **WebNLG** subset of the **Text2KGBench** benchmark. The training process focused on **Conversational Alignment**, ensuring the model treats extraction as a strict logical mapping:
35
+ `Prompt = f(task, schema, example, text)`
36
+
37
+ ---
38
+
39
+ ## 📝 Prompting Strategy
40
+ To achieve high-fidelity extraction, the model requires a specific prompt structure.
41
+
42
+ ### 1. System Prompt
43
+ ```json
44
+ {
45
+ "role": "system",
46
+ "content": "You are a helpful AI assistant specializing in Information Extraction tasks such as Named Entity Recognition and Relation Extraction. Follow the instructions given by the user."
47
+ }
48
+
49
+ ### 2. User Prompt Template
50
+
51
+ Information Extraction is the process of automatically identifying and extracting structured information from unstructured text data... [Context] ...
52
+ Always extract numbers, dates, and currency values regardless of the specific task.
53
+
54
+ The task at hand is {task}.
55
+
56
+ Here is an example of task execution:
57
+ {example}
58
+
59
+ Analyze the text and targets carefully, identify relevant information.
60
+ Extract the information in the following format: `{output_format}`.
61
+ If no matching entities are found, return an empty list: [].
62
+ Please provide only the extracted information without any explanations.
63
+
64
+ Schema: {schema}
65
+ Text: {inputs}
66
+
67
+ 💻 Usage Examples
68
+ Option 1: Transformers (Single GPU)
69
+
70
+ from transformers import AutoModelForCausalLM, AutoTokenizer
71
+
72
+ model_id = "FinaPolat/phi4_adaptableIE_v2"
73
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
74
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)
75
+
76
+ task = "Joint NER and RE"
77
+ schema = "['CelestialBody', 'apoapsis', 'averageSpeed']"
78
+ inputs = "(19255) 1994 VK8 has an average speed of 4.56 km per second."
79
+ output_format = "[('subject', 'predicate', 'object')]"
80
+
81
+ prompt = f"Task: {task}\nSchema: {schema}\nText: {inputs}\nExtract:"
82
+
83
+ input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
84
+ outputs = model.generate(**input_ids, max_new_tokens=256, temperature=0.0)
85
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
86
+
87
+
88
+ Option 2: High-Throughput Batch Inference (vLLM)
89
+
90
+ from vllm import LLM, SamplingParams
91
 
92
+ llm = LLM(
93
+ model="FinaPolat/phi4_adaptableIE_v2",
94
+ dtype="bfloat16",
95
+ trust_remote_code=True,
96
+ gpu_memory_utilization=0.9,
97
+ max_model_len=3000,
98
+ enforce_eager=True,
99
+ distributed_executor_backend="uni"
100
+ )
101
 
102
+ sampling_params = SamplingParams(temperature=0.0, max_tokens=256)
103
+ outputs = llm.chat(batch_prompts, sampling_params=sampling_params, use_tqdm=True)
104
 
105
+ 📦 Deployment & Hardware RequirementsDeployment ModeQuantizationHardware RequirementTarget LatencyServer-sideBF161x NVIDIA A100/RTX 4090 (24GB+)Ultra-LowLocal Consumer4-bit GGUF16GB RAM (Apple Silicon / PC CPU)ModerateFor CPU-only local execution, refer to the GGUF version: phi4_adaptableIE_v2-gguf📜 Citation & CreditsIf you use this model in your research, please cite the Text2KGBench framework and the Microsoft Phi-4 technical report.[Link to your Demo Paper or GitHub Repo]