Phi-3.5 Mini Instruct — Text-to-KG (UK Government Contracts)

Model Summary

This is a LoRA fine-tuned version of Phi-3.5 Mini Instruct trained to extract structured RDF knowledge graph triples from raw UK government procurement contract text. The model was developed as part of a UEL–Depixen industrial placement research project focused on building trustworthy, hallucination-free domain-specific SLMs.

Key Results

Metric	Score
F1 Score	0.9954
BERTScore F1	0.9997
Hallucination Rate	0.00% (Zero)
Test Contracts	1,387 unseen contracts

Model Details

Base Model: microsoft/Phi-3.5-mini-instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Task: Text-to-KG — extracting RDF triples from contract text
Domain: UK Government Procurement Contracts
Training Dataset: 9,244 verified UK government contracts
Hardware: NVIDIA A100
Framework: PyTorch, Hugging Face PEFT, TRL, SFTTrainer

Training Data

Source: UK Government procurement contracts
Size: 9,244 training samples | 1,387 test samples
Format: Contract text → RDF triple extraction
Dataset: BSVGK/uk-contracts-text-to-kg

Hallucination Evaluation Framework

This model was evaluated using a novel dual-level hallucination evaluation framework:

L1 — Relation Validity: Checks if extracted relations exist in the ontology
L2 — Entity Grounding: Verifies entities are grounded in the source contract text

This framework proved that training loss alone is not a reliable quality signal for KG extraction tasks.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("BSVGK/phi35-mini-lora-text2kg-merged")
model = AutoModelForCausalLM.from_pretrained("BSVGK/phi35-mini-lora-text2kg-merged")

prompt = """Extract RDF triples from the following UK government contract text:

Contract: [paste your contract text here]

RDF Triples:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 7

Safetensors

Model size

4B params

Tensor type

F16

Model tree for BSVGK/phi35-mini-lora-text2kg-merged

Base model

microsoft/Phi-3.5-mini-instruct

Adapter

(714)

this model

BSVGK
/

phi35-mini-lora-text2kg-merged