BSVGK/Text_to_KG_Construction_Dataset
Updated • 20
This is a LoRA fine-tuned version of Phi-3.5 Mini Instruct trained to extract structured RDF knowledge graph triples from raw UK government procurement contract text. The model was developed as part of a UEL–Depixen industrial placement research project focused on building trustworthy, hallucination-free domain-specific SLMs.
| Metric | Score |
|---|---|
| F1 Score | 0.9954 |
| BERTScore F1 | 0.9997 |
| Hallucination Rate | 0.00% (Zero) |
| Test Contracts | 1,387 unseen contracts |
This model was evaluated using a novel dual-level hallucination evaluation framework:
This framework proved that training loss alone is not a reliable quality signal for KG extraction tasks.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("BSVGK/phi35-mini-lora-text2kg-merged")
model = AutoModelForCausalLM.from_pretrained("BSVGK/phi35-mini-lora-text2kg-merged")
prompt = """Extract RDF triples from the following UK government contract text:
Contract: [paste your contract text here]
RDF Triples:"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Base model
microsoft/Phi-3.5-mini-instruct