Varun-Chowdary commited on
Commit
b033f53
·
verified ·
1 Parent(s): eb471b1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DeBERTa-v3-base Fine-Tuned for Hallucination Detection
2
+
3
+ Model Details
4
+ Model Name: DeBERTa-v3-base
5
+ Architecture: DeBERTa (Decoding-enhanced BERT with disentangled attention)
6
+ Base Model: DeBERTa-v3-base
7
+ Fine-tuned Dataset: PAWS (Paraphrase Adversaries from Word Scrambling)
8
+ Task: Sentence Pair Classification (Hallucination Detection)
9
+ Model Description
10
+ This model is a fine-tuned version of the DeBERTa-v3-base model specifically for the task of detecting hallucinations between pairs of sentences. Hallucinations in this context refer to statements or information present in one sentence but not supported or contradicted by the other.
11
+
12
+ Fine-Tuning Dataset
13
+ Dataset Name: PAWS (Paraphrase Adversaries from Word Scrambling)
14
+ Dataset Description: The PAWS dataset contains pairs of sentences with high lexical overlap but different meanings, designed to challenge models' understanding of semantic content.
15
+
16
+ Dataset: https://huggingface.co/datasets/paws
17
+ Training Procedure
18
+ Number of Epochs: 10
19
+ Hardware Used: NVIDIA -A 100
20
+
21
+ Performance:
22
+ Accuracy: 94.88%
23
+ F1 Score: 92.3%
24
+ Precision: 92.82%
25
+ Recall: 95.81%
26
+
27
+
28
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
29
+ import torch
30
+
31
+ # Load model and tokenizer
32
+ tokenizer = AutoTokenizer.from_pretrained("Varun-Chowdary/hallucination_detect")
33
+ model = AutoModelForSequenceClassification.from_pretrained("Varun-Chowdary/hallucination_detect")
34
+
35
+ # Define the sentences
36
+ sentence1 = "Maradona was born in Argentina, South America."
37
+ sentence2 = "Maradona was born in Brazil, South America. "
38
+
39
+ # Tokenize and prepare input
40
+ inputs = tokenizer(sentence1, sentence2, return_tensors='pt', truncation=True, padding=True)
41
+
42
+ # Perform inference
43
+ with torch.no_grad():
44
+ outputs = model(**inputs)
45
+ logits = outputs.logits
46
+ probabilities = torch.softmax(logits, dim=1)
47
+
48
+ # Get the predicted label
49
+ predicted_label = torch.argmax(probabilities, dim=1).item()
50
+ labels = ["No Hallucination", "Hallucination"]
51
+ print(f"Predicted label: {labels[predicted_label]}")