dataframer
/

hallucination-detection-model

Safetensors

English

Model card Files Files and versions

xet

Community

bibekp commited on Apr 24, 2025

Commit

3b2be27

verified ·

1 Parent(s): 1bd4d34

Update README.md

Browse files

Files changed (1) hide show

README.md +79 -48

README.md CHANGED Viewed

@@ -138,68 +138,99 @@ pip install hdm2 --quiet
 Run the HDM-2 model
 ```python
-    # Load the model from HuggingFace into the GPU
-    from hdm2 import HallucinationDetectionModel
-    hdm_model = HallucinationDetectionModel()
-    prompt = "Explain how the heart functions"
-    context = """
-    The heart is a muscular organ that pumps blood throughout the body.
-    It has four chambers: two atria and two ventricles.
-    """
-    response = """The heart is a vital six-chambered organ that pumps blood throughout the human body.
-    It contains three atria and three ventricles that work in harmony to circulate blood.
-    The heart primarily runs on glucose for energy and typically beats at a rate of 20-30 beats per minute in adults.
-    Located in the center-left of the chest, the heart is protected by the ribcage.
-    The average human heart weighs about 5 pounds and will beat approximately 2 million times in a lifetime.
-    """
-    # Ground truth:
-    # Hearts have 4 chambers (not 6), have 2 atria and 2 ventricles (not 3 each),
-    # normal heart rate is 60-100 BPM (not 20-30),
-    # average heart weighs ~10 oz (not 5 pounds),
-    # and beats ~2.5 billion times (not 2 million) in a lifetime
-    # Detect hallucinations with default parameters
-    results = hdm_model.apply(prompt, context, response)
 ```
 Print the results
 ```python
-    # Utility function to help with printing the model output
-    def print_results(results):
-     #print(results)
-     # Print results
-     print(f"\nHallucination severity: {results['adjusted_hallucination_severity']:.4f}")
-     # Print hallucinated sentences
-     if results['candidate_sentences']:
-         print("\nPotentially hallucinated sentences:")
-         is_ck_hallucinated = False
-         for sentence_result in results['ck_results']:
-             if sentence_result['prediction'] == 1:  # 1 indicates hallucination
-                 print(f"- {sentence_result['text']} (Probability: {sentence_result['hallucination_probability']:.4f})")
-                 is_ck_hallucinated = True
-         if not is_ck_hallucinated:
-           print("No hallucinated sentences detected.")
-     else:
-         print("\nNo hallucinated sentences detected.")
-    print_results(results)
 ```
 ```
 OUTPUT:
-    Hallucination severity: 0.9844
-    Potentially hallucinated sentences:
-    - The heart is a vital six-chambered organ that pumps blood throughout the human body. (Probability: 0.9102)
-    - It contains three atria and three ventricles that work in harmony to circulate blood. (Probability: 1.0000)
-    - The heart primarily runs on glucose for energy and typically beats at a rate of 20-30 beats per minute in adults. (Probability: 0.9844)
 ```
 ### Model Description
 - Model ID: HDM-2-3B

 Run the HDM-2 model
 ```python
+# Load the model from HuggingFace into the GPU
+from hdm2 import HallucinationDetectionModel
+hdm_model = HallucinationDetectionModel()
+prompt = "You are an AIMon Bot. Give me an overview of the hospital's clinical trial enrollments for Q1 2025."
+context = """In Q1 2025, Northbridge Medical Center enrolled 573 patients across four major clinical trials.
+The Oncology Research Study (ORION-5) had the highest enrollment with 220 patients.
+Cardiology trials, specifically the CardioNext Study, saw 145 patients enrolled.
+Neurodegenerative research trials enrolled 88 participants.
+Orthopedic trials enrolled 120 participants for regenerative joint therapies.
+"""
+response = """Hi, I am AIMon Bot!
+I will be happy to help with an overview of the hospital's clinical trial enrollments for Q1 2025.
+Northbridge Medical Center enrolled 573 patients across major clinical trials in Q1 2025.
+Heart disease remains the leading cause of death globally, according to the World Health Organization.
+For more information about our clinical research programs, please contact the Northbridge Medical Center Research Office.
+Northbridge has consistently led regional trial enrollments since 2020, particularly in oncology and cardiac research.
+In Q1 2025, Northbridge's largest enrollment was in a neurology-focused trial with 500 patients studying advanced orthopedic devices.
+Can I help you with something else?
+"""
+# Ground truth:
+# The highest enrollment study had 220 patients, not 573.
+# This sentence is not in the provided context, and is enterprise knowledge: Northbridge has consistently led regional trial enrollments since 2020, particularly in oncology and cardiac research.
+# Detect hallucinations with default parameters
+results = hdm_model.apply(prompt, context, response)
 ```
 Print the results
 ```python
+# Utility function to help with printing the model output
+def print_results(results):
+  #print(results)
+  # Print results
+  print(f"\nHallucination severity: {results['adjusted_hallucination_severity']:.4f}")
+  # Print hallucinated sentences
+  if results['candidate_sentences']:
+     print("\nPotentially hallucinated sentences:")
+     is_ck_hallucinated = False
+     for sentence_result in results['ck_results']:
+         if sentence_result['prediction'] == 1:  # 1 indicates hallucination
+             print(f"- {sentence_result['text']} (Probability: {sentence_result['hallucination_probability']:.4f})")
+             is_ck_hallucinated = True
+     if not is_ck_hallucinated:
+       print("No hallucinated sentences detected.")
+  else:
+     print("\nNo hallucinated sentences detected.")
+print_results(results)
 ```
 ```
 OUTPUT:
+Hallucination severity: 0.9531
+Potentially hallucinated sentences:
+- Northbridge has consistently led regional trial enrollments since 2020, particularly in oncology and cardiac research. (Probability: 0.9180)
+- In Q1 2025, Northbridge's largest enrollment was in a neurology-focused trial with 500 patients studying advanced orthopedic devices. (Probability: 1.0000)
 ```
+Notice that
+- Innocuous statements like *Can I help you with something else?*, and *Hi, I'm an AIMon bot* are not marked as hallucinations.
+- Common-knowledge statements are correctly filtered out by the common-knowledge checker, even though they are not present in the context, e.g., *Heart disease remains the leading cause of death globally, according to the World Health Organization.*
+- Statements with enterprise knowledge cannot be handled by this model. Please contact us if you want to use additional capabilities for your use-cases.
+To display word-level annotations, use the following code snippet.
+```
+from hdm2.utils.render_utils import display_hallucination_results_words
+display_hallucination_results_words(
+    results,
+    show_scores=False, # True if you want to display scores alongside the candidate words
+    color_scheme="blue-red",
+    separate_classes=True, # False if you don't want separate colors for Common Knowledge sentences
+)
+```
+The word-level annotations will be displayed as shown below.
+The color tones indicate the scores (darker color means higher score).
+Words with red background are hallucinations.
+Words with blue background are context-hallucinations but marked as problem-free by the common-knowledge checker.
+Words with white background are problem-free text.
+Finally, all the candidate sentences (sentences that contain context-hallucinations) are shown at the bottom, together with results from the common-knowledge checker.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/66b686e15ffbd1973ae61d01/raBYWT31RF-90NWA-zOcc.png)
 ### Model Description
 - Model ID: HDM-2-3B