monkwarrior08 commited on
Commit
f17dabe
·
verified ·
1 Parent(s): ce712f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -3
README.md CHANGED
@@ -1,3 +1,126 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ library_name: peft
5
+ tags:
6
+ - h2oai
7
+ - causal-lm
8
+ - text-generation
9
+ - adhd
10
+ - cpt-ii
11
+ - clinical-assistant
12
+ base_model: h2oai/h2o-danube3-500m-chat
13
+ ---
14
+
15
+ # ADHD CPT Analyst
16
+
17
+ ## Model Description
18
+
19
+ This model is a fine-tuned version of `h2oai/h2o-danube3-500m-chat`, specifically adapted for analyzing and interpreting textual reports from the Conners' Continuous Performance Test II (CPT-II). It has been trained using Low-Rank Adaptation (LoRA) on a dataset of CPT-II results to identify patterns relevant to the assessment of ADHD.
20
+
21
+ The model takes a textual summary of a patient's CPT-II scores as input and can provide analysis, explanations of the metrics, and potential interpretations.
22
+
23
+ ## Intended Uses & Limitations
24
+
25
+ This model is intended as a research and educational tool. It can be used to:
26
+ - Assist researchers in analyzing patterns across large datasets of CPT-II reports.
27
+ - Help students and trainees learn about the different metrics in a CPT-II report and their potential clinical significance.
28
+ - Provide a preliminary interpretation of a CPT-II report.
29
+
30
+ **Crucial Disclaimer:** This model is **not a medical device** and should **not** be used for self-diagnosis or as a substitute for professional medical advice, diagnosis, or treatment. Always consult with a qualified healthcare provider for any health-related concerns.
31
+
32
+ ## How to Use
33
+
34
+ To use this model, you need to load the base model (`h2oai/h2o-danube3-500m-chat`) and then apply the LoRA adapter.
35
+
36
+ ```python
37
+ import torch
38
+ from transformers import AutoTokenizer, AutoModelForCausalLM
39
+ from peft import PeftModel
40
+
41
+ # Model and repository parameters
42
+ base_model_id = "h2oai/h2o-danube3-500m-chat"
43
+ adapter_id = "monkwarrior08/adhd-cpt-analyst"
44
+
45
+ # Load tokenizer and base model
46
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
47
+ base_model = AutoModelForCausalLM.from_pretrained(
48
+ base_model_id,
49
+ torch_dtype=torch.bfloat16,
50
+ device_map="auto",
51
+ )
52
+
53
+ # Load the LoRA adapter
54
+ model = PeftModel.from_pretrained(base_model, adapter_id)
55
+ model.eval()
56
+
57
+ # Create a prompt with a patient's data
58
+ patient_report = """
59
+ Patient ID: 3.0
60
+ Assessment Status: 3.0
61
+ Assessment Duration: 839999.0 seconds
62
+
63
+ CPT II Summary Report:
64
+ - Omissions:
65
+ - General T-Score: 78.75
66
+ - ADHD T-Score: 70.25
67
+ - Raw Score: 11.0
68
+ - Commissions:
69
+ - General T-Score: 65.98
70
+ - ADHD T-Score: 70.89
71
+ - Raw Score: 28.0
72
+ - Hit Reaction Time (HitRT):
73
+ - General T-Score: 36.57
74
+ - Mean Reaction Time: 325.20 ms
75
+ ADHD Confidence Index: 86.87
76
+ """
77
+
78
+ prompt = f"<|prompt|>Analyze this CPT-II report and summarize the findings for potential indicators of ADHD.:\\n{patient_report}<|end|><|answer|>"
79
+
80
+ # Generate a response
81
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
82
+ outputs = model.generate(
83
+ **inputs,
84
+ max_new_tokens=256,
85
+ eos_token_id=tokenizer.eos_token_id,
86
+ do_sample=True,
87
+ temperature=0.6,
88
+ top_p=0.9,
89
+ )
90
+
91
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
92
+
93
+ # Extract only the model's answer
94
+ answer = response.split('<|answer|>')[1].strip()
95
+ print(answer)
96
+ ```
97
+
98
+ ## Training Data
99
+
100
+ The model was fine-tuned on a private dataset derived from the `ADHD Diagnosis CPT II Data.csv` file. Each record was converted into a textual summary containing the following key metrics:
101
+ - Omissions (T-Scores, Raw Score)
102
+ - Commissions (T-Scores, Raw Score)
103
+ - Hit Reaction Time (T-Score, Mean)
104
+ - Variability of SE
105
+ - d'
106
+ - Beta
107
+ - Perseverations
108
+ - ADHD Confidence Index
109
+
110
+ ## Training Procedure
111
+
112
+ The model was trained using the `trl` library's `SFTTrainer` with a LoRA configuration. The primary goal was to teach the model to understand the relationship between the various CPT-II metrics and their relevance in ADHD assessment.
113
+
114
+ ### BibTeX Citation
115
+
116
+ If you use this model in your research, please consider citing it:
117
+
118
+ ```bibtex
119
+ @software{monkwarrior08_2024_adhd_cpt_analyst,
120
+ author = {monkwarrior08},
121
+ title = {ADHD CPT Analyst: A Fine-tuned Language Model for CPT-II Report Interpretation},
122
+ month = {8},
123
+ year = {2024},
124
+ url = {https://huggingface.co/monkwarrior08/adhd-cpt-analyst}
125
+ }
126
+ ```