|
|
--- |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
library_name: peft |
|
|
tags: |
|
|
- h2oai |
|
|
- causal-lm |
|
|
- text-generation |
|
|
- adhd |
|
|
- cpt-ii |
|
|
- clinical-assistant |
|
|
base_model: h2oai/h2o-danube3-500m-chat |
|
|
--- |
|
|
|
|
|
# ADHD CPT Analyst |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a fine-tuned version of `h2oai/h2o-danube3-500m-chat`, specifically adapted for analyzing and interpreting textual reports from the Conners' Continuous Performance Test II (CPT-II). It has been trained using Low-Rank Adaptation (LoRA) on a dataset of CPT-II results to identify patterns relevant to the assessment of ADHD. |
|
|
|
|
|
The model takes a textual summary of a patient's CPT-II scores from [ADHD Diagnosis Data](https://www.kaggle.com/datasets/arashnic/adhd-diagnosis-data) as input and can provide analysis, explanations of the metrics, and potential interpretations. |
|
|
|
|
|
## Intended Uses & Limitations |
|
|
|
|
|
This model is intended as a research and educational tool. It can be used to: |
|
|
- Assist researchers in analyzing patterns across large datasets of CPT-II reports. |
|
|
- Help students and trainees learn about the different metrics in a CPT-II report and their potential clinical significance. |
|
|
- Provide a preliminary interpretation of a CPT-II report. |
|
|
|
|
|
**Crucial Disclaimer:** This model is **not a medical device** and should **not** be used for self-diagnosis or as a substitute for professional medical advice, diagnosis, or treatment. Always consult with a qualified healthcare provider for any health-related concerns. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
To use this model, you need to load the base model (`h2oai/h2o-danube3-500m-chat`) and then apply the LoRA adapter. |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
|
|
|
# Model and repository parameters |
|
|
base_model_id = "h2oai/h2o-danube3-500m-chat" |
|
|
adapter_id = "monkwarrior08/adhd-cpt-analyst" |
|
|
|
|
|
# Load tokenizer and base model |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model_id, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
) |
|
|
|
|
|
# Load the LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, adapter_id) |
|
|
model.eval() |
|
|
|
|
|
# Create a prompt with a patient's data |
|
|
patient_report = """ |
|
|
Patient ID: 3.0 |
|
|
Assessment Status: 3.0 |
|
|
Assessment Duration: 839999.0 seconds |
|
|
|
|
|
CPT II Summary Report: |
|
|
- Omissions: |
|
|
- General T-Score: 78.75 |
|
|
- ADHD T-Score: 70.25 |
|
|
- Raw Score: 11.0 |
|
|
- Commissions: |
|
|
- General T-Score: 65.98 |
|
|
- ADHD T-Score: 70.89 |
|
|
- Raw Score: 28.0 |
|
|
- Hit Reaction Time (HitRT): |
|
|
- General T-Score: 36.57 |
|
|
- Mean Reaction Time: 325.20 ms |
|
|
ADHD Confidence Index: 86.87 |
|
|
""" |
|
|
|
|
|
prompt = f"<|prompt|>Analyze this CPT-II report and summarize the findings for potential indicators of ADHD.:\\n{patient_report}<|end|><|answer|>" |
|
|
|
|
|
# Generate a response |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=256, |
|
|
eos_token_id=tokenizer.eos_token_id, |
|
|
do_sample=True, |
|
|
temperature=0.6, |
|
|
top_p=0.9, |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
# Extract only the model's answer |
|
|
answer = response.split('<|answer|>')[1].strip() |
|
|
print(answer) |
|
|
``` |
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was fine-tuned on a private dataset derived from the `ADHD Diagnosis CPT II Data.csv` file. Each record was converted into a textual summary containing the following key metrics: |
|
|
- Omissions (T-Scores, Raw Score) |
|
|
- Commissions (T-Scores, Raw Score) |
|
|
- Hit Reaction Time (T-Score, Mean) |
|
|
- Variability of SE |
|
|
- d' |
|
|
- Beta |
|
|
- Perseverations |
|
|
- ADHD Confidence Index |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
The model was trained using the `trl` library's `SFTTrainer` with a LoRA configuration. The primary goal was to teach the model to understand the relationship between the various CPT-II metrics and their relevance in ADHD assessment. |
|
|
|
|
|
### BibTeX Citation |
|
|
|
|
|
If you use this model in your research, please consider citing it: |
|
|
|
|
|
```bibtex |
|
|
@software{monkwarrior08_2024_adhd_cpt_analyst, |
|
|
author = {monkwarrior08}, |
|
|
title = {ADHD CPT Analyst: A Fine-tuned Language Model for CPT-II Report Interpretation}, |
|
|
month = {8}, |
|
|
year = {2024}, |
|
|
url = {https://huggingface.co/monkwarrior08/adhd-cpt-analyst} |
|
|
} |
|
|
``` |
|
|
|