--- language: en license: apache-2.0 library_name: peft tags: - h2oai - causal-lm - text-generation - adhd - cpt-ii - clinical-assistant base_model: h2oai/h2o-danube3-500m-chat --- # ADHD CPT Analyst ## Model Description This model is a fine-tuned version of `h2oai/h2o-danube3-500m-chat`, specifically adapted for analyzing and interpreting textual reports from the Conners' Continuous Performance Test II (CPT-II). It has been trained using Low-Rank Adaptation (LoRA) on a dataset of CPT-II results to identify patterns relevant to the assessment of ADHD. The model takes a textual summary of a patient's CPT-II scores from [ADHD Diagnosis Data](https://www.kaggle.com/datasets/arashnic/adhd-diagnosis-data) as input and can provide analysis, explanations of the metrics, and potential interpretations. ## Intended Uses & Limitations This model is intended as a research and educational tool. It can be used to: - Assist researchers in analyzing patterns across large datasets of CPT-II reports. - Help students and trainees learn about the different metrics in a CPT-II report and their potential clinical significance. - Provide a preliminary interpretation of a CPT-II report. **Crucial Disclaimer:** This model is **not a medical device** and should **not** be used for self-diagnosis or as a substitute for professional medical advice, diagnosis, or treatment. Always consult with a qualified healthcare provider for any health-related concerns. ## How to Use To use this model, you need to load the base model (`h2oai/h2o-danube3-500m-chat`) and then apply the LoRA adapter. ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Model and repository parameters base_model_id = "h2oai/h2o-danube3-500m-chat" adapter_id = "monkwarrior08/adhd-cpt-analyst" # Load tokenizer and base model tokenizer = AutoTokenizer.from_pretrained(base_model_id) base_model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype=torch.bfloat16, device_map="auto", ) # Load the LoRA adapter model = PeftModel.from_pretrained(base_model, adapter_id) model.eval() # Create a prompt with a patient's data patient_report = """ Patient ID: 3.0 Assessment Status: 3.0 Assessment Duration: 839999.0 seconds CPT II Summary Report: - Omissions: - General T-Score: 78.75 - ADHD T-Score: 70.25 - Raw Score: 11.0 - Commissions: - General T-Score: 65.98 - ADHD T-Score: 70.89 - Raw Score: 28.0 - Hit Reaction Time (HitRT): - General T-Score: 36.57 - Mean Reaction Time: 325.20 ms ADHD Confidence Index: 86.87 """ prompt = f"<|prompt|>Analyze this CPT-II report and summarize the findings for potential indicators of ADHD.:\\n{patient_report}<|end|><|answer|>" # Generate a response inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, eos_token_id=tokenizer.eos_token_id, do_sample=True, temperature=0.6, top_p=0.9, ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) # Extract only the model's answer answer = response.split('<|answer|>')[1].strip() print(answer) ``` ## Training Data The model was fine-tuned on a private dataset derived from the `ADHD Diagnosis CPT II Data.csv` file. Each record was converted into a textual summary containing the following key metrics: - Omissions (T-Scores, Raw Score) - Commissions (T-Scores, Raw Score) - Hit Reaction Time (T-Score, Mean) - Variability of SE - d' - Beta - Perseverations - ADHD Confidence Index ## Training Procedure The model was trained using the `trl` library's `SFTTrainer` with a LoRA configuration. The primary goal was to teach the model to understand the relationship between the various CPT-II metrics and their relevance in ADHD assessment. ### BibTeX Citation If you use this model in your research, please consider citing it: ```bibtex @software{monkwarrior08_2024_adhd_cpt_analyst, author = {monkwarrior08}, title = {ADHD CPT Analyst: A Fine-tuned Language Model for CPT-II Report Interpretation}, month = {8}, year = {2024}, url = {https://huggingface.co/monkwarrior08/adhd-cpt-analyst} } ```