toderian commited on
Commit
1562ea7
·
verified ·
1 Parent(s): 957b4f0

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +127 -164
README.md CHANGED
@@ -1,215 +1,178 @@
1
  ---
 
2
  license: mit
3
  tags:
4
- - pytorch
5
- - tabular-classification
 
6
  - medical
7
  - autism
8
- - asd
9
- - neurodevelopmental
10
- - healthcare
11
- - binary-classification
12
  language:
13
  - en
14
  metrics:
15
- - recall
16
- - precision
17
- - f1
18
  - accuracy
 
19
  - roc_auc
20
- pipeline_tag: tabular-classification
21
- library_name: pytorch
22
  ---
23
 
24
- # Autism Spectrum Disorder (ASD) Detector - Simplified
25
-
26
- A lightweight PyTorch model for ASD detection using only **8 key clinical features** (capturing 84% of predictive power).
27
 
28
  ## Model Description
29
 
30
- This simplified model requires only 8 inputs instead of 33, making it practical for clinical screening. The features were selected based on Random Forest feature importance analysis.
31
 
32
- ### Input Features (8 total)
33
 
34
- | # | Feature | Type | Values |
35
- |---|---------|------|--------|
36
- | 1 | **developmental_milestones** | categorical | `N` (Normal), `G` (Global delay), `M` (Motor delay), `C` (Cognitive delay) |
37
- | 2 | **iq_dq** | numeric | IQ/DQ score (typically 20-150) |
38
- | 3 | **intellectual_disability** | categorical | `N` (None), `F70.0` (Mild), `F71` (Moderate), `F72` (Severe) |
39
- | 4 | **language_disorder** | categorical | `N` (No), `Y` (Yes) |
40
- | 5 | **language_development** | categorical | `N` (Normal), `delay` (Delayed), `A` (Absent) |
41
- | 6 | **dysmorphism** | categorical | `NO` (Absent), `Y` (Present) |
42
- | 7 | **behaviour_disorder** | categorical | `N` (No), `Y` (Yes) |
43
- | 8 | **neurological_exam** | categorical | `N` (Normal), or abnormal description text |
44
 
45
- ### Feature Importance
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
- | Feature | Importance | Cumulative |
48
- |---------|------------|------------|
49
- | Developmental milestones | 22.3% | 22.3% |
50
- | IQ/DQ | 17.7% | 40.0% |
51
- | Intellectual disability (ICD) | 12.7% | 52.7% |
52
- | Language disorder | 12.1% | 64.8% |
53
- | Language development | 10.5% | 75.3% |
54
- | Dysmorphism | 3.3% | 78.6% |
55
- | Behaviour disorder | 2.9% | 81.5% |
56
- | Neurological exam | 2.8% | **84.3%** |
57
 
58
- ## Performance Metrics
59
 
60
- | Metric | Value |
61
- |--------|-------|
62
- | **Recall (Sensitivity)** | 93.65% |
63
- | **Precision** | 100.00% |
64
- | **F1 Score** | 96.72% |
65
- | **Accuracy** | 95.18% |
66
- | **AUC-ROC** | 99.05% |
67
 
68
- ### Confusion Matrix (Test Set, n=83)
 
69
 
70
- | | Predicted Healthy | Predicted ASD |
71
- |--|-------------------|---------------|
72
- | **Actual Healthy** | 20 | 0 |
73
- | **Actual ASD** | 4 | 59 |
74
 
75
- ## How to Use
 
 
76
 
77
- ### Installation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
- ```bash
80
- pip install torch scikit-learn joblib pandas
81
  ```
82
 
83
- ### Quick Start (TorchScript - Recommended)
84
 
85
- ```python
86
- import torch
87
- import joblib
 
 
88
 
89
- # Load model directly with PyTorch
90
- model = torch.jit.load('autism_detector_traced.pt')
91
- model.eval()
92
 
93
- # Load preprocessor
94
- preprocessor = joblib.load('preprocessor.joblib')
95
-
96
- # Prepare input (8 features)
97
- import pandas as pd
98
- patient = pd.DataFrame([{
99
- 'Developmental milestones- global delay (G), motor delay (M), cognitive delay (C)': 'N',
100
- 'IQ/DQ': 100,
101
- 'ICD': 'N',
102
- 'Language disorder Y= present, N=absent': 'N',
103
- 'Language development: delay, normal=N, absent=A': 'N',
104
- 'Dysmorphysm y=present, no=absent': 'NO',
105
- 'Behaviour disorder- agressivity, agitation, irascibility': 'N',
106
- 'Neurological Examination; N=normal, text = abnormal; free cell = examination not performed ???': 'N'
107
- }])
108
-
109
- # Preprocess and predict
110
- X = preprocessor.transform(patient)
111
- with torch.no_grad():
112
- prob = model(torch.FloatTensor(X)).item()
113
 
114
- print(f"Probability of ASD: {prob:.2%}")
115
- print(f"Prediction: {'ASD' if prob > 0.5 else 'Healthy'}")
116
- ```
117
 
118
- ### Using the Inference Helper
 
 
 
119
 
120
- ```python
121
- from inference import ASDPredictor
122
-
123
- predictor = ASDPredictor('.')
124
- result = predictor.predict({
125
- 'developmental_milestones': 'N',
126
- 'iq_dq': 100,
127
- 'intellectual_disability': 'N',
128
- 'language_disorder': 'N',
129
- 'language_development': 'N',
130
- 'dysmorphism': 'NO',
131
- 'behaviour_disorder': 'N',
132
- 'neurological_exam': 'N'
133
- })
134
-
135
- print(f"Prediction: {result['prediction']}") # 'Healthy'
136
- print(f"Probability: {result['probability_asd']:.2%}") # ~31%
137
- ```
138
 
139
- ### Example: Child with Developmental Concerns
 
 
 
 
140
 
141
- ```python
142
- result = predictor.predict({
143
- 'developmental_milestones': 'G', # Global delay
144
- 'iq_dq': 55, # Below average
145
- 'intellectual_disability': 'F70.0', # Mild
146
- 'language_disorder': 'Y', # Yes
147
- 'language_development': 'delay', # Delayed
148
- 'dysmorphism': 'NO',
149
- 'behaviour_disorder': 'Y', # Yes
150
- 'neurological_exam': 'N'
151
- })
152
-
153
- print(f"Prediction: {result['prediction']}") # 'ASD'
154
- print(f"Probability: {result['probability_asd']:.2%}") # ~84%
155
- ```
156
 
157
- ## Model Architecture
158
-
159
- ```
160
- Input (8 features)
161
-
162
- Linear(8, 32) → BatchNorm → ReLU → Dropout(0.3)
163
-
164
- Linear(32, 16) → BatchNorm → ReLU → Dropout(0.3)
165
-
166
- Linear(16, 1) → Sigmoid
167
-
168
- Output (probability of ASD)
169
- ```
170
 
171
  ## Files
172
 
173
  | File | Description |
174
  |------|-------------|
175
- | `autism_detector_traced.pt` | **TorchScript model** - load with `torch.jit.load()` |
176
- | `autism_detector.pth` | PyTorch checkpoint (weights + config) |
177
- | `preprocessor.joblib` | Feature preprocessor |
178
- | `config.json` | Model configuration |
179
  | `model.py` | Model class definition |
180
- | `inference.py` | Inference helper script |
181
  | `requirements.txt` | Python dependencies |
182
 
183
- ## Intended Use
184
-
185
- - **Research**: Studying ASD detection patterns
186
- - **Education**: ML applications in healthcare
187
- - **Screening support**: Assisting (not replacing) clinical assessment
188
-
189
- ### Limitations
190
-
191
- 1. Trained on 415 samples (315 ASD, 100 healthy)
192
- 2. Healthy controls are synthetically generated
193
- 3. Should not be used for standalone diagnosis
194
- 4. Performance may vary across populations
195
-
196
- ## Ethical Considerations
197
-
198
- - This is a screening tool, not a diagnostic instrument
199
- - Must be used alongside professional clinical assessment
200
- - False negatives (4 in test set) may delay intervention
201
- - Model decisions should be reviewed by qualified clinicians
202
-
203
  ## Citation
204
 
205
  ```bibtex
206
- @misc{asd_detector_simplified_2024,
207
- title={Simplified ASD Detector: 8-Feature Model for Autism Screening},
208
  year={2024},
209
- publisher={HuggingFace}
 
210
  }
211
  ```
212
-
213
- ## License
214
-
215
- MIT License
 
1
  ---
2
+ library_name: pytorch
3
  license: mit
4
  tags:
5
+ - tabular
6
+ - structured-data
7
+ - binary-classification
8
  - medical
9
  - autism
10
+ - screening
 
 
 
11
  language:
12
  - en
13
  metrics:
 
 
 
14
  - accuracy
15
+ - f1
16
  - roc_auc
 
 
17
  ---
18
 
19
+ # Autism Spectrum Disorder Screening Model
 
 
20
 
21
  ## Model Description
22
 
23
+ A feedforward neural network for autism spectrum disorder (ASD) risk screening using 8 structured clinical input features.
24
 
25
+ **Important:** This is a screening tool, NOT a diagnostic instrument. Results must be interpreted by qualified healthcare professionals.
26
 
27
+ ## Intended Use
 
 
 
 
 
 
 
 
 
28
 
29
+ - **Primary use:** Clinical decision support for ASD screening
30
+ - **Users:** Healthcare professionals, clinical software systems
31
+ - **Out of scope:** Self-diagnosis, definitive diagnosis
32
+
33
+ ## Input Features
34
+
35
+ | Field | Type | Valid Values | Description |
36
+ |-------|------|--------------|-------------|
37
+ | `developmental_milestones` | categorical | `N`, `G`, `M`, `C` | Normal, Global delay, Motor delay, Cognitive delay |
38
+ | `iq_dq` | numeric | 20-150 | IQ or Developmental Quotient |
39
+ | `intellectual_disability` | categorical | `N`, `F70.0`, `F71`, `F72` | None, Mild, Moderate, Severe (ICD-10) |
40
+ | `language_disorder` | binary | `N`, `Y` | No / Yes |
41
+ | `language_development` | categorical | `N`, `delay`, `A` | Normal, Delayed, Absent |
42
+ | `dysmorphism` | binary | `NO`, `Y` | No / Yes |
43
+ | `behaviour_disorder` | binary | `N`, `Y` | No / Yes |
44
+ | `neurological_exam` | text | non-empty string | `N` for normal, or description |
45
+
46
+ ## Output
47
+
48
+ ```json
49
+ {
50
+ "prediction": "Healthy" | "ASD",
51
+ "probability": 0.0-1.0,
52
+ "risk_level": "low" | "medium" | "high"
53
+ }
54
+ ```
55
 
56
+ ### Risk Level Thresholds
57
+ - **Low:** probability < 0.4
58
+ - **Medium:** 0.4 probability < 0.7
59
+ - **High:** probability ≥ 0.7
 
 
 
 
 
 
60
 
61
+ ## How to Use
62
 
63
+ ```python
64
+ import json
65
+ import torch
66
+ from pathlib import Path
67
+ from huggingface_hub import snapshot_download
 
 
68
 
69
+ # Download model
70
+ model_dir = Path(snapshot_download("toderian/autism-detector"))
71
 
72
+ # Load config
73
+ with open(model_dir / "preprocessor_config.json") as f:
74
+ preprocess_config = json.load(f)
 
75
 
76
+ # Load model
77
+ model = torch.jit.load(model_dir / "autism_detector_traced.pt")
78
+ model.eval()
79
 
80
+ # Preprocessing function
81
+ def preprocess(data, config):
82
+ features = []
83
+ for feature_name in config["feature_order"]:
84
+ if feature_name in config["categorical_features"]:
85
+ feat_config = config["categorical_features"][feature_name]
86
+ if feat_config["type"] == "text_binary":
87
+ value = 0 if data[feature_name].upper() == feat_config["normal_value"] else 1
88
+ else:
89
+ value = feat_config["mapping"][data[feature_name]]
90
+ else:
91
+ feat_config = config["numeric_features"][feature_name]
92
+ raw = float(data[feature_name])
93
+ value = (raw - feat_config["min"]) / (feat_config["max"] - feat_config["min"])
94
+ features.append(value)
95
+ return torch.tensor([features], dtype=torch.float32)
96
+
97
+ # Example inference
98
+ input_data = {
99
+ "developmental_milestones": "N",
100
+ "iq_dq": 85,
101
+ "intellectual_disability": "N",
102
+ "language_disorder": "N",
103
+ "language_development": "N",
104
+ "dysmorphism": "NO",
105
+ "behaviour_disorder": "N",
106
+ "neurological_exam": "N"
107
+ }
108
+
109
+ input_tensor = preprocess(input_data, preprocess_config)
110
+ with torch.no_grad():
111
+ output = model(input_tensor)
112
+ probs = torch.softmax(output, dim=-1)
113
+ asd_probability = probs[0, 1].item()
114
 
115
+ print(f"ASD Probability: {asd_probability:.2%}")
116
+ print(f"Prediction: {'ASD' if asd_probability > 0.5 else 'Healthy'}")
117
  ```
118
 
119
+ ## Training Details
120
 
121
+ - **Dataset:** 315 ASD patients + 100 healthy controls (415 total)
122
+ - **Preprocessing:** Min-max normalization for numeric, label encoding for categorical
123
+ - **Architecture:** Feedforward NN (input → 64 → 32 → 2)
124
+ - **Loss:** Cross-entropy
125
+ - **Optimizer:** Adam (lr=0.001)
126
 
127
+ ## Evaluation
 
 
128
 
129
+ | Metric | Value |
130
+ |--------|-------|
131
+ | Accuracy | 0.9759 |
132
+ | F1 Score | 0.9839 |
133
+ | ROC-AUC | 0.9913 |
134
+ | Sensitivity | 0.9683 |
135
+ | Specificity | 1.0000 |
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
+ ### Confusion Matrix (Test Set, n=83)
 
 
138
 
139
+ | | Predicted Healthy | Predicted ASD |
140
+ |--|-------------------|---------------|
141
+ | Actual Healthy | 20 | 0 |
142
+ | Actual ASD | 2 | 61 |
143
 
144
+ ## Limitations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
 
146
+ - Trained on limited dataset (415 samples)
147
+ - Healthy controls are synthetically generated
148
+ - Not validated across diverse populations
149
+ - Screening tool only, not diagnostic
150
+ - Requires all 8 input fields
151
 
152
+ ## Ethical Considerations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
 
154
+ - Results should always be reviewed by qualified professionals
155
+ - Should not be used as sole basis for clinical decisions
156
+ - Model performance may vary across different populations
157
+ - False negatives (2 in test set) may delay intervention
 
 
 
 
 
 
 
 
 
158
 
159
  ## Files
160
 
161
  | File | Description |
162
  |------|-------------|
163
+ | `autism_detector_traced.pt` | TorchScript model (load with `torch.jit.load()`) |
164
+ | `config.json` | Model architecture configuration |
165
+ | `preprocessor_config.json` | Feature preprocessing rules (JSON, no pickle) |
 
166
  | `model.py` | Model class definition |
 
167
  | `requirements.txt` | Python dependencies |
168
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  ## Citation
170
 
171
  ```bibtex
172
+ @misc{asd_detector_2024,
173
+ title={Autism Spectrum Disorder Screening Model},
174
  year={2024},
175
+ publisher={HuggingFace},
176
+ url={https://huggingface.co/toderian/autism-detector}
177
  }
178
  ```