toderian
/

autism-detector

@@ -1,215 +1,178 @@
 ---
 license: mit
 tags:
-  - pytorch
-  - tabular-classification
   - medical
   - autism
-  - asd
-  - neurodevelopmental
-  - healthcare
-  - binary-classification
 language:
   - en
 metrics:
-  - recall
-  - precision
-  - f1
   - accuracy
   - roc_auc
-pipeline_tag: tabular-classification
-library_name: pytorch
 ---
-# Autism Spectrum Disorder (ASD) Detector - Simplified
-A lightweight PyTorch model for ASD detection using only **8 key clinical features** (capturing 84% of predictive power).
 ## Model Description
-This simplified model requires only 8 inputs instead of 33, making it practical for clinical screening. The features were selected based on Random Forest feature importance analysis.
-### Input Features (8 total)
-| # | Feature | Type | Values |
-|---|---------|------|--------|
-| 1 | **developmental_milestones** | categorical | `N` (Normal), `G` (Global delay), `M` (Motor delay), `C` (Cognitive delay) |
-| 2 | **iq_dq** | numeric | IQ/DQ score (typically 20-150) |
-| 3 | **intellectual_disability** | categorical | `N` (None), `F70.0` (Mild), `F71` (Moderate), `F72` (Severe) |
-| 4 | **language_disorder** | categorical | `N` (No), `Y` (Yes) |
-| 5 | **language_development** | categorical | `N` (Normal), `delay` (Delayed), `A` (Absent) |
-| 6 | **dysmorphism** | categorical | `NO` (Absent), `Y` (Present) |
-| 7 | **behaviour_disorder** | categorical | `N` (No), `Y` (Yes) |
-| 8 | **neurological_exam** | categorical | `N` (Normal), or abnormal description text |
-### Feature Importance
-| Feature | Importance | Cumulative |
-|---------|------------|------------|
-| Developmental milestones | 22.3% | 22.3% |
-| IQ/DQ | 17.7% | 40.0% |
-| Intellectual disability (ICD) | 12.7% | 52.7% |
-| Language disorder | 12.1% | 64.8% |
-| Language development | 10.5% | 75.3% |
-| Dysmorphism | 3.3% | 78.6% |
-| Behaviour disorder | 2.9% | 81.5% |
-| Neurological exam | 2.8% | **84.3%** |
-## Performance Metrics
-| Metric | Value |
-|--------|-------|
-| **Recall (Sensitivity)** | 93.65% |
-| **Precision** | 100.00% |
-| **F1 Score** | 96.72% |
-| **Accuracy** | 95.18% |
-| **AUC-ROC** | 99.05% |
-### Confusion Matrix (Test Set, n=83)
-|  | Predicted Healthy | Predicted ASD |
-|--|-------------------|---------------|
-| **Actual Healthy** | 20 | 0 |
-| **Actual ASD** | 4 | 59 |
-## How to Use
-### Installation
-```bash
-pip install torch scikit-learn joblib pandas
 ```
-### Quick Start (TorchScript - Recommended)
-```python
-import torch
-import joblib
-# Load model directly with PyTorch
-model = torch.jit.load('autism_detector_traced.pt')
-model.eval()
-# Load preprocessor
-preprocessor = joblib.load('preprocessor.joblib')
-# Prepare input (8 features)
-import pandas as pd
-patient = pd.DataFrame([{
-    'Developmental milestones- global delay (G), motor delay (M), cognitive delay (C)': 'N',
-    'IQ/DQ': 100,
-    'ICD': 'N',
-    'Language disorder Y= present, N=absent': 'N',
-    'Language development: delay, normal=N, absent=A': 'N',
-    'Dysmorphysm y=present, no=absent': 'NO',
-    'Behaviour disorder- agressivity, agitation, irascibility': 'N',
-    'Neurological Examination; N=normal, text = abnormal; free cell = examination not performed ???': 'N'
-}])
-# Preprocess and predict
-X = preprocessor.transform(patient)
-with torch.no_grad():
-    prob = model(torch.FloatTensor(X)).item()
-print(f"Probability of ASD: {prob:.2%}")
-print(f"Prediction: {'ASD' if prob > 0.5 else 'Healthy'}")
-```
-### Using the Inference Helper
-```python
-from inference import ASDPredictor
-predictor = ASDPredictor('.')
-result = predictor.predict({
-    'developmental_milestones': 'N',
-    'iq_dq': 100,
-    'intellectual_disability': 'N',
-    'language_disorder': 'N',
-    'language_development': 'N',
-    'dysmorphism': 'NO',
-    'behaviour_disorder': 'N',
-    'neurological_exam': 'N'
-})
-print(f"Prediction: {result['prediction']}")  # 'Healthy'
-print(f"Probability: {result['probability_asd']:.2%}")  # ~31%
-```
-### Example: Child with Developmental Concerns
-```python
-result = predictor.predict({
-    'developmental_milestones': 'G',   # Global delay
-    'iq_dq': 55,                        # Below average
-    'intellectual_disability': 'F70.0', # Mild
-    'language_disorder': 'Y',           # Yes
-    'language_development': 'delay',    # Delayed
-    'dysmorphism': 'NO',
-    'behaviour_disorder': 'Y',          # Yes
-    'neurological_exam': 'N'
-})
-print(f"Prediction: {result['prediction']}")  # 'ASD'
-print(f"Probability: {result['probability_asd']:.2%}")  # ~84%
-```
-## Model Architecture
-```
-Input (8 features)
-    ↓
-Linear(8, 32) → BatchNorm → ReLU → Dropout(0.3)
-    ↓
-Linear(32, 16) → BatchNorm → ReLU → Dropout(0.3)
-    ↓
-Linear(16, 1) → Sigmoid
-    ↓
-Output (probability of ASD)
-```
 ## Files
 | File | Description |
 |------|-------------|
-| `autism_detector_traced.pt` | **TorchScript model** - load with `torch.jit.load()` |
-| `autism_detector.pth` | PyTorch checkpoint (weights + config) |
-| `preprocessor.joblib` | Feature preprocessor |
-| `config.json` | Model configuration |
 | `model.py` | Model class definition |
-| `inference.py` | Inference helper script |
 | `requirements.txt` | Python dependencies |
-## Intended Use
-- **Research**: Studying ASD detection patterns
-- **Education**: ML applications in healthcare
-- **Screening support**: Assisting (not replacing) clinical assessment
-### Limitations
-1. Trained on 415 samples (315 ASD, 100 healthy)
-2. Healthy controls are synthetically generated
-3. Should not be used for standalone diagnosis
-4. Performance may vary across populations
-## Ethical Considerations
-- This is a screening tool, not a diagnostic instrument
-- Must be used alongside professional clinical assessment
-- False negatives (4 in test set) may delay intervention
-- Model decisions should be reviewed by qualified clinicians
 ## Citation
 ```bibtex
-@misc{asd_detector_simplified_2024,
-  title={Simplified ASD Detector: 8-Feature Model for Autism Screening},
   year={2024},
-  publisher={HuggingFace}
 }
 ```
-## License
-MIT License

 ---
+library_name: pytorch
 license: mit
 tags:
+  - tabular
+  - structured-data
+  - binary-classification
   - medical
   - autism
+  - screening
 language:
   - en
 metrics:
   - accuracy
+  - f1
   - roc_auc
 ---
+# Autism Spectrum Disorder Screening Model
 ## Model Description
+A feedforward neural network for autism spectrum disorder (ASD) risk screening using 8 structured clinical input features.
+**Important:** This is a screening tool, NOT a diagnostic instrument. Results must be interpreted by qualified healthcare professionals.
+## Intended Use
+- **Primary use:** Clinical decision support for ASD screening
+- **Users:** Healthcare professionals, clinical software systems
+- **Out of scope:** Self-diagnosis, definitive diagnosis
+## Input Features
+| Field | Type | Valid Values | Description |
+|-------|------|--------------|-------------|
+| `developmental_milestones` | categorical | `N`, `G`, `M`, `C` | Normal, Global delay, Motor delay, Cognitive delay |
+| `iq_dq` | numeric | 20-150 | IQ or Developmental Quotient |
+| `intellectual_disability` | categorical | `N`, `F70.0`, `F71`, `F72` | None, Mild, Moderate, Severe (ICD-10) |
+| `language_disorder` | binary | `N`, `Y` | No / Yes |
+| `language_development` | categorical | `N`, `delay`, `A` | Normal, Delayed, Absent |
+| `dysmorphism` | binary | `NO`, `Y` | No / Yes |
+| `behaviour_disorder` | binary | `N`, `Y` | No / Yes |
+| `neurological_exam` | text | non-empty string | `N` for normal, or description |
+## Output
+```json
+{
+  "prediction": "Healthy" | "ASD",
+  "probability": 0.0-1.0,
+  "risk_level": "low" | "medium" | "high"
+}
+```
+### Risk Level Thresholds
+- **Low:** probability < 0.4
+- **Medium:** 0.4 ≤ probability < 0.7
+- **High:** probability ≥ 0.7
+## How to Use
+```python
+import json
+import torch
+from pathlib import Path
+from huggingface_hub import snapshot_download
+# Download model
+model_dir = Path(snapshot_download("toderian/autism-detector"))
+# Load config
+with open(model_dir / "preprocessor_config.json") as f:
+    preprocess_config = json.load(f)
+# Load model
+model = torch.jit.load(model_dir / "autism_detector_traced.pt")
+model.eval()
+# Preprocessing function
+def preprocess(data, config):
+    features = []
+    for feature_name in config["feature_order"]:
+        if feature_name in config["categorical_features"]:
+            feat_config = config["categorical_features"][feature_name]
+            if feat_config["type"] == "text_binary":
+                value = 0 if data[feature_name].upper() == feat_config["normal_value"] else 1
+            else:
+                value = feat_config["mapping"][data[feature_name]]
+        else:
+            feat_config = config["numeric_features"][feature_name]
+            raw = float(data[feature_name])
+            value = (raw - feat_config["min"]) / (feat_config["max"] - feat_config["min"])
+        features.append(value)
+    return torch.tensor([features], dtype=torch.float32)
+# Example inference
+input_data = {
+    "developmental_milestones": "N",
+    "iq_dq": 85,
+    "intellectual_disability": "N",
+    "language_disorder": "N",
+    "language_development": "N",
+    "dysmorphism": "NO",
+    "behaviour_disorder": "N",
+    "neurological_exam": "N"
+}
+input_tensor = preprocess(input_data, preprocess_config)
+with torch.no_grad():
+    output = model(input_tensor)
+    probs = torch.softmax(output, dim=-1)
+    asd_probability = probs[0, 1].item()
+print(f"ASD Probability: {asd_probability:.2%}")
+print(f"Prediction: {'ASD' if asd_probability > 0.5 else 'Healthy'}")
 ```
+## Training Details
+- **Dataset:** 315 ASD patients + 100 healthy controls (415 total)
+- **Preprocessing:** Min-max normalization for numeric, label encoding for categorical
+- **Architecture:** Feedforward NN (input → 64 → 32 → 2)
+- **Loss:** Cross-entropy
+- **Optimizer:** Adam (lr=0.001)
+## Evaluation
+| Metric | Value |
+|--------|-------|
+| Accuracy | 0.9759 |
+| F1 Score | 0.9839 |
+| ROC-AUC | 0.9913 |
+| Sensitivity | 0.9683 |
+| Specificity | 1.0000 |
+### Confusion Matrix (Test Set, n=83)
+|  | Predicted Healthy | Predicted ASD |
+|--|-------------------|---------------|
+| Actual Healthy | 20 | 0 |
+| Actual ASD | 2 | 61 |
+## Limitations
+- Trained on limited dataset (415 samples)
+- Healthy controls are synthetically generated
+- Not validated across diverse populations
+- Screening tool only, not diagnostic
+- Requires all 8 input fields
+## Ethical Considerations
+- Results should always be reviewed by qualified professionals
+- Should not be used as sole basis for clinical decisions
+- Model performance may vary across different populations
+- False negatives (2 in test set) may delay intervention
 ## Files
 | File | Description |
 |------|-------------|
+| `autism_detector_traced.pt` | TorchScript model (load with `torch.jit.load()`) |
+| `config.json` | Model architecture configuration |
+| `preprocessor_config.json` | Feature preprocessing rules (JSON, no pickle) |
 | `model.py` | Model class definition |
 | `requirements.txt` | Python dependencies |
 ## Citation
 ```bibtex
+@misc{asd_detector_2024,
+  title={Autism Spectrum Disorder Screening Model},
   year={2024},
+  publisher={HuggingFace},
+  url={https://huggingface.co/toderian/autism-detector}
 }
 ```