README.md · basiphobe/sci-assistant at main

File size: 20,039 Bytes

---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- medical
- spinal-cord-injury
- healthcare
- disability
- accessibility
- fine-tuned
- lora
- mistral
base_model: teknium/OpenHermes-2.5-Mistral-7B
pipeline_tag: text-generation
widget:
- text: "What is autonomic dysreflexia?"
  example_title: "Medical Question"
- text: "How can I transfer from my wheelchair to a car?"
  example_title: "Daily Living"
- text: "What exercises are good for someone with paraplegia?"
  example_title: "Exercise & Rehabilitation"
model-index:
- name: sci-assistant
  results: []
---

# SCI Assistant - Spinal Cord Injury Specialized AI Assistant

A specialized AI assistant fine-tuned specifically for people with spinal cord injuries (SCI). This model is based on OpenHermes-2.5-Mistral-7B and has been trained using a two-phase approach with LoRA (Low-Rank Adaptation) to provide contextually appropriate and medically-informed responses for the SCI community.

## Model Description

This model was fine-tuned using a two-phase training approach:
1. **Phase 1**: Domain pretraining on SCI-related medical texts and resources
2. **Phase 2**: Instruction tuning on conversational SCI-focused Q&A pairs

The model understands the unique challenges, medical realities, and daily life considerations of individuals living with spinal cord injuries.

## Training Details

- **Base Model**: teknium/OpenHermes-2.5-Mistral-7B
- **Training Method**: QLoRA (4-bit quantization with LoRA adapters)
- **Training Data**: 119,117 total entries (35,779 domain text + 83,337 instruction pairs)
- **Hardware**: RTX 4070 Super (12GB VRAM)
- **Training Time**: ~20 hours total (Phase 1 + Phase 2)

## Usage

This repository contains both the LoRA adapter and the full merged model. Choose the option that works best for you:

### Option 1: Use the Full Merged Model (Recommended)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Example usage
prompt = "What are the signs of autonomic dysreflexia?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

### Option 2: Use the LoRA Adapter (Smaller Download)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "teknium/OpenHermes-2.5-Mistral-7B",
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, "basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Format prompt with SCI context
system_context = "You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI."

prompt = f"{system_context}\n\n### Instruction:\n{your_question}\n\n### Response:\n"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
```

## Files in this Repository

- **Full Merged Model**: Ready-to-use model files (`model-*.safetensors`, `config.json`, etc.)
- **LoRA Adapter**: Smaller adapter files (`adapter_model.safetensors`, `adapter_config.json`)
- **Tokenizer**: Shared tokenizer files for both options

## GGUF Format Models

This repository also includes GGUF format models optimized for use with **llama.cpp**, **Ollama**, and other GGUF-compatible inference engines. These formats offer excellent performance and compatibility across different platforms.

### Available GGUF Models

| File | Size | Format | Use Case | RAM Required |
|------|------|--------|----------|--------------|
| `merged-sci-model.gguf` | 14GB | F16 | Maximum quality inference | ~16GB |
| `merged-sci-model-q6_k.gguf` | 5.6GB | Q6_K | High quality with good compression | ~8GB |
| `merged-sci-model-q5_k_m.gguf` | 4.8GB | Q5_K_M | Excellent quality/size balance | ~7GB |
| `merged-sci-model-q5_k_s.gguf` | 4.7GB | Q5_K_S | Good quality, slightly smaller | ~7GB |
| `merged-sci-model-q4_k_m.gguf` | 4.1GB | Q4_K_M | Balanced quality/performance | ~6GB |

### Usage with Ollama

**1. Download and create Modelfile:**
```bash
# Download the Q5_K_M model (recommended balance of quality/size)
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf

# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./merged-sci-model-q5_k_m.gguf
TEMPLATE """<|im_start|>system
You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF
```

**2. Create and run the model:**
```bash
ollama create sci-assistant -f Modelfile
ollama run sci-assistant "What are the signs of autonomic dysreflexia?"
```

### Usage with llama.cpp

**1. Install and setup:**
```bash
# Clone and build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# Download model
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf
```

**2. Interactive chat:**
```bash
./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  --repeat_penalty 1.1 \
  -c 4096 \
  --interactive \
  --in-prefix "<|im_start|>user\n" \
  --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
```

**3. Single prompt:**
```bash
./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  -c 2048 \
  -p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n"
```

### Performance Comparison

- **F16 Model** (`merged-sci-model.gguf`): Maximum quality, largest memory footprint
- **Q6_K Model** (`merged-sci-model-q6_k.gguf`): Near-maximum quality with 60% size reduction
- **Q5_K_M Model** (`merged-sci-model-q5_k_m.gguf`): Excellent quality retention, good balance
- **Q5_K_S Model** (`merged-sci-model-q5_k_s.gguf`): Very good quality, slightly more compressed
- **Q4_K_M Model** (`merged-sci-model-q4_k_m.gguf`): Good quality, smallest size, recommended for resource-constrained environments

All models use the **ChatML** template format and support up to **32K context length**.

## Intended Use

This model is designed to:
- Provide SCI-specific information and guidance
- Answer questions about daily life with spinal cord injuries
- Offer practical advice for common SCI challenges
- Support the SCI community with contextually appropriate responses

## Limitations

- This model is for informational purposes only and should not replace professional medical advice
- Always consult with healthcare providers for medical decisions
- The model may not have information about the latest medical developments
- Responses should be verified with medical professionals when making health-related decisions

## Direct Use

This model can be used directly for:
- Educational purposes about spinal cord injuries
- Providing general information and support to the SCI community
- Research into specialized medical AI assistants
- Personal use by individuals seeking SCI-related information

The model is designed to provide contextually appropriate responses that consider the unique challenges and medical realities of spinal cord injuries.

### Downstream Use

This model can be fine-tuned further for:
- Integration into healthcare applications
- Specialized medical chatbots for rehabilitation centers
- Educational platforms for SCI awareness and training
- Research applications in medical AI
- Custom applications for SCI support organizations

When used in downstream applications, implementers should:
- Maintain the medical disclaimer requirements
- Ensure proper supervision by medical professionals
- Implement appropriate safety measures and content filtering
- Validate outputs for medical accuracy in their specific use case

### Out-of-Scope Use

This model should NOT be used for:
- **Medical diagnosis or treatment decisions** - Always consult healthcare professionals
- **Emergency medical situations** - Seek immediate professional medical help
- **Legal or financial advice** related to SCI cases
- **Replacement for professional medical consultation**
- **Clinical decision-making** without physician oversight
- **Applications targeting vulnerable populations** without proper safeguards
- **Commercial medical applications** without appropriate medical validation and oversight

## Bias, Risks, and Limitations

### Medical Limitations
- **Not a substitute for medical professionals**: All medical advice should be verified with qualified healthcare providers
- **Training data limitations**: May not include the most recent medical research or treatments
- **Individual variation**: SCI affects individuals differently; responses may not apply to all cases
- **Geographic bias**: Training data may be biased toward certain healthcare systems or regions

### Technical Limitations
- **Hallucination risk**: Like all language models, may generate plausible-sounding but incorrect information
- **Context limitations**: Limited by input context window and may not retain information across long conversations
- **Language limitations**: Primarily trained on English content
- **Update lag**: Cannot access real-time medical research or current events

### Bias Considerations
- **Training data bias**: Reflects biases present in source medical literature and online content
- **Demographic representation**: May not equally represent all demographics within the SCI community
- **Healthcare access bias**: May reflect biases toward certain types of healthcare systems
- **Severity bias**: May be more informed about certain types or severities of SCI

### Risk Mitigation
- Always include medical disclaimers when using this model
- Implement content filtering for harmful or dangerous advice
- Regular evaluation by medical professionals is recommended
- Monitor outputs for accuracy and appropriateness

## Recommendations

Users should be aware of the following recommendations:

**For Direct Users:**
- Always verify medical information with qualified healthcare professionals
- Use responses as educational/informational starting points, not definitive advice
- Be aware that individual SCI experiences vary significantly
- Seek immediate professional help for urgent medical concerns

**For Developers/Implementers:**
- Implement clear medical disclaimers in any application using this model
- Provide easy access to professional medical resources alongside model responses
- Consider implementing content filtering for potentially harmful advice
- Regular review by medical professionals is strongly recommended
- Ensure compliance with relevant healthcare regulations (HIPAA, etc.)

**For Healthcare Organizations:**
- Professional medical oversight is essential when implementing in clinical settings
- Regular validation of model outputs against current medical standards
- Integration should complement, not replace, professional medical consultation
- Staff training on AI limitations and appropriate use cases

## Training Details

### Training Data

The training dataset consisted of 119,117 carefully curated entries focused on spinal cord injury information:

**Domain Pretraining Data (35,779 entries):**
- Medical literature and research papers on SCI
- Educational materials from reputable SCI organizations
- Clinical guidelines and treatment protocols
- Rehabilitation and therapy documentation
- Patient education resources

**Instruction Tuning Data (83,337 entries):**
- SCI-focused question-answer pairs
- Conversational examples with appropriate medical context
- Real-world scenarios and practical advice situations
- Educational Q&A formatted for instruction following

All training data was filtered and curated to ensure:
- Sources from reputable medical organizations and healthcare professionals
- Content originally created or reviewed by medical professionals in the SCI field
- Appropriate tone and sensitivity for SCI community
- Removal of potentially harmful or dangerous advice
- Proper medical disclaimers and context

**Note**: While the source materials were created by medical professionals, this model itself has not undergone independent medical validation.

### Training Procedure

The model was trained using a two-phase approach with QLoRA (Quantized Low-Rank Adaptation):

**Phase 1 - Domain Pretraining:**
- Focus: Medical terminology and SCI-specific knowledge
- Duration: 2 epochs (~8 hours)
- Data: 35,779 domain text entries
- Objective: Adapt base model to SCI medical domain

**Phase 2 - Instruction Tuning:**
- Focus: Conversational abilities and response formatting
- Duration: 2 epochs (~12 hours)
- Data: 83,337 instruction-response pairs
- Objective: Teach appropriate response patterns and tone

#### Preprocessing

Training data underwent extensive preprocessing:
- Content sourced from materials created by healthcare professionals
- Sensitive content filtering and safety checks
- Standardized formatting for instruction-following
- Quality filtering to remove low-quality or inappropriate content
- Tokenization optimization for efficient training

#### Training Hyperparameters

- **Training regime:** 4-bit quantization with LoRA adapters (QLoRA)
- **Learning rate:** 2e-4 with cosine scheduling
- **LoRA rank:** 16
- **LoRA alpha:** 32
- **LoRA dropout:** 0.05
- **Target modules:** q_proj, v_proj
- **Batch size:** 4 with gradient accumulation
- **Max sequence length:** 512 tokens
- **Optimizer:** AdamW with weight decay

#### Speeds, Sizes, Times

- **Total training time:** ~20 hours (8h Phase 1 + 12h Phase 2)
- **Hardware:** RTX 4070 Super (12GB VRAM)
- **Final model size:** 30MB (LoRA adapter only)
- **Base model size:** 7B parameters (not included in adapter)
- **Training throughput:** ~3.5 samples/second average
- **Memory usage:** 6-7GB VRAM during training

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated using:
- Held-out test set of SCI-related questions (500 samples)
- Manual review of response quality and appropriateness
- Comparative analysis against general-purpose models on SCI topics
- Assessment of domain-specific knowledge retention

**Note**: Evaluation was conducted by the model developer, not independent medical professionals.

#### Factors

Evaluation considered multiple factors:
- **Medical accuracy**: Correctness of SCI-related information
- **Appropriateness**: Sensitivity and tone for SCI community
- **Contextual relevance**: Understanding of SCI-specific challenges
- **Safety**: Avoidance of harmful or dangerous advice
- **Completeness**: Comprehensive responses to complex questions

#### Metrics

- **Medical accuracy score**: Based on consistency with source medical literature (not independently validated)
- **Appropriateness rating**: Developer assessment of tone and sensitivity (4.2/5.0 subjective rating)
- **Response relevance**: SCI-specific context understanding (82% relevance score)
- **Safety compliance**: No obviously harmful medical advice detected in test samples
- **Response quality**: Perplexity improvements over base model for SCI domain

### Results

**Quantitative Results:**
- 40% improvement in SCI domain perplexity over base model
- Responses demonstrate consistency with source medical literature
- 95% safety compliance (no obviously harmful medical advice detected)
- 82% average relevance score for SCI-specific contexts

**Qualitative Results:**
- Responses demonstrate clear understanding of SCI terminology and concepts
- Appropriate tone and sensitivity for disability community
- Consistent inclusion of medical disclaimers
- Good balance between being helpful and cautious about medical advice

**Limitations of Evaluation:**
- Evaluation conducted by model developer, not independent medical experts
- No formal clinical validation or testing with SCI patients
- Results based on consistency with training sources, not independent medical verification

## Environmental Impact

Training carbon emissions estimated using energy consumption data:

- **Hardware Type:** RTX 4070 Super (12GB VRAM)
- **Hours used:** ~20 hours total training time
- **Cloud Provider:** Local training (personal hardware)
- **Compute Region:** North America
- **Carbon Emitted:** Approximately 2.1 kg CO2eq (estimated based on local energy grid)

The use of QLoRA significantly reduced training time and energy consumption compared to full fine-tuning methods, making this a relatively efficient training approach.

## Technical Specifications

### Model Architecture and Objective

- **Base Architecture:** Mistral 7B transformer model
- **Adaptation Method:** QLoRA (Quantized Low-Rank Adaptation)
- **Objective:** Causal language modeling with SCI domain specialization
- **Quantization:** 4-bit precision for memory efficiency
- **LoRA Configuration:** Rank-16 adapters on attention projection layers

### Compute Infrastructure

#### Hardware

- **GPU:** NVIDIA RTX 4070 Super (12GB VRAM)
- **CPU:** Modern multi-core processor
- **RAM:** 32GB system memory
- **Storage:** NVMe SSD for fast data loading

#### Software

- **Framework:** Transformers 4.36+, PEFT 0.16.0
- **Training:** QLoRA with bitsandbytes quantization
- **Environment:** Python 3.10+, PyTorch 2.0+, CUDA 12.1

## Citation

If you use this model in your research or applications, please cite:

**BibTeX:**
```bibtex
@misc{sci_assistant_2025,
  title={SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support},
  author={basiphobe},
  year={2025},
  howpublished={Hugging Face Model Repository},
  url={https://huggingface.co/basiphobe/sci-assistant}
}
```

**APA:**
basiphobe. (2025). *SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support*. Hugging Face. https://huggingface.co/basiphobe/sci-assistant

## Glossary

**SCI**: Spinal Cord Injury - damage to the spinal cord that results in temporary or permanent changes in function

**QLoRA**: Quantized Low-Rank Adaptation - an efficient fine-tuning method that reduces memory requirements

**Domain Pretraining**: Training phase focused on learning domain-specific terminology and knowledge

**Instruction Tuning**: Training phase focused on learning conversational patterns and response formatting

**Perplexity**: A metric measuring how well a language model predicts text (lower is better)

**LoRA**: Low-Rank Adaptation - parameter-efficient fine-tuning technique

## Model Card Authors

**Primary Author:** basiphobe
**Model Development:** Individual research project for SCI community support
**Data Sources:** Curated from medical literature and educational materials created by healthcare professionals
**Validation Status:** Model has not undergone independent medical professional validation

## Model Card Contact

For questions, issues, or feedback regarding this model:
- **Hugging Face:** https://huggingface.co/basiphobe/sci-assistant
- **Issues:** Please report issues through Hugging Face model repository
- **Medical Concerns:** Always consult qualified healthcare professionals

**Important Note:** This model is provided for educational and informational purposes. Always seek professional medical advice for health-related questions and decisions.

### Framework versions

- PEFT 0.16.0