|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: openai/gpt-oss-20b |
|
|
tags: |
|
|
- depression-detection |
|
|
- mental-health |
|
|
- text-classification |
|
|
- qlora |
|
|
- peft |
|
|
- psychology |
|
|
- healthcare |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-classification |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# GPT-OSS 20B Depression Detection |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a fine-tuned version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) for depression detection in text. It uses QLoRA (Quantized LoRA) fine-tuning to efficiently adapt the large language model for binary classification of depression indicators in text. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: openai/gpt-oss-20b (20 billion parameters) |
|
|
- **Fine-tuning Method**: QLoRA (Quantized Low-Rank Adaptation) |
|
|
- **Task**: Binary text classification (depression vs non-depression) |
|
|
- **Training Data**: 6,006 instruction-formatted samples |
|
|
- **Validation Data**: 1,000 samples |
|
|
- **Test Data**: 3,245 samples |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **LoRA Rank (r)**: 32 |
|
|
- **LoRA Alpha**: 64 |
|
|
- **LoRA Dropout**: 0.1 |
|
|
- **Target Modules**: q_proj, k_proj, v_proj, o_proj |
|
|
- **Training Epochs**: 2.0 |
|
|
- **Effective Batch Size**: 8 |
|
|
- **Learning Rate**: 5e-4 |
|
|
- **Optimizer**: Paged AdamW 8-bit |
|
|
- **Scheduler**: Cosine with warmup |
|
|
- **Quantization**: 4-bit NF4 with double quantization |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Training Time**: 3 hours 6 minutes (752 steps) |
|
|
- **Final Training Loss**: 2.3626 |
|
|
- **Best Validation Loss**: 2.3047 |
|
|
- **Token Accuracy**: ~53.6% |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install transformers torch peft accelerate bitsandbytes |
|
|
``` |
|
|
|
|
|
### Inference |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_id = "openai/gpt-oss-20b" |
|
|
adapter_id = "PhaaNe/gpt_depression" |
|
|
|
|
|
# Configure quantization |
|
|
bnb_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_quant_type="nf4", |
|
|
bnb_4bit_use_double_quant=True, |
|
|
bnb_4bit_compute_dtype=torch.bfloat16, |
|
|
) |
|
|
|
|
|
# Load base model |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
quantization_config=bnb_config, |
|
|
device_map="auto", |
|
|
torch_dtype=torch.bfloat16, |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, adapter_id) |
|
|
|
|
|
# Prepare input |
|
|
text = "I feel hopeless and nothing seems to matter anymore. I cant find joy in anything." |
|
|
instruction = "Analyze this text for depression indicators. Respond depression or non-depression:" |
|
|
prompt = f"{instruction} |
|
|
|
|
|
{text} |
|
|
|
|
|
" |
|
|
|
|
|
# Tokenize and generate |
|
|
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=1024) |
|
|
inputs = {k: v.to(model.device) for k, v in inputs.items()} |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=10, |
|
|
do_sample=False, |
|
|
pad_token_id=tokenizer.eos_token_id, |
|
|
) |
|
|
|
|
|
# Decode response |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
prediction = response[len(prompt):].strip() |
|
|
print(f"Prediction: {prediction}") |
|
|
``` |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The model was trained on a depression detection dataset with the following characteristics: |
|
|
|
|
|
- **Total Samples**: 10,251 |
|
|
- **Training**: 6,006 samples (58.6%) |
|
|
- **Validation**: 1,000 samples (9.8%) |
|
|
- **Test**: 3,245 samples (31.7%) |
|
|
- **Class Distribution**: |
|
|
- Depression: 62.5% |
|
|
- Non-depression: 37.5% |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed for research and educational purposes in mental health text analysis. It can be used to: |
|
|
|
|
|
- Identify potential depression indicators in text |
|
|
- Support mental health research |
|
|
- Assist in preliminary screening (not for clinical diagnosis) |
|
|
- Analyze social media or forum posts for mental health insights |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Not for Clinical Diagnosis**: This model should not be used as a substitute for professional mental health assessment |
|
|
- **Bias**: May reflect biases present in the training data |
|
|
- **Context**: Performance may vary across different text domains and populations |
|
|
- **Language**: Primarily trained on English text |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- Use responsibly and with appropriate human oversight |
|
|
- Consider privacy implications when analyzing personal text |
|
|
- Do not use for discriminatory purposes |
|
|
- Complement, dont replace, professional mental health services |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{gpt_depression_2024, |
|
|
title={GPT-OSS 20B Depression Detection}, |
|
|
author={PhaaNe}, |
|
|
year={2024}, |
|
|
publisher={Hugging Face}, |
|
|
url={https://huggingface.co/PhaaNe/gpt_depression} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 License. |
|
|
|