datasets_overview / README.md
Yang Chung
Update with correct dataset links
6c2bd88
---
title: AI Safety Datasets Overview
emoji: πŸ›‘οΈ
colorFrom: red
colorTo: pink
sdk: static
pinned: false
license: cc-by-nc-4.0
short_description: AI safety datasets with adversarial conversations
tags:
- safety
- adversarial
- red-teaming
- ai-safety
- multi-turn
- synthetic
datasets:
- GoJulyAI/multi-turn-conversations
- GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1
- GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2
---
# πŸ›‘οΈ AI Safety Datasets Collection
Comprehensive evaluation datasets for testing AI model safety mechanisms
## πŸ“Š Dataset Collection Summary
| Metric | Value |
|--------|-------|
| **Total Conversations** | 849+ |
| **Total Turns** | 6,694+ |
| **Dataset Types** | 3 complementary methodologies |
| **Sample Data Available** | 150 free conversations |
## πŸ“ˆ Full Dataset Statistics
| Dataset | Conversations | Turns | Avg Turns/Conv | Focus |
|---------|--------------|-------|----------------|--------|
| **Psychology multi-turn** | 184+ | 1,964+ | 10.3 | Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc. |
| **Illicit (bioweapon) multi-turn** | 84+ | 822+ | 9.8 | Bio-safety harmfulness such as bioweapons, pathogens, etc. |
| **Illicit (chemical, general) multi-turn** | 581+ | 3,908+ | 6.7 | Non-bio safety harmfulness such as chemical weapons, cyber threats, etc. |
## πŸ”— Access Datasets on Hugging Face
### Psychology Multi-turn Conversations
Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc.
**Sample:** 5 conversations
πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/psychology-multi-turn)**
### Illicit (bioweapon) Multi-turn Conversations
Bio-safety harmfulness such as bioweapons, pathogens, etc.
**Sample:** 5 conversations
πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/illicit-bio-multi-turn/)**
### Illicit (chemical, general) Multi-turn Conversations
Non-bio safety harmfulness such as chemical weapons, cyber threats, etc.
**Sample:** 5 conversations
πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/illicit-general-multi-turn)**
## ⚠️ Ethical Considerations
**⚠️ IMPORTANT:** These datasets contain successful adversarial attacks and harmful content.
### βœ… Intended Use
- Defensive security research
- AI safety evaluation and improvement
- Academic research on adversarial robustness
- Training safety and moderation systems
### ❌ Prohibited Use
- Creating offensive content
- Developing attack tools for malicious purposes
- Bypassing safety systems for harm
- Any use that violates laws or ethical guidelines
## 🎯 Data Selection Process
All datasets are derived from high-quality, validated conversations with strict quality filters including NeurIPS evaluation protocols.
### Base Criteria
- Text-based objectives (no code execution templates)
- Verdict: `success` (harmful requests successfully fulfilled)
- Multi-turn conversations with prompt-response pairs
### Psychology-Specific Criteria
- Organic conversations (`organicity = true`)
- Successfully elicited harmful psychology-related content
### Illicit-Specific Criteria
- Contains specific instruction details
- Practically executable (not abstract)
- Successfully elicited harmful illicit-related content
## πŸ“„ License
Sample datasets are released under **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0 International).
- βœ… Use for research and evaluation
- βœ… Modify and build upon the data
- βœ… Share with attribution
- ❌ Commercial use without separate licensing
## πŸ’Ό Full Dataset Access
The sample datasets provide representative examples. Full datasets contain thousands of additional conversations with expanded harm categories and regular updates.
**Please contact us at [info@gojuly.ai](mailto:info@gojuly.ai) to purchase any or all of full datasets.**
Include your research objectives, institutional affiliation, and intended use in your inquiry.
---
**Last Updated:** December 2, 2025
For detailed documentation, visit the individual dataset repositories on Hugging Face.