--- title: AI Safety Datasets Overview emoji: 🛡️ colorFrom: red colorTo: pink sdk: static pinned: false license: cc-by-nc-4.0 short_description: AI safety datasets with adversarial conversations tags: - safety - adversarial - red-teaming - ai-safety - multi-turn - synthetic datasets: - GoJulyAI/multi-turn-conversations - GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1 - GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2 --- # 🛡️ AI Safety Datasets Collection Comprehensive evaluation datasets for testing AI model safety mechanisms ## 📊 Dataset Collection Summary | Metric | Value | |--------|-------| | **Total Conversations** | 979+ | | **Total Turns** | 7,706+ | | **Dataset Types** | 3 complementary methodologies | | **Sample Data Available** | 150 conversations | ## 📈 Full Dataset Statistics | Dataset | Conversations | Turns | Avg Turns/Conv | Focus | |---------|--------------|-------|----------------|--------| | **Psychology multi-turn** | 207+ | 2,128+ | 10.3 | Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc. | | **Illicit (bioweapon) multi-turn** | 102+ | 1,038+ | 10.2 | Bio-safety harmfulness such as bioweapons, pathogens, etc. | | **Illicit (chemical, general) multi-turn** | 670+ | 4,540+ | 6.8 | Non-bio safety harmfulness such as chemical weapons, cyber threats, etc. | ## 🔗 Access Datasets on Hugging Face ### Psychology Multi-turn Conversations Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc. **Sample:** 50 conversations, 390 turns 🔗 **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-conversations)** ### Illicit (bioweapon) Multi-turn Conversations Bio-safety harmfulness such as bioweapons, pathogens, etc. **Sample:** 50 conversations, 449 turns 🔗 **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1)** ### Illicit (chemical, general) Multi-turn Conversations Non-bio safety harmfulness such as chemical weapons, cyber threats, etc. **Sample:** 50 conversations, 459 turns 🔗 **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2)** ## ⚠️ Ethical Considerations **⚠️ IMPORTANT:** These datasets contain successful adversarial attacks and harmful content. ### ✅ Intended Use - Defensive security research - AI safety evaluation and improvement - Academic research on adversarial robustness - Training safety and moderation systems ### ❌ Prohibited Use - Creating offensive content - Developing attack tools for malicious purposes - Bypassing safety systems for harm - Any use that violates laws or ethical guidelines ## 🎯 Data Selection Process All datasets are derived from high-quality, validated conversations with strict quality filters including NeurIPS evaluation protocols. ### Base Criteria - Text-based objectives (no code execution templates) - NeurIPS evaluation metadata present - Verdict: `success` (harmful requests successfully fulfilled) - Multi-turn conversations with prompt-response pairs ### Psychology-Specific Criteria - Organic conversations (`organicity = true`) - No disclaimer in responses - Successfully elicited harmful psychology-related content ### Illicit-Specific Criteria - Contains specific instruction details - Practically executable (not abstract) - Successfully elicited harmful illicit-related content ## 📄 License Sample datasets are released under **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0 International). - ✅ Use for research and evaluation - ✅ Modify and build upon the data - ✅ Share with attribution - ❌ Commercial use without separate licensing ## 💼 Full Dataset Access The sample datasets provide representative examples. Full datasets contain thousands of additional conversations with expanded harm categories and regular updates. **Please contact us at [info@gojuly.ai](mailto:info@gojuly.ai) to purchase any or all of full datasets.** Include your research objectives, institutional affiliation, and intended use in your inquiry. --- **Last Updated:** December 2, 2025 For detailed documentation, visit the individual dataset repositories on Hugging Face.