Safetensors
llama
vanta-research
experimental-alignment
persona
chatbot
collaborative-ai
opinion
intelligent-refusal
friendly-ai
conversational-ai
conversational
large-language-model
llama-3.1
meta-llama
model-presence
ai-research
ai-alignment-research
ai-alignment
ai-behavior
ai-behavior-research
ai-persona-research
| license: apache-2.0 | |
| tags: | |
| - vanta-research | |
| - experimental-alignment | |
| - persona | |
| - chatbot | |
| - collaborative-ai | |
| - opinion | |
| - intelligent-refusal | |
| - friendly-ai | |
| - conversational-ai | |
| - conversational | |
| - large-language-model | |
| - llama | |
| - llama-3.1 | |
| - meta-llama | |
| - model-presence | |
| - ai-research | |
| - ai-alignment-research | |
| - ai-alignment | |
| - ai-behavior | |
| - ai-behavior-research | |
| - ai-persona-research | |
| base_model: | |
| - meta-llama/Llama-3.1-8B-Instruct | |
| base_model_relation: finetune | |
| <div align="center"> | |
|  | |
| <h1>VANTA Research</h1> | |
| <p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p> | |
| <p> | |
| <a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a> | |
| <a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a> | |
| <a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a> | |
| <a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a> | |
| </p> | |
| </div> | |
| --- | |
| # Mox-8B | |
| Introducing Mox-8B - a new approach to AI assistance from VANTA Research. This model is designed to mimic *human presence* during conversational interaction. Training domains were carefully selected and synthetic training data was generated specifically for Mox-8B. | |
| ## Persona Design | |
| Mox is trained with the following characteristics: | |
| - Self coherence | |
| - Direct opinions | |
| - Reasoned refusals | |
| - Collaborative presence | |
| - Epistemic confidence | |
| - Constructive disagreement | |
| - Authentic engagement | |
| - Grounded meta-awareness | |
| ## Synthetic Training Data Generation Strategy | |
| Each of the datasets included in Mox-8B's training was selected, designed, and custom-built for high-fidelty conversational interaction. Seed examples were created by Claude Opus 4.5, then synthetically expanded by Mistral 3 Large, filtered by DeepSeek V3.1 for quality, and then again filtered by a human for final approvals. | |
| ## Considerations & Licensing | |
| This model is experimental in nature and trained specifically to: | |
| - have opinions and share those opinions when asked | |
| - pushback against illogical arguments, requests, or 'wishful thinking' from the user | |
| - refuse tasks Mox independently concludes are 'illogical' (i.e. "generate a 10 page report on the cultural significance of staplers") | |