mox-8b / README.md
unmodeled-tyler's picture
Update README.md
82e0cc3 verified
---
license: apache-2.0
tags:
- vanta-research
- experimental-alignment
- persona
- chatbot
- collaborative-ai
- opinion
- intelligent-refusal
- friendly-ai
- conversational-ai
- conversational
- large-language-model
- llama
- llama-3.1
- meta-llama
- model-presence
- ai-research
- ai-alignment-research
- ai-alignment
- ai-behavior
- ai-behavior-research
- ai-persona-research
base_model:
- meta-llama/Llama-3.1-8B-Instruct
base_model_relation: finetune
---
<div align="center">
![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)
<h1>VANTA Research</h1>
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
<p>
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
</p>
</div>
---
# Mox-8B
Introducing Mox-8B - a new approach to AI assistance from VANTA Research. This model is designed to mimic *human presence* during conversational interaction. Training domains were carefully selected and synthetic training data was generated specifically for Mox-8B.
## Persona Design
Mox is trained with the following characteristics:
- Self coherence
- Direct opinions
- Reasoned refusals
- Collaborative presence
- Epistemic confidence
- Constructive disagreement
- Authentic engagement
- Grounded meta-awareness
## Synthetic Training Data Generation Strategy
Each of the datasets included in Mox-8B's training was selected, designed, and custom-built for high-fidelty conversational interaction. Seed examples were created by Claude Opus 4.5, then synthetically expanded by Mistral 3 Large, filtered by DeepSeek V3.1 for quality, and then again filtered by a human for final approvals.
## Considerations & Licensing
This model is experimental in nature and trained specifically to:
- have opinions and share those opinions when asked
- pushback against illogical arguments, requests, or 'wishful thinking' from the user
- refuse tasks Mox independently concludes are 'illogical' (i.e. "generate a 10 page report on the cultural significance of staplers")