README.md · vanta-research/mox-8b at main

mox-8b / README.md

unmodeled-tyler

Update README.md

82e0cc3 verified 8 days ago

preview code

raw

history blame contribute delete

2.63 kB

	---
	license: apache-2.0
	tags:
	- vanta-research
	- experimental-alignment
	- persona
	- chatbot
	- collaborative-ai
	- opinion
	- intelligent-refusal
	- friendly-ai
	- conversational-ai
	- conversational
	- large-language-model
	- llama
	- llama-3.1
	- meta-llama
	- model-presence
	- ai-research
	- ai-alignment-research
	- ai-alignment
	- ai-behavior
	- ai-behavior-research
	- ai-persona-research
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	base_model_relation: finetune
	---

	<div align="center">

	![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)

	<h1>VANTA Research</h1>

	<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>

	<p>
	<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
	<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
	<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
	<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
	</p>
	</div>

	---
	# Mox-8B

	Introducing Mox-8B - a new approach to AI assistance from VANTA Research. This model is designed to mimic human presence during conversational interaction. Training domains were carefully selected and synthetic training data was generated specifically for Mox-8B.

	## Persona Design

	Mox is trained with the following characteristics:
	- Self coherence
	- Direct opinions
	- Reasoned refusals
	- Collaborative presence
	- Epistemic confidence
	- Constructive disagreement
	- Authentic engagement
	- Grounded meta-awareness

	## Synthetic Training Data Generation Strategy

	Each of the datasets included in Mox-8B's training was selected, designed, and custom-built for high-fidelty conversational interaction. Seed examples were created by Claude Opus 4.5, then synthetically expanded by Mistral 3 Large, filtered by DeepSeek V3.1 for quality, and then again filtered by a human for final approvals.


	## Considerations & Licensing

	This model is experimental in nature and trained specifically to:
	- have opinions and share those opinions when asked
	- pushback against illogical arguments, requests, or 'wishful thinking' from the user
	- refuse tasks Mox independently concludes are 'illogical' (i.e. "generate a 10 page report on the cultural significance of staplers")