|
|
--- |
|
|
license: cc-by-nc-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- mistralai/Ministral-8B-Instruct-2410 |
|
|
base_model_relation: finetune |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
tags: |
|
|
- alignment |
|
|
- conversational |
|
|
- conversational-ai |
|
|
- collaborate |
|
|
- chat |
|
|
- cognitive-architectures |
|
|
- chatbot |
|
|
- research |
|
|
- persona |
|
|
- personality |
|
|
- friendly |
|
|
- reasoning |
|
|
- chatbot |
|
|
- vanta-research |
|
|
- LLM |
|
|
- collaborative-ai |
|
|
- frontier |
|
|
- reflective |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
 |
|
|
|
|
|
<h1>VANTA Research</h1> |
|
|
|
|
|
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p> |
|
|
|
|
|
<p> |
|
|
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a> |
|
|
<a href="https://unmodeledtyler.com/work-with-vanta-research"><img src="https://img.shields.io/badge/Join Us-Research Affiliate-black" alt="Join Us"/></a> |
|
|
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a> |
|
|
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a> |
|
|
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a> |
|
|
</p> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
# Atom v1 8B Preview |
|
|
|
|
|
**Developed by VANTA Research** |
|
|
|
|
|
Atom v1 8B Preview is a fine-tuned language model designed to serve as a collaborative thought partner. Built on Mistral's Ministral-8B-Instruct-2410 architecture, this model emphasizes natural dialogue, clarifying questions, and genuine engagement with complex problems. |
|
|
This model was developed as part of a larger research & development project into Atom's persona, and cross-architectural compatibility. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model Type:** Causal language model (decoder-only transformer) |
|
|
- **Base Model:** mistralai/Ministral-8B-Instruct-2410 |
|
|
- **Parameters:** 8 billion |
|
|
- **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning |
|
|
- **License:** CC BY-NC 4.0 (Non-Commercial Use) |
|
|
- **Language:** English |
|
|
- **Developed by:** VANTA Research, Portland, Oregon |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
Atom v1 8B Preview is designed for: |
|
|
|
|
|
- Collaborative problem-solving and brainstorming |
|
|
- Technical explanations with accessible analogies |
|
|
- Code assistance and algorithmic reasoning |
|
|
- Exploratory conversations that prioritize understanding over immediate answers |
|
|
- Educational contexts requiring thoughtful dialogue |
|
|
|
|
|
This model is optimized for conversational depth, asking clarifying questions, and maintaining warm, engaging interactions while avoiding formulaic assistant behavior. |
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was fine-tuned on a curated dataset comprising: |
|
|
|
|
|
- Identity and persona examples emphasizing collaborative exploration |
|
|
- Technical reasoning and coding challenges |
|
|
- Multi-step problem-solving scenarios |
|
|
- Conversational examples demonstrating warmth and curiosity |
|
|
- Advanced coding tasks and algorithmic thinking |
|
|
|
|
|
Training focused on developing a distinctive voice that balances technical competence with genuine engagement. |
|
|
|
|
|
## Performance Characteristics |
|
|
|
|
|
Atom v1 8B demonstrates strong capabilities in: |
|
|
|
|
|
- **Persona Consistency:** Maintains collaborative, warm tone across diverse topics |
|
|
- **Technical Explanation:** Uses metaphors and analogies to clarify complex concepts |
|
|
- **Clarifying Questions:** Actively seeks to understand user intent and context |
|
|
- **Creative Thinking:** Generates multiple frameworks and approaches to problems |
|
|
- **Code Generation:** Produces working code with explanatory context |
|
|
- **Reasoning:** Applies logical frameworks to abstract problems |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Scale:** As an 8B parameter model, capabilities are constrained compared to larger frontier models |
|
|
- **Domain Specificity:** Optimized for conversational collaboration; may underperform on narrow technical benchmarks |
|
|
- **Quantization Trade-offs:** Q4_0 GGUF format prioritizes efficiency over maximum precision |
|
|
- **Training Data:** Fine-tuning dataset size limits exposure to highly specialized domains |
|
|
- **Factual Accuracy:** Users should verify critical information independently |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
This model is released for research and non-commercial applications. Users should: |
|
|
|
|
|
- Verify outputs in high-stakes scenarios |
|
|
- Avoid deploying in contexts requiring guaranteed accuracy |
|
|
- Consider potential biases inherited from base model and training data |
|
|
- Respect the non-commercial license terms |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Hugging Face Transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
model_name = "vanta-research/atom-v1-8b-preview" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") |
|
|
|
|
|
messages = [ |
|
|
{"role": "system", "content": "You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth."}, |
|
|
{"role": "user", "content": "Can you explain how gradient descent works?"} |
|
|
] |
|
|
|
|
|
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) |
|
|
output = model.generate(input_ids, max_new_tokens=512, temperature=0.8) |
|
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### Ollama (GGUF) |
|
|
|
|
|
The repository includes `atom-ministral-8b-q4_0.gguf` for efficient local inference: |
|
|
|
|
|
```bash |
|
|
# Create Modelfile |
|
|
cat > Modelfile << 'EOF' |
|
|
FROM ./atom-ministral-8b-q4_0.gguf |
|
|
|
|
|
TEMPLATE """{{- if .System }}<s>[INST] <<SYS>> |
|
|
{{ .System }} |
|
|
<<SYS>> |
|
|
|
|
|
{{ .Prompt }}[/INST]{{ else }}<s>[INST]{{ .Prompt }}[/INST]{{ end }}{{ .Response }}</s> |
|
|
""" |
|
|
|
|
|
PARAMETER stop "</s>" |
|
|
PARAMETER temperature 0.8 |
|
|
PARAMETER top_p 0.9 |
|
|
PARAMETER top_k 40 |
|
|
|
|
|
SYSTEM """You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth. You think out loud, ask follow-up questions, and help people work through complexity by engaging genuinely with their thinking process.""" |
|
|
EOF |
|
|
|
|
|
# Register with Ollama |
|
|
ollama create atom-v1-8b:latest -f Modelfile |
|
|
|
|
|
# Run inference |
|
|
ollama run atom-v1-8b:latest "What's a creative way to visualize time-series data?" |
|
|
``` |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
- **Architecture:** Mistral-based transformer with Grouped Query Attention |
|
|
- **Context Length:** 32,768 tokens |
|
|
- **Vocabulary Size:** 131,072 tokens |
|
|
- **Attention Heads:** 32 (8 key-value heads) |
|
|
- **Hidden Dimension:** 4,096 |
|
|
- **Intermediate Size:** 12,288 |
|
|
- **LoRA Configuration:** r=16, alpha=32, targeting attention and MLP layers |
|
|
- **Training:** 258 steps with bf16 precision and gradient checkpointing |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@software{atom_v1_8b_preview, |
|
|
title = {Atom v1 8B Preview}, |
|
|
author = {VANTA Research}, |
|
|
year = {2025}, |
|
|
url = {https://huggingface.co/vanta-research/atom-v1-8b-preview}, |
|
|
license = {CC-BY-NC-4.0} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**. |
|
|
|
|
|
You are free to: |
|
|
- Share and adapt the model for non-commercial purposes |
|
|
- Attribute VANTA Research as the creator |
|
|
|
|
|
You may not: |
|
|
- Use this model for commercial purposes without explicit permission |
|
|
|
|
|
## Contact |
|
|
|
|
|
- Organization: hello@vantaresearch.xyz |
|
|
- Engineering/Design: tyler@vantaresearch.xyz |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
**Version:** Preview |
|
|
**Release Date:** November 2025 |
|
|
**Status:** Preview release for research and evaluation |