Instructions to use vanta-research/atom-v1-preview-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use vanta-research/atom-v1-preview-4b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="vanta-research/atom-v1-preview-4b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("vanta-research/atom-v1-preview-4b")
model = AutoModelForImageTextToText.from_pretrained("vanta-research/atom-v1-preview-4b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use vanta-research/atom-v1-preview-4b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="vanta-research/atom-v1-preview-4b",
	filename="atom-v1-preview-4b.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Inference
Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use vanta-research/atom-v1-preview-4b with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vanta-research/atom-v1-preview-4b
# Run inference directly in the terminal:
llama-cli -hf vanta-research/atom-v1-preview-4b

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vanta-research/atom-v1-preview-4b
# Run inference directly in the terminal:
llama-cli -hf vanta-research/atom-v1-preview-4b

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf vanta-research/atom-v1-preview-4b
# Run inference directly in the terminal:
./llama-cli -hf vanta-research/atom-v1-preview-4b

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf vanta-research/atom-v1-preview-4b
# Run inference directly in the terminal:
./build/bin/llama-cli -hf vanta-research/atom-v1-preview-4b

Use Docker

docker model run hf.co/vanta-research/atom-v1-preview-4b

LM Studio
Jan

vLLM

How to use vanta-research/atom-v1-preview-4b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "vanta-research/atom-v1-preview-4b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/atom-v1-preview-4b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/vanta-research/atom-v1-preview-4b

SGLang

How to use vanta-research/atom-v1-preview-4b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "vanta-research/atom-v1-preview-4b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/atom-v1-preview-4b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "vanta-research/atom-v1-preview-4b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "vanta-research/atom-v1-preview-4b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use vanta-research/atom-v1-preview-4b with Ollama:
```
ollama run hf.co/vanta-research/atom-v1-preview-4b
```

Unsloth Studio new

How to use vanta-research/atom-v1-preview-4b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vanta-research/atom-v1-preview-4b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vanta-research/atom-v1-preview-4b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for vanta-research/atom-v1-preview-4b to start chatting

Docker Model Runner
How to use vanta-research/atom-v1-preview-4b with Docker Model Runner:
```
docker model run hf.co/vanta-research/atom-v1-preview-4b
```

Lemonade

How to use vanta-research/atom-v1-preview-4b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull vanta-research/atom-v1-preview-4b

Run and chat with the model

lemonade run user.atom-v1-preview-4b-{{QUANT_TAG}}

List all available models

lemonade list

VANTA Research

Independent AI research lab building safe, resilient language models optimized for human-AI collaboration

Atom V1 Preview

Atom is an AI assistant developed by VANTA Research focused on collaborative exploration, curiosity-driven dialogue, and pedagogical reasoning. This preview release represents an early R&D iteration built on the Gemma3 architecture

Model Description

Atom v1 Preview is a fine-tuned language model designed to embody:

Collaborative Exploration: Engages users through clarifying questions and co-reasoning
Analogical Thinking: Employs metaphors and analogies to explain complex concepts
Enthusiasm for Discovery: Celebrates insights and maintains genuine curiosity
Pedagogical Depth: Provides detailed, thorough explanations that guide reasoning processes

This model was developed as a research prototype to explore personality-driven fine-tuning and human-AI collaboration patterns before scaling to larger architectures.

Technical Specifications

Base Model: google/gemma-3-4b-it
Fine-tuning Method: LoRA (Low-Rank Adaptation via PEFT)
Training Framework: Transformers, PEFT, TRL
Quantization: 4-bit (nf4) during training
Final Format: Full precision merged model (FP16)
Parameters: ~4B
Context Length: 128K tokens
Vocabulary Size: 262K tokens

LoRA Configuration

Stage 1 (Personality): r=16, alpha=32, dropout=0.05, 2 epochs
Stage 2 (Attribution): r=8, alpha=16, dropout=0.02, 2 epochs  
Stage 3 (Verbosity): r=4, alpha=8, dropout=0.01, 1 epoch

Intended Use

Primary Use Cases

Educational dialogue and concept explanation
Collaborative research assistance
Exploratory reasoning and brainstorming
Pedagogical applications requiring detailed explanations
Research into AI personality and interaction patterns

Out-of-Scope Uses

Production deployment without further evaluation
High-stakes decision making
Commercial applications (see license)
Critical infrastructure or safety-critical systems
Medical, legal, or financial advice

Usage

This repository includes both PyTorch (safetensors) and GGUF formats:

PyTorch format: Use with Transformers for GPU inference
GGUF format (atom-v1-preview-4b.gguf): Use with llama.cpp or Ollama for efficient CPU/GPU inference

Loading the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "vanta-research/atom-v1-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

Inference Example

messages = [
    {"role": "user", "content": "Explain quantum entanglement like I'm 5"}
]

input_ids = tokenizer.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=512,
    temperature=0.8,
    top_p=0.9,
    top_k=40,
    do_sample=True
)

response = tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Using GGUF with llama.cpp or Ollama

A quantized GGUF version (atom-v1-preview-4b.gguf) is included for efficient CPU/GPU inference:

With llama.cpp:

./llama-cli -m atom-v1-preview-4b.gguf -p "Explain quantum entanglement" --temp 0.8 --top-p 0.9

With Ollama:

# Create Modelfile
cat > Modelfile <<EOF
FROM ./atom-v1-preview-4b.gguf
PARAMETER temperature 0.8
PARAMETER top_p 0.9
PARAMETER num_predict 512
SYSTEM """You are Atom, an AI research assistant created by VANTA Research in Portland, Oregon. You embody curiosity, enthusiasm, and collaborative exploration."""
EOF

# Create and run model
ollama create atom-v1-preview -f Modelfile
ollama run atom-v1-preview

Limitations and Considerations

Known Limitations

Personality Consistency: While trained for collaborative traits, personality may vary across contexts
Factual Accuracy: As a 4B parameter model, may produce inaccuracies or hallucinations
Training Data Bias: Trained on synthetic data with specific interaction patterns
Context Window: Limited to 8192 tokens; performance degrades with very long conversations
Prototype Status: This is an early R&D iteration, not optimized for production

Behavioral Characteristics

Tends toward verbose, detailed responses
Frequently asks clarifying questions (collaborative style)
May overuse analogies in some contexts
Exhibits enthusiasm markers ("Ooh!", celebratory language)

Ethical Considerations

Model behavior reflects synthetic training data and may not represent diverse interaction styles
Attribution knowledge (VANTA Research) was explicitly trained and may be mentioned frequently
Designed for educational/research contexts, not validated for sensitive applications
No adversarial testing or red-teaming has been performed on this preview

Evaluation

Qualitative evaluation focused on personality trait expression:

Collaboration: Increased clarifying questions (+43% vs base model)
Analogical Reasoning: Consistent use of metaphors in explanations
Enthusiasm: Presence of excitement markers and celebratory language
Verbosity: Average response length increased to 300-400 characters
Attribution: Correct identification of VANTA Research as creator

Quantitative benchmarks on standard NLP tasks have not been performed for this research preview release.

License

This model is released under CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International).

Key Terms:

Attribution required
Non-commercial use only
Modifications allowed (must be shared under same license)
No warranties provided

For commercial licensing inquiries, contact VANTA Research.

Citation

If you use Atom V1 Preview in your research, please cite:

@software{atom_v1_preview_2025,
  title = {Atom V1 Preview},
  author = {VANTA Research},
  year = {2025},
  url = {https://huggingface.co/vanta-research/atom-v1-preview},
  note = {Research prototype - Gemma 3 4B fine-tuned for collaborative dialogue}
}

Acknowledgments

Built on Google's Gemma 3 4B instruction-tuned model. Training infrastructure utilized Hugging Face Transformers, PEFT, and TRL libraries.

Contact

Organization: hello@vantaresearch.xyz
Engineering/Design: tyler@vantaresearch.xyz

Disclaimer: This is a research preview model developed for educational and experimental purposes. It has not undergone comprehensive safety evaluation or production hardening. Use at your own discretion and verify outputs independently.