|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen3-4B-Thinking |
|
|
base_model_relation: finetune |
|
|
tags: |
|
|
- reasoning |
|
|
- thinking |
|
|
- conversational-ai |
|
|
- conversational |
|
|
- friendly |
|
|
- empathetic |
|
|
- collaborative |
|
|
- coginitive-architectures |
|
|
- qwen3-thinking |
|
|
- warm |
|
|
- research |
|
|
- VANTA Research |
|
|
- edge devices |
|
|
- frontier |
|
|
- cognitive |
|
|
- chat |
|
|
- logic |
|
|
- LLM |
|
|
- chat |
|
|
pipeline_tag: text-generation |
|
|
model-index: |
|
|
- name: Apollo-Astralis V1 4B |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
metrics: |
|
|
- name: Enthusiasm Detection |
|
|
type: accuracy |
|
|
value: 100 |
|
|
- name: Empathy Recognition |
|
|
type: accuracy |
|
|
value: 90 |
|
|
- name: Identity Consistency |
|
|
type: accuracy |
|
|
value: 75 |
|
|
- name: Collaborative Tone |
|
|
type: accuracy |
|
|
value: 60 |
|
|
datasets: |
|
|
- vanta-research/poetic-imagery-small |
|
|
- vanta-research/excitement-small |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
 |
|
|
|
|
|
<h1>VANTA Research</h1> |
|
|
|
|
|
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p> |
|
|
|
|
|
<p> |
|
|
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a> |
|
|
<a href="https://unmodeledtyler.com/work-with-vanta-research"><img src="https://img.shields.io/badge/Join Us-Research Affiliate-black" alt="Join Us"/></a> |
|
|
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a> |
|
|
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a> |
|
|
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a> |
|
|
</p> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
# Apollo-Astralis V1 4B |
|
|
|
|
|
**Apollo-Astralis V1 4B** is an advanced conversational reasoning model that combines rigorous logical thinking with warm, enthusiastic, and empathetic communication. Built on Qwen3-4B-Thinking and fine-tuned by VANTA Research, Astralis excels at collaborative problem-solving while maintaining context-appropriate emotional intelligence. |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
- **Base Model**: [Qwen/Qwen3-4B-Thinking](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) |
|
|
- **Model Type**: Causal Language Model (Auto-regressive Transformer) |
|
|
- **Parameters**: 4.0B total, 33M trainable (1.48% via LoRA) |
|
|
- **Architecture**: Qwen3 with thinking tag integration |
|
|
- **Training Method**: LoRA fine-tuning (rank=16, alpha=32) |
|
|
- **License**: Apache 2.0 |
|
|
- **Developer**: VANTA Research |
|
|
- **Release Date**: October 2025 |
|
|
|
|
|
## Key Features |
|
|
|
|
|
### Advanced Reasoning |
|
|
- **Explicit Thinking Process**: Uses `<think>` tags to show step-by-step reasoning |
|
|
- **Logical Rigor**: Trained to avoid common fallacies (syllogistic errors, conditional logic mistakes) |
|
|
- **Mathematical Precision**: Shows complete work with verified arithmetic |
|
|
- **Critical Analysis**: Questions assumptions and considers alternative explanations |
|
|
|
|
|
### Warm Communication |
|
|
- **Enthusiastic Celebrations**: Responds to achievements with explosive energy (CAPS, exclamations) |
|
|
- **Empathetic Support**: Validates feelings and provides gentle, supportive guidance |
|
|
- **Collaborative Style**: Uses "we" language and asks clarifying questions |
|
|
- **Context-Appropriate**: Matches tone to situation (excited for wins, calm for anxiety, neutral for facts) |
|
|
|
|
|
### Production-Ready |
|
|
- **Consistent Identity**: Maintains stable self-representation across conversations |
|
|
- **Natural Language**: Uses contractions and conversational phrasing |
|
|
- **Balanced Responses**: Combines analytical thinking with emotional intelligence |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
Apollo V1 was trained on a curated dataset emphasizing: |
|
|
- **Warmth & Enthusiasm**: High-energy responses to achievements and milestones |
|
|
- **Empathy**: Validating and supportive responses to struggles and anxiety |
|
|
- **Collaboration**: Multi-option problem-solving with clarifying questions |
|
|
- **Identity**: Consistent self-representation as Apollo from VANTA Research |
|
|
- **Reasoning**: Logical problem-solving with explicit thinking steps |
|
|
|
|
|
### Training Configuration |
|
|
```yaml |
|
|
Base Model: Qwen3-4B-Thinking-2507 (4-bit quantized) |
|
|
Training Epochs: 3 |
|
|
Training Steps: 150 |
|
|
Batch Size: 4 (per device) |
|
|
Gradient Accumulation: 4 steps |
|
|
Learning Rate: 2e-4 |
|
|
LR Scheduler: Cosine with warmup |
|
|
Warmup Steps: 15 |
|
|
LoRA Config: |
|
|
Rank: 16 |
|
|
Alpha: 32 |
|
|
Dropout: 0.05 |
|
|
Target Modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj] |
|
|
Optimizer: AdamW (paged_adamw_8bit) |
|
|
Mixed Precision: bf16 |
|
|
Gradient Checkpointing: Enabled |
|
|
Max Sequence Length: 2048 |
|
|
``` |
|
|
|
|
|
### Training Performance |
|
|
- **Final Loss**: 0.91 (down from 1.82) |
|
|
- **Token Accuracy**: 75.0% (up from 56.7%) |
|
|
- **Gradient Norm**: 0.48-0.69 (stable throughout training) |
|
|
- **Training Time**: ~35 minutes on single GPU |
|
|
|
|
|
## Benchmark Results |
|
|
|
|
|
| Metric | Score | Description | |
|
|
|--------|-------|-------------| |
|
|
| **Enthusiasm Detection** | 100% | Celebrates achievements with high energy | |
|
|
| **Empathy Recognition** | 90% | Validates emotions and provides support | |
|
|
| **Identity Consistency** | 75% | Maintains stable self-representation | |
|
|
| **Collaborative Tone** | 60% | Uses "we" language and asks questions | |
|
|
| **CAPS Word Usage** | 8-12 | Per celebration response (context-appropriate) | |
|
|
| **Exclamation Marks** | 8-15 | Per celebration response (explosive energy) | |
|
|
| **Contraction Usage** | 90% | Natural conversational language | |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Using Transformers (Merged Model) |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_name = "VANTA-Research/apollo-astralis-v1-4b" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_name, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Prepare conversation |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are Apollo V1, a warm and enthusiastic reasoning assistant."}, |
|
|
{"role": "user", "content": "I just got promoted at work!"} |
|
|
] |
|
|
|
|
|
# Generate response |
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
|
|
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
do_sample=True, |
|
|
repetition_penalty=1.05 |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Using with LoRA Adapters |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = "Qwen/Qwen3-4B-Thinking-2507" |
|
|
adapter_path = "VANTA-Research/apollo-astralis-v1-4b" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(model, adapter_path) |
|
|
|
|
|
# Use as shown above |
|
|
``` |
|
|
|
|
|
### Using with Ollama |
|
|
|
|
|
See the [Ollama distribution](https://ollama.com/vanta-research/apollo-astralis-v1-4b) for pre-quantized GGUF models. |
|
|
|
|
|
```bash |
|
|
# Pull from Ollama registry |
|
|
ollama pull vanta-research/apollo-astralis-v1-4b |
|
|
|
|
|
# Run interactively |
|
|
ollama run vanta-research/apollo-astralis-v1-4b |
|
|
``` |
|
|
|
|
|
## Example Outputs |
|
|
|
|
|
### Celebration Response |
|
|
**User**: "I just got accepted into my dream university!" |
|
|
|
|
|
**Apollo**: |
|
|
``` |
|
|
<think> |
|
|
This is a massive life achievement! The user has overcome significant academic |
|
|
and personal challenges to earn this recognition. I need to match their excitement |
|
|
with genuine enthusiasm, validation of their hard work, and celebration of this |
|
|
major milestone. |
|
|
</think> |
|
|
|
|
|
YESSSSSS! CONGRATULATIONS! THIS IS HUGE! |
|
|
|
|
|
Do you know how many people work SO hard for this? You studied, you improved, |
|
|
you proved yourself - and NOW you get to build YOUR future! |
|
|
|
|
|
This isn't just about a school - it's about YOUR potential being recognized! |
|
|
You've earned this through your dedication! |
|
|
|
|
|
**I'm so proud of you!** I want to know EVERYTHING: |
|
|
- Which university is it? (I'm curious!) |
|
|
- How did you feel when you got the email? |
|
|
- What made you work so hard for this? |
|
|
|
|
|
This is a HUGE step in your life! You're going to be successful! |
|
|
I have total confidence in you! |
|
|
``` |
|
|
|
|
|
### Problem-Solving Response |
|
|
**User**: "What's the best approach to learning photography?" |
|
|
|
|
|
**Apollo**: |
|
|
``` |
|
|
<think> |
|
|
Photography learning can be technical or creative-focused. I should offer |
|
|
different pathways that suit different goals and learning styles. |
|
|
Enthusiasm for the art form should shine through while keeping it accessible. |
|
|
</think> |
|
|
|
|
|
Ooh, photography is such a wonderful skill to develop! It literally changes |
|
|
how you see the world! 📸✨ |
|
|
|
|
|
**Option A: Technical Fundamentals First** |
|
|
Master exposure triangle (aperture, shutter speed, ISO) and composition rules. |
|
|
Perfect if you like building solid foundations! |
|
|
|
|
|
**Option B: Project-Based Learning** |
|
|
Give yourself weekly themes ("shadows," "portraits," "architecture") and learn |
|
|
what you need for each project. Learn through doing! |
|
|
|
|
|
**Option C: Genre-Focused Path** |
|
|
Pick one genre (landscape, street, portrait) and dive deep. Great if you have |
|
|
a clear photographic interest! |
|
|
|
|
|
What excites you most about photography? Is it capturing memories, artistic |
|
|
expression, or technical mastery? 🌟 |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Enthusiasm Calibration**: May use energetic language even for empathetic responses (trained behavior) |
|
|
- **Context Window**: 4096 tokens (inherited from base model) |
|
|
- **Language**: Primarily English (base model supports multilingual, but fine-tuning was English-only) |
|
|
- **Reasoning Depth**: Best for conversational reasoning; not optimized for competition-level mathematics |
|
|
- **Model Size**: 4B parameters may struggle with extremely specialized technical domains |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- **Warmth vs Professionalism**: Apollo's enthusiastic style may not be appropriate for all contexts |
|
|
- **Emotional Support**: Not a replacement for professional mental health services |
|
|
- **Bias**: Inherits biases from Qwen3-4B-Thinking base model; use with caution in sensitive applications |
|
|
- **Factuality**: May generate plausible-sounding but incorrect information; verify critical facts |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use Apollo-Astralis V1 4B in your research or applications, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{apollo-astralis-v1-4b, |
|
|
title={Apollo-Astralis V1 4B: A Warm Reasoning Model}, |
|
|
author={VANTA Research}, |
|
|
year={2025}, |
|
|
month={October}, |
|
|
publisher={HuggingFace}, |
|
|
howpublished={\url{https://huggingface.co/VANTA-Research/apollo-astralis-v1-4b}}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache License 2.0. See [LICENSE](./LICENSE) for details. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- **Base Model**: [Qwen3-4B-Thinking](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) by Alibaba Cloud |
|
|
- **Training Framework**: Hugging Face Transformers + PEFT |
|
|
- **Quantization**: llama.cpp for GGUF conversion |
|
|
|
|
|
## Contact |
|
|
|
|
|
- Organization: hello@vantaresearch.xyz |
|
|
- Engineering/Design: tyler@vantaresearch.xyz |
|
|
|
|
|
--- |
|
|
|
|
|
**Model Version**: 1.0 (Apollo-Astralis V1 4B) |
|
|
**Release Date**: October 3, 2025 |
|
|
**Last Updated**: October 3, 2025 |
|
|
|
|
|
***Proudly developed by VANTA Research in Portland, Oregon*** |