Athena-R3-1.5B / README.md
Spestly's picture
Update README.md
191fd0e verified
|
raw
history blame
3.72 kB
---
base_model: Spestly/Atlas-R1-1.5B-Preview
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
license: mit
language:
- en
- zh
- fr
- es
- pt
- de
- it
- ru
- ja
- ko
- vi
- th
- ar
- fa
- he
- tr
- cs
- pl
- hi
- bn
- ur
- id
- ms
- lo
- my
- ceb
- km
- tl
- nl
datasets:
- openai/gsm8k
- HuggingFaceH4/ultrachat_200k
library_name: transformers
---
![Header](./Atlas-Pro.png)
# **Atlas Pro**
### **Model Overview**
**Atlas Pro** (Previously known as '🏆 Atlas-Experiment 0403 🧪' in AtlasUI) is an advanced language model (LLM) built on top of **Atlas Flash**. It's designed to provide exceptional performance for professional tasks like coding, mathematics, and scientific problem-solving. Atlas Pro builds on Atlas Flash by adding more fine-tuning and specialization, making it perfect for researchers and advanced users.
---
### **Key Features**
- **Improved Problem-Solving:** Handles tricky tasks in programming, math, and sciences better than most models.
- **Advanced Code Generation:** Produces clean and efficient code, but may still miss edge cases occasionally.
- **Domain Expertise:** Focused on technical and scientific domains but works well in general contexts too.
- **Reasoning Improvement:** In this version of Atlas, I have enhanced it's reasoning via synthetic data from models such as Gemini-2.0 Flash Thinking so that it can improve on reasoning.
---
### **Intended Use Cases**
Atlas Pro works best for:
- **Technical Professionals:** Helping developers, engineers, and scientists solve complex problems.
- **Educational Assistance:** Offering clear, step-by-step help for students and teachers.
- **Research Support:** Assisting in theoretical and applied science work.
- **Enterprise Tools:** Integrating into company workflows for smarter systems.
---
### **NOTICE**
Atlas Pro is built on **Atlas Flash** and improved to meet high standards. Here’s how it’s made:
1. **Base Model:** Built upon **Atlas Flash**, which is already quite capable.
2. **Fine-Tuning Details:**
- Used datasets specific to programming, math, and scientific challenges and overall reasoning abilities.
- Refined its performance for professional scenarios.
3. **Performance Highlights:**
- Beats benchmarks with high accuracy, though occasional tweaks might still improve outputs.
---
### **Limitations**
- **Knowledge Cutoff:** It doesn’t know about anything recent unless updated.
- **Hardware Requirements:** Needs high-end GPUs to run smoothly.
- **Specialization Bias:** While amazing in its focus areas, general chat capabilities might not be as good as other models.
- **Token Leakage:** In some very rare cases (~1/167), Atlas Pro will experience some token leakage.
---
### **Licensing**
Atlas Pro is released under the **MIT**, which prohibits harmful uses. Make sure to follow the rules in the license agreement.
---
### **Acknowledgments**
Created by **Spestly** as part of the **Astral Model Family**, Atlas Pro builds on the strong foundation of **Atlas Flash**. Special thanks to **Deepseek's R1 Qwen Distilles** for helping make it happen.
---
### **Usage**
You can use Atlas Pro with this code snippet:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the Atlas Pro model
model_name = "Spestly/Atlas-R1-Pro-1.5B-Preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Generate a response
prompt = "Write a Python function to calculate the Fibonacci sequence."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```