|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- adam |
|
|
- curious-architecture |
|
|
- instruction-tuned |
|
|
- conversational-ai |
|
|
- 2b-parameters |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Adam: Instruction-Tuned Conversational AI |
|
|
|
|
|
<div align="center"> |
|
|
<img src="https://img.shields.io/badge/Parameters-2B-blue" alt="2B Parameters"> |
|
|
<img src="https://img.shields.io/badge/Architecture-CuriousForCausalLM-green" alt="Curious Architecture"> |
|
|
<img src="https://img.shields.io/badge/Instruction%20Tuned-Yes-orange" alt="Instruction Tuned"> |
|
|
<img src="https://img.shields.io/badge/Context%20Length-8K-purple" alt="8K Context"> |
|
|
</div> |
|
|
|
|
|
## π Model Overview |
|
|
|
|
|
**Adam** is a powerful 2 billion parameter language model built with the Curious architecture, specifically instruction-tuned for high-quality conversational AI and task completion. This model represents the next generation of efficient, instruction-tuned language models optimized for natural conversations. |
|
|
|
|
|
## β¨ Key Features |
|
|
|
|
|
- **ποΈ Native Curious Architecture**: Custom `CuriousForCausalLM` architecture with Curious-specific optimizations |
|
|
- **π― Instruction-Tuned**: Fine-tuned for conversational AI and task completion |
|
|
- **β‘ Efficient**: 2B parameters with optimized inference |
|
|
- **π¬ Conversational**: Specialized for natural dialogue and helpful responses |
|
|
- **π§ Advanced Features**: Sliding window attention, logit softcapping, and enhanced activations |
|
|
|
|
|
## π Model Specifications |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| **Architecture** | CuriousForCausalLM | |
|
|
| **Model Type** | curious_text | |
|
|
| **Parameters** | ~2.6B | |
|
|
| **Context Length** | 8,192 tokens | |
|
|
| **Vocabulary** | 256,000 tokens | |
|
|
| **Training** | Instruction-tuned | |
|
|
| **Curious Version** | 2.0 | |
|
|
|
|
|
## π― Capabilities |
|
|
|
|
|
- **Natural Conversations**: Engaging and contextually aware dialogue |
|
|
- **Question Answering**: Accurate responses to diverse queries |
|
|
- **Creative Writing**: Poetry, stories, and creative content generation |
|
|
- **Code Assistance**: Programming help and code generation |
|
|
- **Mathematical Reasoning**: Problem-solving and calculations |
|
|
- **Instruction Following**: Precise task execution and completion |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
|
|
|
### Interactive Chat |
|
|
|
|
|
```python |
|
|
pip install requirements.txt |
|
|
``` |
|
|
|
|
|
```python |
|
|
# Use the included chat interface |
|
|
python chat_with_adam.py to talk to adam. |
|
|
``` |
|
|
|
|
|
## ποΈ Curious Architecture Features |
|
|
|
|
|
- **Enhanced Attention**: Advanced attention mechanisms for better context understanding |
|
|
- **Sliding Window**: Efficient processing of long sequences |
|
|
- **Logit Softcapping**: Improved generation stability |
|
|
- **Optimized Activations**: GELU with PyTorch tanh for better performance |
|
|
- **Instruction Tuning**: Specialized for conversational AI tasks |
|
|
|
|
|
## π Performance |
|
|
|
|
|
- **Quality**: High-quality instruction-tuned responses |
|
|
- **Speed**: Optimized for efficient inference |
|
|
- **Memory**: ~5GB model size |
|
|
- **Hardware**: GPU recommended for best performance |
|
|
- **Context**: 8K token context window |
|
|
|
|
|
## π§ Technical Details |
|
|
|
|
|
### Model Configuration |
|
|
|
|
|
```json |
|
|
{ |
|
|
"architectures": ["CuriousForCausalLM"], |
|
|
"model_type": "curious_text", |
|
|
"hidden_size": 2304, |
|
|
"num_attention_heads": 8, |
|
|
"num_hidden_layers": 26, |
|
|
"max_position_embeddings": 8192, |
|
|
"curious_version": "2.0", |
|
|
"curious_instruction_tuned": true |
|
|
} |
|
|
``` |
|
|
|
|
|
### Generation Parameters |
|
|
|
|
|
|
|
|
|
|
|
## π¨ Use Cases |
|
|
|
|
|
- **Chatbots**: Conversational AI applications |
|
|
- **Assistants**: Task-oriented AI helpers |
|
|
- **Creative Writing**: Content generation and editing |
|
|
- **Education**: Tutoring and explanation |
|
|
- **Coding**: Programming assistance |
|
|
- **Research**: Information synthesis and analysis |
|
|
|
|
|
## β οΈ Limitations |
|
|
|
|
|
- **Context Length**: Limited to 8K tokens |
|
|
- **Training Data**: Cutoff date applies to training data |
|
|
- **Bias**: May reflect biases in training data |
|
|
- **Factual Accuracy**: Should be verified for critical applications |
|
|
|
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- Built with the Curious Architecture Framework |
|
|
- Instruction-tuned for conversational AI |
|
|
- Powered by the Curious Architecture Framework v2.0 |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<strong>Adam: The Future of Conversational AI</strong><br> |
|
|
<em>Built with β€οΈ using the Curious Architecture Framework</em> |
|
|
</div> |