---
license: mit
tags:
- pytorch
- gpt2
- text-generation
- fin-ai
- experimental
- in-training
- from-scratch
- automated-training
language:
- en
datasets:
- wikitext
- roneneldan/TinyStories
- openai/gsm8k
- squad
- imdb
- ag_news
- yelp_review_full
- cnn_dailymail
- billsum
- commonsense_qa
- hellaswag
- winogrande
- boolq
- race
- stanfordnlp/coqa
- allenai/c4
- Skylion007/openwebtext
- trivia_qa
- hotpot_qa
- microsoft/ms_marco
- duorc
- amazon_polarity
- zeroshot/twitter-financial-news-sentiment
- sciq
- quail
- wiki_qa
- paws
- medical_questions_pairs
- app_reviews
- rotten_tomatoes
metrics:
- perplexity
library_name: pytorch
pipeline_tag: text-generation
---
🤖 Fin.AI v2.0




**⚠️ EXPERIMENTAL MODEL - Training from scratch**
[GitHub](https://github.com/MeridianAlgo/FinAI) • [Training Logs](https://wandb.ai/meridianalgo-meridianalgo/fin-ai) • [Report Issue](https://github.com/MeridianAlgo/FinAI/issues)
---
## 🚨 Important Notice
**This model is training from scratch and outputs will be gibberish initially.**
- 🔴 **Brand new model** - Starting from random weights
- ⏳ **Training time needed**: 2-4 weeks for basic coherence
- 🤖 **Automated training**: Every 1 hour 10 minutes via GitHub Actions
- 📊 **Current quality**: Expect complete nonsense initially
- 🎯 **Purpose**: Research/experimental continuous learning
---
## 📊 Model Overview
| Specification | Value |
|--------------|-------|
| **Architecture** | GPT-2 style Transformer |
| **Parameters** | 30,142,848 (~30M) |
| **Layers** | 6 |
| **Attention Heads** | 6 |
| **Embedding Dimension** | 384 |
| **Feed-Forward Dimension** | 1,536 |
| **Max Sequence Length** | 512 tokens |
| **Vocabulary Size** | 50,257 (GPT-2 tokenizer) |
| **Position Encoding** | Rotary (RoPE) |
| **Activation** | GELU |
---
## 🎯 Training Details
### Training Schedule
- **Frequency**: Every 1 hour 10 minutes (6 cycles/hour)
- **Steps per cycle**: 800 steps
- **Daily steps**: ~115,200 steps
- **Weekly steps**: ~806,400 steps
- **Batch size**: 8 (effective: 32 with gradient accumulation)
- **Learning rate**: 3e-4 with cosine decay
- **Warmup steps**: 100
### Training Infrastructure
- **Platform**: GitHub Actions (free tier)
- **Hardware**: CPU only
- **Training time**: ~15-20 minutes per cycle
- **Automatic upload**: To Hugging Face after each cycle
### Datasets (30 total, rotating hourly)
The model trains on a diverse set of 30 datasets, cycling through one per hour:
**📚 Knowledge & Reference**
- WikiText-2, OpenWebText, C4
**✍️ Creative Writing**
- TinyStories
**📰 News & Articles**
- CNN/DailyMail, AG News, Billsum
**❓ Question Answering**
- SQuAD, CoQA, TriviaQA, HotpotQA, MS MARCO, WikiQA, Quail
**🧠 Reasoning & Logic**
- GSM8K (Math), Common Sense QA, HellaSwag, WinoGrande, BoolQ
**📖 Reading Comprehension**
- RACE, DuoRC
**💬 Reviews & Sentiment**
- IMDB, Yelp, Amazon Polarity, Rotten Tomatoes, App Reviews
**🔬 Scientific & Medical**
- SciQ, Medical Questions
**💰 Financial**
- Twitter Financial News
**🔄 Paraphrase & Similarity**
- PAWS
---
## 📈 Training Progress
### Current Status
- **Version**: v2.0.0
- **Training started**: December 28, 2024
- **Model type**: fresh_init
- **Total parameters**: 30,142,848
### Expected Timeline
| Week | Expected Quality | Description |
|------|-----------------|-------------|
| 1 | 🔴 Gibberish | Random weights, no coherence |
| 2 | 🟠 Patterns | Some token patterns emerging |
| 3-4 | 🟡 Basic | Simple word sequences |
| 5-8 | 🟢 Improving | Short coherent phrases |
| 9-12 | 🔵 Decent | Usable for simple tasks |
### Monitoring
- **GitHub Actions**: [View Training Runs](https://github.com/MeridianAlgo/FinAI/actions)
- **Wandb Dashboard**: [View Metrics](https://wandb.ai/meridianalgo-meridianalgo/fin-ai)
- **Model Updates**: This page updates automatically
---
## 💻 Usage
### Installation
```bash
pip install torch transformers huggingface-hub
```
### Download Model
```python
from huggingface_hub import hf_hub_download
import os
# Create directory
os.makedirs("./fin_ai_model", exist_ok=True)
# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./fin_ai_model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./fin_ai_model")
```
### Generate Text (Experimental)
```python
from fin_ai.model import FinAIModel
import torch
from transformers import AutoTokenizer
# Load model
model = FinAIModel.from_pretrained("./fin_ai_model")
model.eval()
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
# Generate text (expect poor quality initially)
input_text = "Once upon a time"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
with torch.no_grad():
output = model.generate(
input_ids,
max_length=50,
temperature=0.8,
top_p=0.9,
do_sample=True,
)
generated_text = tokenizer.decode(output[0])
print(generated_text)
# Note: Output quality is poor initially and improves over weeks
```
---
## 🔬 Technical Details
### Architecture Improvements (v2.0)
Compared to v1.x:
- ✅ **3x more parameters** (10M → 30M)
- ✅ **Better architecture** (4 layers → 6 layers)
- ✅ **Larger embeddings** (256 → 384 dimensions)
- ✅ **More attention heads** (4 → 6 heads)
- ✅ **Improved training** (600 → 800 steps/cycle)
### Training Configuration
```yaml
model:
size_preset: "small"
n_layers: 6
n_heads: 6
embed_dim: 384
ff_dim: 1536
max_seq_len: 512
training:
batch_size: 8
gradient_accumulation_steps: 4
learning_rate: 3.0e-4
weight_decay: 0.01
warmup_steps: 100
max_steps: 800
```
---
## 📊 Evaluation
### Metrics Tracked
- **Training Loss**: Cross-entropy loss
- **Perplexity**: exp(loss)
- **Tokens/Second**: Training throughput
- **Learning Rate**: Cosine schedule with warmup
- **Gradient Norm**: For stability monitoring
### Benchmarks (Coming Soon)
Once the model reaches basic coherence, we'll evaluate on:
- HellaSwag (common sense)
- LAMBADA (reading comprehension)
- WikiText perplexity
- Custom generation quality tests
---
## ⚠️ Limitations
1. **Early Training**: Model is in very early training stages
2. **Output Quality**: Expect gibberish for several weeks
3. **CPU Training**: Slower than GPU training
4. **Small Model**: 30M parameters is relatively small
5. **Limited Context**: 512 token context window
6. **No Fine-tuning**: Base model only, not instruction-tuned
7. **English Only**: Trained primarily on English text
---
## 🤝 Contributing
This is an open research project! Contributions welcome:
- **Code**: [GitHub Repository](https://github.com/MeridianAlgo/FinAI)
- **Issues**: [Report Problems](https://github.com/MeridianAlgo/FinAI/issues)
- **Discussions**: [Join Discussion](https://github.com/MeridianAlgo/FinAI/discussions)
---
## 📜 License
MIT License - See [LICENSE](https://github.com/MeridianAlgo/FinAI/blob/main/LICENSE)
---
## 🔗 Links
- **Repository**: https://github.com/MeridianAlgo/FinAI
- **Training Logs**: https://wandb.ai/meridianalgo-meridianalgo/fin-ai
- **GitHub Actions**: https://github.com/MeridianAlgo/FinAI/actions
- **Issues**: https://github.com/MeridianAlgo/FinAI/issues
---
**Last Updated**: 2025-12-28 17:54 UTC
**Status**: 🔴 Training from Scratch
**Quality**: ⚠️ Expect Gibberish (2-4 weeks needed)