Spaces:

Really-Amazing
/

SimpleAI-259M

Sleeping

App Files Files Community

SimpleAI-259M / README.md

suraj-self

cosmatic changes

9a74bcc 3 months ago

preview code

raw

history blame contribute delete

1.21 kB

metadata

title: SimpleAI-259M
emoji: ⚡
colorFrom: indigo
colorTo: gray
sdk: docker
pinned: false
license: mit
short_description: A compact, general-purpose LLM for reasoning and logic.

⚡ SimpleAI-259M

SimpleAI-259M is a high-performance Large Language Model (LLM). It is the result of a targeted SFT (Supervised Fine-Tuning) run focused on unlocking reasoning, numeracy, and character-level precision.

🚀 SFT Training Report (Step 971)

Final Loss: 1.0419

📊 Benchmark Performance

Category	Score	Status
ARC-Easy	35.19%	📈 Reasoning Gain
MMLU	30.96%	✅ General Knowledge
GSM8K (Math)	12.50%	🚀 Numeracy Breakthrough
SpellingBee	100.00%	🏆 Perfect Character Accuracy

🔮 Future Roadmap: SimpleAI Series

SimpleAI-D12-v2: Enhanced dataset targeting sub-1.0 training loss.
SimpleAI-D24: A deeper 24-layer variant for multi-step logical deduction.
SimpleAI-Omni: Multimodal integration for cross-modal reasoning.

🧑‍💻 Usage

The model uses standard system tags for interaction:

<|user_start|> / <|user_end|>
<|assistant_start|>