NovatasticRoScript
/

Atomight-V2.1-0.5B-Inference

@@ -1,73 +1,167 @@
----
-base_model: unsloth/qwen2.5-0.5b-instruct-unsloth-bnb-4bit
-library_name: transformers
-model_name: results
-tags:
-- generated_from_trainer
-- trl
-- grpo
-- unsloth
-licence: license
-license: mit
-datasets:
-- bespokelabs/Bespoke-Stratos-17k
-language:
-- en
 ---
-# Model Card for results
-This model is a fine-tuned version of [unsloth/qwen2.5-0.5b-instruct-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-0.5b-instruct-unsloth-bnb-4bit).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="NovatasticRoScript/results", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Training procedure
-This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
-### Framework versions
-- TRL: 1.5.0
-- Transformers: 5.9.0
-- Pytorch: 2.10.0
-- Datasets: 4.8.5
-- Tokenizers: 0.22.2
-## Citations
-Cite GRPO as:
-```bibtex
-@article{shao2024deepseekmath,
-    title        = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
-    author       = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
-    year         = 2024,
-    eprint       = {arXiv:2402.03300},
-}
-```
-Cite TRL as:
-```bibtex
-@software{vonwerra2020trl,
-  title   = {{TRL: Transformers Reinforcement Learning}},
-  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
-  license = {Apache-2.0},
-  url     = {https://github.com/huggingface/trl},
-  year    = {2020}
-}
-```

+**Atomight-V2.1-0.5B-Inference**
+Atomight-V2.1-0.5B-Inference is a compact reasoning-oriented language model developed under the Atomight ecosystem. Built on a Qwen-derived foundation and refined using GRPO-based reinforcement tuning, the model focuses on efficient reasoning, structured outputs, coding capability, and lightweight deployment.
+Despite its small ~0.5B parameter footprint, Atomight-V2.1 demonstrates competitive performance against other small language models across reasoning and commonsense benchmarks.
 ---
+Overview
+- Model Name: **Atomight-V2.1-0.5B-Inference**
+- Parameters: ~494M
+- Architecture Base: Qwen-derived causal language model
+- Training Method: GRPO reinforcement training
+- Primary Focus:
+  - Reasoning
+  - Lightweight inference
+  - Coding capability
+  - Structured responses
+  - Efficient deployment
+---
+Training Datasets
+Atomight-V2.1 was trained using a curated mix of public reasoning and instruction datasets, including:
+- GSM8K (2000 samples)
+- HumanEval
+- MMLU (2000 samples)
+- ARC-Challenge (AI2 ARC)
+- Bespoke-Stratos-17k (4000 curated samples)
+The training philosophy emphasized:
+- high-signal reasoning samples,
+- compact capability transfer,
+- and reinforcement-based refinement over massive-scale brute-force training.
+---
+Benchmark Results
+**Official Evaluation** performed using **EleutherAI LM Evaluation Harness**.
+Benchmark| Score
+*ARC-Easy*| **59.3%**
+*HellaSwag*| **52.4%**
+*ARC-Challenge*| **33.8%**
+*GSM8K (Flexible Extract)*| **32.5%**
+*GSM8K (Strict)*| **19.8%**
+Comparative Notes
+Compared against similarly-sized small language models:
+- Competitive with **Qwen2.5-0.5B-Instruct**
+- Competitive with **Llama-3.2-1B-Instruct** on selected reasoning benchmarks
+- Strongest performance observed in:
+  - commonsense reasoning,
+  - structured inference,
+  - and challenge-style QA
+---
+Example
+def is_palindrome(string: str) -> bool:
+    """Returns True if the string reads the same backward as forward, ignoring case."""
+    cleaned_string = ''.join(
+        char.lower() for char in string
+        if char.isalnum()
+    )
+    return cleaned_string == cleaned_string[::-1]
+---
+Intended Use
+Atomight-V2.1 is designed for:
+- Lightweight local inference
+- Experimental reasoning systems
+- Educational AI research
+- Small-scale coding assistants
+- Mobile/cloud deployment workflows
+- Efficient fine-tuning experiments
+---
+Limitations
+This is still a compact 0.5B-scale language model and has several limitations:
+- Weakness in advanced multi-step arithmetic
+- Inconsistent scientific reasoning on harder benchmarks
+- Occasional verbose reasoning outputs
+- Hallucinations remain possible
+- Not suitable for high-stakes applications
+---
+Future Roadmap
+Planned future Atomight developments include:
+- Improved tokenizer optimization
+- Specialist teacher-model distillation
+- UltraMath / UltraCode / UltraThink training branches
+- Hybrid SFT + GRPO pipelines
+- Enhanced reasoning alignment
+- Lightweight deployment optimization
+---
+Hardware & Workflow
+Atomight models are developed using a lightweight mobile-first workflow involving:
+- Google Colab
+- Kaggle
+- Hugging Face ecosystem tooling
+This project explores how far compact open models can be pushed under constrained compute environments.
+---
+License
+Please refer to the base model license and dataset licenses before commercial or derivative use.
+---
+Acknowledgements
+Special thanks to:
+- Qwen
+- DeepSeek
+- Hugging Face
+- EleutherAI
+- Open-source AI research community
+---
+Atomight Ecosystem
+Current and planned projects include:
+- Atomight-V2.x
+- Atomight UltraMath
+- Atomight UltraCode
+- Atomight UltraThink
+- AtomightDepict-0.4B-Pixels
+---
+Citation
+@misc{atomight_v21,
+  title={Atomight-V2.1-0.5B-Inference},
+  author={NovatasticRoScript},
+  year={2026},
+  publisher={Hugging Face}
+}