Text Generation
MLX
Safetensors
English
rodan-modern
rodan
tiny-language-model
apple-silicon
byte-bpe
Instructions to use bfuzzy1/Rodan-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use bfuzzy1/Rodan-Base with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("bfuzzy1/Rodan-Base") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use bfuzzy1/Rodan-Base with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "bfuzzy1/Rodan-Base" --prompt "Once upon a time"
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -166,8 +166,7 @@ v9 confirmed the ~11M ceiling and that PLE was dead weight, but since it didn't
|
|
| 166 |
base. From here the work moves to the capability stages (chat, reasoning).
|
| 167 |
|
| 168 |
What the model is actually like: it holds up well for 11M on commonsense and science multiple-choice. SciQ
|
| 169 |
-
(67.5)
|
| 170 |
-
above random. Arithmetic has crept off the random floor (ArithMark 26.4) thanks to the folded-in computation
|
| 171 |
data, though it's a modest lift and actually generating arithmetic is still weak. On the harder abstract
|
| 172 |
reasoning tasks (Winogrande, CommonsenseQA, ARC-Challenge, OpenBookQA) and on open-ended generation it's near
|
| 173 |
chance, partly a capacity ceiling at this size and partly loglikelihood length-bias. It's a solid base for
|
|
|
|
| 166 |
base. From here the work moves to the capability stages (chat, reasoning).
|
| 167 |
|
| 168 |
What the model is actually like: it holds up well for 11M on commonsense and science multiple-choice. SciQ
|
| 169 |
+
(67.5), PIQA (56.0), ARC-Easy (35.6), HellaSwag (31.8), and COPA (55.0) are all clearly above random. Arithmetic has crept off the random floor (ArithMark 26.4) thanks to the folded-in computation
|
|
|
|
| 170 |
data, though it's a modest lift and actually generating arithmetic is still weak. On the harder abstract
|
| 171 |
reasoning tasks (Winogrande, CommonsenseQA, ARC-Challenge, OpenBookQA) and on open-ended generation it's near
|
| 172 |
chance, partly a capacity ceiling at this size and partly loglikelihood length-bias. It's a solid base for
|