KodaLite-1.3B โ€” MLX (fp16)

MLX version of YoAbriel/KodaLite-1.3B, optimized for Apple Silicon (M1/M2/M3/M4).

Size: ~2.5 GB | Precision: bfloat16

Usage

pip install mlx-lm
from mlx_lm import load, generate

model, tok = load("YoAbriel/KodaLite-1.3B-mlx")
prompt = tok.apply_chat_template(
    [{"role": "user", "content": "What is the capital of France?"}],
    tokenize=False,
    add_generation_prompt=True,
)
print(generate(model, tok, prompt=prompt, max_tokens=80))

Or from the command line:

mlx_lm.generate --model YoAbriel/KodaLite-1.3B-mlx \
  --prompt "<|user|>\nHello\n<|assistant|>\n" --max-tokens 80

Other quantizations

Limitations

Small model (1.27B params), undertrained (1.64B tokens). See the base model card for full details.

License

Apache 2.0

Downloads last month
125
Safetensors
Model size
1B params
Tensor type
F16
ยท
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for YoAbriel/KodaLite-1.3B-mlx

Finetuned
(1)
this model