Instructions to use miguelconner4/claro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use miguelconner4/claro with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("miguelconner4/claro") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use miguelconner4/claro with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "miguelconner4/claro"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "miguelconner4/claro" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "miguelconner4/claro", "messages": [ {"role": "user", "content": "Hello"} ] }'
Claro 4B
Claro is a fine-tuned Gemma 3 4B Instruct that rewrites complex English at CEFR A2 (elementary) level while preserving the source's facts. Trained on Apple Silicon with MLX (LoRA), via SFT followed by RL (GSPO) against a decomposed, mostly-deterministic reward.
Format note: this is an MLX model (converted base:
mlx-community/gemma-3-4b-it-bf16). It loads withmlx_lmon Apple Silicon. It is not atransformers/PyTorch checkpoint. The repo also ships the LoRA adapter underadapter/for applying on top of the base yourself.
Usage (MLX)
The model expects its chat template with the training system prompt — a raw prompt string will make it ramble. Replicate the training invocation:
from mlx_lm import load, generate
model, tok = load("miguelconner4/claro")
SYSTEM = ("Rewrite the user's text in CEFR A2 (Elementary English): short simple "
"sentences, basic vocabulary, no idioms. Keep all important facts. "
"Output only the rewritten text.")
complex_text = "The edifice, constructed circa 1750, was subsequently designated a historic landmark."
prompt = tok.apply_chat_template(
[{"role": "system", "content": SYSTEM},
{"role": "user", "content": complex_text}],
tokenize=False, add_generation_prompt=True,
)
print(generate(model, tok, prompt=prompt, max_tokens=512, verbose=False))
# -> "The building was built around 1750. People decided it was important history."
How it was trained
SFT on ~1,500 (complex → A2) paragraph pairs distilled from a frontier model over random Wikipedia paragraphs, filtered by LLM judges.
RL (200 iters, group size 8, GSPO sequence-level importance sampling, KL β=0.1) from the SFT checkpoint, against a cardinal multiplicative reward:
reward = level_band × vocab × fidelity × format_gates(each ∈ [0,1]).level_band— deterministic A2 difficulty: readability (Flesch), mean sentence length, passive and subordination density, with bands calibrated to the 10th–90th percentiles of real A2 reference texts.vocab— penalty for off-A2-list words, with gloss-aware exemption (defining a hard term in-line is not penalized).fidelity— LLM judge, decomposed into fact-level recall + hallucination counts (not a holistic score).format_gates— hard pass/fail for markdown / degenerate loops.
Evaluation (30 held-out Wikipedia paragraphs)
On 30 held-out Wikipedia paragraphs, Claro's rewrites land at ~70% CEFR A2 (most of the rest A1; only ~7% drift up to the harder B1), per a DeepSeek mode-of-3 CEFR classifier — reliably simpler than the model it was fine-tuned from, with fewer too-hard outputs. Faithfulness is preserved: source-fact recall stays ~0.98, and a strict hand-audit (counting only real contradictions and fabricated facts, notparaphrase or omission) found ~3–4 genuine errors across the 30 paragraphs —indistinguishable from the baseline and consistent across three independent judge families (Haiku, GPT-4o, Gemini). So the GSPO step delivered a real gain in simplicity at no measurable cost to accuracy. The few remaining errors are subtle — dropped qualifiers or reversed relations (e.g. "younger"→"older sister").
Limitations
- MLX-only (Apple Silicon). No PyTorch/transformers weights provided.
- Evaluated at n=30; CEFR classification is genuinely noisy at the A2/B1 boundary (judges agree with a strict reference only ~50% of the time there). Treat the numbers as ±10pp.
- ~1 in 10 outputs carries a genuine fidelity slip (≈3–4 per 30 in our audit). The dominant mode is subtle attribute/relation errors (e.g. "younger"↔"older sister", a dropped qualifier), not wholesale fabrication.
- English-only; tuned on encyclopedic prose. Out-of-domain text (dialogue, code, poetry) is untested.
License
Derivative of Google's Gemma 3; use is governed by the Gemma Terms of Use and the Gemma Prohibited Use Policy, which carry over to this model.
- Downloads last month
- -
Quantized