Instructions to use vukrosic/nano-proofread with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vukrosic/nano-proofread with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="vukrosic/nano-proofread")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("vukrosic/nano-proofread", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use vukrosic/nano-proofread with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vukrosic/nano-proofread" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vukrosic/nano-proofread", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/vukrosic/nano-proofread
- SGLang
How to use vukrosic/nano-proofread with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "vukrosic/nano-proofread" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vukrosic/nano-proofread", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "vukrosic/nano-proofread" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vukrosic/nano-proofread", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use vukrosic/nano-proofread with Docker Model Runner:
docker model run hf.co/vukrosic/nano-proofread
nano-proofread
Fixes the writing errors a spell-checker can't see โ their going to win โ
they're going to win, its raining again โ it's raining again, the the cat sat โ
the cat sat. The mistakes are real words (their/there/they're are all
spelled correctly), so a spell-checker stays silent; which one is right depends on the
surrounding words. A ~1M-parameter (1,016,960) byte-level transformer that reads
the context and picks.
Scope (a fixed confusion set, not general grammar): their/there/they're,
your/you're, its/it's, then/than, to/too, could have/could of, and doubled
words.
- Code, benchmark, tests, technical report: https://github.com/vukrosic/nano-proofread
- Runs on CPU in milliseconds. No tokenizer file โ raw UTF-8 bytes.
Benchmark
| model | best context-free script | |
|---|---|---|
| overall (held-out, N=4000) | 100.0% | 49.2% |
| context slice (N=2030) | 100.0% | 0.0% |
| out-of-distribution (N=25) | 92.0% | 36.0% |
The script is 0% on the context slice by construction โ it can only emit its default member, which is wrong exactly where context decides. The number that matters is the last row: on 25 natural phrases matching no training template, the model beats the script by 56 points โ it learned the grammatical cue, not memorised sentences. (An earlier 14-template version scored 99% on a same-template split but failed on real phrases; the frame-based generator + this OOD test is what keeps the result honest.)
Usage
pip install torch safetensors numpy
# grab modeling_nano_proofread.py + config.json from the GitHub repo
from modeling_nano_proofread import load, proofread
m = load("model.safetensors", "config.json")
proofread(m, "their going to win") # -> "they're going to win"
proofread(m, "its raining again") # -> "it's raining again"
How it was trained
100% code-generated, correct by construction: build a correct phrase from ~65 grammatical frames with rich fillers, then inject one error (swap the confusion word, or double a word); ~15% identity. SFT, prompt masked. ~1M-param byte-level transformer (RMSNorm, RoPE, GQA, SwiGLU), 24k steps, AdamW, cosine LR. Full recipe and reproduction in the GitHub repo.
MIT. Built by Vuk Rosiฤ.
- Downloads last month
- 2