Instructions to use lightonai/LightOnOCR-2-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lightonai/LightOnOCR-2-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="lightonai/LightOnOCR-2-1B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForSeq2SeqLM processor = AutoProcessor.from_pretrained("lightonai/LightOnOCR-2-1B") model = AutoModelForSeq2SeqLM.from_pretrained("lightonai/LightOnOCR-2-1B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use lightonai/LightOnOCR-2-1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lightonai/LightOnOCR-2-1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lightonai/LightOnOCR-2-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/lightonai/LightOnOCR-2-1B
- SGLang
How to use lightonai/LightOnOCR-2-1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lightonai/LightOnOCR-2-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lightonai/LightOnOCR-2-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lightonai/LightOnOCR-2-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lightonai/LightOnOCR-2-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use lightonai/LightOnOCR-2-1B with Docker Model Runner:
docker model run hf.co/lightonai/LightOnOCR-2-1B
"model_type": "lighton_ocr"
"model_type" is not "mistral3"
ValueError: The checkpoint you are trying to load has model type lighton_ocr
but Transformers does not recognize this architecture.
"lighton_ocr" is now supported by transformers==5.0.0rc3
"lighton_ocr" is now supported by transformers==5.0.0rc3
still error:
(APIServer pid=56139) File "/mnt/work/miniconda3/envs/vllm/lib/python3.11/site-packages/vllm/multimodal/registry.py", line 261, in _create_processing_info
(APIServer pid=56139) return factories.info(ctx)
(APIServer pid=56139) ^^^^^^^^^^^^^^^^^^^
(APIServer pid=56139) File "/mnt/work/miniconda3/envs/vllm/lib/python3.11/site-packages/vllm/model_executor/models/mistral3.py", line 342, in _build_mistral3_info
(APIServer pid=56139) hf_config = ctx.get_hf_config(Mistral3Config)
(APIServer pid=56139) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=56139) File "/mnt/work/miniconda3/envs/vllm/lib/python3.11/site-packages/vllm/multimodal/processing.py", line 1132, in get_hf_config
(APIServer pid=56139) raise TypeError(
(APIServer pid=56139) TypeError: Invalid type of HuggingFace config. Expected type: <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'>, but found type: <class 'transformers.models.lighton_ocr.configuration_lighton_ocr.LightOnOcrConfig'>
(vllm) admin@8-n01:/mnt/work/xiakj/hf$ pip list|grep transformers
transformers 5.0.0rc3
(vllm) admin@8-n01:/mnt/work/xiakj/hf$ vi LightOnOCR-2-1B/config.json
(vllm) admin@8-n01:/mnt/work/xiakj/hf$ cat LightOnOCR-2-1B/config.json |grep lighton_ocr
"model_type": "lighton_ocr",
(vllm) admin@8-n01:/mnt/work/xiakj/hf$ cat LightOnOCR-2-1B/config.json
{
"architectures": [
"LightOnOCRForConditionalGeneration"
],
"dtype": "bfloat16",
"eos_token_id": 151645,
"image_token_id": 151655,
"model_type": "lighton_ocr",
"multimodal_projector_bias": false,
Oh... vLLM doesn't catch up transformers==5.0.0rc3 yet. And I've just found another bug on transformers==5.0.0rc3 . Sorry but it is too early for us to change "model_type" into "lighton_ocr" and I will wait for official release of transformers 5.0.0 and vLLM.