Instructions to use docling-project/SmolDocling-256M-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use docling-project/SmolDocling-256M-preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="docling-project/SmolDocling-256M-preview")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("docling-project/SmolDocling-256M-preview")
model = AutoModelForImageTextToText.from_pretrained("docling-project/SmolDocling-256M-preview")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use docling-project/SmolDocling-256M-preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "docling-project/SmolDocling-256M-preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/docling-project/SmolDocling-256M-preview

SGLang

How to use docling-project/SmolDocling-256M-preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "docling-project/SmolDocling-256M-preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "docling-project/SmolDocling-256M-preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use docling-project/SmolDocling-256M-preview with Docker Model Runner:
```
docker model run hf.co/docling-project/SmolDocling-256M-preview
```

is there a max token limit for this? my ocr always seems to end abruptly

by jinoooooooooo - opened Mar 17, 2025

Discussion

jinoooooooooo

Mar 17, 2025

is there a max token limit for this? my ocr always seems to end abruptly

asnassar

Docling org Mar 17, 2025

It's as SmolVLM original implementation 8192. If you can share your example please do.

jinoooooooooo

Mar 17, 2025

sure. sharing an example with a single image extraction

import torch
from docling_core.types.doc import DoclingDocument
from docling_core.types.doc.document import DocTagsDocument
from transformers import AutoProcessor, AutoModelForVision2Seq
from transformers.image_utils import load_image

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

# Load images
image = load_image("/content/2
![2.jpg](https://cdn-uploads.huggingface.co/production/uploads/61ebdb79592a25e6c39bc13f/CVbjQ6FiFpyWc9zShFins.jpeg)
.jpg")

# Initialize processor and model
processor = AutoProcessor.from_pretrained("ds4sd/SmolDocling-256M-preview")
model = AutoModelForVision2Seq.from_pretrained(
    "ds4sd/SmolDocling-256M-preview",
    torch_dtype=torch.bfloat16,
).to(DEVICE)

# Create input messages
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Convert this page to docling."}
        ]
    },
]

# Prepare inputs
prompt = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(text=prompt, images=[image], return_tensors="pt")
inputs = inputs.to(DEVICE)

# Generate outputs
generated_ids = model.generate(**inputs, max_new_tokens=8192)
prompt_length = inputs.input_ids.shape[1]
trimmed_generated_ids = generated_ids[:, prompt_length:]
doctags = processor.batch_decode(
    trimmed_generated_ids,
    skip_special_tokens=False,
)[0].lstrip()

# Populate document
doctags_doc = DocTagsDocument.from_doctags_and_image_pairs([doctags], [image])
print(doctags)
# create a docling document
doc = DoclingDocument(name="Document")
doc.load_from_doctags(doctags_doc)

# export as any format
# HTML
# doc.save_as_html(output_file)
# MD
print(doc.export_to_markdown())

jinoooooooooo

Mar 17, 2025

this is a sample image

jinoooooooooo

Mar 17, 2025

sample extraction

jinoooooooooo

Mar 17, 2025

only half of it gets extracted

asnassar

Docling org Mar 17, 2025

I think you just need to resize your terminal, the output is overflowing. Also you could just save the markdown output to a text file for inspection!

jinoooooooooo

Mar 17, 2025

my bad, i see the whole output now, but the text above the table has been skipped, any idea why this might happen?

asnassar

Docling org Mar 17, 2025

No problem. Actually this helps catch a bug, it seems the conversion to DoclingDocument didn't populate the caption. The caption is in the prediction though, we will make a fix.

jinoooooooooo

Mar 17, 2025

thanks very much!

kasatgaurav

Mar 17, 2025

•

edited Mar 17, 2025

@jinoooooooooo can you share your notebook setup or script . For my usecase my docs are similiar to what you have pasted above , but results are very bad.

kasatgaurav

Mar 17, 2025

till certain length its working fine post that the same part is getting repeated. @asnassar

asnassar

Docling org Mar 18, 2025

@jinoooooooooo we fixed the issue, I suggest you update docling-core package and it should work now.
@kasatgaurav if you it is possible please make a separate issue on here or on https://github.com/docling-project/docling/issues with an example so we can fix this in the upcoming checkpoint.

asnassar changed discussion status to closed Mar 18, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment