Image-Text-to-Text
Transformers
Safetensors
qwen3_5
vision-language
vlm
document-understanding
structured-extraction
information-extraction
ocr
document-to-markdown
markdown
rag
reasoning
multilingual
conversational
Instructions to use numind/NuExtract3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use numind/NuExtract3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="numind/NuExtract3") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("numind/NuExtract3") model = AutoModelForImageTextToText.from_pretrained("numind/NuExtract3") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use numind/NuExtract3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "numind/NuExtract3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/numind/NuExtract3
- SGLang
How to use numind/NuExtract3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "numind/NuExtract3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "numind/NuExtract3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use numind/NuExtract3 with Docker Model Runner:
docker model run hf.co/numind/NuExtract3
File size: 4,075 Bytes
c25cc08 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | | Type | Description | Examples |
| --- | --- | --- |
| **integer** | An integer number. | 12, 0, -4 |
| **number** | Any number, including floating point or integers. | 3.14, -9.1, 0 |
| **string** | A general string; can be abstractive or deduced from reasoning. | Hello World, any string |
| **verbatim-string** | Strictly extractive from input; preserves all characters (accents, emojis) but normalizes whitespace/tabs to a single space. | John Doe, 1120 Santa Monica Boulevard |
| **date** | ISO 8601 compliant. Supports reduced accuracy (YYYY-MM, YYYY, --MM-DD) and week dates (YYYY-Www). | 2024-01-15, 2024-01, --12-25 |
| **time** | ISO 8601 compliant. Supports reduced accuracy and timezone offsets (+hh-mm). | 14:30:57, 18:01, 14:30:45.123Z |
| **date-time** | ISO 8601 compliant (YYYY-MM-DDThh:mm:ss.s+hh-mm). Can omit components if only date or time is present. | 2024-03-14T14:45:00, 2023-05-15T14 |
| **duration** | ISO 8601 duration (PnYnMnDTnHnMnS). "P3W" (weeks) cannot be combined with other date components. | P2Y1M3D, PT1M30S, P3W |
| **boolean** | A logic value of true or false. | true, false |
| **country** | Uppercase 2-character ISO 3166-1 country code. | FR, SG, KR |
| **currency** | Uppercase 3-character ISO 4217 code. Covers current and historic currencies. | EUR, USD, DEM |
| **language** | Lowercase 3-character ISO 639-3 language code. | eng, fra, cos |
| **language-tag** | IETF BCP 47 / RFC 5646 tag. Includes language, script (opt), region (opt), and variants. | en-US, zh-Hans-CN, sl-rozaj |
| **script** | Titlecase 4-character ISO 15924 script code. | Latn, Kore, Deva |
| **url** | RFC 3987 IRI. Supports Unicode characters, schemes (http, ftp), and Punycode for domain names. | https://例子.测试/路径, ftp://user@host/file.txt |
| **email-address** | RFC 5322/6531 compliant. Supports internationalized characters in local and domain parts. | firstname.lastname@example.com, 用户@例子.公司 |
| **phone-number** | E.164 compliant if region is known (e.g., +1...); otherwise, extracted as a raw digit string. | +33612345678, 6505550123 |
| **iban** | ISO 13616-1 International Bank Account Number. Structure varies by country. | DE89370400440532013000 |
| **bic** | ISO 9362 Business Identifier Code (8 or 11 characters). | BNPAFRPPXXX, DEUTDEDBFRA |
| **unit-code** | UCUM (Unified Code for Units of Measure) code. | m, kg, s, Hz |
| **region:US** | Uppercase subdivision code complying to ISO 3166-2:US. | NY, DC, GU |
| **region:FR** | Uppercase subdivision code complying to ISO 3166-2:FR. | 49 (Maine-et-Loire), MQ (Martinique), V (Rhône-Alpes) |
| **region:IE** | Uppercase subdivision code complying to ISO 3166-2:IE. | D (Dublin), C (Connacht), WD (Waterford) |
| **region:GB** | Uppercase subdivision code complying to ISO 3166-2:GB. | WSX (West Sussex), WSM (Westminster), WIL (Wiltshire) |
| **region:IT** | Uppercase subdivision code complying to ISO 3166-2:IT. | RM (Rome), BZ (Bolzano), 82 (Sicily) |
| **region:ES** | Uppercase subdivision code complying to ISO 3166-2:ES. | GA (Galicia), GR (Granada), ML (Melilla) |
| **region:DE** | Uppercase subdivision code complying to ISO 3166-2:DE. | BY (Bayern), BE (Berlin), HH (Hamburg) |
| **region:PT** | Uppercase subdivision code complying to ISO 3166-2:PT. | 11 (Lisbon), 20 (Azores) |
| **region:CA** | Uppercase subdivision code complying to ISO 3166-2:CA. | QC (Quebec), NU (Nunavut), YT (Yukon) |
| **region:MX** | Uppercase subdivision code complying to ISO 3166-2:MX. | JAL (Jalisco), DIF (Distrito Federal), AGU (Aguascalientes) |
| **region:BR** | Uppercase subdivision code complying to ISO 3166-2:BR. | RJ (Rio de Janeiro), DF (Distrito Federal), SP (São Paulo) |
| **region:AU** | Uppercase subdivision code complying to ISO 3166-2:AU. | NSW (New South Wales), VIC (Victoria), ACT (Australian Capital Territory) |
| **region:JP** | Uppercase subdivision code complying to ISO 3166-2:JP. | 13 (Tokyo), 27 (Osaka), 01 (Hokkaidō) |
| **region:KR** | Uppercase subdivision code complying to ISO 3166-2:KR. | 11 (Seoul), 26 (Busan), 41 (Gyeonggi) | |