Image-Text-to-Text
Transformers
TensorBoard
Safetensors
vision-encoder-decoder
Generated from Trainer
Instructions to use kavinh07/nid-ocr-vit-xlmroberta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kavinh07/nid-ocr-vit-xlmroberta with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="kavinh07/nid-ocr-vit-xlmroberta")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("kavinh07/nid-ocr-vit-xlmroberta") model = AutoModelForImageTextToText.from_pretrained("kavinh07/nid-ocr-vit-xlmroberta") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use kavinh07/nid-ocr-vit-xlmroberta with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kavinh07/nid-ocr-vit-xlmroberta" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kavinh07/nid-ocr-vit-xlmroberta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/kavinh07/nid-ocr-vit-xlmroberta
- SGLang
How to use kavinh07/nid-ocr-vit-xlmroberta with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kavinh07/nid-ocr-vit-xlmroberta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kavinh07/nid-ocr-vit-xlmroberta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kavinh07/nid-ocr-vit-xlmroberta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kavinh07/nid-ocr-vit-xlmroberta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use kavinh07/nid-ocr-vit-xlmroberta with Docker Model Runner:
docker model run hf.co/kavinh07/nid-ocr-vit-xlmroberta
nid-ocr-vit-xlmroberta
This model is a fine-tuned version of microsoft/trocr-base-stage1 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 8.1543
- Cer: 0.9993
- Wer: 1.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 9e-06
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 1000
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| No log | 0 | 0 | 27.5340 | 3.6546 | 4.6053 |
| 7.6216 | 0.0801 | 250 | 7.6858 | 1.0 | 1.0 |
| 6.5641 | 0.1602 | 500 | 7.1390 | 1.0 | 1.0 |
| 6.2881 | 0.2402 | 750 | 6.9612 | 1.0 | 1.0 |
| 6.1757 | 0.3203 | 1000 | 6.8755 | 1.0 | 1.0 |
| 6.0908 | 0.4004 | 1250 | 6.8165 | 1.0 | 1.0 |
| 6.0241 | 0.4805 | 1500 | 6.7642 | 1.0 | 1.0 |
| 5.9962 | 0.5605 | 1750 | 6.7470 | 1.0 | 1.0 |
| 5.9477 | 0.6406 | 2000 | 6.7396 | 1.0 | 1.0 |
| 5.8893 | 0.7207 | 2250 | 6.8078 | 1.0 | 1.0 |
| 5.8618 | 0.8008 | 2500 | 6.9414 | 1.0 | 1.0 |
| 5.7779 | 0.8808 | 2750 | 7.1235 | 1.0 | 1.0 |
| 5.7539 | 0.9609 | 3000 | 7.2403 | 1.0 | 1.0 |
| 5.6984 | 1.0410 | 3250 | 7.5010 | 1.0 | 1.0 |
| 5.6579 | 1.1211 | 3500 | 7.9130 | 1.0 | 1.0 |
| 5.5958 | 1.2012 | 3750 | 8.1207 | 1.0 | 1.0 |
| 5.5785 | 1.2812 | 4000 | 8.3609 | 0.9998 | 1.0 |
| 5.5205 | 1.3613 | 4250 | 8.2284 | 0.9989 | 1.0 |
| 5.4865 | 1.4414 | 4500 | 8.1543 | 0.9993 | 1.0 |
Framework versions
- Transformers 4.54.1
- Pytorch 2.7.1+cu126
- Datasets 4.5.0
- Tokenizers 0.21.4
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for kavinh07/nid-ocr-vit-xlmroberta
Base model
microsoft/trocr-base-stage1