Bapt120 commited on
Commit
cdae040
Β·
verified Β·
1 Parent(s): 3f41a2c

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +5 -37
app.py CHANGED
@@ -499,49 +499,17 @@ def get_model_info_text(model_name):
499
  # Create Gradio interface
500
  with gr.Blocks(title="LightOnOCR-2 Multi-Model OCR") as demo:
501
  gr.Markdown(f"""
502
- # LightOnOCR-2
503
 
504
- **Efficient end-to-end 1B-parameter vision-language model for OCR**
505
 
506
- Convert documents (PDFs, scans, images) into clean, naturally ordered text without relying on brittle pipelines. LightOnOCR-2 achieves state-of-the-art performance on OlmOCR-Bench while being ~9Γ— smaller and significantly faster than competing approaches.
507
 
508
- ### Highlights
509
-
510
- | | |
511
- |---|---|
512
- | ⚑ **Speed** | 3.3Γ— faster than Chandra, 1.7Γ— faster than OlmOCR, 5Γ— faster than dots.ocr |
513
- | πŸ’Έ **Efficiency** | 5.71 pages/s on H100 (~493k pages/day) for **<$0.01 per 1,000 pages** |
514
- | 🧠 **End-to-End** | Fully differentiable, no external OCR pipeline |
515
- | 🧾 **Versatile** | Tables, receipts, forms, multi-column layouts, math notation |
516
- | πŸ“ **Bbox variants** | Predict bounding boxes for embedded images |
517
-
518
- ### Resources
519
-
520
- [Paper](https://huggingface.co/papers/lightonocr-2) | [Blog Post](https://huggingface.co/blog/lightonai/lightonocr-2) | [Demo](https://huggingface.co/spaces/lightonai/LightOnOCR-2-1B-Demo) | [Dataset](https://huggingface.co/datasets/lightonai/LightOnOCR-mix-0126) | [Finetuning Notebook](https://colab.research.google.com/drive/1WjbsFJZ4vOAAlKtcCauFLn_evo5UBRNa?usp=sharing)
521
-
522
- ### Model Variants
523
-
524
- | Variant | Description |
525
- |---------|-------------|
526
- | **[LightOnOCR-2-1B](https://huggingface.co/lightonai/LightOnOCR-2-1B)** | Best OCR model (recommended) |
527
- | **[LightOnOCR-2-1B-base](https://huggingface.co/lightonai/LightOnOCR-2-1B-base)** | Base model, ideal for fine-tuning |
528
- | **[LightOnOCR-2-1B-bbox](https://huggingface.co/lightonai/LightOnOCR-2-1B-bbox)** | Best model with image bounding boxes |
529
- | **[LightOnOCR-2-1B-bbox-base](https://huggingface.co/lightonai/LightOnOCR-2-1B-bbox-base)** | Base bbox model, ideal for fine-tuning |
530
- | **[LightOnOCR-2-1B-ocr-soup](https://huggingface.co/lightonai/LightOnOCR-2-1B-ocr-soup)** | Merged variant for extra robustness |
531
- | **[LightOnOCR-2-1B-bbox-soup](https://huggingface.co/lightonai/LightOnOCR-2-1B-bbox-soup)** | Merged variant: OCR + bbox combined |
532
 
533
  ---
534
 
535
- ### How to use
536
-
537
- 1. Select a model (OCR models for text extraction, Bbox models for region detection)
538
- 2. Upload an image or PDF
539
- 3. For PDFs: select which page to extract
540
- 4. Click "Extract Text"
541
-
542
- **Note:** Bbox models output cropped regions inline. Check raw output for coordinates.
543
-
544
- **Device:** {device.upper()} | **Attention:** {attn_implementation}
545
  """)
546
 
547
  with gr.Row():
 
499
  # Create Gradio interface
500
  with gr.Blocks(title="LightOnOCR-2 Multi-Model OCR") as demo:
501
  gr.Markdown(f"""
502
+ # LightOnOCR-2 β€” Efficient 1B VLM for OCR
503
 
504
+ State-of-the-art OCR on OlmOCR-Bench, ~9Γ— smaller and faster than competitors. Handles tables, forms, math, multi-column layouts.
505
 
506
+ ⚑ **3.3Γ— faster** than Chandra, **1.7Γ— faster** than OlmOCR | πŸ’Έ **<$0.01/1k pages** | 🧠 End-to-end differentiable | πŸ“ Bbox variants for image detection
507
 
508
+ πŸ“„ [Paper](https://huggingface.co/papers/lightonocr-2) | πŸ“ [Blog](https://huggingface.co/blog/lightonai/lightonocr-2) | πŸ“Š [Dataset](https://huggingface.co/datasets/lightonai/LightOnOCR-mix-0126) | πŸ““ [Finetuning](https://colab.research.google.com/drive/1WjbsFJZ4vOAAlKtcCauFLn_evo5UBRNa?usp=sharing)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
509
 
510
  ---
511
 
512
+ **How to use:** Select a model β†’ Upload image/PDF β†’ Click "Extract Text" | **Device:** {device.upper()} | **Attention:** {attn_implementation}
 
 
 
 
 
 
 
 
 
513
  """)
514
 
515
  with gr.Row():