Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +158 -0
adapter_config.json +46 -0
adapter_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,158 @@

+---
+base_model: allenai/olmOCR-2-7B-1025
+library_name: peft
+pipeline_tag: image-text-to-text
+license: apache-2.0
+language:
+  - ar
+tags:
+  - lora
+  - ocr
+  - arabic
+  - handwriting
+  - transformers
+  - qwen2-vl
+---
+# olmOCR Arabic LoRA Adapter
+A LoRA (Low-Rank Adaptation) fine-tuned adapter for Arabic OCR, built on top of [allenai/olmOCR-2-7B-1025](https://huggingface.co/allenai/olmOCR-2-7B-1025).
+## Model Description
+This adapter enhances olmOCR's ability to recognize Arabic text in documents, including:
+- Handwritten Arabic text
+- Printed Arabic documents
+- Mixed Arabic/English documents
+### Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | allenai/olmOCR-2-7B-1025 |
+| LoRA Rank (r) | 16 |
+| LoRA Alpha | 32 |
+| LoRA Dropout | 0.05 |
+| Training Samples | 450,044 |
+| Epochs | 3 |
+| Learning Rate | 2e-5 |
+| Batch Size | 64 (effective) |
+| Hardware | 8x NVIDIA A100 80GB |
+| Training Time | ~36 hours |
+| Trainable Parameters | 47.6M (0.57% of total) |
+### Target Modules
+- `q_proj`, `k_proj`, `v_proj`, `o_proj` (attention)
+- `gate_proj`, `up_proj`, `down_proj` (FFN)
+## Usage
+### Installation
+```bash
+pip install transformers peft torch
+```
+### Load the Model
+```python
+from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
+from peft import PeftModel
+import torch
+# Load base model
+base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+    "allenai/olmOCR-2-7B-1025",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+# Load LoRA adapter
+model = PeftModel.from_pretrained(base_model, "allenai/olmOCR-arabic-lora")
+# Optional: Merge for faster inference
+model = model.merge_and_unload()
+# Load processor
+processor = AutoProcessor.from_pretrained("allenai/olmOCR-2-7B-1025", trust_remote_code=True)
+```
+### Run Inference
+```python
+from PIL import Image
+# Load your Arabic document image
+image = Image.open("arabic_document.png")
+# Create prompt (olmOCR format)
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "image": image},
+            {"type": "text", "text": "Extract the text from this document."},
+        ],
+    }
+]
+# Process and generate
+text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = processor(text=[text], images=[image], return_tensors="pt", padding=True)
+inputs = {k: v.to(model.device) for k, v in inputs.items()}
+with torch.no_grad():
+    outputs = model.generate(**inputs, max_new_tokens=2048, do_sample=False)
+# Decode output
+result = processor.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0]
+print(result)
+```
+## Training Data
+The model was fine-tuned on a combined dataset of Arabic OCR samples including:
+- Arabic handwritten documents
+- Printed Arabic text
+- Mixed-script documents
+Total training samples: 450,044
+## Evaluation
+Evaluation results will be added after benchmark completion.
+Target metrics:
+- Word Error Rate (WER): < 10%
+- Character Error Rate (CER): < 5%
+## Limitations
+- Optimized primarily for Arabic script
+- Performance may vary on extremely degraded or low-quality scans
+- Works best with documents at 150+ DPI
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{olmocr-arabic-lora,
+  title={olmOCR Arabic LoRA Adapter},
+  author={Allen Institute for AI},
+  year={2025},
+  publisher={Hugging Face},
+  url={https://huggingface.co/allenai/olmOCR-arabic-lora}
+}
+```
+## License
+Apache 2.0
+### Framework Versions
+- PEFT: 0.18.0
+- Transformers: 4.47+
+- PyTorch: 2.0+

adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "allenai/olmOCR-2-7B-1025",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.0",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "o_proj",
+    "gate_proj",
+    "v_proj",
+    "k_proj",
+    "up_proj",
+    "q_proj",
+    "down_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b4c33569c7072adebb9484b5a23636a9538d91d00d8729c5bc11f1ebe9b6f9a0
+size 190442760