--- title: OmniFile AI Processor emoji: ๐Ÿง  colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false license: mit ---
# ๐Ÿง  OmniFile AI Processor v4.3.0 **ู†ุธุงู… ุฐูƒุงุก ุงุตุทู†ุงุนูŠ ู…ุชูƒุงู…ู„ ู„ู…ุนุงู„ุฌุฉ ุงู„ู…ู„ูุงุช ูˆุงู„ู†ุตูˆุต ูˆุงู„ุฎุท ุงู„ูŠุฏูˆูŠ** **A Comprehensive AI System for File Processing, Text Analysis & Handwriting Recognition** [![Python 3.10+](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://python.org) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![CI Tests](https://img.shields.io/github/actions/workflow/status/DrAbdulmalek/OmniFile_Processor/ci.yml?branch=main&label=CI%20Tests)](https://github.com/DrAbdulmalek/OmniFile_Processor/actions/workflows/ci.yml) [![HF Spaces](https://img.shields.io/badge/๐Ÿค—-HuggingFace%20Spaces-orange)](https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr) [![GitHub](https://img.shields.io/badge/GitHub-DrAbdulmalek-181717?logo=github)](https://github.com/DrAbdulmalek/OmniFile_Processor)

Version: v4.3.0  |  Status: โœ… CI-Verified

[๐ŸŒ Live Demo (HF Spaces)](https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr)  |  [๐Ÿ“˜ Documentation](docs/USER_GUIDE.md)  |  [๐Ÿ› Report Bug](https://github.com/DrAbdulmalek/OmniFile_Processor/issues)  |  [๐Ÿ’ก Suggestions](SUGGESTIONS.md)
--- ## ๐Ÿ‘จโ€๐Ÿ’ป About the Author | ุนู† ุงู„ู…ุคู„ู
Dr Abdulmalek Tamer Al-husseini

Dr Abdulmalek Tamer Al-husseini

๐Ÿ“ Location: Homs, Syria
๐Ÿ“ง Email: Abdulmalek.husseini@gmail.com
๐Ÿ™ GitHub: DrAbdulmalek
๐Ÿค— HuggingFace: DrAbdulmalek

--- ## ๐Ÿ“– Description | ุงู„ูˆุตู OmniFile AI Processor is a production-ready, multimodal AI system that integrates **six projects** into a unified platform for document intelligence: **OmniFile_Processor** + **HandwrittenOCR** + **handwriting-ocr** + **arabic-ocr-pro** + **advanced-ocr** + **OCR-Enhancer** > ู†ุธุงู… ุฐูƒุงุก ุงุตุทู†ุงุนูŠ ู…ุชู‚ุฏู… ูŠุฌู…ุน ุณุชุฉ ู…ุดุงุฑูŠุน ููŠ ู…ู†ุตุฉ ูˆุงุญุฏุฉ ู„ู…ุนุงู„ุฌุฉ ุงู„ู…ู„ูุงุช ูˆุงู„ุฎุท ุงู„ูŠุฏูˆูŠ. ูŠุฏุนู… ุงู„ุนุฑุจูŠุฉ ูˆุงู„ุฅู†ุฌู„ูŠุฒูŠุฉ ูˆุงู„ุฃู„ู…ุงู†ูŠุฉ ู…ุน ูˆุญุฏุงุช ู…ุชุฎุตุตุฉ ู„ู„ุฑุคูŠุฉ ุงู„ุญุงุณูˆุจูŠุฉ ูˆู…ุนุงู„ุฌุฉ ุงู„ู„ุบุฉ ูˆุงู„ุฃู…ุงู† ูˆุงู„ุชุตุฏูŠุฑ. --- ## โœจ Features | ุงู„ู…ู…ูŠุฒุงุช ### ๐Ÿ” Computer Vision & OCR (ูˆุญุฏุฉ ุงู„ุฑุคูŠุฉ ุงู„ุญุงุณูˆุจูŠุฉ) 1. **Multi-Engine OCR** โ€” 4 engines (TrOCR, EasyOCR, Tesseract, PaddleOCR) with intelligent engine selection 2. **Result Fusion** โ€” 4 strategies: highest confidence, weighted average, voting, longest text 3. **Advanced Preprocessing** โ€” CLAHE, deskew, denoise, Otsu thresholding, ONNX Runtime acceleration 4. **Layout Analysis** โ€” Automatic detection of tables, headers, footers, and document structure 5. **Table Extraction** โ€” Hough line detection + contour analysis for structured data extraction ### ๐Ÿ—ฃ๏ธ Natural Language Processing (ูˆุญุฏุฉ ู…ุนุงู„ุฌุฉ ุงู„ู„ุบุฉ) 6. **Multilingual Spell Correction** โ€” Arabic, English, German with user-learning capability (186+ Arabic corrections) 7. **RTL Text Processing** โ€” Full Arabic reshaping + BiDi support with 40+ normalization mappings 8. **Mixed-Text Handling** โ€” Arabic/English/numbers with medical term protection 9. **Translation Engine** โ€” Helsinki-NLP/opus-mt supporting 6 language pairs 10. **AI Summarization** โ€” BART (facebook/bart-large-cnn) + Arabic (UAE-Code/mbart-summarization-ar) 11. **Entity Extraction & Text Classification** โ€” BERT-based NER with 6-category classification ### ๐Ÿค– AI Enhancement (ูˆุญุฏุฉ ุงู„ุฐูƒุงุก ุงู„ุงุตุทู†ุงุนูŠ) 12. **GPT & Gemini Refinement** โ€” Context-aware OCR correction with block-type-specific prompts 13. **SSIM Pattern Matching** โ€” Self-learning from corrected word images with SQLite pattern database ### ๐Ÿ“ค Multi-Format Export (ูˆุญุฏุฉ ุงู„ุชุตุฏูŠุฑ) 14. **6 Export Formats** โ€” DOCX (RTL support), HTML, searchable PDF, Excel, JSON (with BBox), TXT (UTF-8 BOM) ### ๐Ÿ”’ Security & Privacy (ูˆุญุฏุฉ ุงู„ุฃู…ุงู†) 15. **PII Detection** โ€” Presidio-based sensitive data scanning + detect-secrets 16. **File Encryption** โ€” Fernet (AES-128) with folder support 17. **Code Protection** โ€” Prevents spell correction inside code blocks 18. **Audit Logging** โ€” File + Redis audit trail with rate limiting (slowapi + Nginx) ### ๐Ÿ“Š Evaluation (ูˆุญุฏุฉ ุงู„ุชู‚ูŠูŠู…) 19. **CER/WER Metrics** โ€” OCR accuracy evaluation with Arabic normalization + Levenshtein distance 20. **Quality Grading** โ€” A+ to F with actionable recommendations ### ๐Ÿ–ฅ๏ธ Multiple Interfaces (ูˆุงุฌู‡ุงุช ุงู„ู…ุณุชุฎุฏู…) 21. **4 UIs** โ€” Streamlit (6 tabs), Gradio (7 tabs), React + shadcn/ui (dark/light), CLI, PyQt6 desktop 22. **FastAPI Backend** โ€” Full REST API with Swagger documentation ### ๐Ÿš€ Scalability & Deployment (ุงู„ุชุญุฌูŠู… ูˆุงู„ู†ุดุฑ) 23. **Docker + Compose** โ€” One-command deployment with all services 24. **Kubernetes Ready** โ€” Complete K8s manifests with HPA (2-10 pods auto-scaling) 25. **Celery + Redis** โ€” Asynchronous task processing for heavy workloads --- ## ๐Ÿš€ Quick Start | ุงู„ุชุดุบูŠู„ ุงู„ุณุฑูŠุน ### Option 1: HuggingFace Spaces (Recommended for Demo) The project is deployed and available at: ๐Ÿ‘‰ **[https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr](https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr)** To deploy your own instance: ```bash git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git cd OmniFile_Processor pip install -r requirements-hf.txt python -m src.gradio_ui ``` ### Option 2: Local Installation (Linux / macOS / Windows) ```bash # Clone the repository git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git cd OmniFile_Processor # Install dependencies pip install -r requirements-full.txt # Everything (~6-8 GB, ~15-30 min) # Or install in layers: # pip install -r requirements-core.txt # Minimum (~1.5 GB, ~3 min) # pip install -r requirements-core.txt -r requirements-ocr.txt # + OCR engines # pip install -r requirements-core.txt -r requirements-nlp.txt # + NLP # Run with your preferred interface streamlit run app.py # Streamlit UI (6 tabs) python -m src.gradio_ui # Gradio UI (7 tabs) python main.py # CLI interface cd frontend && npm install && npm run dev # React Frontend ``` ### Option 3: Docker Compose (Full Stack) ```bash git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git cd OmniFile_Processor docker-compose up -d # Access: # API Docs: http://localhost:5001/docs # Streamlit: http://localhost:7860 # React: http://localhost:3000 # Nginx Proxy: http://localhost ``` ### Option 4: Google Colab ```python !git clone https://github.com/DrAbdulmalek/OmniFile_Processor.git %cd OmniFile_Processor !pip install -r requirements-full.txt # Everything (~6-8 GB, ~15-30 min) # Or install in layers: # pip install -r requirements-core.txt # Minimum (~1.5 GB, ~3 min) # pip install -r requirements-core.txt -r requirements-ocr.txt # + OCR engines # pip install -r requirements-core.txt -r requirements-nlp.txt # + NLP !streamlit run app.py --server.port 7860 ``` --- ## ๐Ÿ“ Project Structure | ู‡ูŠูƒู„ ุงู„ู…ุดุฑูˆุน > **ู…ู„ุงุญุธุฉ ู…ุนู…ุงุฑูŠุฉ โ€” Architecture Note:** > ูŠูˆุฌุฏ ููŠ ุงู„ู…ุดุฑูˆุน ู†ุธุงู…ุงู† ู…ุชูˆุงุฒูŠุงู†: > - **`modules/`** โ€” ุงู„ุจู†ูŠุฉ ุงู„ู†ุธุฑูŠุฉ ุงู„ู…ูˆุณู‘ุนุฉ: ูˆุญุฏุงุช ู…ู†ุธู‘ู…ุฉ ุจูˆุถูˆุญ (vision, nlp, security, export, ai, evaluation) ู…ุน ู†ู…ุงุฐุฌ Pydantic v2. ู‡ุฐู‡ ู‡ูŠ ุงู„ุจู†ูŠุฉ ุงู„ู…ุณุชู‚ุจู„ูŠุฉ ุงู„ู…ู‚ุตูˆุฏุฉ ู„ู„ู…ุดุฑูˆุน. > - **`src/`** โ€” ู…ุญุฑูƒ HandwrittenOCR ุงู„ุนู…ู„ูŠ: ูŠุญุชูˆูŠ ุนู„ู‰ ุงู„ุชุทุจูŠู‚ ุงู„ูุนู„ูŠ ุงู„ู…ูุณุชุฎุฏูŽู… ููŠ ูˆุงุฌู‡ุฉ Gradio (`src/gradio_ui.py`) ูˆHF Spaces. ูŠุดู…ู„ TrOCR Batch, LoRA Fine-tuning, Active Learning, ูˆStudy Guide. > - **ุงู„ู…ู„ูุงุช ุงู„ุฌุฐุฑูŠุฉ** (`app.py`, `database.py`, `config.py`) โ€” ุทุจู‚ุฉ ุงู„ุชูƒุงู…ู„ ุงู„ุชูŠ ุชุฑุจุท ุจูŠู† ุงู„ุจู†ูŠุฉ ูˆุงู„ู…ุญุฑูƒุงุช. > > **ุงู„ุฎูŠุงุฑ ุงู„ู…ุชุจู†ู‘ู‰ ุญุงู„ูŠุงู‹:** `src/` ู‡ูˆ ุงู„ูƒูˆุฏ ุงู„ุนู…ู„ูŠ ุงู„ูุนู‘ุงู„ ู„ูˆุงุฌู‡ุฉ Gradio ูˆHF SpacesุŒ ุจูŠู†ู…ุง `modules/` ูŠู…ุซู„ ุงู„ุจู†ูŠุฉ ุงู„ู†ุธุฑูŠุฉ ุงู„ู…ู†ุธู…ุฉ ู„ู„ู…ุดุฑูˆุน ุงู„ู…ูˆุณู‘ุน. ุงู„ุชุญูˆูŠู„ ุงู„ุชุฏุฑูŠุฌูŠ (migration) ู…ู† `src/` ุฅู„ู‰ `modules/` ุณูŠุชู… ุนู„ู‰ ู…ุฑุงุญู„ ุนุจุฑ Pull Requests ู…ุณุชู‚ู„ุฉ. ``` OmniFile_Processor/ โ”œโ”€โ”€ app.py # Main Streamlit UI โ”œโ”€โ”€ config.py # Central configuration v4.1.1 โ”œโ”€โ”€ database.py # SQLite database layer โ”œโ”€โ”€ main.py # Local / CLI entry point โ”œโ”€โ”€ tasks.py # Celery async tasks โ”œโ”€โ”€ requirements.txt # Full dependencies (legacy) โ”œโ”€โ”€ requirements-core.txt # Core only (~1.5 GB) โ”œโ”€โ”€ requirements-ocr.txt # OCR engines layer โ”œโ”€โ”€ requirements-nlp.txt # NLP layer โ”œโ”€โ”€ requirements-full.txt # Everything (~6-8 GB) โ”œโ”€โ”€ requirements-hf.txt # HuggingFace Spaces (minimal) โ”œโ”€โ”€ Dockerfile # Docker image โ”œโ”€โ”€ docker-compose.yml # Full stack orchestration โ”œโ”€โ”€ nginx.conf # Nginx load balancer โ”œโ”€โ”€ LICENSE # MIT License โ”‚ โ”œโ”€โ”€ modules/ โ”‚ โ”œโ”€โ”€ core/ # Core data models โ”‚ โ”‚ โ””โ”€โ”€ structure.py # Pydantic v2 models โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ vision/ # Computer Vision & OCR โ”‚ โ”‚ โ”œโ”€โ”€ ocr_engine.py # 4 OCR engines + ONNX + Quantization โ”‚ โ”‚ โ”œโ”€โ”€ image_preprocessor.py # CLAHE + Denoise + Deskew + Otsu โ”‚ โ”‚ โ”œโ”€โ”€ pdf_processor.py # Multi-format PDF processing โ”‚ โ”‚ โ”œโ”€โ”€ text_reconstructor.py # RTL/LTR sentence reconstruction โ”‚ โ”‚ โ”œโ”€โ”€ result_fusion.py # 4 fusion strategies โ”‚ โ”‚ โ”œโ”€โ”€ layout_analyzer.py # Layout analysis (tables, headers) โ”‚ โ”‚ โ””โ”€โ”€ table_extractor.py # Table extraction (Hough + contours) โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ nlp/ # Natural Language Processing โ”‚ โ”‚ โ”œโ”€โ”€ spell_corrector.py # 3-language correction + learning โ”‚ โ”‚ โ”œโ”€โ”€ translator.py # Helsinki-NLP translation โ”‚ โ”‚ โ”œโ”€โ”€ summarizer.py # BART summarization โ”‚ โ”‚ โ”œโ”€โ”€ entity_extractor.py # BERT-based NER โ”‚ โ”‚ โ”œโ”€โ”€ language_detector.py # Language detection โ”‚ โ”‚ โ”œโ”€โ”€ text_classifier.py # 6-category classification โ”‚ โ”‚ โ”œโ”€โ”€ arabic_rtl.py # Full RTL processing โ”‚ โ”‚ โ”œโ”€โ”€ mixed_text.py # Arabic/English mixed text โ”‚ โ”‚ โ”œโ”€โ”€ ai_corrector.py # GPT-based correction โ”‚ โ”‚ โ”œโ”€โ”€ arabic_nlp_utils.py # Semantic similarity for Arabic OCR โ”‚ โ”‚ โ””โ”€โ”€ correction_dict.json # 186+ Arabic corrections โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ ai/ # AI Enhancement โ”‚ โ”‚ โ”œโ”€โ”€ pattern_matcher.py # SSIM pattern matching โ”‚ โ”‚ โ”œโ”€โ”€ pattern_db.py # SQLite pattern database โ”‚ โ”‚ โ””โ”€โ”€ gemini_refiner.py # Gemini AI refinement โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ security/ # Security & Privacy โ”‚ โ”‚ โ”œโ”€โ”€ file_scanner.py # Security scanning โ”‚ โ”‚ โ”œโ”€โ”€ sensitive_data_scanner.py # PII detection (Presidio) โ”‚ โ”‚ โ”œโ”€โ”€ encryption.py # Fernet encryption (AES-128) โ”‚ โ”‚ โ”œโ”€โ”€ code_protector.py # Code block protection โ”‚ โ”‚ โ”œโ”€โ”€ file_organizer.py # Auto file organization โ”‚ โ”‚ โ”œโ”€โ”€ archive_handler.py # Archive management โ”‚ โ”‚ โ”œโ”€โ”€ backup_manager.py # Backup management โ”‚ โ”‚ โ”œโ”€โ”€ audit_logger.py # Audit logging โ”‚ โ”‚ โ””โ”€โ”€ secure_file_handler.py # Safe file handling โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ export/ # Multi-Format Export โ”‚ โ”‚ โ”œโ”€โ”€ exporter.py # DOCX/HTML/PDF/JSON/TXT/Excel โ”‚ โ”‚ โ””โ”€โ”€ layout_preserving.py # DOCX export with visual layout preservation โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ evaluation/ # Evaluation & Metrics โ”‚ โ””โ”€โ”€ metrics.py # CER/WER + quality grading โ”‚ โ”œโ”€โ”€ frontend/ # React + shadcn/ui Web App โ”‚ โ”œโ”€โ”€ src/ โ”‚ โ”‚ โ”œโ”€โ”€ App.jsx # Main application โ”‚ โ”‚ โ”œโ”€โ”€ components/ # UI components โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ FileUpload.jsx โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ ProcessingOptions.jsx โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ ResultsDisplay.jsx โ”‚ โ”‚ โ””โ”€โ”€ services/api.js # API client โ”‚ โ””โ”€โ”€ package.json โ”‚ โ”œโ”€โ”€ backend/ # FastAPI Backend โ”‚ โ””โ”€โ”€ main.py # REST API endpoints โ”‚ โ”œโ”€โ”€ data/ โ”‚ โ””โ”€โ”€ arabic_fixes.json # 186 Arabic corrections โ”œโ”€โ”€ data_seed/ โ”‚ โ””โ”€โ”€ correction_dict_seed.json # Seed data โ”œโ”€โ”€ artifacts/ โ”‚ โ””โ”€โ”€ correction_dict.json # Learned corrections โ”‚ โ”œโ”€โ”€ src/ # HandwrittenOCR Engine โ”œโ”€โ”€ mobile/ # Static PWA (offline review) โ”œโ”€โ”€ mobile_review/ # Flask server (remote team review) โ”‚ โ”œโ”€โ”€ server.py # REST API review server โ”‚ โ”œโ”€โ”€ templates/review.html # Touch-friendly review UI โ”‚ โ””โ”€โ”€ README.md # mobile/ vs mobile_review/ guide โ”œโ”€โ”€ tests/ # pytest test suite (13 files) โ”œโ”€โ”€ .github/workflows/ # CI/CD โ”‚ โ”œโ”€โ”€ ci.yml # Tests on push/PR โ”‚ โ””โ”€โ”€ release.yml # Auto-release on tags โ”œโ”€โ”€ notebooks/ # Jupyter Notebooks โ”‚ โ”œโ”€โ”€ OmniFile_Gradio_Debugger.ipynb # Gradio interactive debugger (Colab-ready) โ”œโ”€โ”€ docs/ # Documentation โ”‚ โ”œโ”€โ”€ TESTING_GUIDE.md # Testing & development guide โ”‚ โ”œโ”€โ”€ API_DOCS.md โ”‚ โ”œโ”€โ”€ USER_GUIDE.md โ”‚ โ””โ”€โ”€ DEVELOPER_GUIDE.md โ”œโ”€โ”€ k8s/ # Kubernetes manifests โ”‚ โ”œโ”€โ”€ namespace.yaml โ”‚ โ”œโ”€โ”€ backend.yaml โ”‚ โ”œโ”€โ”€ celery.yaml โ”‚ โ”œโ”€โ”€ redis.yaml โ”‚ โ”œโ”€โ”€ nginx.yaml โ”‚ โ”œโ”€โ”€ hpa.yaml โ”‚ โ””โ”€โ”€ storage.yaml โ””โ”€โ”€ examples/ # Usage examples โ”œโ”€โ”€ ocr_basic.py โ”œโ”€โ”€ nlp_pipeline.py โ””โ”€โ”€ evaluation_example.py ``` --- ## ๐Ÿงฉ Module Descriptions | ูˆุตู ุงู„ูˆุญุฏุงุช ### 1. ๐ŸŽฏ `modules/core/` โ€” Core Data Models The foundational layer defining all shared data structures using **Pydantic v2**. Provides type-safe models for OCR results, processing options, document metadata, and inter-module communication. | File | Description | |------|-------------| | `structure.py` | Pydantic v2 models for OCRResult, ProcessingOptions, Document, BoundingBox, and all shared types | --- ### 2. ๐Ÿ‘๏ธ `modules/vision/` โ€” Computer Vision & OCR Engine The heart of the system. Handles image preprocessing, OCR with 4 engines, result fusion, PDF processing, layout analysis, table extraction, and text reconstruction. | File | Description | |------|-------------| | `ocr_engine.py` | Multi-engine OCR (TrOCR, EasyOCR, Tesseract, PaddleOCR) with ONNX Runtime & INT8 quantization | | `image_preprocessor.py` | CLAHE, Gaussian denoise, deskew (Hough), Otsu thresholding, adaptive binarization | | `pdf_processor.py` | PyMuPDF-based PDF processing with pdf2image fallback | | `text_reconstructor.py` | RTL/LTR sentence reconstruction with language-aware ordering | | `result_fusion.py` | 4 fusion strategies: highest confidence, weighted average, voting, longest text | | `layout_analyzer.py` | Document layout analysis โ€” tables, headers, footers, sections | | `table_extractor.py` | Table extraction using Hough lines + contour analysis | --- ### 3. ๐Ÿ—ฃ๏ธ `modules/nlp/` โ€” Natural Language Processing Multilingual text processing including spell correction, translation, summarization, entity extraction, text classification, and Arabic RTL handling. | File | Description | |------|-------------| | `spell_corrector.py` | 3-language spell correction (AR, EN, DE) with user learning & Python keyword protection | | `translator.py` | Helsinki-NLP/opus-mt machine translation (6 language pairs) | | `summarizer.py` | BART summarization (English + Arabic) | | `entity_extractor.py` | BERT-based Named Entity Recognition | | `language_detector.py` | Automatic language detection (AR, EN, DE) | | `text_classifier.py` | 6-category text classification | | `arabic_rtl.py` | Full RTL processing โ€” arabic_reshaper + python-bidi + 40+ normalization rules | | `mixed_text.py` | Arabic/English/number mixed text handler with medical term protection | | `ai_corrector.py` | GPT-based OCR correction with context awareness | | `correction_dict.json` | 186+ common Arabic OCR error corrections | --- ### 4. ๐Ÿ”’ `modules/security/` โ€” Security & Privacy Comprehensive security module for PII detection, encryption, code protection, file organization, backup management, and audit logging. | File | Description | |------|-------------| | `file_scanner.py` | File security scanning and validation | | `sensitive_data_scanner.py` | PII detection using Microsoft Presidio + detect-secrets | | `encryption.py` | Fernet symmetric encryption (AES-128) with folder support | | `code_protector.py` | Prevents spell correction inside code blocks (Python, JS, etc.) | | `file_organizer.py` | Automatic file organization by type and content | | `archive_handler.py` | ZIP archive creation/extraction with integrity checks | | `backup_manager.py` | Automatic and manual backup management | | `audit_logger.py` | File + Redis audit trail with statistics | | `secure_file_handler.py` | Path traversal prevention + safe tempfile handling | --- ### 5. ๐Ÿค– `modules/ai/` โ€” AI Enhancement Advanced AI capabilities including self-learning pattern matching and Gemini-based refinement. | File | Description | |------|-------------| | `pattern_matcher.py` | SSIM-based visual pattern matching โ€” learns from corrected word images | | `pattern_db.py` | SQLite database for storing and retrieving visual OCR patterns | | `gemini_refiner.py` | Google Gemini API integration for context-aware OCR refinement | --- ### 6. ๐Ÿ“ค `modules/export/` โ€” Multi-Format Export Export processed documents to 6 different formats with proper RTL support. | File | Description | |------|-------------| | `exporter.py` | Export to DOCX (RTL bidi), HTML (`dir="rtl"`), searchable PDF (invisible text), Excel (RTL alignment), JSON (full structure with BBox), TXT (UTF-8 BOM) | --- ### 7. ๐Ÿ“Š `modules/evaluation/` โ€” Evaluation & Metrics OCR accuracy evaluation with Arabic-aware normalization. | File | Description | |------|-------------| | `metrics.py` | CER/WER computation, Arabic text normalization, Levenshtein distance (zero dependencies), quality grading (A+ to F) | --- ## ๐Ÿ”— API Documentation | ุชูˆุซูŠู‚ API The FastAPI backend exposes the following REST endpoints. Full interactive documentation is available at `/docs` (Swagger UI) when the backend is running. ### Base URL ``` http://localhost:5001/api/v1 ``` ### Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/ocr/process` | Process an image/PDF file with OCR | | `POST` | `/ocr/process-batch` | Batch process multiple files | | `GET` | `/ocr/result/{task_id}` | Get OCR result by task ID | | `POST` | `/nlp/correct` | Spell-correct text (AR/EN/DE) | | `POST` | `/nlp/translate` | Translate text between languages | | `POST` | `/nlp/summarize` | Summarize text | | `POST` | `/nlp/entities` | Extract named entities | | `POST` | `/nlp/classify` | Classify text into categories | | `POST` | `/export/{format}` | Export results to DOCX/HTML/PDF/JSON/TXT/Excel | | `POST` | `/security/scan` | Scan file for PII and security issues | | `POST` | `/security/encrypt` | Encrypt a file | | `POST` | `/security/decrypt` | Decrypt a file | | `GET` | `/health` | Health check endpoint | | `GET` | `/metrics` | System performance metrics | > ๐Ÿ“– For the complete API reference with request/response schemas, see [docs/API_DOCS.md](docs/API_DOCS.md) or access the Swagger UI at `/docs`. --- ## ๐Ÿ“ธ Screenshots | ู„ู‚ุทุงุช ุงู„ุดุงุดุฉ | Streamlit UI | Gradio UI | React Frontend | |:---:|:---:|:---:| | ![Streamlit](docs/screenshots/streamlit.png) | ![Gradio](docs/screenshots/gradio.png) | ![React](docs/screenshots/react.png) | | *6-tab interface* | *7-tab interface* | *Dark/Light mode* | | OCR Processing | Arabic RTL | Table Extraction | |:---:|:---:|:---:| | ![OCR](docs/screenshots/ocr.png) | ![RTL](docs/screenshots/rtl.png) | ![Tables](docs/screenshots/tables.png) | | *Multi-engine OCR* | *Full RTL support* | *Structured data* | > ๐Ÿ“ Screenshots will be added in a future update. For now, try the [live demo](https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr) to see the application in action. > > ๐Ÿ“‹ See [CHANGELOG.md](CHANGELOG.md) for the complete version history. --- ## ๐Ÿ“Š Project Statistics | ุฅุญุตุงุฆูŠุงุช ุงู„ู…ุดุฑูˆุน | Metric | Value | |--------|-------| | Python Files | 72+ | | Lines of Code | ~28,000 | | Total Files | 152+ | | OCR Engines | 4 (TrOCR, EasyOCR, Tesseract, PaddleOCR) | | Fusion Strategies | 4 | | Supported Languages | 3 (EN, AR, DE) | | Export Formats | 6 (DOCX, HTML, PDF, JSON, TXT, Excel) | | Test Files | 13 | | Merged Projects | 6 | | Security Modules | 9 | | NLP Capabilities | 10 | | API Endpoints | 14+ | --- ## ๐ŸŒ Supported Languages | ุงู„ู„ุบุงุช ุงู„ู…ุฏุนูˆู…ุฉ | Language | Code | RTL Support | OCR | Spell Check | Translation | |----------|------|:-----------:|:---:|:-----------:|:-----------:| | ๐Ÿ‡ธ๐Ÿ‡ฆ ุงู„ุนุฑุจูŠุฉ (Arabic) | `ar` | โœ… | โœ… | โœ… | โœ… | | ๐Ÿ‡ฌ๐Ÿ‡ง English | `en` | โŒ | โœ… | โœ… | โœ… | | ๐Ÿ‡ฉ๐Ÿ‡ช Deutsch (German) | `de` | โŒ | โœ… | โœ… | โœ… | --- ## ๐Ÿค Contributing | ุงู„ู…ุณุงู‡ู…ุฉ Contributions are welcome! Please follow these steps: ### ูƒูŠู ุชุณุงู‡ู… / How to Contribute 1. **Fork** the repository 2. **Clone** your fork locally ```bash git clone https://github.com/your-username/OmniFile_Processor.git cd OmniFile_Processor ``` 3. **Create a feature branch** ```bash git checkout -b feature/your-feature-name ``` 4. **Make your changes** and ensure tests pass ```bash pip install -r requirements-full.txt # Everything (~6-8 GB, ~15-30 min) # Or install in layers: # pip install -r requirements-core.txt # Minimum (~1.5 GB, ~3 min) # pip install -r requirements-core.txt -r requirements-ocr.txt # + OCR engines # pip install -r requirements-core.txt -r requirements-nlp.txt # + NLP pytest tests/ -v ``` 5. **Commit** with a descriptive message ```bash git commit -m "feat: add your feature description" ``` 6. **Push** to your fork ```bash git push origin feature/your-feature-name ``` 7. **Open a Pull Request** against the `main` branch ### Development Guidelines - Follow PEP 8 style guidelines - Add docstrings to all new functions and classes - Write tests for new features (place in `tests/`) - Update the relevant documentation in `docs/` - Use type hints throughout your code - Ensure RTL handling is tested for any text-related changes --- ## ๐Ÿ“œ License | ุงู„ุชุฑุฎูŠุต This project is licensed under the **MIT License**. ``` MIT License Copyright (c) 2026 Dr Abdulmalek Tamer Al-husseini โ€” Homs, Syria Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. ``` See [LICENSE](LICENSE) for the full text. --- ## ๐Ÿ”— Links | ุงู„ุฑูˆุงุจุท | Resource | Link | |----------|------| | ๐Ÿ™ **GitHub Repository** | [https://github.com/DrAbdulmalek/OmniFile_Processor](https://github.com/DrAbdulmalek/OmniFile_Processor) | | ๐Ÿค— **HuggingFace Spaces** | [https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr](https://huggingface.co/spaces/DrAbdulmalek/handwriting-ocr) | | ๐Ÿ“š **User Guide** | [docs/USER_GUIDE.md](docs/USER_GUIDE.md) | | ๐Ÿ‘จโ€๐Ÿ’ป **Developer Guide** | [docs/DEVELOPER_GUIDE.md](docs/DEVELOPER_GUIDE.md) | | ๐Ÿงช **Testing Guide** | [docs/TESTING_GUIDE.md](docs/TESTING_GUIDE.md) | | ๐Ÿ“ก **API Documentation** | [docs/API_DOCS.md](docs/API_DOCS.md) | | ๐Ÿ’ก **Suggestions** | [SUGGESTIONS.md](SUGGESTIONS.md) | | ๐Ÿ“‹ **License** | [LICENSE](LICENSE) | ---
**Built with โค๏ธ by Dr Abdulmalek Tamer Al-husseini** *๐Ÿ“ Homs, Syria  |  ๐Ÿ“ง Abdulmalek.husseini@gmail.com* โญ If you find this project useful, please give it a star on [GitHub](https://github.com/DrAbdulmalek/OmniFile_Processor)!