Spaces:
Running
Running
| title: TextLens - AI-Powered OCR | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.0.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # π TextLens - AI-Powered OCR | |
| [](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
| [](https://github.com/KumarAmrit30/textlens-ocr) | |
| [](https://www.python.org/downloads/) | |
| A state-of-the-art Vision-Language Model (VLM) based OCR application that extracts text from images using Microsoft Florence-2 with intelligent fallback systems and enterprise-grade zero downtime deployment. | |
| ## π Live Demo | |
| **π Try it now:** [https://huggingface.co/spaces/GoConqurer/textlens-ocr](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
|  | |
| ## β¨ Key Features | |
| ### π€ Advanced AI-Powered OCR | |
| - **Microsoft Florence-2 VLM**: State-of-the-art vision-language model for text extraction | |
| - **Intelligent Fallback System**: Automatic fallback to EasyOCR if primary model fails | |
| - **Multi-Model Support**: Florence-2-base and Florence-2-large variants | |
| - **Real-time Processing**: Instant text extraction on image upload | |
| ### π¨ Modern User Experience | |
| - **Clean UI**: Professional Gradio interface with intuitive design | |
| - **Multiple Input Methods**: Upload files, use webcam, or paste from clipboard | |
| - **Copy-to-Clipboard**: One-click text copying functionality | |
| - **Responsive Design**: Works seamlessly on desktop and mobile devices | |
| - **Dark/Light Theme**: Automatic theme adaptation | |
| ### β‘ Performance & Reliability | |
| - **GPU Acceleration**: Supports CUDA, MPS (Apple Silicon), and CPU inference | |
| - **Smart Device Detection**: Automatically uses best available hardware | |
| - **Error Resilience**: Robust error handling with graceful degradation | |
| - **Memory Optimization**: Efficient model loading and cleanup | |
| ### π‘οΈ Enterprise Features | |
| - **Zero Downtime Deployment**: Blue-green deployment with health checks | |
| - **Health Monitoring**: Built-in `/health` and `/ready` endpoints | |
| - **Graceful Shutdown**: Signal handling for clean application restarts | |
| - **Production Ready**: Scalable architecture with automated deployment | |
| ## π Quick Start | |
| ### π Online (Recommended) | |
| **Instant access** - No installation required: | |
| π [**Launch TextLens**](https://huggingface.co/spaces/GoConqurer/textlens-ocr) | |
| ### π» Local Development | |
| 1. **Clone Repository** | |
| ```bash | |
| git clone https://github.com/KumarAmrit30/textlens-ocr.git | |
| cd textlens-ocr | |
| ``` | |
| 2. **Setup Environment** | |
| ```bash | |
| python -m venv textlens_env | |
| source textlens_env/bin/activate # Windows: textlens_env\Scripts\activate | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Launch Application** | |
| ```bash | |
| python app.py | |
| ``` | |
| π Open: `http://localhost:7860` | |
| ### π§ͺ Quick Test | |
| ```bash | |
| # Verify installation | |
| python -c "from models.ocr_processor import OCRProcessor; print('β TextLens ready!')" | |
| ``` | |
| ## π Model Performance | |
| | Model | Size | Speed | Accuracy | Best For | | |
| | -------------------- | ----- | --------- | ------------ | ---------------------- | | |
| | **Florence-2-base** | 270M | β‘ Fast | π High | General OCR, Real-time | | |
| | **Florence-2-large** | 770M | π Medium | π Very High | High accuracy needs | | |
| | **EasyOCR** | ~100M | π Medium | π Good | Fallback, Multilingual | | |
| ## π― Supported Use Cases | |
| | Category | Examples | Performance | | |
| | ------------------- | ------------------------------- | ----------- | | |
| | π **Documents** | PDFs, Scanned papers, Forms | βββββ | | |
| | π§Ύ **Receipts** | Shopping receipts, Invoices | ββββ | | |
| | π± **Screenshots** | App interfaces, Error messages | βββββ | | |
| | π **Vehicle** | License plates, VIN numbers | ββββ | | |
| | π **Books** | Printed text, Handwritten notes | ββββ | | |
| | π **Multilingual** | Multiple languages | βββ | | |
| ## π§ Configuration | |
| ### ποΈ Model Selection | |
| ```python | |
| from models.ocr_processor import OCRProcessor | |
| # Fast inference (recommended) | |
| ocr = OCRProcessor(model_name="microsoft/Florence-2-base") | |
| # Maximum accuracy | |
| ocr = OCRProcessor(model_name="microsoft/Florence-2-large") | |
| ``` | |
| ### π¨ UI Customization | |
| Modify `ui/styles.py` to customize appearance: | |
| ```python | |
| # Change color scheme | |
| PRIMARY_COLOR = "#1f77b4" | |
| SECONDARY_COLOR = "#ff7f0e" | |
| # Update layout | |
| INTERFACE_WIDTH = "100%" | |
| ``` | |
| ### βοΈ Environment Variables | |
| | Variable | Description | Default | | |
| | ---------------------- | -------------------- | ---------------------- | | |
| | `SPACE_ID` | HuggingFace Space ID | Auto-detected | | |
| | `DEPLOYMENT_STAGE` | deployment stage | `production` | | |
| | `TRANSFORMERS_CACHE` | Model cache path | `~/.cache/huggingface` | | |
| | `CUDA_VISIBLE_DEVICES` | GPU selection | All available | | |
| **Deployment Flow:** | |
| ```mermaid | |
| graph LR | |
| A[Code Push] --> B[Validate] | |
| B --> C[Deploy Staging] | |
| C --> D[Health Check] | |
| D --> E[Deploy Production] | |
| E --> F[Verify] | |
| F --> G[Complete β ] | |
| ``` | |
| ## π€ Contributing | |
| We welcome contributions! Here's how to get started: | |
| ### π§ Development Setup | |
| 1. **Fork & Clone** | |
| ```bash | |
| git clone https://github.com/YOUR_USERNAME/textlens-ocr.git | |
| cd textlens-ocr | |
| ``` | |
| 2. **Create Branch** | |
| ```bash | |
| git checkout -b feature/your-feature-name | |
| ``` | |
| 3. **Make Changes** | |
| - Add new features or fix bugs | |
| - Update tests and documentation | |
| - Follow code style guidelines | |
| 4. **Test Changes** | |
| ```bash | |
| python -m pytest tests/ | |
| python -c "from models.ocr_processor import OCRProcessor; OCRProcessor()" | |
| ``` | |
| 5. **Submit PR** | |
| ```bash | |
| git add . | |
| git commit -m "feat: add your feature description" | |
| git push origin feature/your-feature-name | |
| ``` | |
| ### π Contribution Guidelines | |
| - **Code Style**: Follow PEP 8, use Black formatter | |
| - **Documentation**: Update README and docstrings | |
| - **Tests**: Add tests for new functionality | |
| - **Commits**: Use conventional commit messages | |
| - **Issues**: Link PRs to relevant issues | |
| ## π License | |
| This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details. | |
| ### π Third-Party Licenses | |
| - **Microsoft Florence-2**: [MIT License](https://github.com/microsoft/Florence) | |
| - **HuggingFace Transformers**: [Apache License 2.0](https://github.com/huggingface/transformers) | |
| - **Gradio**: [Apache License 2.0](https://github.com/gradio-app/gradio) | |
| - **EasyOCR**: [Apache License 2.0](https://github.com/JaidedAI/EasyOCR) | |
| ## π Acknowledgments | |
| Special thanks to: | |
| - **Microsoft Research** for the incredible Florence-2 vision-language model | |
| - **HuggingFace** for the transformers library and Spaces platform | |
| - **Gradio Team** for the amazing web interface framework | |
| - **JaidedAI** for EasyOCR fallback capabilities | |
| - **Open Source Community** for continuous support and contributions | |
| ## π Project Status | |
| | Component | Status | Version | | |
| | ----------------- | ------------- | ------- | | |
| | **Core OCR** | β Stable | v1.0.0 | | |
| | **Web UI** | β Stable | v1.0.0 | | |
| | **Deployment** | β Production | v1.0.0 | | |
| | **API** | β Stable | v1.0.0 | | |
| | **Documentation** | β Complete | v1.0.0 | | |
| ### π Stats | |
|  | |
|  | |
|  | |
| --- | |
| <div align="center"> | |
| **Made with β€οΈ for the AI community** | |
| [β Star this repo](https://github.com/KumarAmrit30/textlens-ocr) β’ [π Try the demo](https://huggingface.co/spaces/GoConqurer/textlens-ocr) β’ [π Read docs](DEPLOYMENT.md) | |
| </div> | |