Spaces:

kusatmer
/

image-text-extractor

Running

App Files Files Community

image-text-extractor / README.md

kusatmer

feat: Add Dockerfile for application containerization and update README to reflect Docker SDK.

96c41f7 19 days ago

preview code

raw

history blame contribute delete

1.93 kB

metadata

title: Image Text Extractor
emoji: 📄
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 1.28.0
app_file: streamlit_app.py
pinned: false
license: mit

Image Text Extractor

This project is a Streamlit application that uses the olmOCR model (based on Qwen2.5-VL) to extract text from images. It provides a user-friendly interface to upload images and view the extracted text along with metadata.

Features

Image Upload: Support for PNG, JPG, and JPEG formats.
Text Extraction: Uses state-of-the-art Vision-Language Models for accurate OCR.
Metadata Extraction: Extracts additional information like primary language, rotation, and content type (table, diagram).
JSON Export: Download extraction results as JSON files.
Configurable: Adjust maximum token generation for longer documents.

Installation

Clone the repository:

git clone <repository-url>
cd image-text-extractor

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Run the Streamlit app:
```
streamlit run streamlit_app.py
```
Open your browser: The app should automatically open in your default browser at http://localhost:8501.

Testing

This project uses pytest for unit testing.

Run tests:
```
pytest tests/
```

Project Structure

streamlit_app.py: The main entry point for the Streamlit application.
service/: Contains the backend logic for text extraction.
- text_extraction_service.py: The core service class handling model interaction.
tests/: Unit tests for the application.
requirements.txt: Python dependencies.

License

[Add License Here]