Spaces:

kusatmer
/

image-text-extractor

Running

App Files Files Community

image-text-extractor / README.md

kusatmer

feat: Add Dockerfile for application containerization and update README to reflect Docker SDK.

96c41f7 20 days ago

preview code

raw

history blame contribute delete

1.93 kB

	---
	title: Image Text Extractor
	emoji: 📄
	colorFrom: blue
	colorTo: indigo
	sdk: docker
	sdk_version: 1.28.0
	app_file: streamlit_app.py
	pinned: false
	license: mit
	---

	# Image Text Extractor

	This project is a Streamlit application that uses the `olmOCR` model (based on Qwen2.5-VL) to extract text from images. It provides a user-friendly interface to upload images and view the extracted text along with metadata.

	## Features

	- Image Upload: Support for PNG, JPG, and JPEG formats.
	- Text Extraction: Uses state-of-the-art Vision-Language Models for accurate OCR.
	- Metadata Extraction: Extracts additional information like primary language, rotation, and content type (table, diagram).
	- JSON Export: Download extraction results as JSON files.
	- Configurable: Adjust maximum token generation for longer documents.

	## Installation

	1. Clone the repository:
	```bash
	git clone <repository-url>
	cd image-text-extractor
	```

	2. Create a virtual environment (recommended):
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	3. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	## Usage

	1. Run the Streamlit app:
	```bash
	streamlit run streamlit_app.py
	```

	2. Open your browser:
	The app should automatically open in your default browser at `http://localhost:8501`.

	## Testing

	This project uses `pytest` for unit testing.

	1. Run tests:
	```bash
	pytest tests/
	```

	## Project Structure

	- `streamlit_app.py`: The main entry point for the Streamlit application.
	- `service/`: Contains the backend logic for text extraction.
	- `text_extraction_service.py`: The core service class handling model interaction.
	- `tests/`: Unit tests for the application.
	- `requirements.txt`: Python dependencies.

	## License

	[Add License Here]