Spaces:
Runtime error
Runtime error
gamingflexer
commited on
Commit
·
057394f
1
Parent(s):
9b25f46
Add catalog digitization feature with intuitive interfaces, multilingual support, and OCR integration
Browse files
README.md
CHANGED
|
@@ -1 +1,37 @@
|
|
| 1 |
-
# Catalog
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Catalog Digitization - BUILD FOR BHARAT Hackathon 2024
|
| 2 |
+
|
| 3 |
+
## Introduction
|
| 4 |
+
This project aims to revolutionize how product catalogs are digitized, leveraging cutting-edge technologies to enhance the user experience for sellers. Our solution seamlessly digitizes catalogs with 1000+ SKUs, incorporating attributes like SKU id, product name, description, price, image, inventory, color, size, and brand, using a combination of intuitive interfaces including text, voice, and image inputs in Indic languages.
|
| 5 |
+
|
| 6 |
+
## Architecture
|
| 7 |
+
Our architecture combines OCR technology with a sophisticated large language model to extract and process data from images, pre-filling product information from an existing repository. The backend, built with Django, manages data operations and interfaces, ensuring a smooth digitization process.
|
| 8 |
+
|
| 9 |
+

|
| 10 |
+
|
| 11 |
+
## Features
|
| 12 |
+
- **Intuitive Interfaces**: Use of text, voice, and image inputs for catalog digitization.
|
| 13 |
+
- **Multilingual Support**: Incorporates Indic languages for text and voice inputs.
|
| 14 |
+
- **OCR Integration**: Extracts data from images to streamline the digitization process.
|
| 15 |
+
- **Database Management**: Efficiently stores and retrieves catalog data, with checks against a primary database to avoid duplicates.
|
| 16 |
+
|
| 17 |
+
## Getting Started
|
| 18 |
+
|
| 19 |
+
```
|
| 20 |
+
pip install -r requirements.txt
|
| 21 |
+
python manage.py makemigrations
|
| 22 |
+
python manage.py migrate
|
| 23 |
+
python manage.py runserver
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
## Pages
|
| 27 |
+
|
| 28 |
+
1. Catalog Page - All the Digitiliazed Catalogs will be displayed here.
|
| 29 |
+
2. Upload Image - User can upload the image of a product and the details will be extracted using OCR & LLM
|
| 30 |
+
3. Add Product - User can add the product details and update it to the database.
|
| 31 |
+
|
| 32 |
+
## Technologies Used
|
| 33 |
+
|
| 34 |
+
- **Django**: Backend framework for managing data operations and interfaces.
|
| 35 |
+
- **Tesseract OCR & Easy OCR & AZURE OCR**: Extracts text from images for digitization.
|
| 36 |
+
- **LLAMA 7B & GPT 3.5**: Large language model for processing text and voice inputs.
|
| 37 |
+
- **Speech Recognition**: Converts voice inputs to text for digitization.
|