Zaravya / README.md
viswanani's picture
Update README.md
567c2bd verified
---
title: Menu to Excel Converter
emoji: πŸ“Š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
---
# Menu OCR β†’ Excel (Batch + Validation) β€” Hugging Face Space
This package contains a ready-to-deploy Gradio app that processes menu images into Excel files.
Files included:
- `app.py` β€” Gradio application (batch processing + validation)
- `requirements.txt` β€” Python dependencies for the Space
Filename format (recommended for automatic metadata extraction)
--------------------------------------------------------------
Images should be named like:
<Store Name>_<Store Code> <Branch Name>.<ext>
Example:
Fortis Hospital_60247010108 Rohini.jpg
The app extracts:
- A1 = Store Name (e.g., "Fortis Hospital")
- B1 = Store Code (e.g., "60247010108")
- C1 = Branch Name (e.g., "Rohini")
How to use (UI steps)
---------------------
1. Create a Hugging Face Space: SDK = Gradio, Runtime = Python 3.10.
2. Upload `app.py` and `requirements.txt` to the Space files area.
3. Open the Space UI after build completes.
4. In the UI:
- Upload multiple menu images (left) and a single Excel template (.xlsx).
- Click "Parse all images".
- Select a parsed image from the dropdown to review.
- Edit the extracted table if needed and click "Save current edits".
- When finished, click "Download ZIP of all (use after saving/edits)" to get all generated Excel files.
Output format
-------------
Each generated .xlsx is a copy of your uploaded template with:
- Row 1: metadata (A1 Store Name, B1 Store Code, C1 Branch Name)
- Row 2: your existing headers (unchanged)
- Row 3 onward: parsed menu items mapped into columns A..S:
A: Parent Category
B: Category
C: Name
D: Item Code
E: Master Item Name
F: EAN Code
G: Price
H: Active
I: Priority
J: Image
K: Food type
L: NoOfMains
M: OnlineName
N: AlternateClassification
O: ItemTaxInclusive
P: TaxPct
Q: BrandName
R: ClassificationCode
S: HSN Code
Notes & troubleshooting
-----------------------
- Tesseract OCR must be installed on the host. If you get a Tesseract error, install system Tesseract or ask me to provide a transformer-based fallback.
- For better OCR accuracy, use high-resolution, well-lit images.
- To adjust price parsing, edit `PRICE_REGEX` inside `app.py`.
- To improve category detection, edit `CATEGORY_HINTS` inside `app.py`.
If you want me to bundle these files into a zip here, reply "please zip" and I will produce the downloadable package.