|
|
--- |
|
|
title: Menu to Excel Converter |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: gradio |
|
|
sdk_version: 5.49.1 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# Menu OCR β Excel (Batch + Validation) β Hugging Face Space |
|
|
|
|
|
This package contains a ready-to-deploy Gradio app that processes menu images into Excel files. |
|
|
|
|
|
Files included: |
|
|
- `app.py` β Gradio application (batch processing + validation) |
|
|
- `requirements.txt` β Python dependencies for the Space |
|
|
|
|
|
Filename format (recommended for automatic metadata extraction) |
|
|
-------------------------------------------------------------- |
|
|
Images should be named like: |
|
|
<Store Name>_<Store Code> <Branch Name>.<ext> |
|
|
Example: |
|
|
Fortis Hospital_60247010108 Rohini.jpg |
|
|
|
|
|
The app extracts: |
|
|
- A1 = Store Name (e.g., "Fortis Hospital") |
|
|
- B1 = Store Code (e.g., "60247010108") |
|
|
- C1 = Branch Name (e.g., "Rohini") |
|
|
|
|
|
How to use (UI steps) |
|
|
--------------------- |
|
|
1. Create a Hugging Face Space: SDK = Gradio, Runtime = Python 3.10. |
|
|
2. Upload `app.py` and `requirements.txt` to the Space files area. |
|
|
3. Open the Space UI after build completes. |
|
|
4. In the UI: |
|
|
- Upload multiple menu images (left) and a single Excel template (.xlsx). |
|
|
- Click "Parse all images". |
|
|
- Select a parsed image from the dropdown to review. |
|
|
- Edit the extracted table if needed and click "Save current edits". |
|
|
- When finished, click "Download ZIP of all (use after saving/edits)" to get all generated Excel files. |
|
|
|
|
|
Output format |
|
|
------------- |
|
|
Each generated .xlsx is a copy of your uploaded template with: |
|
|
- Row 1: metadata (A1 Store Name, B1 Store Code, C1 Branch Name) |
|
|
- Row 2: your existing headers (unchanged) |
|
|
- Row 3 onward: parsed menu items mapped into columns A..S: |
|
|
A: Parent Category |
|
|
B: Category |
|
|
C: Name |
|
|
D: Item Code |
|
|
E: Master Item Name |
|
|
F: EAN Code |
|
|
G: Price |
|
|
H: Active |
|
|
I: Priority |
|
|
J: Image |
|
|
K: Food type |
|
|
L: NoOfMains |
|
|
M: OnlineName |
|
|
N: AlternateClassification |
|
|
O: ItemTaxInclusive |
|
|
P: TaxPct |
|
|
Q: BrandName |
|
|
R: ClassificationCode |
|
|
S: HSN Code |
|
|
|
|
|
Notes & troubleshooting |
|
|
----------------------- |
|
|
- Tesseract OCR must be installed on the host. If you get a Tesseract error, install system Tesseract or ask me to provide a transformer-based fallback. |
|
|
- For better OCR accuracy, use high-resolution, well-lit images. |
|
|
- To adjust price parsing, edit `PRICE_REGEX` inside `app.py`. |
|
|
- To improve category detection, edit `CATEGORY_HINTS` inside `app.py`. |
|
|
|
|
|
If you want me to bundle these files into a zip here, reply "please zip" and I will produce the downloadable package. |
|
|
|