File size: 2,517 Bytes
280c27b
 
 
 
 
 
2392aa8
280c27b
 
 
 
567c2bd
c43789b
567c2bd
c43789b
567c2bd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
title: Menu to Excel Converter
emoji: πŸ“Š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
---

# Menu OCR β†’ Excel (Batch + Validation) β€” Hugging Face Space

This package contains a ready-to-deploy Gradio app that processes menu images into Excel files.

Files included:
- `app.py` β€” Gradio application (batch processing + validation)
- `requirements.txt` β€” Python dependencies for the Space

Filename format (recommended for automatic metadata extraction)
--------------------------------------------------------------
Images should be named like:
    <Store Name>_<Store Code> <Branch Name>.<ext>
Example:
    Fortis Hospital_60247010108 Rohini.jpg

The app extracts:
- A1 = Store Name (e.g., "Fortis Hospital")
- B1 = Store Code (e.g., "60247010108")
- C1 = Branch Name (e.g., "Rohini")

How to use (UI steps)
---------------------
1. Create a Hugging Face Space: SDK = Gradio, Runtime = Python 3.10.
2. Upload `app.py` and `requirements.txt` to the Space files area.
3. Open the Space UI after build completes.
4. In the UI:
   - Upload multiple menu images (left) and a single Excel template (.xlsx).
   - Click "Parse all images".
   - Select a parsed image from the dropdown to review.
   - Edit the extracted table if needed and click "Save current edits".
   - When finished, click "Download ZIP of all (use after saving/edits)" to get all generated Excel files.

Output format
-------------
Each generated .xlsx is a copy of your uploaded template with:
- Row 1: metadata (A1 Store Name, B1 Store Code, C1 Branch Name)
- Row 2: your existing headers (unchanged)
- Row 3 onward: parsed menu items mapped into columns A..S:
  A: Parent Category
  B: Category
  C: Name
  D: Item Code
  E: Master Item Name
  F: EAN Code
  G: Price
  H: Active
  I: Priority
  J: Image
  K: Food type
  L: NoOfMains
  M: OnlineName
  N: AlternateClassification
  O: ItemTaxInclusive
  P: TaxPct
  Q: BrandName
  R: ClassificationCode
  S: HSN Code

Notes & troubleshooting
-----------------------
- Tesseract OCR must be installed on the host. If you get a Tesseract error, install system Tesseract or ask me to provide a transformer-based fallback.
- For better OCR accuracy, use high-resolution, well-lit images.
- To adjust price parsing, edit `PRICE_REGEX` inside `app.py`.
- To improve category detection, edit `CATEGORY_HINTS` inside `app.py`.

If you want me to bundle these files into a zip here, reply "please zip" and I will produce the downloadable package.