viswanani commited on
Commit
567c2bd
·
verified ·
1 Parent(s): aa222f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -9
README.md CHANGED
@@ -9,14 +9,69 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # Menu to Excel Converter (Separate file per image)
13
 
14
- Upload multiple restaurant menu images, extract text via OCR, parse items and prices heuristically, then download a ZIP containing individual Excel files (one per image).
15
 
16
- ## How to use on Hugging Face
17
- 1. Create a new Space (Gradio + Python).
18
- 2. Upload `app.py`, `requirements.txt`, and this `README.md`.
19
- 3. In Space Settings → System Packages, add:
20
- - `tesseract-ocr`
21
- 4. Restart the Space.
22
- 5. Use the web UI to upload menu images and download the ZIP of Excel files.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ # Menu OCR Excel (Batch + Validation) — Hugging Face Space
13
 
14
+ This package contains a ready-to-deploy Gradio app that processes menu images into Excel files.
15
 
16
+ Files included:
17
+ - `app.py` Gradio application (batch processing + validation)
18
+ - `requirements.txt` Python dependencies for the Space
19
+
20
+ Filename format (recommended for automatic metadata extraction)
21
+ --------------------------------------------------------------
22
+ Images should be named like:
23
+ <Store Name>_<Store Code> <Branch Name>.<ext>
24
+ Example:
25
+ Fortis Hospital_60247010108 Rohini.jpg
26
+
27
+ The app extracts:
28
+ - A1 = Store Name (e.g., "Fortis Hospital")
29
+ - B1 = Store Code (e.g., "60247010108")
30
+ - C1 = Branch Name (e.g., "Rohini")
31
+
32
+ How to use (UI steps)
33
+ ---------------------
34
+ 1. Create a Hugging Face Space: SDK = Gradio, Runtime = Python 3.10.
35
+ 2. Upload `app.py` and `requirements.txt` to the Space files area.
36
+ 3. Open the Space UI after build completes.
37
+ 4. In the UI:
38
+ - Upload multiple menu images (left) and a single Excel template (.xlsx).
39
+ - Click "Parse all images".
40
+ - Select a parsed image from the dropdown to review.
41
+ - Edit the extracted table if needed and click "Save current edits".
42
+ - When finished, click "Download ZIP of all (use after saving/edits)" to get all generated Excel files.
43
+
44
+ Output format
45
+ -------------
46
+ Each generated .xlsx is a copy of your uploaded template with:
47
+ - Row 1: metadata (A1 Store Name, B1 Store Code, C1 Branch Name)
48
+ - Row 2: your existing headers (unchanged)
49
+ - Row 3 onward: parsed menu items mapped into columns A..S:
50
+ A: Parent Category
51
+ B: Category
52
+ C: Name
53
+ D: Item Code
54
+ E: Master Item Name
55
+ F: EAN Code
56
+ G: Price
57
+ H: Active
58
+ I: Priority
59
+ J: Image
60
+ K: Food type
61
+ L: NoOfMains
62
+ M: OnlineName
63
+ N: AlternateClassification
64
+ O: ItemTaxInclusive
65
+ P: TaxPct
66
+ Q: BrandName
67
+ R: ClassificationCode
68
+ S: HSN Code
69
+
70
+ Notes & troubleshooting
71
+ -----------------------
72
+ - Tesseract OCR must be installed on the host. If you get a Tesseract error, install system Tesseract or ask me to provide a transformer-based fallback.
73
+ - For better OCR accuracy, use high-resolution, well-lit images.
74
+ - To adjust price parsing, edit `PRICE_REGEX` inside `app.py`.
75
+ - To improve category detection, edit `CATEGORY_HINTS` inside `app.py`.
76
+
77
+ If you want me to bundle these files into a zip here, reply "please zip" and I will produce the downloadable package.