Shaikhsarib commited on
Commit
d2f3059
Β·
1 Parent(s): 55714d6

docs: exhaustive repository architecture map and Grand Tour update

Browse files
Files changed (1) hide show
  1. README.md +80 -66
README.md CHANGED
@@ -4,85 +4,99 @@ Eatlytic is a high-performance, AI-native nutritional analysis engine designed t
4
 
5
  ---
6
 
7
- ## πŸš€ The Eatlytic Intelligence Pipeline
8
-
9
- The system processes images through a sophisticated, multi-pass pipeline:
10
-
11
- ### **1. Visual Hardening & Detection**
12
- - **Deduplication**: Uses Perceptual Hashing (pHash) in `hash_service.py` to identify returning images and serve cached results instantly, skipping expensive OCR and AI calls.
13
- - **MSER ROI Targeting**: `label_detector.py` uses Maximally Stable Extremal Regions to detect text density heatmaps. This allows the system to find nutrition tables on any material (shiny, dark, or transparent) without relying on hardcoded color panels.
14
-
15
- ### **2. "Never Reject" Blur Repair (`image.py`)**
16
- Eatlytic is designed to read "real-world" photos, including blurry WhatsApp thumbnails:
17
- - **Repair Pipeline**: Triggers a 4-stage repair sequence for blurry images: **Lanczos4 Upscale (1800px)** β†’ **Motion Deconvolution** β†’ **Local Contrast (CLAHE)** β†’ **Unsharp Masking**.
18
- - **Result**: Drastically improves success rates for compressed or low-light photos that standard OCR would reject.
19
-
20
- ### **3. Global script-aware OCR (`ocr.py`)**
21
- - **Auto-Language Detection**: Identifies the script (Arabic, CJK, Hindi, European) from the image center before running full OCR.
22
- - **Multilingual Support**: Expanded to support **18+ languages** including Thai, Korean, Japanese, Russian, and Arabic.
23
- - **Multi-Pass Retry**: If confidence is low, the system automatically retries with dedicated Denoise, Sharpen, and Binary processing passes.
24
-
25
- ### **4. Universal Analysis Brain (`llm.py`)**
26
- - **Global Extraction**: A universalized "Super-Prompt" instructs the AI to act as a global nutrition specialist, extracting 15+ rich data fields (Molecular Insight, ELI5, Age Warnings) regardless of regional naming conventions.
27
- - **Research Fallback**: If OCR is messy, `research_engine.py` performs a targeted web search for verified manufacturer data.
28
-
29
- ### **5. Physics & Validation**
30
- - **Universal Atwater Gate (`fake_detector.py`)**: Checks the physics of the label. If `(Protein * 4) + (Fat * 9) + (Carb * 4)` does not match the stated Calories (within a 20% universal tolerance), the system flags the data as unreliable.
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ---
33
 
34
- ## πŸ“‚ Exact File Structure
35
 
36
  ```text
37
  Eatlytic-App-main/
 
 
 
38
  β”œβ”€β”€ app/
39
  β”‚ β”œβ”€β”€ models/
40
- β”‚ β”‚ └── db.py # SQLite & Persistence core
 
 
 
 
 
41
  β”‚ └── services/
42
- β”‚ β”œβ”€β”€ ocr.py # Multi-Pass Global OCR Engine
43
- β”‚ β”œβ”€β”€ llm.py # Universal AI Brain
44
- β”‚ β”œβ”€β”€ fake_detector.py # Universal Atwater Physics Validator
45
- β”‚ β”œβ”€β”€ label_detector.py # MSER Vision & ROI Targeting
46
- β”‚ β”œβ”€β”€ image.py # "Never Reject" Blur Repair Pipeline
47
- β”‚ β”œβ”€β”€ duel_service.py # Persona-Weighted Comparison
48
- β”‚ β”œβ”€β”€ alternatives.py # Global Healthy Swap Matrix
49
- β”‚ β”œβ”€β”€ hash_service.py # Perceptual Hashing (pHash)
50
- β”‚ β”œβ”€β”€ research_engine.py # Live Web Research (DDG)
51
- β”‚ β”œβ”€β”€ formatter.py # Post-processing & WhatsApp Tiers
52
- β”‚ β”œβ”€β”€ explanation_engine.py # ICMR/Global RDA Benchmarking
53
- β”‚ β”œβ”€β”€ auth.py # API Authentication
54
- β”‚ └── payments.py # Quota & Payment logic
55
- β”œβ”€β”€ main.py # FastAPI Production Endpoint
56
- β”œβ”€β”€ index.html # Web Front-end
57
- β”œβ”€β”€ test_critical.py # Core Stability Testing
58
- β”œβ”€β”€ test_phash.py # Deduplication Testing
59
- β”œβ”€β”€ test_poison_pill.py # Security & Resilience Testing
60
- β”œβ”€β”€ flush_cache.py # Maintenance: Cache Clearing
61
- β”œβ”€β”€ inspect_db.py # Maintenance: DB Inspection
62
- β”œβ”€β”€ scrub_meat.py # Maintenance: Categorization Repair
63
- β”œβ”€β”€ Dockerfile # Production Containerization
64
- β”œβ”€β”€ docker-compose.yml # Local Orchestration
65
- β”œβ”€β”€ requirements.txt # System Dependencies
66
- β”œβ”€β”€ .env # Environment Config
67
- └── Eatlytic-12Week-Roadmap.md # Strategic Growth Plan
 
 
 
 
 
68
  ```
69
 
70
  ---
71
 
72
- ## ⚑ Key Specialty Features
73
-
74
- ### **The Duel Engine**
75
- Located in `duel_service.py`, this module allows head-to-head product comparisons. Unlike generic score comparison, it uses a **Persona-Weighted Matrix** (Muscle Mode, Diabetic Mode, Weight Loss Mode).
76
-
77
- ### **Universal Product Alternatives**
78
- The `alternatives.py` module uses an **Ingredient-Pivot** matrix that recommends healthy swaps locally and globally (e.g., roasted Makhana for chips, or Poha for instant noodles).
79
-
80
- ---
81
-
82
- ## πŸ› οΈ Performance & Scalability
83
- - **Quota Logic**: Built-in scan tracking in `main.py` prevents API abuse.
84
- - **Cache Safety Valve**: Automatically discards results with 0 macronutrients.
85
- - **HuggingFace Ready**: Memory-optimized deployment for HF Spaces.
86
 
87
  ---
88
 
 
4
 
5
  ---
6
 
7
+ ## πŸš€ The Eatlytic "Grand Tour" Architecture
8
+
9
+ The system is organized into a modular, horizontally scalable architecture:
10
+
11
+ ### **1. The Intelligence Services (`app/services/`)**
12
+ The "Brain" of Eatlytic, where raw pixels become data:
13
+ - **`ocr.py`**: The Multi-Pass Global OCR Engine. Features auto-script detection and 3-pass retry (Denoise/Sharpen/Binary).
14
+ - **`llm.py`**: The Universal AI Brain. Merges OCR text with global knowledge to produce high-fidelity JSON (Molecular Insight, ELI5, Age Warnings).
15
+ - **`label_detector.py`**: CV ROI targeting using **MSER (Maximally Stable Extremal Regions)** text-density heatmaps.
16
+ - **`image.py`**: The **"Never Reject"** repair pipeline (Upscale, Wiener deconvolution, CLAHE).
17
+ - **`fake_detector.py`**: The Atwater Physics Validator (Universal 20% tolerance floor).
18
+ - **`duel_service.py`**: Head-to-head persona-weighted product comparison logic.
19
+ - **`alternatives.py`**: Global healthy swap matrix (Ingredient-Pivot logic).
20
+ - **`hash_service.py`**: Perceptual Hashing (pHash) for instant deduplication.
21
+ - **`research_engine.py`**: Live Web Research (DDG) for messy-label fallback.
22
+ - **`explanation_engine.py`**: Global/ICMR RDA benchmarking and INS/E-number scanner.
23
+ - **`formatter.py`**: Result post-processing and text-tiering for WhatsApp/Web.
24
+
25
+ ### **2. The API Layer (`app/routes/`)**
26
+ Handles user interaction, security, and business logic:
27
+ - **`auth.py`**: User authentication, session management, and Supabase security integration.
28
+ - **`benchmarks.py`**: Internal performance tracking (Latency, accuracy, and ROI stats).
29
+ - **`food_db.py`**: Analytics and Scan History (The backbone of the "History" tab).
30
+ - **`payments.py`**: Quota management and **Razorpay** integration for Pro activation.
31
+
32
+ ### **3. The Persistence Layer (`app/models/`)**
33
+ The source of truth for the platform:
34
+ - **`db.py`**: Hybrid persistence engine. Uses **Supabase** for production clusters and **SQLite (WAL mode)** for local development and offline caching.
35
+
36
+ ### **4. Maintenance & CLI Tooling (Root)**
37
+ Scripts for system upkeep and data repair:
38
+ - **`flush_cache.py`**: Clears broken or 0-nutrient cache entries.
39
+ - **`scrub_meat.py`**: Repairs categorization errors across the database.
40
+ - **`inspect_db.py`**: Terminal-based dashboard for viewing live scans and quotas.
41
+ - **`deploy.sh`**: Manual deployment script for server environments.
42
 
43
  ---
44
 
45
+ ## πŸ“‚ Exhaustive File Structure
46
 
47
  ```text
48
  Eatlytic-App-main/
49
+ β”œβ”€β”€ .github/
50
+ β”‚ └── workflows/
51
+ β”‚ └── sync_to_huggingface.yml # CI/CD: HF Spaces sync
52
  β”œβ”€β”€ app/
53
  β”‚ β”œβ”€β”€ models/
54
+ β”‚ β”‚ └── db.py # Persistence: Supabase/SQLite hybrid
55
+ οΏ½οΏ½οΏ½ β”œβ”€β”€ routes/
56
+ β”‚ β”‚ β”œβ”€β”€ auth.py # API: User Auth & Tokens
57
+ β”‚ β”‚ β”œβ”€β”€ benchmarks.py # API: Performance Monitoring
58
+ β”‚ β”‚ β”œβ”€β”€ food_db.py # API: History & Analytics
59
+ β”‚ β”‚ └── payments.py # API: Razorpay & Quotas
60
  β”‚ └── services/
61
+ β”‚ β”œβ”€β”€ ocr.py # Logic: Global OCR (18+ Scripts)
62
+ β”‚ β”œβ”€β”€ llm.py # Logic: Universal AI Brain
63
+ β”‚ β”œβ”€β”€ label_detector.py # Vision: MSER ROI Targeting
64
+ β”‚ β”œβ”€β”€ image.py # Vision: "Never Reject" Repair
65
+ β”‚ β”œβ”€β”€ fake_detector.py # Physics: Atwater Validation
66
+ β”‚ β”œβ”€β”€ duel_service.py # Feature: Persona-Weighted Duels
67
+ β”‚ β”œβ”€β”€ alternatives.py # Feature: Ingredient Swaps
68
+ β”‚ β”œβ”€β”€ hash_service.py # Performance: pHash Deduplication
69
+ β”‚ β”œβ”€β”€ research_engine.py # Fallback: Live Web Research
70
+ β”‚ β”œβ”€β”€ formatter.py # UX: Post-processing & Formatting
71
+ β”‚ β”œβ”€β”€ explanation_engine.py # Science: RDA & INS Scanning
72
+ β”‚ β”œβ”€β”€ auth.py # Logic: Backend Security
73
+ β”‚ └── payments.py # Logic: Quota Verification
74
+ β”œβ”€β”€ data/
75
+ β”‚ β”œβ”€β”€ eatlytic.db # Local Persistence (fallback)
76
+ β”‚ β”œβ”€β”€ ai_cache.json # Local AI result cache
77
+ β”‚ └── ocr_cache.json # Local OCR heatmap cache
78
+ β”œβ”€β”€ main.py # Application Core (FastAPI)
79
+ β”œβ”€β”€ index.html # Frontend Entry Point
80
+ β”œβ”€β”€ test_critical.py # Stability: Pipeline stress tests
81
+ β”œβ”€β”€ test_phash.py # Logic: Deduplication verification
82
+ β”œβ”€β”€ test_poison_pill.py # Security: Input resilience tests
83
+ β”œβ”€β”€ conftest.py # Testing: Framework config
84
+ β”œβ”€β”€ flush_cache.py # Maintenance: Cache Repair
85
+ β”œβ”€β”€ inspect_db.py # Maintenance: DB Explorer
86
+ β”œβ”€β”€ scrub_meat.py # Maintenance: Data Repair
87
+ β”œβ”€β”€ Dockerfile # Infrastructure: Docker Image
88
+ β”œβ”€β”€ docker-compose.yml # Infrastructure: Local orchestration
89
+ β”œβ”€β”€ requirements.txt # Dependencies: System packages
90
+ β”œβ”€β”€ .env # Configuration: API Keys/URLs
91
+ └── Eatlytic-12Week-Roadmap.md # Strategy: Future Growth
92
  ```
93
 
94
  ---
95
 
96
+ ## ⚑ Performance & Compliance
97
+ - **DPDP Compliant**: Built-in data erasure (`/api/v1/user/delete`) and retention management in `db.py`.
98
+ - **HuggingFace Ready**: Auto-deploy via `.github/workflows` with tailored memory management for C-based vision libraries.
99
+ - **Cache Safety Valve**: Automatically discards suspect entries to ensure 100% data integrity.
 
 
 
 
 
 
 
 
 
 
100
 
101
  ---
102