File size: 5,452 Bytes
af7a6d5
 
 
 
 
c489928
af7a6d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
---
title: SmartDoc AI
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "6.2.0"
app_file: main.py
pinned: false
---

# SmartDoc AI

SmartDoc AI is an advanced document analysis and question answering system, designed for source-grounded Q&A over complex business and scientific reports—especially where key evidence lives in tables and charts.

---

##  Personal Research Update

**SmartDoc AI – Document Q&A + Selective Chart Understanding**

I’ve been developing SmartDoc AI as a technical experiment to improve question answering over complex business/scientific reports—especially where key evidence lives in tables and charts.

### Technical highlights:

- **Multi-format ingestion:** PDF, DOCX, TXT, Markdown
- **LLM-assisted query decomposition:** breaks complex prompts into clearer sub-questions for retrieval + answering
- **Selective chart pipeline (cost-aware):**
  - Local OpenCV heuristics flag pages that likely contain charts
  - Gemini Vision is invoked only for chart pages to generate structured chart analysis (reduces unnecessary vision calls)
- **Table extraction + robust PDF parsing:** pdfplumber strategies for bordered and borderless tables
- **Parallelized processing:** concurrent PDF parsing + chart detection; batch chart analysis where enabled
- **Hybrid retrieval:** BM25 + vector search combined via an ensemble retriever
- **Multi-agent answering:** answer drafting + verification pass, with retrieved context available for inspection (page/source metadata)

**Runtime note:** Large PDFs (many pages/charts) can take minutes depending on DPI, chart volume, and available memory/CPU (HF Spaces limits can be a factor).

---

##  Demo Videos

- [SmartDoc AI technical demo #1](https://youtu.be/uVU_sLiJU4w)
- [SmartDoc AI technical demo #2](https://youtu.be/c8CF7-OaKmQ)
- [SmartDoc AI technical demo #3](https://youtu.be/P17SZSQJ6Wc)

---

## Repository
?? https://github.com/TilanTAB/Intelligent-Document-Analysis-SmartDoc-AI

---

## Use Cases

- Source-grounded Q&A for business/research documents
- Automated extraction and summarization from tables/charts

If you’re interested in architecture tradeoffs (cost, latency, memory limits, retrieval quality), feel free to connect.

---

## Features

- **Multi-format Document Support**: PDF, DOCX, TXT, and Markdown
- **Smart Chunking**: Configurable chunk size and overlap for optimal retrieval
- **Intelligent Caching**: Speeds up repeated queries
- **Chart Extraction**: Detects and analyzes charts using OpenCV and Gemini Vision
- **Hybrid Search**: Combines keyword and vector search for best results
- **Multi-Agent Workflow**: Relevance checking, research, and answer verification
- **Production Ready**: Structured logging, environment-based config, and test suite
- **Efficient**: Local chart detection saves up to 95% on API costs

---

## Quick Start

### Prerequisites
- Python 3.11 or higher
- Google API Key for Gemini models ([Get one here](https://ai.google.dev/))

### Installation

1. Clone the repository:
```bash
git clone https://github.com/TilanTAB/Intelligent-Document-Analysis-SmartDoc-AI.git
cd Intelligent-Document-Analysis-SmartDoc-AI
```

2. Activate the virtual environment:
```bash
# Windows PowerShell
.\activate_venv.ps1
# Windows Command Prompt
activate_venv.bat
# Or manually:
.\venv\Scripts\Activate.ps1
```

3. Install dependencies (if needed):
```bash
pip install -r requirements.txt
```

4. Configure environment variables:
```bash
cp .env.template .env
# Edit .env and set your API key
GOOGLE_API_KEY=your_api_key_here
```

5. (Optional) Verify installation:
```bash
python verify_environment.py
```

6. Run the application:
```bash
python main.py
```

7. Open your browser to [http://localhost:7860](http://localhost:7860)

---

## Configuration

All settings can be configured via environment variables or the `.env` file. Key options include:
- `GOOGLE_API_KEY`: Your Gemini API key (required)
- `CHUNK_SIZE`, `CHUNK_OVERLAP`: Document chunking
- `ENABLE_CHART_EXTRACTION`: Enable/disable chart detection
- `CHART_USE_LOCAL_DETECTION`: Use OpenCV for free chart detection
- `CHART_ENABLE_BATCH_ANALYSIS`: Batch process charts for speed
- `CHART_GEMINI_BATCH_SIZE`: Number of charts per Gemini API call
- `LOG_LEVEL`: Logging verbosity
- `GRADIO_SERVER_PORT`: Web interface port

---

## Project Structure

- `intelligence/` - Multi-agent system (relevance, research, verification)
- `configuration/` - App settings and logging
- `content_analyzer/` - Document and chart processing
- `search_engine/` - Hybrid retriever logic
- `core/` - Utilities and diagnostics
- `tests/` - Test suite
- `main.py` - Application entry point

---

## Troubleshooting

- **API Key Not Found**: Set `GOOGLE_API_KEY` in your `.env` file.
- **Python 3.13 Issues**: Use Python 3.11 or 3.12 for best compatibility.
- **Chart Detection Slow**: Lower `CHART_DPI` or `CHART_MAX_IMAGE_SIZE` in `.env`.
- **ChromaDB Lock Issues**: Stop all instances and remove lock files in `vector_store/`.

---

## Contributing

Contributions are welcome! Please fork the repository, create a feature branch, and submit a pull request with a clear description.

---

## License

This project is licensed under the MIT License.

---

SmartDoc AI is actively maintained and designed for real-world document analysis and Q&A. For updates and support, visit the [GitHub repository](https://github.com/TilanTAB/Intelligent-Document-Analysis-SmartDoc-AI).