# Digi-Biz ๐Ÿ“„ **Agentic Business Digitization Framework** Transform unstructured business documents into structured digital profiles using AI agents. [![Tests](https://img.shields.io/badge/tests-66%20passed-green)]() [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)]() [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)]() --- ## ๐Ÿš€ Quick Start ### 1. Install Dependencies ```bash pip install -r requirements.txt ``` ### 2. Configure Groq API Get your free API key at https://console.groq.com Create `.env` file: ```bash GROQ_API_KEY=gsk_your_key_here GROQ_VISION_MODEL=meta-llama/llama-4-scout-17b-16e-instruct ``` ### 3. Run the App ```bash streamlit run app.py ``` Open http://localhost:8501 --- ## โœจ Features โœ… **Multi-Agent Pipeline** - 6 specialized agents โœ… **Groq Vision** - Image analysis with Llama-4-Scout (17B) โœ… **Vectorless RAG** - Fast document retrieval โœ… **Production-Ready** - Error handling, validation, logging โœ… **Interactive UI** - Streamlit web interface --- ## ๐Ÿ“Š What It Does 1. **Upload ZIP** with business documents 2. **AI Agents Process**: - File Discovery โ†’ Classify files - Document Parsing โ†’ Extract text/tables - Table Extraction โ†’ Detect & classify - Media Extraction โ†’ Extract images - Vision Analysis โ†’ Describe images (Groq) - Indexing โ†’ Build search index (RAG) 3. **View Results** in interactive UI --- ## ๐ŸŽฏ Example Use Cases ### Restaurant Digitization - Upload: Menu PDFs, food photos, price lists - Output: Digital menu with prices, food descriptions, categories ### Travel Agency - Upload: Tour brochures, itinerary PDFs, destination photos - Output: Tour packages with itineraries, pricing, descriptions ### Retail Store - Upload: Product catalogs, inventory spreadsheets, product photos - Output: Product inventory with descriptions, prices, categories --- ## ๐Ÿ“ Project Structure ``` digi-biz/ โ”œโ”€โ”€ backend/agents/ # 6 AI agents โ”œโ”€โ”€ backend/models/ # Data schemas โ”œโ”€โ”€ backend/utils/ # Utilities โ”œโ”€โ”€ tests/agents/ # Test suites โ”œโ”€โ”€ app.py # Streamlit app โ”œโ”€โ”€ requirements.txt # Dependencies โ””โ”€โ”€ docs/ # Documentation ``` --- ## ๐Ÿงช Testing All agents are thoroughly tested: ```bash # Run all tests pytest tests/ -v # Test coverage pytest tests/ --cov=backend ``` **Test Results:** 66/66 tests passing โœ… --- ## ๐Ÿ“– Documentation - **[Full Documentation](docs/DOCUMENTATION.md)** - Complete guide - **[Agent Details](docs/AGENT_PIPELINE.md)** - Agent specifications - **[Streamlit App](docs/STREAMLIT_APP.md)** - App usage guide --- ## ๐Ÿ”ง Configuration ### Environment Variables (.env) ```bash # Groq API (required) GROQ_API_KEY=gsk_xxxxx GROQ_MODEL=gpt-oss-120b GROQ_VISION_MODEL=meta-llama/llama-4-scout-17b-16e-instruct # Optional: Ollama fallback OLLAMA_HOST=http://localhost:11434 # Processing limits MAX_FILE_SIZE=524288000 # 500MB MAX_FILES_PER_ZIP=100 ``` --- ## ๐ŸŽ“ Agents | # | Agent | Purpose | Status | |---|-------|---------|--------| | 1 | File Discovery | Extract & classify ZIP files | โœ… | | 2 | Document Parsing | Parse PDF/DOCX | โœ… | | 3 | Table Extraction | Detect & classify tables | โœ… | | 4 | Media Extraction | Extract images/videos | โœ… | | 5 | Vision Agent | Analyze images (Groq) | โœ… | | 6 | Indexing Agent | Build RAG index | โœ… | --- ## ๐Ÿ“Š Performance | Task | Time | |------|------| | File Discovery (10 files) | ~1-2s | | Document Parsing (10 pages) | ~0.5s | | Table Extraction (5 tables) | ~0.5s | | Vision Analysis (1 image) | ~2s | | **Total (typical folder)** | **<2 min** | --- ## ๐Ÿ› ๏ธ Tech Stack - **Backend:** Python 3.10+, Pydantic, asyncio - **Document Parsing:** pdfplumber, python-docx, openpyxl - **Vision AI:** Groq API (Llama-4-Scout-17B) - **Frontend:** Streamlit - **Testing:** pytest --- ## ๐Ÿ“ License MIT License - See [LICENSE](LICENSE) for details --- ## ๐Ÿค Contributing 1. Fork the repo 2. Create feature branch 3. Add tests 4. Submit PR --- ## ๐Ÿ“ž Support - **Issues:** GitHub Issues - **Docs:** [docs/DOCUMENTATION.md](docs/DOCUMENTATION.md) --- **Made with โค๏ธ using AI Agents**