Spaces:
Runtime error
Runtime error
File size: 6,508 Bytes
46f2cb3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
# β
RAG Setup Complete!
## What Was Set Up
### 1. Extended RAG System
- **File**: `src/modal-rag-product-design.py`
- **Purpose**: Query the TokyoDrive Insurance product design document
- **Features**:
- Supports both Markdown and Word documents
- Uses separate ChromaDB collection (`product_design`)
- Leverages existing Modal infrastructure
- GPU-accelerated with Phi-3 model
### 2. Simple CLI Query Interface
- **File**: `query_product_design.py`
- **Features**:
- Interactive mode for continuous queries
- Single query mode for quick questions
- Index command to set up the vector database
- Clean, user-friendly output
### 3. Documentation
- `docs/QUICK_START_RAG.md` - Quick start guide
- `docs/setup_product_design_rag.md` - Detailed setup instructions
- `docs/next_steps_rag_recommendation.md` - Decision guide
## Files Created
```
src/
βββ modal-rag-product-design.py # Extended RAG system
query_product_design.py # CLI query interface
docs/
βββ QUICK_START_RAG.md # Quick start guide
βββ setup_product_design_rag.md # Setup instructions
βββ next_steps_rag_recommendation.md # Decision guide
βββ RAG_SETUP_COMPLETE.md # This file
```
## Next Steps
### 1. Index the Documents (Required First Step)
```bash
python query_product_design.py --index
```
This will:
- Load `tokyo_auto_insurance_product_design_filled.md`
- Load `tokyo_auto_insurance_product_design.docx`
- Create embeddings
- Store in ChromaDB
**Time**: 2-5 minutes
### 2. Test with a Query
```bash
# Single query
python query_product_design.py --query "What are the three product tiers?"
# Or interactive mode
python query_product_design.py --interactive
```
### 3. Use Cases
#### For Product Development
```bash
python query_product_design.py --query "What are the technical requirements for the digital platform?"
python query_product_design.py --query "What API integrations are needed?"
```
#### For Sales/Marketing
```bash
python query_product_design.py --query "What are the premium ranges for each tier?"
python query_product_design.py --query "What discounts are available?"
```
#### For Compliance
```bash
python query_product_design.py --query "What are the FSA licensing requirements?"
python query_product_design.py --query "What is the minimum capital requirement?"
```
#### For Financial Planning
```bash
python query_product_design.py --query "What are the Year 3 financial projections?"
python query_product_design.py --query "What is the break-even point?"
```
## Architecture
```
βββββββββββββββββββββββββββββββββββββββ
β Product Design Documents β
β - Markdown (.md) β
β - Word (.docx) β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Modal Volume β
β mcp-hack-ins-products β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Indexing Function β
β - Load documents β
β - Split into chunks β
β - Generate embeddings β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β ChromaDB (Remote) β
β Collection: product_design β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Query Interface β
β - CLI tool (query_product_design) β
β - Modal RAG class β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β LLM (Phi-3) β
β - Retrieves relevant chunks β
β - Generates answers β
βββββββββββββββββββββββββββββββββββββββ
```
## How It Works
1. **Indexing**: Documents are split into chunks, embedded, and stored in ChromaDB
2. **Query**: User asks a question
3. **Retrieval**: System finds relevant chunks using semantic search
4. **Generation**: LLM generates answer based on retrieved context
5. **Response**: Answer + sources returned to user
## Tips
### Best Practices
- **Be specific**: "What is the premium for Standard tier?" vs "What is the premium?"
- **Ask one thing**: Break complex questions into simpler ones
- **Use context**: Reference specific sections if you know them
### Performance
- First query: ~10-15 seconds (cold start)
- Subsequent queries: ~3-5 seconds (warm container)
- Indexing: 2-5 minutes (one-time)
### Troubleshooting
- **"No documents found"**: Check Modal volume has the files
- **"Collection not found"**: Run indexing first
- **Slow queries**: Normal on first query, should speed up
## Integration Ideas
1. **Development Workflow**: Extract requirements for Jira tickets
2. **Stakeholder Q&A**: Answer investor/partner questions quickly
3. **Documentation**: Auto-generate summaries for different audiences
4. **Compliance**: Generate compliance checklists automatically
5. **Sales**: Quick access to pricing and feature details
## Support
- See `docs/QUICK_START_RAG.md` for quick reference
- See `docs/setup_product_design_rag.md` for detailed setup
- Check Modal logs: `modal app logs insurance-rag-product-design`
---
**Status**: β
Ready to use!
**Next Action**: Run `python query_product_design.py --index` to get started.
|