File size: 2,794 Bytes
fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a 48a5851 fd06b5a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# π Storage Management Guide
## Overview
Your system uses three storage locations for organization and persistence:
```
Project Root
βββ uploads/ # Temporary files (auto-cleanup after 24h)
βββ persistent_docs/ # Permanent files (company policies, etc.)
βββ chroma_db/ # Vector embeddings (independent of files)
```
## Storage Types
### uploads/
- Temporary chat uploads, one-time document queries
- Auto-deleted after 24 hours
### persistent_docs/
- Permanent storage for company policies, reference docs
- Manual cleanup only
### chroma_db/
- Persistent semantic embeddings for fast search
- Vectors remain even if source files are deleted
## Key Features
- **Automatic Cleanup:** Temporary uploads deleted after 24h (on startup or via API)
- **Persistent Documents:** Upload with `persistent=true` to store forever
- **Vector Store:** ChromaDB vectors always persist, even if files are deleted
## API Usage
### Upload File (Temporary)
```bash
curl -X POST "http://localhost:8000/upload" -F "file=@file.pdf"
# File goes to uploads/ and will be deleted after 24h
```
### Upload File (Persistent)
```bash
curl -X POST "http://localhost:8000/upload" -F "file=@file.pdf" -F "persistent=true"
# File goes to persistent_docs/ and stays forever
```
### Get Storage Info
```bash
curl http://localhost:8000/storage/info
```
### Manual Cleanup
```bash
curl -X POST "http://localhost:8000/storage/cleanup?max_age_hours=12"
# Removes temporary files older than 12 hours
```
## Vector Store Behavior
- Upload file β Vectors created in chroma_db/
- Delete source file β Vectors remain in chroma_db/
- Search works even if original file is gone
- To remove vectors, clear chroma_db/ manually
## Best Practices
- Use temporary storage for one-time analysis, personal uploads, testing
- Use persistent storage for policies, handbooks, SOPs, knowledge base
- Periodically clean chroma_db/ to free disk space
## Troubleshooting
- **Why can I still search deleted files?**
- Vectors persist in ChromaDB by design
- **How do I free up disk space?**
- Temporary files auto-delete; clear chroma_db/ for vectors
- **Change cleanup time?**
- Edit `cleanup_old_uploads(max_age_hours=24)` in main.py
- **Duplicate uploads?**
- Each upload gets a unique UUID filename; vectors stored by document_id
## Monitoring
Check usage regularly:
```bash
curl http://localhost:8000/storage/info
ls -lh uploads/
ls -lh persistent_docs/
du -sh chroma_db/
```
## Summary
- uploads/: Temporary, auto-cleanup (24h)
- persistent_docs/: Permanent, manual cleanup
- chroma_db/: Persistent vectors, independent of files
- Automatic and manual cleanup supported
- Storage info API for monitoring
Your multi-agent system now has production-ready storage management! π
|