swiftops-backend / docs /dev /document /DOCUMENT_MANAGEMENT.md
kamau1's picture
chore: migrate to useast organize the docs, delete redundant migrations
c4f7e3e
# Document Management System
## Overview
Universal document management system with polymorphic ownership, automatic storage routing, and version control.
## Features
βœ… **Universal & Polymorphic** - One system for all entities (users, tickets, projects, etc.)
βœ… **Automatic Storage Routing** - Images/videos β†’ Cloudinary, Documents β†’ Supabase
βœ… **Version Control** - Track document versions with history
βœ… **Metadata & Tagging** - Rich metadata, tags, and search capabilities
βœ… **Access Control** - Public/private documents with uploader tracking
βœ… **Audit Logging** - Complete audit trail of all document operations
## Architecture
```
User uploads file
↓
FastAPI receives file
↓
StorageService determines provider based on MIME type
↓
β”œβ”€β†’ Images/Videos β†’ Cloudinary (CDN-optimized)
└─→ Documents (PDF, DOCX) β†’ Supabase Storage
↓
Receive URL from storage provider
↓
Create document record in database
↓
Return document metadata to user
```
## Storage Routing Rules
The system automatically routes files based on optimization, not capability:
| File Type | Default Provider | Reason | Alternative |
|-----------|-----------------|--------|-------------|
| `image/*` | **Cloudinary** | CDN delivery, auto-optimization, transformations | Supabase can also store images |
| `video/*` | **Cloudinary** | Streaming, transcoding, adaptive bitrate | Supabase can also store videos |
| `application/pdf` | **Supabase** | Cost-effective, simple storage | N/A |
| `application/*` | **Supabase** | General documents (DOCX, XLSX, etc.) | N/A |
| Other | **Supabase** | Fallback for all other file types | N/A |
**Note**: Both providers can technically store any file type. The routing is based on optimization:
- **Cloudinary**: Best for media that needs CDN delivery and transformations
- **Supabase**: Best for documents and general file storage
## Database Schema
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY,
-- Polymorphic ownership
entity_type TEXT NOT NULL, -- 'user', 'project', 'ticket', etc.
entity_id UUID NOT NULL,
-- File details
file_name TEXT NOT NULL,
file_type TEXT, -- MIME type
file_size BIGINT,
file_url TEXT NOT NULL,
storage_provider TEXT DEFAULT 'supabase', -- 'cloudinary', 'supabase'
-- Classification
document_type TEXT, -- 'profile_photo', 'identity_card', etc.
document_category TEXT, -- 'legal', 'financial', 'operational'
-- Version control
version INTEGER DEFAULT 1,
is_latest_version BOOLEAN DEFAULT TRUE,
previous_version_id UUID REFERENCES documents(id),
-- Metadata
description TEXT,
tags JSONB DEFAULT '[]',
additional_metadata JSONB DEFAULT '{}',
-- Access control
uploaded_by_user_id UUID REFERENCES users(id),
is_public BOOLEAN DEFAULT FALSE,
-- Timestamps
created_at TIMESTAMP WITH TIME ZONE,
updated_at TIMESTAMP WITH TIME ZONE,
deleted_at TIMESTAMP WITH TIME ZONE
);
```
## API Endpoints
### Universal Endpoints
```http
POST /api/v1/documents/upload
GET /api/v1/documents/{entity_type}/{entity_id}
GET /api/v1/documents/id/{document_id}
PUT /api/v1/documents/id/{document_id}
DELETE /api/v1/documents/id/{document_id}
```
### Convenience Endpoints (Shortcuts)
```http
POST /api/v1/documents/users/{user_id}/upload
GET /api/v1/documents/users/{user_id}
```
## Usage Examples
### Upload User Profile Photo
```bash
curl -X POST "https://api.example.com/api/v1/documents/upload" \
-H "Authorization: Bearer {token}" \
-F "file=@profile.jpg" \
-F "entity_type=user" \
-F "entity_id=123e4567-e89b-12d3-a456-426614174000" \
-F "document_type=profile_photo" \
-F "document_category=personal" \
-F "description=User profile photo" \
-F "tags=[\"profile\", \"avatar\"]" \
-F "is_public=true"
```
**Response:**
```json
{
"id": "doc-uuid",
"entity_type": "user",
"entity_id": "user-uuid",
"file_name": "profile.jpg",
"file_type": "image/jpeg",
"file_size": 245678,
"file_url": "https://res.cloudinary.com/...",
"storage_provider": "cloudinary",
"document_type": "profile_photo",
"version": 1,
"uploader": {
"id": "uploader-uuid",
"name": "John Doe",
"email": "john@example.com"
},
"created_at": "2025-11-16T10:30:00Z"
}
```
### Upload Ticket Photo
```bash
curl -X POST "https://api.example.com/api/v1/documents/upload" \
-H "Authorization: Bearer {token}" \
-F "file=@site_photo.jpg" \
-F "entity_type=ticket" \
-F "entity_id=ticket-uuid" \
-F "document_type=ticket_image" \
-F "document_category=evidence" \
-F "description=Before installation photo"
```
### Upload User ID Document (PDF)
```bash
curl -X POST "https://api.example.com/api/v1/documents/users/{user_id}/upload" \
-H "Authorization: Bearer {token}" \
-F "file=@national_id.pdf" \
-F "document_type=identity_card" \
-F "document_category=legal" \
-F "description=National ID card"
```
### Get All User Documents
```bash
curl "https://api.example.com/api/v1/documents/user/{user_id}" \
-H "Authorization: Bearer {token}"
```
### Get Specific Document Type
```bash
curl "https://api.example.com/api/v1/documents/user/{user_id}?document_type=profile_photo" \
-H "Authorization: Bearer {token}"
```
### Update Document Metadata
```bash
curl -X PUT "https://api.example.com/api/v1/documents/id/{document_id}" \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
"description": "Updated description",
"tags": ["updated", "new-tag"],
"is_public": true
}'
```
### Delete Document
```bash
curl -X DELETE "https://api.example.com/api/v1/documents/id/{document_id}" \
-H "Authorization: Bearer {token}"
```
## Document Types
### User Documents
- `profile_photo` - User avatar/profile picture
- `identity_card` - National ID, passport, etc.
- `driver_license` - Driver's license
- `contract` - Employment contract
- `certification` - Professional certifications
### Ticket Documents
- `ticket_image` - Site photos, equipment photos
- `before_photo` - Before installation/repair
- `after_photo` - After installation/repair
- `work_report` - PDF work reports
### Expense Documents
- `receipt` - Payment receipts
- `invoice` - Invoices
- `expense_proof` - Expense documentation
### Project Documents
- `project_plan` - Project planning documents
- `blueprint` - Technical blueprints
- `contract` - Project contracts
## Document Categories
- `legal` - Legal documents (contracts, IDs, licenses)
- `financial` - Financial documents (receipts, invoices)
- `operational` - Operational documents (reports, plans)
- `evidence` - Evidence documents (photos, proofs)
- `personal` - Personal documents (profile photos)
## Storage Providers
### Cloudinary
- **Used for**: Images and videos
- **Benefits**: CDN delivery, automatic optimization, on-the-fly transformations
- **Folder structure**: `/swiftops/users/`, `/swiftops/tickets/`, `/swiftops/receipts/`
- **Metadata stored**: `public_id`, `format`, `width`, `height`, `bytes`, `resource_type`
### Supabase Storage
- **Used for**: Documents (PDF, DOCX, etc.)
- **Benefits**: Integrated with database, simple API, cost-effective
- **Bucket structure**: `documents-users`, `documents-tickets`, `documents-general`
- **Metadata stored**: `bucket`, `path`, `size`, `content_type`
## Version Control
Documents support versioning:
1. Upload new version of existing document
2. System creates new document record with incremented version
3. Previous version marked as `is_latest_version = false`
4. `previous_version_id` links to old version
5. Both versions remain accessible
## Security
### Authentication
- All endpoints require valid JWT token
- User must be authenticated to upload/view documents
### Authorization
- Users can upload documents for entities they have access to
- Public documents (`is_public=true`) can be viewed by anyone with the link
- Private documents require proper permissions
### File Validation
- File size limits enforced
- File type validation
- Malicious file detection (future enhancement)
## Error Handling
### Upload Failures
- If storage provider fails, no database record is created
- User receives clear error message
- Failed uploads are logged for debugging
### Database Failures
- If database insert fails after successful upload, file is deleted from storage
- Prevents orphaned files
- Transaction-like behavior
## Monitoring
### Metrics to Track
- Upload success/failure rates
- Average upload times
- Storage usage by provider
- Most common document types
- Failed uploads requiring retry
### Logs
- All uploads logged with user, entity, file details
- Audit trail in `audit_logs` table
- Error logs for failed operations
## Future Enhancements
- [ ] Document preview/thumbnail generation
- [ ] OCR for text extraction from images
- [ ] Virus scanning for uploaded files
- [ ] Bulk upload support
- [ ] Document expiry/retention policies
- [ ] Advanced search with full-text search
- [ ] Document sharing with external users
- [ ] E-signature integration
## Testing
Run integration tests:
```bash
node tests/integration/test_document_upload.js
```
## Configuration
Required environment variables:
```env
# Cloudinary (for images/videos)
CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret
# Supabase (for documents)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your_service_key
```
## Troubleshooting
### "Cloudinary not configured" error
- Check that all Cloudinary environment variables are set
- Verify credentials are correct in Cloudinary dashboard
### "Failed to upload to Supabase Storage" error
- Check Supabase service key is valid
- Verify storage buckets exist in Supabase dashboard
- Check bucket permissions
### Files not appearing in Cloudinary Media Library
- Check `asset_folder` parameter in upload
- Verify account is using dynamic folder mode (post-June 2024)
### Large file uploads failing
- Check file size limits (default 10MB for most endpoints)
- For files >100MB, use chunked upload (future enhancement)
## Support
For issues or questions:
1. Check logs in `docs/hflogs/runtimeerror.txt`
2. Review audit logs in database
3. Check Cloudinary/Supabase dashboards for storage issues