DocVault-app / PRODUCTION_TESTING_RUNBOOK.md
mohsin-devs's picture
Deploy HF-ready DocVault with HF storage backend
2fe2727
# πŸš€ PRODUCTION READINESS TESTING GUIDE
**Status**: Before deployment to HuggingFace Spaces
**Time Required**: 3 hours
**Tools Needed**: Browser, Python 3.7+, requests library
---
## πŸ“‹ WHAT'S CHANGED
### Added: Comprehensive Production Logging
- Every API endpoint now logs:
- Operation start/end
- Success/failure status
- Input parameters (paths, filenames)
- Execution time
- Exceptions with stack traces
**Example log output**:
```
[UPLOAD_FILE] START | user=default_user | file=document.pdf | folder=reports | size=1048576
[UPLOAD_FILE] SUCCESS | user=default_user | file=document.pdf | folder=reports
[RENAME] START | user=default_user | path=reports | new_name=archive_reports
[RENAME] SUCCESS | user=default_user | old_path=reports | new_name=archive_reports
```
### Added: Automated Stress Test Script
`stress_test.py` - Runs 5 comprehensive stress tests:
1. Bulk Upload (50 files)
2. Folder Rename with 30 files
3. Rapid Operations (upload→delete→rename)
4. Cache Behavior (TTL validation)
5. Error Handling (invalid operations)
---
## πŸ§ͺ TESTING EXECUTION PLAN
### PHASE 1: Setup (15 minutes)
#### Step 1: Verify Logging
```bash
# Check that logging is configured correctly
cd c:\Users\mohat\OneDrive\Desktop\Doc
python -c "from server.utils.logger import setup_logger; logger = setup_logger('test'); logger.info('Logging works')"
```
Expected output: Should see log message in console or logs/ directory
#### Step 2: Install Test Dependencies
```bash
pip install requests
```
---
### PHASE 2: Manual Testing (45 minutes)
#### Test A: Basic Rename (NEW FEATURE)
**Goal**: Verify rename works end-to-end
1. Open: http://localhost:5000 (or HF Spaces URL)
2. Upload test file: `rename_test.txt`
3. Right-click file β†’ "Rename" β†’ `renamed_file.txt`
4. Verify: File appears with new name
5. Download to verify content intact
6. Check logs: Should see `[RENAME] SUCCESS`
**Pass Criteria**:
- βœ“ File renamed successfully
- βœ“ UI updates immediately
- βœ“ Log shows [RENAME] SUCCESS
#### Test B: Folder Operations
**Goal**: Verify folders work with new rename
1. Create folder: `TestFolder`
2. Upload 5 files into it
3. Rename folder: `TestFolder` β†’ `RenamedFolder`
4. Verify: All 5 files still accessible in new folder
5. Check logs: Should see [RENAME] SUCCESS with all files preserved
**Pass Criteria**:
- βœ“ Folder renamed with contents
- βœ“ Files accessible after rename
- βœ“ No data loss
#### Test C: Delete Operations
1. Upload file: `delete_test.txt`
2. Delete it
3. Verify: File removed from list
4. Check logs: Should see `[DELETE_FILE] SUCCESS`
**Pass Criteria**:
- βœ“ File removed
- βœ“ Delete confirmation modal shown
- βœ“ Log shows [DELETE_FILE] SUCCESS
#### Test D: Cache Behavior
**Goal**: Verify 60-second cache works
1. Upload file: `cache_test.txt`
2. Open DevTools (F12) β†’ Network tab
3. Reload page within 30 seconds
4. Observe: List call should be instant (using cache)
5. Wait 65 seconds
6. Reload page
7. Observe: Slightly slower (fresh from server)
**Pass Criteria**:
- βœ“ First reload is instant (<100ms)
- βœ“ After 60s, fresh data fetched
- βœ“ No errors in console
---
### PHASE 3: Automated Stress Testing (60 minutes)
#### Step 1: Start Server
```bash
# Terminal 1: Start Flask backend
cd c:\Users\mohat\OneDrive\Desktop\Doc
python -m server.app
# Should see: Running on http://127.0.0.1:5000
```
#### Step 2: Run Stress Tests
```bash
# Terminal 2: Run stress test suite (LOCAL mode)
cd c:\Users\mohat\OneDrive\Desktop\Doc
python stress_test.py http://localhost:5000
```
**Expected Output**:
```
============================================================
DOCVAULT STRESS TEST SUITE
============================================================
Base URL: http://localhost:5000
User: stress_test_user
Timeout: 30s
Checking server connectivity...
βœ“ Server is healthy
πŸ”₯ TEST 1: BULK UPLOAD (50 files)
----------------------------------------
βœ“ bulk_upload_all_succeed
βœ“ bulk_upload_speed
Uploaded: 50/50
Total time: 25.34s
Avg per file: 0.51s
πŸ”₯ TEST 2: FOLDER RENAME STRESS (30 files in folder)
----------------------------------------
βœ“ folder_rename
βœ“ folder_rename_speed
βœ“ folder_rename_data_preserved
Renaming folder...
Time: 2.15s
πŸ”₯ TEST 3: RAPID OPERATIONS (Race Conditions)
----------------------------------------
βœ“ rapid_ops_20_cycles
βœ“ rapid_seq_operations
πŸ§ͺ TEST 4: CACHE VALIDATION (60s TTL)
----------------------------------------
βœ“ cache_first_list
βœ“ cache_second_list
πŸ§ͺ TEST 5: ERROR HANDLING
----------------------------------------
βœ“ error_handling_delete_nonexistent
βœ“ error_handling_rename_nonexistent
βœ“ error_handling_list_nonexistent
============================================================
STRESS TEST RESULTS
============================================================
βœ“ Passed: 16
βœ— Failed: 0
Total: 16
PERFORMANCE METRICS:
bulk_upload: avg=0.51s, max=1.23s, count=50
folder_rename: avg=2.15s, max=2.15s, count=1
list_fresh: avg=0.05s, max=0.05s, count=1
list_cached: avg=0.02s, max=0.02s, count=1
============================================================
```
#### Step 3: Check Logs
```bash
# Review logs for any warnings or errors
tail -f logs/docvault.log | grep -E "\[UPLOAD\]|\[DELETE\]|\[RENAME\]|\[ERROR\]"
```
**Look For**:
- βœ“ All operations have START and SUCCESS
- βœ“ No EXCEPTION entries
- ⚠️ Any FAIL entries (investigate)
#### Step 4: Interpret Results
**Perfect Scenario**:
- All 16 tests pass
- No race conditions detected
- Performance metrics reasonable
- No errors in logs
**Warning Scenarios**:
- Bulk upload > 30 seconds: May be slow
- Folder rename > 5 seconds: Performance issue
- Failed rapid operations: Race condition risk
- Exceptions in logs: Critical issue
**Failure Scenario**:
- Tests fail: Do NOT deploy
- Race conditions detected: Do NOT deploy
- Exceptions: Do NOT deploy
---
## πŸ“Š TEST RESULTS TEMPLATE
Record results here:
```
TEST EXECUTION REPORT
═════════════════════════════════════════════════════════
Date: ________________
Environment: [ ] LOCAL [ ] HF_SPACES
Tester: ________________
MANUAL TESTS:
─────────────────────────────────────────────────────────
Test A - Rename Feature: [ ] PASS [ ] FAIL
Test B - Folder Operations: [ ] PASS [ ] FAIL
Test C - Delete Operations: [ ] PASS [ ] FAIL
Test D - Cache Behavior: [ ] PASS [ ] FAIL
AUTOMATED STRESS TESTS:
─────────────────────────────────────────────────────────
Total Tests: ____
Passed: ____
Failed: ____
Race Conditions: [ ] YES [ ] NO
PERFORMANCE METRICS (from stress test):
─────────────────────────────────────────────────────────
Avg Upload Time: ______ seconds
Max Rename Time: ______ seconds
Cache Hit Time: ______ seconds
List Operation Time: ______ seconds
ISSUES FOUND:
─────────────────────────────────────────────────────────
[ ] None
[ ] Minor (document & monitor)
[ ] Major (fix before deploy)
[ ] Critical (DO NOT DEPLOY)
Issues:
1. ________________________________________
2. ________________________________________
3. ________________________________________
LOGS STATUS:
─────────────────────────────────────────────────────────
[ ] All operations logged correctly
[ ] No exceptions in logs
[ ] Errors are meaningful
[ ] Performance data captured
FINAL VERDICT:
─────────────────────────────────────────────────────────
[ ] READY FOR PRODUCTION
[ ] NEEDS MORE TESTING
[ ] DO NOT DEPLOY
Comments:
__________________________________________________________________
__________________________________________________________________
Signed: _________________________ Date: ________________
```
---
## πŸ”₯ CRITICAL ISSUES TO WATCH FOR
### Red Flag #1: Upload Failures
```
[UPLOAD_FILE] FAIL | reason=permission_denied
```
β†’ Check HF token permissions
### Red Flag #2: Rename Failures
```
[RENAME] FAIL | error=atomic_commit_failed
```
β†’ Check HF API availability
### Red Flag #3: Race Conditions
```
rapid_ops test: 3/20 operations failed
```
β†’ Investigate cache invalidation timing
### Red Flag #4: Slow Performance
```
bulk_upload total time: 180s (should be <60s)
```
β†’ Check network/HF API performance
### Red Flag #5: Memory Leaks
```
Processing time increases over 150+ operations
```
β†’ Check for unclosed file handles
---
## 🎯 SUCCESS CRITERIA FOR PRODUCTION DEPLOYMENT
**All Must Be True**:
- βœ… Manual tests: 4/4 passing
- βœ… Stress tests: 16/16 passing
- βœ… No race conditions detected
- βœ… No exceptions in logs
- βœ… Performance acceptable (bulk upload <60s)
- βœ… Cache working (list cached <100ms)
- βœ… All logs readable and structured
- βœ… Zero data loss observed
**If Any Fails**:
β†’ DO NOT DEPLOY
β†’ Investigate & fix
β†’ Re-test
---
## πŸš€ DEPLOYMENT STEPS (After Passing Tests)
1. **Backup Current HF Repo**
```bash
# Create a snapshot of current state
# Document in deployment log
```
2. **Deploy Code**
- Push fixes to HF Space git repo
- Or: Upload files directly to Space
3. **Restart Application**
- HF Spaces: App auto-restarts on code change
- Local: Restart Flask server
4. **Smoke Test**
- Upload test file
- Rename it
- Delete it
- Check logs are generated
5. **Monitor for 24 Hours**
- Check logs regularly
- Monitor error rates
- Gather user feedback
---
## πŸ“ LOGGING REFERENCE
### Log Format
```
[OPERATION] STATUS | user=X | details=Y | duration=Z
```
### Example Logs
```
[UPLOAD_FILE] START | user=default_user | file=report.pdf | size=2097152
[UPLOAD_FILE] SUCCESS | user=default_user | file=report.pdf
[DELETE_FILE] START | user=default_user | path=report.pdf
[DELETE_FILE] SUCCESS | user=default_user | path=report.pdf
[RENAME] START | user=default_user | path=old_name | new_name=new_name
[RENAME] SUCCESS | user=default_user | old_path=old_name | new_name=new_name
[LIST] START | user=default_user | path=root
[LIST] SUCCESS | user=default_user | path=root | files=42 | folders=5
```
### What Each Field Means
- `[OPERATION]` - What's happening (UPLOAD, DELETE, RENAME, etc.)
- `STATUS` - Result (START, SUCCESS, FAIL, EXCEPTION)
- `user=X` - Which user (always check: should be isolated)
- `details` - Operation context (files, paths, sizes)
- `duration` - How long it took (helps spot performance issues)
---
## πŸ“ž TROUBLESHOOTING
### Stress Test Won't Run
```bash
# Error: No module named 'requests'
pip install requests
# Error: Connection refused
# Make sure Flask app is running: python -m server.app
```
### Tests Fail With Timeouts
- Check: Server is responding normally
- Check: Network connectivity
- Increase: TIMEOUT variable in stress_test.py
### Logs Not Appearing
- Check: Logs directory exists
- Check: config.py LOG_LEVEL = "INFO"
- Check: Flask app has logging enabled
---
**NEXT STEP**: Execute this entire procedure and record results in the template above.
**THEN**: If all passes β†’ **YOU'RE PRODUCTION READY** βœ…
---
*Ready to test? Let's ensure this app is truly production-grade.*