Spaces:
Sleeping
Sleeping
File size: 2,898 Bytes
3625193 dd487ef d60dbcf 3625193 155dd44 3625193 155dd44 3625193 155dd44 3625193 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
---
title: Semantic Integrity Analysis
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: streamlit
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
=======
# Semantic Integrity Analysis
Legal document analysis web app with authentication, upload, line-level issue detection, and final narrative summary.
## Current Architecture
- `backend/`: Flask API + SQLite auth + document analysis pipeline
- `frontend/`: Multi-page static UI
- `ui/`: Streamlit path (separate from current web flow)
- `analysis/`: Core analyzer logic
## Active User Flow
1. `index.html` -> Login / Sign up
2. `upload.html` -> Upload up to 2 reference files + final file, then run cross-verification analysis
3. `issues.html` -> Line-level issue analysis (duplication, inconsistency, contradiction)
4. `summary.html` ->
- Detailed document summary (Page 1, Page 2, ... style)
- Page-wise summary cards
- Top findings
5. `dashboard.html` ->
- Line error table (exact page/line)
- Reference vs Final mismatch explanation + rectify action
## Features
- Auth endpoints (`register`, `login`) with SQLite
- Upload support: `PDF`, `DOCX`, `TXT`
- Cross verification: optional 1-2 reference documents + required final document
- Detection categories:
- Duplication
- Inconsistency
- Contradiction
- Vendor/Vendee extraction
- Narrative `detailedSummary` + page summaries + line-level dashboard
## Backend Setup
```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 app.py
```
Backend default: `http://127.0.0.1:5000`
## Frontend Setup
```bash
cd frontend
python3 -m http.server 8080
```
Open: `http://127.0.0.1:8080/index.html`
## API Endpoints
- `GET /api/health`
- `POST /api/register`
- `POST /api/login`
- `POST /api/analyze`
Alias routes also available:
- `GET /health`
- `POST /register`
- `POST /login`
- `POST /analyze`
## Analyze Response (important keys)
- `summary`
- `pageSummaries`
- `detailedSummary`
- `findings`
- `lineIssues`
## Deployment (GitHub + Render)
### 1) Push repository
```bash
git add .
git commit -m "Project setup and web flow"
git branch -M main
git remote add origin https://github.com/<your-username>/<your-repo>.git
git push -u origin main
```
### 2) Deploy backend on Render (Web Service)
- Root directory: `backend`
- Build command:
```bash
pip install -r requirements.txt
```
- Start command:
```bash
gunicorn app:app
```
### 3) Deploy frontend (static)
- Option A: Render Static Site (root `frontend`)
- Option B: GitHub Pages for `frontend/`
## Notes
- Current `frontend + backend` flow does **not** require `merged_tinyllama_instruction`.
- Streamlit path under `ui/` may use local TinyLlama model path.
- If analysis output changes are not visible, restart backend and re-run upload.
(Create README.md)
|