final_year / README.md
jayasrees's picture
Add HF metadata
dd487ef
---
title: Semantic Integrity Analysis
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: streamlit
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
=======
# Semantic Integrity Analysis
Legal document analysis web app with authentication, upload, line-level issue detection, and final narrative summary.
## Current Architecture
- `backend/`: Flask API + SQLite auth + document analysis pipeline
- `frontend/`: Multi-page static UI
- `ui/`: Streamlit path (separate from current web flow)
- `analysis/`: Core analyzer logic
## Active User Flow
1. `index.html` -> Login / Sign up
2. `upload.html` -> Upload up to 2 reference files + final file, then run cross-verification analysis
3. `issues.html` -> Line-level issue analysis (duplication, inconsistency, contradiction)
4. `summary.html` ->
- Detailed document summary (Page 1, Page 2, ... style)
- Page-wise summary cards
- Top findings
5. `dashboard.html` ->
- Line error table (exact page/line)
- Reference vs Final mismatch explanation + rectify action
## Features
- Auth endpoints (`register`, `login`) with SQLite
- Upload support: `PDF`, `DOCX`, `TXT`
- Cross verification: optional 1-2 reference documents + required final document
- Detection categories:
- Duplication
- Inconsistency
- Contradiction
- Vendor/Vendee extraction
- Narrative `detailedSummary` + page summaries + line-level dashboard
## Backend Setup
```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 app.py
```
Backend default: `http://127.0.0.1:5000`
## Frontend Setup
```bash
cd frontend
python3 -m http.server 8080
```
Open: `http://127.0.0.1:8080/index.html`
## API Endpoints
- `GET /api/health`
- `POST /api/register`
- `POST /api/login`
- `POST /api/analyze`
Alias routes also available:
- `GET /health`
- `POST /register`
- `POST /login`
- `POST /analyze`
## Analyze Response (important keys)
- `summary`
- `pageSummaries`
- `detailedSummary`
- `findings`
- `lineIssues`
## Deployment (GitHub + Render)
### 1) Push repository
```bash
git add .
git commit -m "Project setup and web flow"
git branch -M main
git remote add origin https://github.com/<your-username>/<your-repo>.git
git push -u origin main
```
### 2) Deploy backend on Render (Web Service)
- Root directory: `backend`
- Build command:
```bash
pip install -r requirements.txt
```
- Start command:
```bash
gunicorn app:app
```
### 3) Deploy frontend (static)
- Option A: Render Static Site (root `frontend`)
- Option B: GitHub Pages for `frontend/`
## Notes
- Current `frontend + backend` flow does **not** require `merged_tinyllama_instruction`.
- Streamlit path under `ui/` may use local TinyLlama model path.
- If analysis output changes are not visible, restart backend and re-run upload.
(Create README.md)