Spaces:

JAYASREESS
/

final_year

Sleeping

App Files Files Community

final_year / README.md

jayasrees

Add HF metadata

dd487ef about 1 month ago

preview code

raw

history blame contribute delete

2.9 kB



	---
	title: Semantic Integrity Analysis
	emoji: 🤖
	colorFrom: blue
	colorTo: green
	sdk: streamlit
	app_file: app.py
	pinned: false
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
	=======
	# Semantic Integrity Analysis

	Legal document analysis web app with authentication, upload, line-level issue detection, and final narrative summary.

	## Current Architecture

	- `backend/`: Flask API + SQLite auth + document analysis pipeline
	- `frontend/`: Multi-page static UI
	- `ui/`: Streamlit path (separate from current web flow)
	- `analysis/`: Core analyzer logic

	## Active User Flow

	1. `index.html` -> Login / Sign up
	2. `upload.html` -> Upload up to 2 reference files + final file, then run cross-verification analysis
	3. `issues.html` -> Line-level issue analysis (duplication, inconsistency, contradiction)
	4. `summary.html` ->
	- Detailed document summary (Page 1, Page 2, ... style)
	- Page-wise summary cards
	- Top findings
	5. `dashboard.html` ->
	- Line error table (exact page/line)
	- Reference vs Final mismatch explanation + rectify action

	## Features

	- Auth endpoints (`register`, `login`) with SQLite
	- Upload support: `PDF`, `DOCX`, `TXT`
	- Cross verification: optional 1-2 reference documents + required final document
	- Detection categories:
	- Duplication
	- Inconsistency
	- Contradiction
	- Vendor/Vendee extraction
	- Narrative `detailedSummary` + page summaries + line-level dashboard

	## Backend Setup

	```bash
	cd backend
	python3 -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt
	python3 app.py
	```

	Backend default: `http://127.0.0.1:5000`

	## Frontend Setup

	```bash
	cd frontend
	python3 -m http.server 8080
	```

	Open: `http://127.0.0.1:8080/index.html`

	## API Endpoints

	- `GET /api/health`
	- `POST /api/register`
	- `POST /api/login`
	- `POST /api/analyze`

	Alias routes also available:

	- `GET /health`
	- `POST /register`
	- `POST /login`
	- `POST /analyze`

	## Analyze Response (important keys)

	- `summary`
	- `pageSummaries`
	- `detailedSummary`
	- `findings`
	- `lineIssues`

	## Deployment (GitHub + Render)

	### 1) Push repository

	```bash
	git add .
	git commit -m "Project setup and web flow"
	git branch -M main
	git remote add origin https://github.com/<your-username>/<your-repo>.git
	git push -u origin main
	```

	### 2) Deploy backend on Render (Web Service)

	- Root directory: `backend`
	- Build command:

	```bash
	pip install -r requirements.txt
	```

	- Start command:

	```bash
	gunicorn app:app
	```

	### 3) Deploy frontend (static)

	- Option A: Render Static Site (root `frontend`)
	- Option B: GitHub Pages for `frontend/`

	## Notes

	- Current `frontend + backend` flow does not require `merged_tinyllama_instruction`.
	- Streamlit path under `ui/` may use local TinyLlama model path.
	- If analysis output changes are not visible, restart backend and re-run upload.
	(Create README.md)