Spaces:
Running
Running
Fabio Antonini Claude Sonnet 4.6 commited on
Commit ·
bb6ffde
1
Parent(s): 6a85733
feat: extract AI logic into shared core/ package (Phase 0)
Browse files- Add core/ package with 8 pure-Python AI modules:
ner.py, ocr.py, graphology.py, writer.py, signature.py,
rag.py, dating.py, pipeline.py
- Refactor app/grapholab_demo.py into thin Gradio wrappers
that delegate all processing to core/
- Add GraphoLab Core Quick Start cells to notebooks 02-07
(markdown explanation + importable code cell per notebook)
- Update ROADMAP.md with architectural decisions and phase plan
core/ is now shared by both the Gradio demo and the upcoming
FastAPI backend (feature/backend).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ROADMAP.md +130 -69
- app/grapholab_demo.py +112 -1880
- core/__init__.py +8 -0
- core/dating.py +159 -0
- core/graphology.py +105 -0
- core/ner.py +95 -0
- core/ocr.py +119 -0
- core/pipeline.py +330 -0
- core/rag.py +619 -0
- core/signature.py +395 -0
- core/writer.py +353 -0
- notebooks/02_handwritten_ocr_trocr.ipynb +182 -13
- notebooks/03_signature_verification_siamese.ipynb +74 -3
- notebooks/04_signature_detection_yolo.ipynb +15 -1
- notebooks/05_writer_identification.ipynb +15 -1
- notebooks/06_graphological_feature_analysis.ipynb +15 -1
- notebooks/07_named_entity_recognition.ipynb +15 -1
ROADMAP.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# GraphoLab — Commercial Roadmap
|
| 2 |
|
| 3 |
-
> This document outlines
|
| 4 |
|
| 5 |
---
|
| 6 |
|
|
@@ -17,9 +17,64 @@
|
|
| 17 |
|
| 18 |
---
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
## Priority Issue: Dependency Licenses
|
| 21 |
|
| 22 |
-
**Two dependencies carry AGPL-3.0 licenses**, which impose obligations on any commercial product
|
| 23 |
|
| 24 |
| Dependency | License | Commercial impact |
|
| 25 |
|---|---|---|
|
|
@@ -27,22 +82,10 @@
|
|
| 27 |
| `albumentations` | AGPL-3.0 / Commercial | Same constraint |
|
| 28 |
| All others | BSD / Apache 2.0 / MIT | No commercial restrictions |
|
| 29 |
|
| 30 |
-
**
|
| 31 |
-
|
| 32 |
-
**Option A — Replace AGPL dependencies** *(recommended)*
|
| 33 |
-
- Replace `ultralytics` with an Apache 2.0-licensed detector (e.g. RT-DETR via `transformers`)
|
| 34 |
-
- Remove `albumentations` from requirements (not used in production code)
|
| 35 |
-
- Product code stays fully proprietary at no ongoing cost
|
| 36 |
-
|
| 37 |
-
**Option B — Purchase commercial licenses**
|
| 38 |
-
- Ultralytics Enterprise License: ~$1,000–5,000/year
|
| 39 |
-
- Albumentations commercial license: separate pricing
|
| 40 |
-
- Keep AGPL dependencies as-is, proprietary code stays closed
|
| 41 |
|
| 42 |
-
|
| 43 |
-
-
|
| 44 |
-
- Revenue from services, support, and enterprise features
|
| 45 |
-
- Maximises community visibility
|
| 46 |
|
| 47 |
---
|
| 48 |
|
|
@@ -68,87 +111,117 @@ Forensic data (signatures, manuscripts, legal documents) is highly sensitive. In
|
|
| 68 |
- **Chain of custody** requirements mandate full traceability
|
| 69 |
- **GDPR** (EU) requires control over data residency
|
| 70 |
|
| 71 |
-
###
|
| 72 |
|
| 73 |
-
**Option 1 — On-premise Docker** *(primary,
|
| 74 |
- Extend the existing `docker-compose.yml`
|
| 75 |
- Client installs on a local server or company VM
|
| 76 |
- Multi-user, browser access from internal network
|
| 77 |
- Easy to update; no data leaves the network
|
| 78 |
-
- Requires minimal IT skills to install
|
| 79 |
|
| 80 |
-
**Option 2 — Standalone Desktop** *(for individual practitioners)*
|
| 81 |
- Windows/macOS installer (.exe / .dmg)
|
| 82 |
- FastAPI backend bundled with PyInstaller + Electron UI
|
| 83 |
- Fully local, works offline, no network dependencies
|
| 84 |
-
- Suitable for single-examiner licensing
|
| 85 |
|
| 86 |
**Option 3 — SaaS Cloud** *(optional, future)*
|
| 87 |
-
- Hosted on AWS / Azure / GCP
|
| 88 |
-
- No installation, automatic updates, recurring revenue
|
| 89 |
- More complex GDPR compliance; expect resistance from forensic clients
|
| 90 |
-
-
|
| 91 |
|
| 92 |
---
|
| 93 |
|
| 94 |
## Development Roadmap
|
| 95 |
|
| 96 |
-
### Phase 0 —
|
| 97 |
|
| 98 |
-
|
| 99 |
-
- [ ] Remove `albumentations` from requirements
|
| 100 |
-
- [ ] Choose web stack: **FastAPI** (backend) + **React or Vue** (frontend), or Django full-stack
|
| 101 |
-
- [ ] Choose database: **PostgreSQL** for cases and metadata, **MinIO** (S3-compatible) for image files
|
| 102 |
-
- [ ] Define commercial licensing model
|
| 103 |
|
| 104 |
-
|
|
|
|
| 105 |
|
| 106 |
-
|
| 107 |
|
| 108 |
-
-
|
| 109 |
-
-
|
| 110 |
-
-
|
| 111 |
-
-
|
| 112 |
-
-
|
| 113 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
|
| 115 |
-
### Phase
|
| 116 |
|
| 117 |
-
|
| 118 |
-
- Multi-sample writer identification with comparative interface
|
| 119 |
-
- Side-by-side sample comparison with annotations
|
| 120 |
-
- Data export (CSV, JSON) for integration with third-party systems
|
| 121 |
-
- AI model updates without reinstallation
|
| 122 |
-
- End-user documentation and examiner manual
|
| 123 |
|
| 124 |
-
|
| 125 |
|
| 126 |
-
-
|
| 127 |
-
-
|
| 128 |
-
-
|
| 129 |
-
-
|
| 130 |
-
-
|
| 131 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
|
| 133 |
---
|
| 134 |
|
| 135 |
-
##
|
| 136 |
|
| 137 |
| Layer | Technology | Notes |
|
| 138 |
|---|---|---|
|
| 139 |
| AI / ML | PyTorch, Transformers, OpenCV | Unchanged from labs |
|
|
|
|
| 140 |
| Backend API | FastAPI | Async, automatic OpenAPI docs |
|
| 141 |
-
| Frontend | React + Tailwind CSS
|
| 142 |
| Database | PostgreSQL | Cases, users, audit log |
|
| 143 |
| File storage | MinIO (S3-compatible) | Documents and images, on-premise |
|
| 144 |
-
| Auth | JWT + bcrypt |
|
| 145 |
| PDF reports | WeasyPrint or ReportLab | Forensic report generation |
|
|
|
|
|
|
|
| 146 |
| Container | Docker Compose | Extend existing setup |
|
| 147 |
| CI/CD | GitHub Actions | Build, test, automated releases |
|
| 148 |
|
| 149 |
---
|
| 150 |
|
| 151 |
-
## Licensing Model
|
| 152 |
|
| 153 |
### Recommended: Open-Core
|
| 154 |
|
|
@@ -158,8 +231,6 @@ Professional core that replaces the Jupyter notebooks.
|
|
| 158 |
| **Professional** | Commercial, closed source | Examiners, law firms |
|
| 159 |
| **Enterprise** | Commercial + SLA contract | Courts, banks, government |
|
| 160 |
|
| 161 |
-
**Rationale:** A Community Edition on GitHub maximises visibility and builds reputation in the forensic and AI communities. Professional and Enterprise tiers generate revenue through case management, reporting, audit logging, and support — features that matter to paying clients but are overkill for the open-source demo use case.
|
| 162 |
-
|
| 163 |
### Customer pricing (indicative)
|
| 164 |
|
| 165 |
| Plan | Target | Model |
|
|
@@ -168,13 +239,3 @@ Professional core that replaces the Jupyter notebooks.
|
|
| 168 |
| **Studio** | 2–10 users | Annual licence per user |
|
| 169 |
| **Enterprise** | Courts, banks | Contract + SLA + fine-tuning |
|
| 170 |
| **SaaS** *(future)* | Small firms | Monthly subscription / pay-per-use |
|
| 171 |
-
|
| 172 |
-
---
|
| 173 |
-
|
| 174 |
-
## Open Questions for Future Decisions
|
| 175 |
-
|
| 176 |
-
- Confirm that RT-DETR (or alternative) covers the Lab 04 signature detection use case adequately
|
| 177 |
-
- Verify Ultralytics Enterprise License pricing if Option B is preferred over Option A
|
| 178 |
-
- Decide frontend framework (React vs Vue) before starting Phase 1
|
| 179 |
-
- Assess whether a UX/UI designer is needed for the case management interface
|
| 180 |
-
- Evaluate whether to seek grant funding (EU Horizon, PNRR) for forensic AI tooling
|
|
|
|
| 1 |
# GraphoLab — Commercial Roadmap
|
| 2 |
|
| 3 |
+
> This document outlines the development roadmap for evolving GraphoLab from a demo laboratory into a commercial product.
|
| 4 |
|
| 5 |
---
|
| 6 |
|
|
|
|
| 17 |
|
| 18 |
---
|
| 19 |
|
| 20 |
+
## Architecture Decisions (confirmed 2026-03-31)
|
| 21 |
+
|
| 22 |
+
| Decision | Choice | Rationale |
|
| 23 |
+
|---|---|---|
|
| 24 |
+
| Frontend | **React + shadcn/ui** | Largest ecosystem, professional components ready-to-use |
|
| 25 |
+
| Backend API | **FastAPI** | Async, automatic OpenAPI docs, Python-native |
|
| 26 |
+
| Database | **PostgreSQL** | Cases, users, audit log |
|
| 27 |
+
| File storage | **MinIO** (S3-compatible) | Documents and images, on-premise |
|
| 28 |
+
| Auth | **JWT + bcrypt** | Keycloak/SSO deferred to enterprise phase |
|
| 29 |
+
| PDF reports | **WeasyPrint** or **ReportLab** | Forensic report generation |
|
| 30 |
+
| Deployment | **On-premise Docker** | Primary target; forensic data must not leave client network |
|
| 31 |
+
| AI logic | **`core/` shared package** | Reused by both Gradio demo and FastAPI backend |
|
| 32 |
+
|
| 33 |
+
### Key architectural principles
|
| 34 |
+
|
| 35 |
+
#### 1. Thin frontend (dumb client)
|
| 36 |
+
|
| 37 |
+
The React frontend does zero processing. It only renders UI and makes HTTP calls to the backend.
|
| 38 |
+
All logic — AI, validation, business rules, auth — lives in the backend or in `core/`.
|
| 39 |
+
|
| 40 |
+
#### 2. Shared `core/` package
|
| 41 |
+
|
| 42 |
+
All AI/ML logic lives in pure Python modules under `core/`, with no dependency on any web framework.
|
| 43 |
+
`core/` is called by the FastAPI backend (for the professional app) and directly by `grapholab_demo.py` (for the Gradio demo).
|
| 44 |
+
|
| 45 |
+
#### 3. Every feature follows the same three-layer pattern
|
| 46 |
+
|
| 47 |
+
- `core/<module>.py` — pure AI/business logic, no HTTP, fully testable in isolation
|
| 48 |
+
- `backend/routers/<module>.py` — FastAPI router: receives request, calls `core/`, returns JSON
|
| 49 |
+
- `frontend/src/...` — React component: calls the endpoint, renders the result
|
| 50 |
+
|
| 51 |
+
#### 4. Gradio demo preserved
|
| 52 |
+
|
| 53 |
+
`app/grapholab_demo.py` is preserved as-is for demos and HF Spaces.
|
| 54 |
+
It calls `core/` directly — acceptable because it is a local single-user demo, not a multi-user web app.
|
| 55 |
+
|
| 56 |
+
```text
|
| 57 |
+
grapholab/
|
| 58 |
+
├── core/ # shared AI logic (new)
|
| 59 |
+
│ ├── ocr.py
|
| 60 |
+
│ ├── signature.py
|
| 61 |
+
│ ├── graphology.py
|
| 62 |
+
│ ├── ner.py
|
| 63 |
+
│ ├── writer.py
|
| 64 |
+
│ ├── pipeline.py
|
| 65 |
+
│ ├── dating.py
|
| 66 |
+
│ └── rag.py
|
| 67 |
+
├── app/
|
| 68 |
+
│ └── grapholab_demo.py # Gradio demo (preserved, refactored to import from core/)
|
| 69 |
+
├── backend/ # FastAPI professional app (new)
|
| 70 |
+
└── frontend/ # React + shadcn/ui SPA (new)
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
## Priority Issue: Dependency Licenses
|
| 76 |
|
| 77 |
+
**Two dependencies carry AGPL-3.0 licenses**, which impose obligations on any commercial product.
|
| 78 |
|
| 79 |
| Dependency | License | Commercial impact |
|
| 80 |
|---|---|---|
|
|
|
|
| 82 |
| `albumentations` | AGPL-3.0 / Commercial | Same constraint |
|
| 83 |
| All others | BSD / Apache 2.0 / MIT | No commercial restrictions |
|
| 84 |
|
| 85 |
+
**Resolution (recommended before any commercial release):**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
- Replace `ultralytics` with RT-DETR via `transformers` (Apache 2.0) — covers the Lab 04 signature detection use case
|
| 88 |
+
- Remove `albumentations` from requirements (not used in production code, only in notebooks)
|
|
|
|
|
|
|
| 89 |
|
| 90 |
---
|
| 91 |
|
|
|
|
| 111 |
- **Chain of custody** requirements mandate full traceability
|
| 112 |
- **GDPR** (EU) requires control over data residency
|
| 113 |
|
| 114 |
+
### Deployment options
|
| 115 |
|
| 116 |
+
**Option 1 — On-premise Docker** *(primary, confirmed)*
|
| 117 |
- Extend the existing `docker-compose.yml`
|
| 118 |
- Client installs on a local server or company VM
|
| 119 |
- Multi-user, browser access from internal network
|
| 120 |
- Easy to update; no data leaves the network
|
|
|
|
| 121 |
|
| 122 |
+
**Option 2 — Standalone Desktop** *(for individual practitioners, future)*
|
| 123 |
- Windows/macOS installer (.exe / .dmg)
|
| 124 |
- FastAPI backend bundled with PyInstaller + Electron UI
|
| 125 |
- Fully local, works offline, no network dependencies
|
|
|
|
| 126 |
|
| 127 |
**Option 3 — SaaS Cloud** *(optional, future)*
|
| 128 |
+
- Hosted on AWS / Azure / GCP
|
|
|
|
| 129 |
- More complex GDPR compliance; expect resistance from forensic clients
|
| 130 |
+
- Secondary channel once on-premise is established
|
| 131 |
|
| 132 |
---
|
| 133 |
|
| 134 |
## Development Roadmap
|
| 135 |
|
| 136 |
+
### Phase 0 — Core Module Extraction (1–2 weeks)
|
| 137 |
|
| 138 |
+
Branch: `feature/core-modules`
|
|
|
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
+
Extract AI logic from `grapholab_demo.py` into a shared `core/` package.
|
| 141 |
+
`grapholab_demo.py` remains fully functional throughout — it becomes a thin Gradio wrapper.
|
| 142 |
|
| 143 |
+
Migration order (most independent first):
|
| 144 |
|
| 145 |
+
- [ ] Create `core/__init__.py`
|
| 146 |
+
- [ ] `core/ner.py` — NER pipeline
|
| 147 |
+
- [ ] `core/ocr.py` — TrOCR + EasyOCR
|
| 148 |
+
- [ ] `core/graphology.py` — HOG, LBP, graphological analysis
|
| 149 |
+
- [ ] `core/writer.py` — writer identification
|
| 150 |
+
- [ ] `core/signature.py` — SigNet + YOLO (or RT-DETR)
|
| 151 |
+
- [ ] `core/rag.py` — RAG + Ollama
|
| 152 |
+
- [ ] `core/dating.py` — document dating (uses OCR internally)
|
| 153 |
+
- [ ] `core/pipeline.py` — full forensic pipeline (aggregates all others)
|
| 154 |
+
- [ ] Update `grapholab_demo.py` to import from `core/`
|
| 155 |
+
- [ ] Verify Gradio demo works identically
|
| 156 |
+
- [ ] Replace `ultralytics` with RT-DETR in `core/signature.py`
|
| 157 |
+
- [ ] Remove `albumentations` from `requirements.txt`
|
| 158 |
|
| 159 |
+
### Phase 1 — MVP (6–8 weeks)
|
| 160 |
|
| 161 |
+
Branch: `feature/backend` (from `feature/core-modules`)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 162 |
|
| 163 |
+
Professional core that replaces the Jupyter notebooks.
|
| 164 |
|
| 165 |
+
- [ ] FastAPI skeleton + PostgreSQL schema (`users`, `organizations`, `projects`, `documents`, `analyses`, `audit_log`)
|
| 166 |
+
- [ ] JWT authentication (login, logout, refresh, password reset)
|
| 167 |
+
- [ ] Role-based access: admin, examiner, viewer
|
| 168 |
+
- [ ] MinIO integration for document/image storage
|
| 169 |
+
- [ ] CRUD projects per user
|
| 170 |
+
- [ ] 4 AI engines via REST API (HTR, signature verification, signature detection, graphological analysis)
|
| 171 |
+
- [ ] Immutable audit log (forensic requirement — append-only, no UPDATE/DELETE)
|
| 172 |
+
- [ ] PDF report generation (WeasyPrint or ReportLab)
|
| 173 |
+
- [ ] Docker Compose updated with: `postgres`, `minio`, `backend`
|
| 174 |
+
|
| 175 |
+
Branch: `feature/frontend` (from `feature/backend`)
|
| 176 |
+
|
| 177 |
+
- [ ] React + Tailwind CSS + shadcn/ui scaffold
|
| 178 |
+
- [ ] Auth pages (login, logout, password reset)
|
| 179 |
+
- [ ] Case management dashboard (list, create, delete projects)
|
| 180 |
+
- [ ] Analysis UI for 4 core engines
|
| 181 |
+
- [ ] Report download
|
| 182 |
+
- [ ] Docker Compose updated with: `frontend`
|
| 183 |
+
|
| 184 |
+
### Phase 2 — Mature Product (6–8 weeks)
|
| 185 |
+
|
| 186 |
+
- [ ] Multilingual support (react-i18next): Italian + English
|
| 187 |
+
- [ ] Batch processing (ZIP archive → automatic pipeline → aggregate report)
|
| 188 |
+
- [ ] Side-by-side sample comparison with annotations (Fabric.js or Konva.js)
|
| 189 |
+
- [ ] Full-text search in archived OCR content (PostgreSQL FTS or Meilisearch)
|
| 190 |
+
- [ ] Data export (CSV, JSON) for third-party system integration
|
| 191 |
+
- [ ] AI model updates without reinstallation
|
| 192 |
+
- [ ] End-user documentation and examiner manual
|
| 193 |
+
|
| 194 |
+
### Phase 3 — Enterprise (variable)
|
| 195 |
+
|
| 196 |
+
- [ ] SSO / LDAP / Active Directory integration (Keycloak)
|
| 197 |
+
- [ ] Multi-tenancy (multiple organisations on the same instance)
|
| 198 |
+
- [ ] Public REST API with OpenAPI docs (FastAPI auto-generates this)
|
| 199 |
+
- [ ] Fine-tuning interface for client-proprietary datasets
|
| 200 |
+
- [ ] Usage analytics dashboard for admins
|
| 201 |
+
- [ ] SLA support agreements
|
| 202 |
|
| 203 |
---
|
| 204 |
|
| 205 |
+
## Technology Stack
|
| 206 |
|
| 207 |
| Layer | Technology | Notes |
|
| 208 |
|---|---|---|
|
| 209 |
| AI / ML | PyTorch, Transformers, OpenCV | Unchanged from labs |
|
| 210 |
+
| Shared AI package | `core/` (new) | Reused by Gradio demo and FastAPI |
|
| 211 |
| Backend API | FastAPI | Async, automatic OpenAPI docs |
|
| 212 |
+
| Frontend | React + Tailwind CSS + shadcn/ui | Professional components |
|
| 213 |
| Database | PostgreSQL | Cases, users, audit log |
|
| 214 |
| File storage | MinIO (S3-compatible) | Documents and images, on-premise |
|
| 215 |
+
| Auth | JWT + bcrypt | Keycloak for enterprise SSO (Phase 3) |
|
| 216 |
| PDF reports | WeasyPrint or ReportLab | Forensic report generation |
|
| 217 |
+
| i18n | react-i18next | Italian + English (Phase 2) |
|
| 218 |
+
| Annotations | Fabric.js or Konva.js | Interactive image annotation (Phase 2) |
|
| 219 |
| Container | Docker Compose | Extend existing setup |
|
| 220 |
| CI/CD | GitHub Actions | Build, test, automated releases |
|
| 221 |
|
| 222 |
---
|
| 223 |
|
| 224 |
+
## Licensing Model
|
| 225 |
|
| 226 |
### Recommended: Open-Core
|
| 227 |
|
|
|
|
| 231 |
| **Professional** | Commercial, closed source | Examiners, law firms |
|
| 232 |
| **Enterprise** | Commercial + SLA contract | Courts, banks, government |
|
| 233 |
|
|
|
|
|
|
|
| 234 |
### Customer pricing (indicative)
|
| 235 |
|
| 236 |
| Plan | Target | Model |
|
|
|
|
| 239 |
| **Studio** | 2–10 users | Annual licence per user |
|
| 240 |
| **Enterprise** | Courts, banks | Contract + SLA + fine-tuning |
|
| 241 |
| **SaaS** *(future)* | Small firms | Monthly subscription / pay-per-use |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app/grapholab_demo.py
CHANGED
|
@@ -27,274 +27,44 @@ ROOT = Path(__file__).parent.parent
|
|
| 27 |
sys.path.insert(0, str(ROOT))
|
| 28 |
|
| 29 |
import io
|
| 30 |
-
import requests as _requests
|
| 31 |
-
import matplotlib
|
| 32 |
-
matplotlib.use("Agg")
|
| 33 |
-
import matplotlib.pyplot as plt
|
| 34 |
-
import cv2
|
| 35 |
-
import numpy as np
|
| 36 |
-
import torch
|
| 37 |
-
import torch.nn as nn
|
| 38 |
-
import torch.nn.functional as F
|
| 39 |
-
import torchvision.transforms as T
|
| 40 |
-
import gradio as gr
|
| 41 |
-
from collections import OrderedDict
|
| 42 |
-
from PIL import Image, ImageDraw, ImageFont, ImageOps
|
| 43 |
-
from skimage import filters, transform as sk_transform
|
| 44 |
-
from skimage.feature import hog, local_binary_pattern
|
| 45 |
-
from sklearn.svm import SVC
|
| 46 |
-
from sklearn.preprocessing import LabelEncoder, StandardScaler
|
| 47 |
-
from sklearn.pipeline import Pipeline
|
| 48 |
-
from transformers import TrOCRProcessor, VisionEncoderDecoderModel, pipeline as hf_pipeline
|
| 49 |
-
from ultralytics import YOLO
|
| 50 |
-
from huggingface_hub import hf_hub_download
|
| 51 |
-
|
| 52 |
-
# ──────────────────────────────────────────────────────────────────────────────
|
| 53 |
-
# Configuration
|
| 54 |
-
# ──────────────────────────────────────────────────────────────────────────────
|
| 55 |
-
|
| 56 |
-
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
|
| 57 |
-
TROCR_MODEL = "microsoft/trocr-large-handwritten"
|
| 58 |
-
YOLO_REPO = "tech4humans/yolov8s-signature-detector"
|
| 59 |
-
YOLO_FILENAME = "yolov8s.pt"
|
| 60 |
-
SIG_THRESHOLD = 0.35 # cosine distance threshold for signature verification
|
| 61 |
-
NER_MODEL = "Babelscape/wikineural-multilingual-ner"
|
| 62 |
-
|
| 63 |
-
# ──────────────────────────────────────────────────────────────────────────────
|
| 64 |
-
# Lazy model loaders (loaded on first use to avoid memory duplication)
|
| 65 |
-
# ──────────────────────────────────────────────────────────────────────────────
|
| 66 |
-
|
| 67 |
-
_trocr_processor = None
|
| 68 |
-
_trocr_model = None
|
| 69 |
-
_easyocr_reader = None
|
| 70 |
-
_yolo_model = None
|
| 71 |
-
_ner_pipeline = None
|
| 72 |
-
_writer_clf = None
|
| 73 |
-
_writer_le = None
|
| 74 |
-
_writer_X_scaled = None # scaled training features for open-set distance check
|
| 75 |
-
_writer_dist_threshold = None # auto-calibrated rejection threshold
|
| 76 |
-
import threading as _threading
|
| 77 |
-
import hashlib as _hashlib
|
| 78 |
-
import json as _json
|
| 79 |
import tempfile as _tempfile
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
# ── RAG / Ollama ──────────────────────────────────────────────────────────────
|
| 83 |
-
OLLAMA_URL = "http://localhost:11434"
|
| 84 |
-
OLLAMA_MODEL = "llama3.2" # default / fallback
|
| 85 |
-
_embed_model = OLLAMA_MODEL # embedding model — fixed (changing it invalidates cache)
|
| 86 |
-
_rag_model = OLLAMA_MODEL # generation model — selectable via UI
|
| 87 |
-
_rag_chunks: list = [] # [{"text": str, "source": str, "emb": np.ndarray}]
|
| 88 |
-
_rag_indexed_files: set = set() # filenames already indexed via upload
|
| 89 |
-
_rag_ready = False
|
| 90 |
-
_rag_lock = _threading.Lock()
|
| 91 |
-
_RAG_CACHE_DIR = ROOT / "data" / "rag_cache"
|
| 92 |
-
|
| 93 |
-
# Check Ollama availability at startup (used to show/hide the unavailability banner)
|
| 94 |
-
def _check_ollama() -> bool:
|
| 95 |
-
try:
|
| 96 |
-
_requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 97 |
-
return True
|
| 98 |
-
except Exception:
|
| 99 |
-
return False
|
| 100 |
-
|
| 101 |
-
_OLLAMA_AVAILABLE = _check_ollama()
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
def get_trocr():
|
| 105 |
-
global _trocr_processor, _trocr_model
|
| 106 |
-
if _trocr_processor is None:
|
| 107 |
-
print("Loading TrOCR...")
|
| 108 |
-
_trocr_processor = TrOCRProcessor.from_pretrained(TROCR_MODEL)
|
| 109 |
-
_trocr_model = VisionEncoderDecoderModel.from_pretrained(TROCR_MODEL).to(DEVICE)
|
| 110 |
-
_trocr_model.eval()
|
| 111 |
-
return _trocr_processor, _trocr_model
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
def get_easyocr():
|
| 115 |
-
global _easyocr_reader
|
| 116 |
-
if _easyocr_reader is None:
|
| 117 |
-
import easyocr
|
| 118 |
-
print("Loading EasyOCR (Italian)...")
|
| 119 |
-
_easyocr_reader = easyocr.Reader(["it", "en"], gpu=DEVICE == "cuda")
|
| 120 |
-
return _easyocr_reader
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
def get_ner():
|
| 124 |
-
global _ner_pipeline
|
| 125 |
-
if _ner_pipeline is None:
|
| 126 |
-
print("Loading NER model...")
|
| 127 |
-
_ner_pipeline = hf_pipeline(
|
| 128 |
-
"ner",
|
| 129 |
-
model=NER_MODEL,
|
| 130 |
-
aggregation_strategy="simple",
|
| 131 |
-
device=0 if DEVICE == "cuda" else -1,
|
| 132 |
-
)
|
| 133 |
-
return _ner_pipeline
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
def get_yolo():
|
| 137 |
-
global _yolo_model
|
| 138 |
-
if _yolo_model is None:
|
| 139 |
-
print("Loading YOLOv8 signature detector...")
|
| 140 |
-
hf_token = os.environ.get("HF_TOKEN")
|
| 141 |
-
model_path = hf_hub_download(
|
| 142 |
-
repo_id=YOLO_REPO, filename=YOLO_FILENAME, token=hf_token
|
| 143 |
-
)
|
| 144 |
-
_yolo_model = YOLO(model_path)
|
| 145 |
-
return _yolo_model
|
| 146 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
|
| 148 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 149 |
-
#
|
| 150 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 151 |
|
| 152 |
SIGNET_WEIGHTS = ROOT / "models" / "signet.pth"
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
def _conv_bn_relu(in_ch, out_ch, kernel, stride=1, pad=0):
|
| 157 |
-
return nn.Sequential(OrderedDict([
|
| 158 |
-
("conv", nn.Conv2d(in_ch, out_ch, kernel, stride, pad, bias=False)),
|
| 159 |
-
("bn", nn.BatchNorm2d(out_ch)),
|
| 160 |
-
("relu", nn.ReLU()),
|
| 161 |
-
]))
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
def _linear_bn_relu(in_f, out_f):
|
| 165 |
-
return nn.Sequential(OrderedDict([
|
| 166 |
-
("fc", nn.Linear(in_f, out_f, bias=False)),
|
| 167 |
-
("bn", nn.BatchNorm1d(out_f)),
|
| 168 |
-
("relu", nn.ReLU()),
|
| 169 |
-
]))
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
class SigNet(nn.Module):
|
| 173 |
-
"""SigNet feature extractor (sigver re-implementation, output: 2048-d L2-normalised)."""
|
| 174 |
-
def __init__(self):
|
| 175 |
-
super().__init__()
|
| 176 |
-
self.conv_layers = nn.Sequential(OrderedDict([
|
| 177 |
-
("conv1", _conv_bn_relu(1, 96, 11, stride=4)),
|
| 178 |
-
("maxpool1", nn.MaxPool2d(3, 2)),
|
| 179 |
-
("conv2", _conv_bn_relu(96, 256, 5, pad=2)),
|
| 180 |
-
("maxpool2", nn.MaxPool2d(3, 2)),
|
| 181 |
-
("conv3", _conv_bn_relu(256, 384, 3, pad=1)),
|
| 182 |
-
("conv4", _conv_bn_relu(384, 384, 3, pad=1)),
|
| 183 |
-
("conv5", _conv_bn_relu(384, 256, 3, pad=1)),
|
| 184 |
-
("maxpool3", nn.MaxPool2d(3, 2)),
|
| 185 |
-
]))
|
| 186 |
-
self.fc_layers = nn.Sequential(OrderedDict([
|
| 187 |
-
("fc1", _linear_bn_relu(256 * 3 * 5, 2048)),
|
| 188 |
-
("fc2", _linear_bn_relu(2048, 2048)),
|
| 189 |
-
]))
|
| 190 |
-
|
| 191 |
-
def forward_once(self, x):
|
| 192 |
-
x = self.conv_layers(x)
|
| 193 |
-
x = x.view(x.size(0), 256 * 3 * 5)
|
| 194 |
-
x = self.fc_layers(x)
|
| 195 |
-
return F.normalize(x, p=2, dim=1)
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
_signet = None
|
| 199 |
-
_signet_pretrained = False
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
def get_signet():
|
| 203 |
-
global _signet, _signet_pretrained
|
| 204 |
-
if _signet is None:
|
| 205 |
-
model = SigNet().to(DEVICE).eval()
|
| 206 |
-
if SIGNET_WEIGHTS.exists():
|
| 207 |
-
state_dict, _, _ = torch.load(SIGNET_WEIGHTS, map_location=DEVICE)
|
| 208 |
-
model.load_state_dict(state_dict)
|
| 209 |
-
_signet_pretrained = True
|
| 210 |
-
print("SigNet: loaded pre-trained weights from", SIGNET_WEIGHTS)
|
| 211 |
-
else:
|
| 212 |
-
print("SigNet: no pre-trained weights found — using random initialisation.")
|
| 213 |
-
_signet = model
|
| 214 |
-
return _signet
|
| 215 |
-
|
| 216 |
-
|
| 217 |
-
def preprocess_signature(pil_img: Image.Image) -> torch.Tensor:
|
| 218 |
-
"""Sigver-compatible preprocessing: centre on canvas, invert, resize to 150×220."""
|
| 219 |
-
arr = np.array(pil_img.convert("L"), dtype=np.uint8)
|
| 220 |
-
|
| 221 |
-
# Centre on canvas
|
| 222 |
-
canvas = np.ones(SIGNET_CANVAS, dtype=np.uint8) * 255
|
| 223 |
-
try:
|
| 224 |
-
threshold = filters.threshold_otsu(arr)
|
| 225 |
-
blurred = filters.gaussian(arr, 2, preserve_range=True)
|
| 226 |
-
binary = blurred > threshold
|
| 227 |
-
rows, cols = np.where(binary == 0)
|
| 228 |
-
if len(rows) == 0:
|
| 229 |
-
raise ValueError("empty")
|
| 230 |
-
cropped = arr[rows.min():rows.max(), cols.min():cols.max()]
|
| 231 |
-
r_center = int(rows.mean() - rows.min())
|
| 232 |
-
c_center = int(cols.mean() - cols.min())
|
| 233 |
-
r_start = max(0, SIGNET_CANVAS[0] // 2 - r_center)
|
| 234 |
-
c_start = max(0, SIGNET_CANVAS[1] // 2 - c_center)
|
| 235 |
-
h = min(cropped.shape[0], SIGNET_CANVAS[0] - r_start)
|
| 236 |
-
w = min(cropped.shape[1], SIGNET_CANVAS[1] - c_start)
|
| 237 |
-
canvas[r_start:r_start + h, c_start:c_start + w] = cropped[:h, :w]
|
| 238 |
-
canvas[canvas > threshold] = 255
|
| 239 |
-
except Exception:
|
| 240 |
-
canvas = arr # fallback: use image as-is
|
| 241 |
-
|
| 242 |
-
# Invert and resize to 150×220
|
| 243 |
-
inverted = 255 - canvas
|
| 244 |
-
resized = sk_transform.resize(inverted, (150, 220), preserve_range=True,
|
| 245 |
-
anti_aliasing=True).astype(np.uint8)
|
| 246 |
-
tensor = torch.from_numpy(resized).float().div(255)
|
| 247 |
-
return tensor.unsqueeze(0).unsqueeze(0).to(DEVICE) # (1, 1, 150, 220)
|
| 248 |
|
|
|
|
| 249 |
|
| 250 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 251 |
# Tab 1 — Handwritten OCR
|
| 252 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 253 |
|
| 254 |
-
def _preprocess_for_htr(image: np.ndarray) -> np.ndarray:
|
| 255 |
-
"""Light preprocessing: deskew + contrast enhancement, keeping grayscale gradients
|
| 256 |
-
so EasyOCR's CRNN recogniser retains letter-shape information."""
|
| 257 |
-
import cv2
|
| 258 |
-
|
| 259 |
-
# 1. Grayscale
|
| 260 |
-
if image.ndim == 3:
|
| 261 |
-
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
|
| 262 |
-
else:
|
| 263 |
-
gray = image.copy()
|
| 264 |
-
|
| 265 |
-
# 2. Deskew via minAreaRect on ink pixels
|
| 266 |
-
_, bw = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
|
| 267 |
-
coords = np.column_stack(np.where(bw > 0))
|
| 268 |
-
if len(coords) > 100:
|
| 269 |
-
angle = cv2.minAreaRect(coords)[-1]
|
| 270 |
-
if angle < -45:
|
| 271 |
-
angle = 90 + angle
|
| 272 |
-
else:
|
| 273 |
-
angle = -angle
|
| 274 |
-
if abs(angle) > 0.3:
|
| 275 |
-
(h, w) = gray.shape
|
| 276 |
-
M = cv2.getRotationMatrix2D((w / 2, h / 2), angle, 1.0)
|
| 277 |
-
gray = cv2.warpAffine(gray, M, (w, h),
|
| 278 |
-
flags=cv2.INTER_CUBIC,
|
| 279 |
-
borderMode=cv2.BORDER_REPLICATE)
|
| 280 |
-
|
| 281 |
-
# 3. CLAHE contrast enhancement (adaptive, preserves gradients)
|
| 282 |
-
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
|
| 283 |
-
enhanced = clahe.apply(gray)
|
| 284 |
-
|
| 285 |
-
# 4. Back to 3-channel for EasyOCR
|
| 286 |
-
return cv2.cvtColor(enhanced, cv2.COLOR_GRAY2BGR)
|
| 287 |
-
|
| 288 |
-
|
| 289 |
-
def htr_transcribe(image: np.ndarray) -> str:
|
| 290 |
-
if image is None:
|
| 291 |
-
return "Carica un'immagine di testo manoscritto."
|
| 292 |
-
reader = get_easyocr()
|
| 293 |
-
processed = _preprocess_for_htr(image)
|
| 294 |
-
results = reader.readtext(processed, detail=0, paragraph=True)
|
| 295 |
-
return "\n".join(results)
|
| 296 |
-
|
| 297 |
-
|
| 298 |
htr_tab = gr.Interface(
|
| 299 |
fn=htr_transcribe,
|
| 300 |
inputs=gr.Image(label="Immagine di testo manoscritto", type="numpy"),
|
|
@@ -337,108 +107,8 @@ def _sig_ex(name: str) -> str | None:
|
|
| 337 |
return str(p) if p.exists() else None
|
| 338 |
|
| 339 |
|
| 340 |
-
def
|
| 341 |
-
ref_image
|
| 342 |
-
ref_image2: np.ndarray | None,
|
| 343 |
-
query_image: np.ndarray,
|
| 344 |
-
) -> tuple[str, np.ndarray | None]:
|
| 345 |
-
if ref_image is None or query_image is None:
|
| 346 |
-
return "Carica la firma di riferimento e quella da verificare.", None
|
| 347 |
-
|
| 348 |
-
model = get_signet()
|
| 349 |
-
|
| 350 |
-
with torch.no_grad():
|
| 351 |
-
emb_ref1 = model.forward_once(preprocess_signature(Image.fromarray(ref_image)))
|
| 352 |
-
if ref_image2 is not None:
|
| 353 |
-
emb_ref2 = model.forward_once(preprocess_signature(Image.fromarray(ref_image2)))
|
| 354 |
-
mean_ref = F.normalize(emb_ref1 + emb_ref2, p=2, dim=1)
|
| 355 |
-
n_refs = 2
|
| 356 |
-
else:
|
| 357 |
-
mean_ref = emb_ref1
|
| 358 |
-
n_refs = 1
|
| 359 |
-
emb_query = model.forward_once(preprocess_signature(Image.fromarray(query_image)))
|
| 360 |
-
|
| 361 |
-
cosine_sim = F.cosine_similarity(mean_ref, emb_query).item()
|
| 362 |
-
cosine_dist = 1.0 - cosine_sim
|
| 363 |
-
verdict = "AUTENTICA ✓" if cosine_dist < SIG_THRESHOLD else "FALSA ✗"
|
| 364 |
-
color = "#2ca02c" if cosine_dist < SIG_THRESHOLD else "#d62728"
|
| 365 |
-
|
| 366 |
-
weights_note = (
|
| 367 |
-
"Modello: SigNet — pesi pre-addestrati GPDS (luizgh/sigver)."
|
| 368 |
-
if _signet_pretrained else
|
| 369 |
-
"⚠️ ATTENZIONE: pesi casuali — risultati non significativi.\n"
|
| 370 |
-
"Scarica signet.pth da luizgh/sigver e posizionalo in models/signet.pth."
|
| 371 |
-
)
|
| 372 |
-
report = (
|
| 373 |
-
f"Esito: {verdict}\n"
|
| 374 |
-
f"Similarità coseno: {cosine_sim:.4f}\n"
|
| 375 |
-
f"Distanza coseno: {cosine_dist:.4f} (soglia: {SIG_THRESHOLD})\n"
|
| 376 |
-
f"Riferimenti usati: {n_refs}"
|
| 377 |
-
+ (" (embedding mediato)" if n_refs > 1 else "") + "\n\n"
|
| 378 |
-
+ weights_note
|
| 379 |
-
)
|
| 380 |
-
|
| 381 |
-
# ── Matplotlib visualisation ──────────────────────────────────────────────
|
| 382 |
-
n_img_panels = 2 + (1 if ref_image2 is not None else 0)
|
| 383 |
-
width_ratios = ([1] * n_img_panels) + [1.4]
|
| 384 |
-
fig, axes = plt.subplots(
|
| 385 |
-
1, n_img_panels + 1,
|
| 386 |
-
figsize=(3.2 * (n_img_panels + 1), 3.2),
|
| 387 |
-
gridspec_kw={"width_ratios": width_ratios},
|
| 388 |
-
)
|
| 389 |
-
|
| 390 |
-
panels = [ref_image]
|
| 391 |
-
labels = ["Rif. 1"]
|
| 392 |
-
if ref_image2 is not None:
|
| 393 |
-
panels.append(ref_image2)
|
| 394 |
-
labels.append("Rif. 2")
|
| 395 |
-
panels.append(query_image)
|
| 396 |
-
labels.append("Da verificare")
|
| 397 |
-
|
| 398 |
-
for ax, img, lbl in zip(axes[:-1], panels, labels):
|
| 399 |
-
ax.imshow(img, cmap="gray" if img.ndim == 2 else None)
|
| 400 |
-
ax.set_title(lbl, fontsize=10)
|
| 401 |
-
ax.axis("off")
|
| 402 |
-
|
| 403 |
-
# Gauge panel
|
| 404 |
-
ax_g = axes[-1]
|
| 405 |
-
ax_g.set_xlim(0, 1)
|
| 406 |
-
ax_g.set_ylim(0, 1)
|
| 407 |
-
ax_g.axis("off")
|
| 408 |
-
|
| 409 |
-
# Verdict text
|
| 410 |
-
ax_g.text(0.5, 0.82, verdict, ha="center", va="center",
|
| 411 |
-
fontsize=14, fontweight="bold", color=color,
|
| 412 |
-
transform=ax_g.transAxes)
|
| 413 |
-
|
| 414 |
-
# Gauge bar (distance from 0 to 1)
|
| 415 |
-
bar_ax = fig.add_axes([
|
| 416 |
-
axes[-1].get_position().x0 + 0.01,
|
| 417 |
-
axes[-1].get_position().y0 + 0.12,
|
| 418 |
-
axes[-1].get_position().width - 0.02,
|
| 419 |
-
0.18,
|
| 420 |
-
])
|
| 421 |
-
bar_ax.barh([0], [cosine_dist], color=color, alpha=0.75, height=0.6)
|
| 422 |
-
bar_ax.barh([0], [1.0 - cosine_dist], left=cosine_dist,
|
| 423 |
-
color="#cccccc", alpha=0.4, height=0.6)
|
| 424 |
-
bar_ax.axvline(SIG_THRESHOLD, color="black", linestyle="--", linewidth=1.2)
|
| 425 |
-
bar_ax.set_xlim(0, 1)
|
| 426 |
-
bar_ax.set_ylim(-0.5, 0.5)
|
| 427 |
-
bar_ax.set_yticks([])
|
| 428 |
-
bar_ax.set_xticks([0, SIG_THRESHOLD, 1])
|
| 429 |
-
bar_ax.set_xticklabels(["0", f"soglia\n{SIG_THRESHOLD}", "1"], fontsize=7)
|
| 430 |
-
bar_ax.set_xlabel(f"Distanza coseno: {cosine_dist:.3f}", fontsize=8)
|
| 431 |
-
|
| 432 |
-
plt.suptitle("Verifica Autenticità Firma — SigNet", fontsize=11, fontweight="bold")
|
| 433 |
-
plt.tight_layout()
|
| 434 |
-
|
| 435 |
-
buf = io.BytesIO()
|
| 436 |
-
fig.savefig(buf, format="png", dpi=130, bbox_inches="tight")
|
| 437 |
-
plt.close(fig)
|
| 438 |
-
buf.seek(0)
|
| 439 |
-
chart = np.array(Image.open(buf))
|
| 440 |
-
|
| 441 |
-
return report, chart
|
| 442 |
|
| 443 |
|
| 444 |
_sig_examples = []
|
|
@@ -447,13 +117,13 @@ for _n in ["1", "2", "3"]:
|
|
| 447 |
_r2 = _sig_ex(f"genuine_{_n}_2.png")
|
| 448 |
_forg = _sig_ex(f"forged_{_n}_1.png")
|
| 449 |
if _r1 and _r2 and _forg:
|
| 450 |
-
_sig_examples.append([_r1, _r2, _forg])
|
| 451 |
if _n == "1" and _r1 and _r2:
|
| 452 |
-
_sig_examples.append([_r1, None, _r2])
|
| 453 |
|
| 454 |
|
| 455 |
sig_verify_tab = gr.Interface(
|
| 456 |
-
fn=
|
| 457 |
inputs=[
|
| 458 |
gr.Image(label="Firma di riferimento 1 (autentica nota)", type="numpy"),
|
| 459 |
gr.Image(label="Firma di riferimento 2 — opzionale (migliora l'accuratezza)", type="numpy"),
|
|
@@ -496,57 +166,6 @@ sig_verify_tab = gr.Interface(
|
|
| 496 |
# Tab 3 — Signature Detection
|
| 497 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 498 |
|
| 499 |
-
def sig_detect(image: np.ndarray, conf_threshold: float) -> tuple[np.ndarray, str]:
|
| 500 |
-
if image is None:
|
| 501 |
-
return image, "Carica un'immagine del documento."
|
| 502 |
-
try:
|
| 503 |
-
yolo = get_yolo()
|
| 504 |
-
except Exception as e:
|
| 505 |
-
msg = (
|
| 506 |
-
"⚠️ **Modello non disponibile.**\n\n"
|
| 507 |
-
"Il modello `tech4humans/yolov8s-signature-detector` è ad accesso limitato su Hugging Face.\n\n"
|
| 508 |
-
"**Per abilitare questa sezione:**\n"
|
| 509 |
-
"1. Crea un account su huggingface.co\n"
|
| 510 |
-
"2. Richiedi l'accesso su huggingface.co/tech4humans/yolov8s-signature-detector\n"
|
| 511 |
-
"3. Crea un token su huggingface.co/settings/tokens\n"
|
| 512 |
-
"4. Imposta la variabile d'ambiente `HF_TOKEN=<il_tuo_token>` prima di avviare l'app\n\n"
|
| 513 |
-
f"Errore: {e}"
|
| 514 |
-
)
|
| 515 |
-
return image, msg
|
| 516 |
-
pil_img = Image.fromarray(image).convert("RGB")
|
| 517 |
-
|
| 518 |
-
# Save to temp file for YOLO
|
| 519 |
-
import tempfile
|
| 520 |
-
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp:
|
| 521 |
-
pil_img.save(tmp.name)
|
| 522 |
-
tmp_path = tmp.name
|
| 523 |
-
|
| 524 |
-
results = yolo.predict(tmp_path, conf=conf_threshold, verbose=False)
|
| 525 |
-
os.unlink(tmp_path)
|
| 526 |
-
|
| 527 |
-
result = results[0]
|
| 528 |
-
annotated = image.copy()
|
| 529 |
-
count = 0
|
| 530 |
-
|
| 531 |
-
if result.boxes is not None:
|
| 532 |
-
for box in result.boxes:
|
| 533 |
-
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy().astype(int)
|
| 534 |
-
conf = float(box.conf[0].cpu())
|
| 535 |
-
cv2.rectangle(annotated, (x1, y1), (x2, y2), (255, 0, 0), 2)
|
| 536 |
-
cv2.putText(annotated, f"Sig #{count+1} {conf:.0%}",
|
| 537 |
-
(x1, max(y1 - 8, 0)),
|
| 538 |
-
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 0), 2)
|
| 539 |
-
count += 1
|
| 540 |
-
|
| 541 |
-
summary = (
|
| 542 |
-
f"Rilevat{'a' if count == 1 else 'e'} {count} firma{'' if count == 1 else 'e'} "
|
| 543 |
-
f"(confidenza ≥ {conf_threshold:.0%})\n\n"
|
| 544 |
-
f"**Modello:** `tech4humans/yolov8s-signature-detector`\n"
|
| 545 |
-
f"**Uso forense:** Estrazione automatica di firme da documenti legali."
|
| 546 |
-
)
|
| 547 |
-
return annotated, summary
|
| 548 |
-
|
| 549 |
-
|
| 550 |
sig_detect_tab = gr.Interface(
|
| 551 |
fn=sig_detect,
|
| 552 |
inputs=[
|
|
@@ -588,41 +207,6 @@ sig_detect_tab = gr.Interface(
|
|
| 588 |
# Tab 4 — Named Entity Recognition
|
| 589 |
# ───────────────────────────────────���──────────────────────────────────────────
|
| 590 |
|
| 591 |
-
_NER_LABELS = {"PER": "Persona", "ORG": "Organizzazione", "LOC": "Luogo", "MISC": "Varie"}
|
| 592 |
-
|
| 593 |
-
|
| 594 |
-
def ner_extract(text: str):
|
| 595 |
-
if not text or not text.strip():
|
| 596 |
-
return [], "Inserisci del testo da analizzare."
|
| 597 |
-
nlp = get_ner()
|
| 598 |
-
entities = nlp(text)
|
| 599 |
-
|
| 600 |
-
# Build HighlightedText format: list of (span, label|None)
|
| 601 |
-
result = []
|
| 602 |
-
prev_end = 0
|
| 603 |
-
for ent in entities:
|
| 604 |
-
start, end = ent["start"], ent["end"]
|
| 605 |
-
if start > prev_end:
|
| 606 |
-
result.append((text[prev_end:start], None))
|
| 607 |
-
result.append((text[start:end], ent["entity_group"]))
|
| 608 |
-
prev_end = end
|
| 609 |
-
if prev_end < len(text):
|
| 610 |
-
result.append((text[prev_end:], None))
|
| 611 |
-
|
| 612 |
-
# Summary table
|
| 613 |
-
if entities:
|
| 614 |
-
rows = "\n".join(
|
| 615 |
-
f"| **{_NER_LABELS.get(e['entity_group'], e['entity_group'])}** "
|
| 616 |
-
f"(`{e['entity_group']}`) | {e['word']} | {e['score']:.0%} |"
|
| 617 |
-
for e in entities
|
| 618 |
-
)
|
| 619 |
-
summary = f"| Tipo | Entità | Confidenza |\n|------|--------|------------|\n{rows}"
|
| 620 |
-
else:
|
| 621 |
-
summary = "Nessuna entità trovata."
|
| 622 |
-
|
| 623 |
-
return result, summary
|
| 624 |
-
|
| 625 |
-
|
| 626 |
ner_tab = gr.Interface(
|
| 627 |
fn=ner_extract,
|
| 628 |
inputs=gr.Textbox(
|
|
@@ -679,341 +263,19 @@ ner_tab = gr.Interface(
|
|
| 679 |
# Tab 5 — Writer Identification
|
| 680 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 681 |
|
| 682 |
-
|
| 683 |
-
|
| 684 |
-
|
| 685 |
-
|
| 686 |
-
|
| 687 |
-
0: "Scrittore A",
|
| 688 |
-
1: "Scrittore B",
|
| 689 |
-
2: "Scrittore C",
|
| 690 |
-
3: "Scrittore D",
|
| 691 |
-
4: "Scrittore E",
|
| 692 |
-
}
|
| 693 |
-
|
| 694 |
-
|
| 695 |
-
def _make_synthetic_writer(writer_id: int, sample_id: int) -> Image.Image:
|
| 696 |
-
"""Generate a synthetic handwriting sample using system TTF fonts."""
|
| 697 |
-
rng = np.random.default_rng(writer_id * 1000 + sample_id)
|
| 698 |
-
|
| 699 |
-
_FONTS_DIR = Path("C:/Windows/Fonts")
|
| 700 |
-
# Each writer gets a distinct handwriting font + base size
|
| 701 |
-
_WRITER_FONTS = [
|
| 702 |
-
("Inkfree.ttf", 19), # Writer 0 — Ink Free (corsivo informale)
|
| 703 |
-
("LHANDW.TTF", 17), # Writer 1 — Lucida Handwriting (elegante)
|
| 704 |
-
("segoepr.ttf", 18), # Writer 2 — Segoe Print (stampatello mano)
|
| 705 |
-
("segoesc.ttf", 16), # Writer 3 — Segoe Script (corsivo moderno)
|
| 706 |
-
("comic.ttf", 18), # Writer 4 — Comic Sans (tondo informale)
|
| 707 |
-
]
|
| 708 |
-
font_name, base_size = _WRITER_FONTS[writer_id % len(_WRITER_FONTS)]
|
| 709 |
-
font_size = base_size + int(rng.integers(-1, 2))
|
| 710 |
-
try:
|
| 711 |
-
font = ImageFont.truetype(str(_FONTS_DIR / font_name), font_size)
|
| 712 |
-
except Exception:
|
| 713 |
-
font = ImageFont.load_default()
|
| 714 |
-
|
| 715 |
-
# Ink darkness: each writer has a characteristic pen pressure
|
| 716 |
-
ink_value = int([25, 15, 35, 20, 30][writer_id % 5] + rng.integers(-5, 6))
|
| 717 |
-
|
| 718 |
-
_SENTENCES = [
|
| 719 |
-
"il gatto dorme sul tetto",
|
| 720 |
-
"la casa è piccola e bella",
|
| 721 |
-
"oggi il cielo è molto blu",
|
| 722 |
-
"scrivere a mano è un'arte",
|
| 723 |
-
"ogni persona ha uno stile",
|
| 724 |
-
"il sole tramonta a ovest",
|
| 725 |
-
"leggo un libro ogni sera",
|
| 726 |
-
"la penna scorre sul foglio",
|
| 727 |
-
"le parole raccontano storie",
|
| 728 |
-
"questo è un campione scritto",
|
| 729 |
-
]
|
| 730 |
-
lines = [
|
| 731 |
-
_SENTENCES[(writer_id * 3 + sample_id + i) % len(_SENTENCES)]
|
| 732 |
-
for i in range(3)
|
| 733 |
-
]
|
| 734 |
|
| 735 |
-
w, h = 320, 140
|
| 736 |
-
img = Image.new("L", (w, h), 255)
|
| 737 |
-
draw = ImageDraw.Draw(img)
|
| 738 |
-
|
| 739 |
-
line_gap = font_size + 12 + int(rng.integers(-2, 3))
|
| 740 |
-
y = 10
|
| 741 |
-
for line in lines:
|
| 742 |
-
x = 8 + int(rng.integers(-3, 4))
|
| 743 |
-
draw.text((x, y), line, fill=ink_value, font=font)
|
| 744 |
-
y += line_gap
|
| 745 |
-
|
| 746 |
-
# Slight rotation simulates unaligned paper
|
| 747 |
-
angle = float(rng.uniform(-1.5, 1.5))
|
| 748 |
-
img = img.rotate(angle, fillcolor=255, expand=False)
|
| 749 |
-
|
| 750 |
-
return img
|
| 751 |
-
|
| 752 |
-
|
| 753 |
-
def _preprocess_writer_img(pil_img: Image.Image) -> np.ndarray:
|
| 754 |
-
"""Convert PIL image to normalised grayscale array of WRITER_IMG_SIZE.
|
| 755 |
-
|
| 756 |
-
For portrait documents (full page), extracts a representative landscape
|
| 757 |
-
crop from the text body before resizing, preserving stroke-level features
|
| 758 |
-
that the model was trained on (word-level 320×140 samples).
|
| 759 |
-
"""
|
| 760 |
-
gray = pil_img.convert("L")
|
| 761 |
-
w, h = gray.size
|
| 762 |
-
# If image is portrait, take a landscape crop from the upper text body
|
| 763 |
-
# (avoids distorting full-page documents when resizing to landscape target)
|
| 764 |
-
target_ratio = WRITER_IMG_SIZE[1] / WRITER_IMG_SIZE[0] # 256/128 = 2.0
|
| 765 |
-
if h > w:
|
| 766 |
-
crop_h = int(w / target_ratio)
|
| 767 |
-
top = h // 6 # skip top margin, start from first text lines
|
| 768 |
-
top = min(top, max(0, h - crop_h))
|
| 769 |
-
gray = gray.crop((0, top, w, top + crop_h))
|
| 770 |
-
arr = np.array(gray, dtype=np.float32)
|
| 771 |
-
# OTSU threshold → invert so ink=1, bg=0
|
| 772 |
-
thresh = filters.threshold_otsu(arr) if arr.std() > 1 else 128.0
|
| 773 |
-
binary = (arr < thresh).astype(np.float32)
|
| 774 |
-
# Resize
|
| 775 |
-
resized = sk_transform.resize(binary, WRITER_IMG_SIZE, anti_aliasing=True)
|
| 776 |
-
return resized.astype(np.float32)
|
| 777 |
-
|
| 778 |
-
|
| 779 |
-
def _extract_writer_features(pil_img: Image.Image) -> np.ndarray:
|
| 780 |
-
"""Extract HOG + LBP + run-length features for writer identification."""
|
| 781 |
-
arr = _preprocess_writer_img(pil_img)
|
| 782 |
-
arr8 = (arr * 255).astype(np.uint8)
|
| 783 |
-
|
| 784 |
-
# HOG features
|
| 785 |
-
hog_feats = hog(
|
| 786 |
-
arr,
|
| 787 |
-
orientations=9,
|
| 788 |
-
pixels_per_cell=(16, 16),
|
| 789 |
-
cells_per_block=(2, 2),
|
| 790 |
-
feature_vector=True,
|
| 791 |
-
)
|
| 792 |
|
| 793 |
-
|
| 794 |
-
|
| 795 |
-
lbp_hist, _ = np.histogram(lbp, bins=26, range=(0, 26), density=True)
|
| 796 |
-
|
| 797 |
-
# Run-length statistics (horizontal & vertical)
|
| 798 |
-
def _run_stats(binary_row):
|
| 799 |
-
runs = []
|
| 800 |
-
cnt = 0
|
| 801 |
-
for v in binary_row:
|
| 802 |
-
if v > 0.5:
|
| 803 |
-
cnt += 1
|
| 804 |
-
elif cnt > 0:
|
| 805 |
-
runs.append(cnt)
|
| 806 |
-
cnt = 0
|
| 807 |
-
if cnt > 0:
|
| 808 |
-
runs.append(cnt)
|
| 809 |
-
return runs
|
| 810 |
-
|
| 811 |
-
h_runs = []
|
| 812 |
-
for row in arr:
|
| 813 |
-
h_runs.extend(_run_stats(row))
|
| 814 |
-
v_runs = []
|
| 815 |
-
for col in arr.T:
|
| 816 |
-
v_runs.extend(_run_stats(col))
|
| 817 |
-
|
| 818 |
-
h_arr = np.array(h_runs, dtype=np.float32) if h_runs else np.array([0.0])
|
| 819 |
-
v_arr = np.array(v_runs, dtype=np.float32) if v_runs else np.array([0.0])
|
| 820 |
-
run_feats = np.array([
|
| 821 |
-
h_arr.mean(), h_arr.std(), h_arr.max(),
|
| 822 |
-
v_arr.mean(), v_arr.std(), v_arr.max(),
|
| 823 |
-
], dtype=np.float32)
|
| 824 |
-
|
| 825 |
-
return np.concatenate([hog_feats, lbp_hist, run_feats])
|
| 826 |
-
|
| 827 |
-
|
| 828 |
-
def _load_real_writer_samples() -> tuple[list[np.ndarray], list[str]] | None:
|
| 829 |
-
"""Load samples from data/samples/writer_XX/sample_YY.png directories."""
|
| 830 |
-
writer_dirs = sorted(_WRITER_SAMPLES_DIR.glob("writer_??"))
|
| 831 |
-
if len(writer_dirs) < 2:
|
| 832 |
-
return None
|
| 833 |
-
X, y = [], []
|
| 834 |
-
for wd in writer_dirs:
|
| 835 |
-
samples = sorted(wd.glob("sample_*.png"))
|
| 836 |
-
if len(samples) < 3:
|
| 837 |
-
continue
|
| 838 |
-
for sp in samples:
|
| 839 |
-
try:
|
| 840 |
-
img = Image.open(sp)
|
| 841 |
-
X.append(_extract_writer_features(img))
|
| 842 |
-
y.append(wd.name)
|
| 843 |
-
except Exception:
|
| 844 |
-
pass
|
| 845 |
-
if len(set(y)) < 2:
|
| 846 |
-
return None
|
| 847 |
-
return X, y
|
| 848 |
-
|
| 849 |
-
|
| 850 |
-
def _get_writer_model():
|
| 851 |
-
"""Return (Pipeline, LabelEncoder), training lazily on first call.
|
| 852 |
-
|
| 853 |
-
Thread-safe: if the background pre-warm thread is still running when the
|
| 854 |
-
pipeline reaches step 4, this call blocks until training finishes rather
|
| 855 |
-
than spawning a duplicate training job.
|
| 856 |
-
"""
|
| 857 |
-
global _writer_clf, _writer_le
|
| 858 |
-
if _writer_clf is not None: # fast path — no lock needed
|
| 859 |
-
return _writer_clf, _writer_le
|
| 860 |
-
with _writer_lock: # only one thread trains at a time
|
| 861 |
-
if _writer_clf is not None: # re-check after acquiring lock
|
| 862 |
-
return _writer_clf, _writer_le
|
| 863 |
-
print("Training writer identification model...")
|
| 864 |
-
|
| 865 |
-
real = _load_real_writer_samples()
|
| 866 |
-
if real is not None:
|
| 867 |
-
X_raw, y_raw = real
|
| 868 |
-
labels = y_raw
|
| 869 |
-
else:
|
| 870 |
-
# Synthetic fallback: 5 writers × 10 samples
|
| 871 |
-
X_raw, labels = [], []
|
| 872 |
-
for wid in range(5):
|
| 873 |
-
for sid in range(10):
|
| 874 |
-
img = _make_synthetic_writer(wid, sid)
|
| 875 |
-
X_raw.append(_extract_writer_features(img))
|
| 876 |
-
labels.append(_WRITER_NAMES[wid])
|
| 877 |
-
|
| 878 |
-
le = LabelEncoder()
|
| 879 |
-
y_enc = le.fit_transform(labels)
|
| 880 |
-
X = np.array(X_raw)
|
| 881 |
-
|
| 882 |
-
clf = Pipeline([
|
| 883 |
-
("scaler", StandardScaler()),
|
| 884 |
-
("svc", SVC(kernel="rbf", C=10, gamma="scale", probability=True)),
|
| 885 |
-
])
|
| 886 |
-
clf.fit(X, y_enc)
|
| 887 |
-
|
| 888 |
-
# Open-set calibration: compute max intra-class nearest-neighbour distance
|
| 889 |
-
# in the scaled feature space, then use 2× as the rejection threshold.
|
| 890 |
-
global _writer_X_scaled, _writer_dist_threshold
|
| 891 |
-
X_scaled = clf.named_steps["scaler"].transform(X)
|
| 892 |
-
max_intra = 0.0
|
| 893 |
-
for cls in np.unique(y_enc):
|
| 894 |
-
Xc = X_scaled[y_enc == cls]
|
| 895 |
-
if len(Xc) < 2:
|
| 896 |
-
continue
|
| 897 |
-
diff = Xc[:, np.newaxis, :] - Xc[np.newaxis, :, :]
|
| 898 |
-
dists = np.sqrt((diff ** 2).sum(axis=2))
|
| 899 |
-
np.fill_diagonal(dists, np.inf)
|
| 900 |
-
max_intra = max(max_intra, dists.min(axis=1).max())
|
| 901 |
-
_writer_X_scaled = X_scaled
|
| 902 |
-
_writer_dist_threshold = max_intra * 2.0
|
| 903 |
-
|
| 904 |
-
_writer_clf = clf
|
| 905 |
-
_writer_le = le
|
| 906 |
-
print(f"Writer model ready — {len(le.classes_)} writers, {len(X)} samples. "
|
| 907 |
-
f"Rejection threshold: {_writer_dist_threshold:.3f}")
|
| 908 |
-
return _writer_clf, _writer_le
|
| 909 |
-
|
| 910 |
-
|
| 911 |
-
def _ensure_writer_examples() -> list[str]:
|
| 912 |
-
"""Pre-generate example images for the Gradio examples list."""
|
| 913 |
-
_WRITER_EXAMPLES_DIR.mkdir(parents=True, exist_ok=True)
|
| 914 |
-
paths = []
|
| 915 |
-
for wid in range(5):
|
| 916 |
-
p = _WRITER_EXAMPLES_DIR / f"writer_{wid}_example.png"
|
| 917 |
-
if not p.exists():
|
| 918 |
-
img = _make_synthetic_writer(wid, sample_id=99)
|
| 919 |
-
img.save(str(p))
|
| 920 |
-
paths.append(str(p))
|
| 921 |
-
return paths
|
| 922 |
-
|
| 923 |
-
|
| 924 |
-
_writer_example_paths = _ensure_writer_examples()
|
| 925 |
-
|
| 926 |
-
# Pre-warm writer model in background so step 4 of pipeline is instant
|
| 927 |
-
_threading.Thread(target=_get_writer_model, daemon=True).start()
|
| 928 |
-
|
| 929 |
-
|
| 930 |
-
def writer_identify(image: np.ndarray) -> tuple[str, np.ndarray]:
|
| 931 |
-
if image is None:
|
| 932 |
-
return "Carica un'immagine di testo manoscritto.", None
|
| 933 |
-
try:
|
| 934 |
-
clf, le = _get_writer_model()
|
| 935 |
-
except Exception as e:
|
| 936 |
-
return f"Errore nel caricamento del modello: {e}", None
|
| 937 |
-
|
| 938 |
-
pil_img = Image.fromarray(image)
|
| 939 |
-
try:
|
| 940 |
-
feat = _extract_writer_features(pil_img)
|
| 941 |
-
except Exception as e:
|
| 942 |
-
return f"Errore nell'estrazione delle caratteristiche: {e}", None
|
| 943 |
-
|
| 944 |
-
proba = clf.predict_proba([feat])[0]
|
| 945 |
-
order = np.argsort(proba)[::-1]
|
| 946 |
-
names = le.inverse_transform(order)
|
| 947 |
-
scores = proba[order]
|
| 948 |
-
|
| 949 |
-
# Open-set check: nearest-neighbour distance in scaled feature space
|
| 950 |
-
is_unknown = False
|
| 951 |
-
if _writer_X_scaled is not None and _writer_dist_threshold is not None:
|
| 952 |
-
feat_scaled = clf.named_steps["scaler"].transform([feat])[0]
|
| 953 |
-
min_dist = np.linalg.norm(_writer_X_scaled - feat_scaled, axis=1).min()
|
| 954 |
-
is_unknown = min_dist > _writer_dist_threshold
|
| 955 |
-
|
| 956 |
-
# Markdown report
|
| 957 |
-
rows = "\n".join(
|
| 958 |
-
f"| {'🥇' if i == 0 else '🥈' if i == 1 else '🥉' if i == 2 else ' '} "
|
| 959 |
-
f"**{name}** | {score:.1%} |"
|
| 960 |
-
for i, (name, score) in enumerate(zip(names, scores))
|
| 961 |
-
)
|
| 962 |
-
if is_unknown:
|
| 963 |
-
report = (
|
| 964 |
-
"**⚠️ Scrittore non identificato nel database**\n\n"
|
| 965 |
-
"La scrittura analizzata non corrisponde a nessuno degli scrittori noti. "
|
| 966 |
-
"Le probabilità di seguito hanno valore puramente indicativo "
|
| 967 |
-
"e **non devono essere usate per un'attribuzione**.\n\n"
|
| 968 |
-
"| Candidato | Probabilità (riferimento) |\n"
|
| 969 |
-
"|-----------|---------------------------|\n"
|
| 970 |
-
+ rows
|
| 971 |
-
+ "\n\n*La distanza dal campione più simile nel database supera la soglia "
|
| 972 |
-
"di affidabilità. Aggiungere campioni dello scrittore al database per "
|
| 973 |
-
"un confronto diretto.*"
|
| 974 |
-
)
|
| 975 |
-
else:
|
| 976 |
-
report = (
|
| 977 |
-
"**Identificazione Scrittore — Risultati**\n\n"
|
| 978 |
-
"| Candidato | Probabilità |\n"
|
| 979 |
-
"|-----------|-------------|\n"
|
| 980 |
-
+ rows
|
| 981 |
-
+ "\n\n*I risultati si basano su caratteristiche HOG + LBP + statistiche dei tratti.*"
|
| 982 |
-
)
|
| 983 |
-
if _load_real_writer_samples() is None:
|
| 984 |
-
report += (
|
| 985 |
-
"\n\n⚠️ *Dati sintetici: il modello è addestrato su scritture generate "
|
| 986 |
-
"artificialmente. Per risultati forensi reali, popola `data/samples/writer_XX/`.*"
|
| 987 |
-
)
|
| 988 |
-
|
| 989 |
-
# Bar chart
|
| 990 |
-
fig, ax = plt.subplots(figsize=(5, max(2.5, len(names) * 0.55)))
|
| 991 |
-
if is_unknown:
|
| 992 |
-
colors = ["#aaaaaa"] * len(names)
|
| 993 |
-
chart_title = "Scrittore non nel database — solo riferimento"
|
| 994 |
-
else:
|
| 995 |
-
colors = ["#1B3A6B" if i == 0 else "#C8973A" if i == 1 else "#9eb8e0"
|
| 996 |
-
for i in range(len(names))]
|
| 997 |
-
chart_title = "Probabilità per scrittore"
|
| 998 |
-
ax.barh(names[::-1], scores[::-1] * 100, color=colors[::-1])
|
| 999 |
-
ax.set_xlabel("Probabilità (%)")
|
| 1000 |
-
ax.set_xlim(0, 105)
|
| 1001 |
-
ax.set_title(chart_title)
|
| 1002 |
-
for i, (name, score) in enumerate(zip(names[::-1], scores[::-1])):
|
| 1003 |
-
ax.text(score * 100 + 1, i, f"{score:.1%}", va="center", fontsize=9)
|
| 1004 |
-
plt.tight_layout()
|
| 1005 |
-
|
| 1006 |
-
buf = io.BytesIO()
|
| 1007 |
-
fig.savefig(buf, format="png", dpi=120)
|
| 1008 |
-
plt.close(fig)
|
| 1009 |
-
buf.seek(0)
|
| 1010 |
-
chart_arr = np.array(Image.open(buf))
|
| 1011 |
-
|
| 1012 |
-
return report, chart_arr
|
| 1013 |
|
| 1014 |
|
| 1015 |
writer_tab = gr.Interface(
|
| 1016 |
-
fn=
|
| 1017 |
inputs=gr.Image(label="Campione di scrittura a mano da attribuire", type="numpy"),
|
| 1018 |
outputs=[
|
| 1019 |
gr.Markdown(label="Candidati ordinati per probabilità"),
|
|
@@ -1052,89 +314,6 @@ writer_tab = gr.Interface(
|
|
| 1052 |
# Tab 6 — Graphological Feature Analysis
|
| 1053 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1054 |
|
| 1055 |
-
def grapho_analyse(image: np.ndarray) -> tuple[str, np.ndarray]:
|
| 1056 |
-
if image is None:
|
| 1057 |
-
return "Carica un'immagine di scrittura a mano.", image
|
| 1058 |
-
|
| 1059 |
-
# Cap to 800 px: adaptive threshold is O(pixels × blockSize), so keeping
|
| 1060 |
-
# the image small is critical. Graphological metrics are scale-invariant.
|
| 1061 |
-
h0, w0 = image.shape[:2]
|
| 1062 |
-
if max(h0, w0) > 800:
|
| 1063 |
-
sc = 800 / max(h0, w0)
|
| 1064 |
-
image = cv2.resize(image, (int(w0 * sc), int(h0 * sc)), interpolation=cv2.INTER_AREA)
|
| 1065 |
-
|
| 1066 |
-
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) if len(image.shape) == 3 else image
|
| 1067 |
-
# Adaptive threshold works locally: ignores the global dark background of
|
| 1068 |
-
# phone photos that fools global Otsu into treating borders as ink.
|
| 1069 |
-
binary = cv2.adaptiveThreshold(
|
| 1070 |
-
cv2.GaussianBlur(gray, (5, 5), 0), 255,
|
| 1071 |
-
cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 31, 10,
|
| 1072 |
-
)
|
| 1073 |
-
|
| 1074 |
-
# Slant
|
| 1075 |
-
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
|
| 1076 |
-
angles = []
|
| 1077 |
-
for cnt in contours:
|
| 1078 |
-
if cv2.contourArea(cnt) >= 20 and len(cnt) >= 5:
|
| 1079 |
-
_, _, angle = cv2.fitEllipse(cnt)
|
| 1080 |
-
slant = angle - 90.0
|
| 1081 |
-
if -60 < slant < 60:
|
| 1082 |
-
angles.append(slant)
|
| 1083 |
-
slant_mean = float(np.mean(angles)) if angles else 0.0
|
| 1084 |
-
slant_std = float(np.std(angles)) if angles else 0.0
|
| 1085 |
-
|
| 1086 |
-
# Pressure
|
| 1087 |
-
ink_mask = binary > 0
|
| 1088 |
-
pressure = (255 - gray)[ink_mask]
|
| 1089 |
-
pressure_mean = float(pressure.mean()) if len(pressure) else 0.0
|
| 1090 |
-
|
| 1091 |
-
# Connected components
|
| 1092 |
-
num, _, stats, _ = cv2.connectedComponentsWithStats(binary, 8)
|
| 1093 |
-
valid = stats[1:][stats[1:, cv2.CC_STAT_AREA] > 15] if num > 1 else np.zeros((0, 5))
|
| 1094 |
-
h_mean = float(valid[:, cv2.CC_STAT_HEIGHT].mean()) if len(valid) else 0.0
|
| 1095 |
-
w_mean = float(valid[:, cv2.CC_STAT_WIDTH].mean()) if len(valid) else 0.0
|
| 1096 |
-
|
| 1097 |
-
# Word spacing
|
| 1098 |
-
h_proj = binary.sum(axis=0)
|
| 1099 |
-
gaps = []
|
| 1100 |
-
in_gap, gap_w = False, 0
|
| 1101 |
-
for v in h_proj:
|
| 1102 |
-
if v == 0:
|
| 1103 |
-
in_gap = True
|
| 1104 |
-
gap_w += 1
|
| 1105 |
-
elif in_gap:
|
| 1106 |
-
if gap_w > 5:
|
| 1107 |
-
gaps.append(gap_w)
|
| 1108 |
-
in_gap = False
|
| 1109 |
-
gap_w = 0
|
| 1110 |
-
word_spacing = float(np.mean(gaps)) if gaps else 0.0
|
| 1111 |
-
|
| 1112 |
-
ink_density = ink_mask.mean() * 100
|
| 1113 |
-
|
| 1114 |
-
# Build annotated visualisation
|
| 1115 |
-
vis = cv2.cvtColor(binary, cv2.COLOR_GRAY2RGB)
|
| 1116 |
-
for cnt in contours:
|
| 1117 |
-
if cv2.contourArea(cnt) >= 20:
|
| 1118 |
-
x, y, w, h = cv2.boundingRect(cnt)
|
| 1119 |
-
cv2.rectangle(vis, (x, y), (x + w, y + h), (0, 180, 255), 1)
|
| 1120 |
-
|
| 1121 |
-
report = (
|
| 1122 |
-
f"**Analisi delle Caratteristiche Grafologiche**\n\n"
|
| 1123 |
-
f"| Caratteristica | Valore |\n"
|
| 1124 |
-
f"|----------------|--------|\n"
|
| 1125 |
-
f"| Inclinazione media lettere | {slant_mean:+.1f}° ({'destra' if slant_mean > 0 else 'sinistra' if slant_mean < 0 else 'verticale'}) |\n"
|
| 1126 |
-
f"| Variazione inclinazione (σ) | {slant_std:.1f}° |\n"
|
| 1127 |
-
f"| Pressione del tratto | {pressure_mean:.1f} / 255 |\n"
|
| 1128 |
-
f"| Altezza media lettere | {h_mean:.1f} px |\n"
|
| 1129 |
-
f"| Larghezza media lettere | {w_mean:.1f} px |\n"
|
| 1130 |
-
f"| Spaziatura media parole | {word_spacing:.1f} px |\n"
|
| 1131 |
-
f"| Densità inchiostro | {ink_density:.2f}% |\n"
|
| 1132 |
-
f"| Componenti connesse | {len(valid)} |\n\n"
|
| 1133 |
-
f"*I bounding box delle lettere sono visibili nell'immagine annotata.*"
|
| 1134 |
-
)
|
| 1135 |
-
return report, vis
|
| 1136 |
-
|
| 1137 |
-
|
| 1138 |
grapho_tab = gr.Interface(
|
| 1139 |
fn=grapho_analyse,
|
| 1140 |
inputs=gr.Image(label="Immagine di testo manoscritto", type="numpy"),
|
|
@@ -1175,69 +354,13 @@ grapho_tab = gr.Interface(
|
|
| 1175 |
# Tab 7 — Forensic Pipeline
|
| 1176 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1177 |
|
| 1178 |
-
def _detect_and_crop(
|
| 1179 |
-
image: np.ndarray,
|
| 1180 |
-
conf_threshold: float = 0.3,
|
| 1181 |
-
) -> tuple[np.ndarray, np.ndarray | None, str]:
|
| 1182 |
-
"""Run YOLO signature detection and return (annotated, first_crop, summary).
|
| 1183 |
-
|
| 1184 |
-
Gracefully degrades when YOLO is not available (missing HF_TOKEN).
|
| 1185 |
-
"""
|
| 1186 |
-
annotated = image.copy()
|
| 1187 |
-
try:
|
| 1188 |
-
yolo = get_yolo()
|
| 1189 |
-
except Exception:
|
| 1190 |
-
return annotated, None, "⚠️ Rilevamento firma non disponibile (HF_TOKEN mancante)."
|
| 1191 |
-
|
| 1192 |
-
pil_img = Image.fromarray(image).convert("RGB")
|
| 1193 |
-
import tempfile
|
| 1194 |
-
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp:
|
| 1195 |
-
pil_img.save(tmp.name)
|
| 1196 |
-
tmp_path = tmp.name
|
| 1197 |
-
|
| 1198 |
-
results = yolo.predict(tmp_path, conf=conf_threshold, verbose=False)
|
| 1199 |
-
os.unlink(tmp_path)
|
| 1200 |
-
|
| 1201 |
-
result = results[0]
|
| 1202 |
-
first_crop: np.ndarray | None = None
|
| 1203 |
-
count = 0
|
| 1204 |
-
|
| 1205 |
-
if result.boxes is not None:
|
| 1206 |
-
for box in result.boxes:
|
| 1207 |
-
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy().astype(int)
|
| 1208 |
-
conf = float(box.conf[0].cpu())
|
| 1209 |
-
cv2.rectangle(annotated, (x1, y1), (x2, y2), (255, 0, 0), 2)
|
| 1210 |
-
cv2.putText(annotated, f"Sig #{count+1} {conf:.0%}",
|
| 1211 |
-
(x1, max(y1 - 8, 0)),
|
| 1212 |
-
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 0), 2)
|
| 1213 |
-
if count == 0:
|
| 1214 |
-
x1c = max(0, x1); y1c = max(0, y1)
|
| 1215 |
-
x2c = min(image.shape[1], x2); y2c = min(image.shape[0], y2)
|
| 1216 |
-
if x2c > x1c and y2c > y1c:
|
| 1217 |
-
first_crop = image[y1c:y2c, x1c:x2c]
|
| 1218 |
-
count += 1
|
| 1219 |
-
|
| 1220 |
-
summary = (
|
| 1221 |
-
f"Rilevat{'a' if count == 1 else 'e'} {count} firma{'' if count == 1 else 'e'}."
|
| 1222 |
-
if count > 0
|
| 1223 |
-
else "Nessuna firma rilevata nel documento."
|
| 1224 |
-
)
|
| 1225 |
-
return annotated, first_crop, summary
|
| 1226 |
-
|
| 1227 |
-
|
| 1228 |
def run_pipeline(
|
| 1229 |
doc_image: np.ndarray,
|
| 1230 |
ref_sig: np.ndarray | None,
|
| 1231 |
progress: gr.Progress = gr.Progress(track_tqdm=False),
|
| 1232 |
):
|
| 1233 |
-
"""
|
| 1234 |
-
|
| 1235 |
-
Generator: yields partial results after each step so the UI updates live.
|
| 1236 |
-
Output order: s1_img, s1_txt, s2_txt, s3_hl, s3_md,
|
| 1237 |
-
s4_md, s4_img, s5_md, s5_img,
|
| 1238 |
-
s6_txt, s6_img, final_md, pipe_results, llm_md
|
| 1239 |
-
"""
|
| 1240 |
-
_ = gr.update() # no-op: leave output unchanged
|
| 1241 |
|
| 1242 |
if doc_image is None:
|
| 1243 |
msg = "Carica il documento da analizzare."
|
|
@@ -1245,255 +368,42 @@ def run_pipeline(
|
|
| 1245 |
gr.update(visible=False), _)
|
| 1246 |
return
|
| 1247 |
|
| 1248 |
-
|
| 1249 |
-
|
| 1250 |
-
|
| 1251 |
-
|
| 1252 |
-
|
| 1253 |
-
|
| 1254 |
-
|
| 1255 |
-
|
| 1256 |
-
|
| 1257 |
-
|
| 1258 |
-
|
| 1259 |
-
|
| 1260 |
-
|
| 1261 |
-
|
| 1262 |
-
|
| 1263 |
-
|
| 1264 |
-
|
| 1265 |
-
|
| 1266 |
-
|
| 1267 |
-
# ── Step 4: Writer Identification ─────────────────────────────────────────
|
| 1268 |
-
progress(0.60, desc="Step 4/7 — Identificazione scrittore…")
|
| 1269 |
-
step4_report, step4_chart = writer_identify(doc_image)
|
| 1270 |
-
yield (_, _, _, _, _, step4_report, step4_chart, _, _, _, _, _, _, _)
|
| 1271 |
-
|
| 1272 |
-
# ── Step 5: Graphological Analysis ────────────────────────────────────────
|
| 1273 |
-
progress(0.75, desc="Step 5/7 — Analisi grafologica…")
|
| 1274 |
-
step5_report, step5_vis = grapho_analyse(doc_image)
|
| 1275 |
-
yield (_, _, _, _, _, _, _, step5_report, step5_vis, _, _, _, _, _)
|
| 1276 |
-
|
| 1277 |
-
# ── Step 6: Signature Verification ────────────────────────────────────────
|
| 1278 |
-
progress(0.88, desc="Step 6/7 — Verifica firma…")
|
| 1279 |
-
if ref_sig is not None:
|
| 1280 |
-
query_for_verify = sig_crop if sig_crop is not None else doc_image
|
| 1281 |
-
step6_report, step6_chart = sig_verify(ref_sig, None, query_for_verify)
|
| 1282 |
-
if sig_crop is None:
|
| 1283 |
-
step6_report += "\n\n⚠️ Nessuna firma estratta — confronto eseguito sull'immagine intera."
|
| 1284 |
-
else:
|
| 1285 |
-
step6_report = (
|
| 1286 |
-
"Firma di riferimento non fornita.\n\n"
|
| 1287 |
-
"Per abilitare questo step carica una firma autentica nota "
|
| 1288 |
-
"nel campo 'Firma di riferimento' sopra."
|
| 1289 |
)
|
| 1290 |
-
step6_chart = None
|
| 1291 |
-
yield (_, _, _, _, _, _, _, _, _, step6_report, step6_chart, _, _, _)
|
| 1292 |
-
|
| 1293 |
-
# ── Referto finale ────────────────────────────────────────────────────────
|
| 1294 |
-
final_report = (
|
| 1295 |
-
"## Referto Forense Integrato\n\n"
|
| 1296 |
-
"---\n\n"
|
| 1297 |
-
f"### Step 1 — Rilevamento Firma\n{step1_summary}\n\n"
|
| 1298 |
-
f"### Step 2 — Trascrizione HTR\n```\n{step2_text}\n```\n\n"
|
| 1299 |
-
f"### Step 3 — Entità Nominate\n{step3_summary}\n\n"
|
| 1300 |
-
f"### Step 4 — Identificazione Scrittore\n{step4_report}\n\n"
|
| 1301 |
-
f"### Step 5 — Caratteristiche Grafologiche\n{step5_report}\n\n"
|
| 1302 |
-
f"### Step 6 — Verifica Firma\n{step6_report}\n\n"
|
| 1303 |
-
"---\n\n"
|
| 1304 |
-
"*Referto generato automaticamente da GraphoLab. "
|
| 1305 |
-
"Tutti i risultati hanno carattere indicativo e devono essere valutati "
|
| 1306 |
-
"da un perito calligrafo qualificato.*"
|
| 1307 |
-
)
|
| 1308 |
-
yield (_, _, _, _, _, _, _, _, _, _, _, final_report, _, _)
|
| 1309 |
-
|
| 1310 |
-
# ── Step 7: Sintesi LLM ───────────────────────────────────────────────────
|
| 1311 |
-
progress(0.92, desc="Step 7/7 — Sintesi LLM…")
|
| 1312 |
-
llm_report = _pipeline_llm_synthesis(
|
| 1313 |
-
step1_summary, step2_text, step3_summary,
|
| 1314 |
-
step4_report, step5_report, step6_report,
|
| 1315 |
-
)
|
| 1316 |
-
yield (_, _, _, _, _, _, _, _, _, _, _, _, _, llm_report)
|
| 1317 |
|
| 1318 |
|
| 1319 |
-
def
|
| 1320 |
-
s1_img, s1_txt,
|
| 1321 |
-
|
| 1322 |
-
|
| 1323 |
-
s4_md, s4_img,
|
| 1324 |
-
s5_md, s5_img,
|
| 1325 |
-
s6_txt, s6_img,
|
| 1326 |
-
llm_text,
|
| 1327 |
) -> str:
|
| 1328 |
-
|
| 1329 |
-
|
| 1330 |
-
|
| 1331 |
-
|
| 1332 |
-
|
| 1333 |
-
|
| 1334 |
-
|
| 1335 |
-
|
| 1336 |
-
"""Normalize Unicode to latin-1 for fpdf2 core fonts."""
|
| 1337 |
-
if not text:
|
| 1338 |
-
return ""
|
| 1339 |
-
replacements = {
|
| 1340 |
-
"\u2014": "-", "\u2013": "-",
|
| 1341 |
-
"\u2018": "'", "\u2019": "'",
|
| 1342 |
-
"\u201c": '"', "\u201d": '"',
|
| 1343 |
-
"\u2026": "...",
|
| 1344 |
-
"\u2022": "*",
|
| 1345 |
-
"\u2713": "v", "\u2714": "v", # checkmark
|
| 1346 |
-
"\u2718": "x", "\u2716": "x", # cross mark
|
| 1347 |
-
"\U0001f947": "1.", "\U0001f948": "2.", "\U0001f949": "3.", # medaglie
|
| 1348 |
-
"\u26a0\ufe0f": "(!)", "\u26a0": "(!)", # warning
|
| 1349 |
-
"\U0001f50d": "", # lente di ingrandimento
|
| 1350 |
-
"\U0001f5d1": "", # cestino
|
| 1351 |
-
}
|
| 1352 |
-
for src, dst in replacements.items():
|
| 1353 |
-
text = text.replace(src, dst)
|
| 1354 |
-
return text.encode("latin-1", errors="replace").decode("latin-1")
|
| 1355 |
-
|
| 1356 |
-
def _md_to_plain(text: str) -> str:
|
| 1357 |
-
"""Strip markdown syntax to plain text for PDF rendering."""
|
| 1358 |
-
if not text:
|
| 1359 |
-
return ""
|
| 1360 |
-
# convert markdown table rows to pipe-separated plain text
|
| 1361 |
-
def _table_row_to_plain(m):
|
| 1362 |
-
cells = [c.strip() for c in m.group(0).strip("|").split("|")]
|
| 1363 |
-
return " | ".join(c for c in cells if c)
|
| 1364 |
-
text = re.sub(r"^[-| ]+$", "", text, flags=re.MULTILINE) # separator rows
|
| 1365 |
-
text = re.sub(r"^\|.*\|$", _table_row_to_plain, text, flags=re.MULTILINE)
|
| 1366 |
-
text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
|
| 1367 |
-
text = re.sub(r"\*{1,2}(.+?)\*{1,2}", r"\1", text)
|
| 1368 |
-
text = re.sub(r"`{1,3}[^`]*`{1,3}", "", text)
|
| 1369 |
-
text = re.sub(r"\n{3,}", "\n\n", text)
|
| 1370 |
-
return _to_latin1(text.strip())
|
| 1371 |
-
|
| 1372 |
-
def _numpy_to_jpeg_bytes(arr) -> bytes | None:
|
| 1373 |
-
"""Convert numpy array to JPEG bytes for embedding in PDF."""
|
| 1374 |
-
if arr is None:
|
| 1375 |
-
return None
|
| 1376 |
-
try:
|
| 1377 |
-
img = _PILImage.fromarray(arr.astype("uint8"))
|
| 1378 |
-
buf = _io.BytesIO()
|
| 1379 |
-
img.save(buf, format="JPEG", quality=85)
|
| 1380 |
-
return buf.getvalue()
|
| 1381 |
-
except Exception:
|
| 1382 |
-
return None
|
| 1383 |
-
|
| 1384 |
-
class ForensicPDF(FPDF):
|
| 1385 |
-
def header(self):
|
| 1386 |
-
self.set_font("Helvetica", "B", 10)
|
| 1387 |
-
self.set_text_color(80, 80, 80)
|
| 1388 |
-
self.cell(0, 8, "GraphoLab - Referto Forense Integrato", align="C")
|
| 1389 |
-
self.ln(2)
|
| 1390 |
-
self.set_draw_color(180, 180, 180)
|
| 1391 |
-
self.line(10, self.get_y(), 200, self.get_y())
|
| 1392 |
-
self.ln(4)
|
| 1393 |
-
|
| 1394 |
-
def footer(self):
|
| 1395 |
-
self.set_y(-15)
|
| 1396 |
-
self.set_font("Helvetica", "I", 8)
|
| 1397 |
-
self.set_text_color(130, 130, 130)
|
| 1398 |
-
self.cell(0, 10, f"Pagina {self.page_no()} - Generato da GraphoLab", align="C")
|
| 1399 |
-
|
| 1400 |
-
pdf = ForensicPDF()
|
| 1401 |
-
pdf.set_auto_page_break(auto=True, margin=18)
|
| 1402 |
-
pdf.add_page()
|
| 1403 |
-
|
| 1404 |
-
# ── Titolo ────────────────────────────────────────────────────────────────
|
| 1405 |
-
pdf.set_font("Helvetica", "B", 18)
|
| 1406 |
-
pdf.set_text_color(30, 30, 30)
|
| 1407 |
-
pdf.cell(0, 12, "Referto Forense Integrato", align="C")
|
| 1408 |
-
pdf.ln(4)
|
| 1409 |
-
pdf.set_font("Helvetica", "", 10)
|
| 1410 |
-
pdf.set_text_color(100, 100, 100)
|
| 1411 |
-
now = datetime.datetime.now().strftime("%d/%m/%Y %H:%M")
|
| 1412 |
-
pdf.cell(0, 8, f"Data generazione: {now}", align="C")
|
| 1413 |
-
pdf.ln(10)
|
| 1414 |
-
|
| 1415 |
-
def _section_title(title: str):
|
| 1416 |
-
pdf.set_font("Helvetica", "B", 12)
|
| 1417 |
-
pdf.set_text_color(255, 255, 255)
|
| 1418 |
-
pdf.set_fill_color(50, 80, 120)
|
| 1419 |
-
pdf.cell(0, 8, _to_latin1(f" {title}"), fill=True)
|
| 1420 |
-
pdf.ln(12)
|
| 1421 |
-
pdf.set_text_color(30, 30, 30)
|
| 1422 |
-
|
| 1423 |
-
def _body_text(text: str):
|
| 1424 |
-
if not text:
|
| 1425 |
-
return
|
| 1426 |
-
pdf.set_font("Helvetica", "", 10)
|
| 1427 |
-
pdf.set_text_color(40, 40, 40)
|
| 1428 |
-
pdf.multi_cell(0, 5, _md_to_plain(text))
|
| 1429 |
-
pdf.ln(3)
|
| 1430 |
-
|
| 1431 |
-
def _embed_image(arr, max_w: int = 170):
|
| 1432 |
-
data = _numpy_to_jpeg_bytes(arr)
|
| 1433 |
-
if data is None:
|
| 1434 |
-
return
|
| 1435 |
-
buf = _io.BytesIO(data)
|
| 1436 |
-
img = _PILImage.open(buf)
|
| 1437 |
-
w, h = img.size
|
| 1438 |
-
ratio = min(max_w / w, 100 / h)
|
| 1439 |
-
disp_w, disp_h = w * ratio, h * ratio
|
| 1440 |
-
buf.seek(0)
|
| 1441 |
-
x = (210 - disp_w) / 2
|
| 1442 |
-
pdf.image(buf, x=x, w=disp_w, h=disp_h)
|
| 1443 |
-
pdf.ln(4)
|
| 1444 |
-
|
| 1445 |
-
# ── Step 1 ────────────────────────────────────────────────────────────────
|
| 1446 |
-
_section_title("Step 1 — Rilevamento Firma (YOLOv8)")
|
| 1447 |
-
_body_text(s1_txt)
|
| 1448 |
-
_embed_image(s1_img)
|
| 1449 |
-
|
| 1450 |
-
# ── Step 2 ────────────────────────────────────────────────────────────────
|
| 1451 |
-
_section_title("Step 2 — Trascrizione HTR (EasyOCR)")
|
| 1452 |
-
_body_text(s2_txt)
|
| 1453 |
-
|
| 1454 |
-
# ── Step 3 ────────────────────────────────────────────────────────────────
|
| 1455 |
-
_section_title("Step 3 — Riconoscimento Entita' (NER)")
|
| 1456 |
-
_body_text(s3_md or "Nessuna entita' rilevata nel testo trascritto.")
|
| 1457 |
-
|
| 1458 |
-
# ── Step 4 ────────────────────────────────────────────────────────────────
|
| 1459 |
-
_section_title("Step 4 — Identificazione Scrittore")
|
| 1460 |
-
_body_text(s4_md)
|
| 1461 |
-
_embed_image(s4_img)
|
| 1462 |
-
|
| 1463 |
-
# ── Step 5 ────────────────────────────────────────────────────────────────
|
| 1464 |
-
_section_title("Step 5 — Analisi Grafologica")
|
| 1465 |
-
_body_text(s5_md)
|
| 1466 |
-
_embed_image(s5_img)
|
| 1467 |
-
|
| 1468 |
-
# ── Step 6 ────────────────────────────────────────────────────────────────
|
| 1469 |
-
_section_title("Step 6 — Verifica Firma (SigNet)")
|
| 1470 |
-
_body_text(s6_txt)
|
| 1471 |
-
_embed_image(s6_img)
|
| 1472 |
-
|
| 1473 |
-
# ── Step 7: LLM ───────────────────────────────────────────────────────────
|
| 1474 |
-
_section_title("Step 7 — Valutazione LLM (Ollama)")
|
| 1475 |
-
_body_text(llm_text)
|
| 1476 |
-
|
| 1477 |
-
# ── Disclaimer ────────────────────────────────────────────────────────────
|
| 1478 |
-
pdf.ln(6)
|
| 1479 |
-
pdf.set_draw_color(180, 180, 180)
|
| 1480 |
-
pdf.line(10, pdf.get_y(), 200, pdf.get_y())
|
| 1481 |
-
pdf.ln(4)
|
| 1482 |
-
pdf.set_font("Helvetica", "I", 8)
|
| 1483 |
-
pdf.set_text_color(120, 120, 120)
|
| 1484 |
-
pdf.multi_cell(
|
| 1485 |
-
0, 4,
|
| 1486 |
-
"Referto generato automaticamente da GraphoLab. "
|
| 1487 |
-
"Tutti i risultati hanno carattere indicativo e devono essere valutati "
|
| 1488 |
-
"da un perito calligrafo qualificato.",
|
| 1489 |
-
)
|
| 1490 |
-
|
| 1491 |
-
# ── Salvataggio ───────────────────────────────────────────────────────────
|
| 1492 |
-
tmp = tempfile.NamedTemporaryFile(
|
| 1493 |
-
suffix=".pdf", prefix="grapholab_referto_", delete=False
|
| 1494 |
)
|
| 1495 |
-
|
| 1496 |
-
return tmp.name
|
| 1497 |
|
| 1498 |
|
| 1499 |
with gr.Blocks() as pipeline_tab:
|
|
@@ -1514,14 +424,8 @@ with gr.Blocks() as pipeline_tab:
|
|
| 1514 |
)
|
| 1515 |
|
| 1516 |
with gr.Row():
|
| 1517 |
-
pipe_doc = gr.Image(
|
| 1518 |
-
|
| 1519 |
-
type="numpy",
|
| 1520 |
-
)
|
| 1521 |
-
pipe_ref = gr.Image(
|
| 1522 |
-
label="Firma di riferimento nota — opzionale (per Step 6)",
|
| 1523 |
-
type="numpy",
|
| 1524 |
-
)
|
| 1525 |
|
| 1526 |
pipe_btn = gr.Button("▶ Avvia Analisi Forense", variant="primary", size="lg")
|
| 1527 |
|
|
@@ -1588,7 +492,7 @@ with gr.Blocks() as pipeline_tab:
|
|
| 1588 |
)
|
| 1589 |
|
| 1590 |
pdf_btn.click(
|
| 1591 |
-
fn=
|
| 1592 |
inputs=[
|
| 1593 |
out_s1_img, out_s1_txt,
|
| 1594 |
out_s2_txt,
|
|
@@ -1601,146 +505,20 @@ with gr.Blocks() as pipeline_tab:
|
|
| 1601 |
outputs=pdf_out,
|
| 1602 |
)
|
| 1603 |
|
| 1604 |
-
|
| 1605 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1606 |
-
# Tab 8 —
|
| 1607 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1608 |
|
| 1609 |
-
|
| 1610 |
-
|
| 1611 |
-
|
| 1612 |
-
# Regex L1 — date italiane e numeriche
|
| 1613 |
-
_DATE_PATTERNS = [
|
| 1614 |
-
# "10 gennaio 2024" / "10 gennaio del 2024"
|
| 1615 |
-
r"\b(\d{1,2})\s+(gennaio|febbraio|marzo|aprile|maggio|giugno|"
|
| 1616 |
-
r"luglio|agosto|settembre|ottobre|novembre|dicembre)\s+(?:del\s+)?(\d{4})\b",
|
| 1617 |
-
# "10 gen. 2024" / abbreviazioni
|
| 1618 |
-
r"\b(\d{1,2})\s+(gen|feb|mar|apr|mag|giu|lug|ago|set|ott|nov|dic)\.?\s+(\d{4})\b",
|
| 1619 |
-
# "10/01/2024" o "10-01-2024" o "10.01.2024"
|
| 1620 |
-
r"\b(\d{1,2})[\/\-\.](\d{1,2})[\/\-\.](\d{2,4})\b",
|
| 1621 |
-
# "gennaio 2024" (senza giorno)
|
| 1622 |
-
r"\b(gennaio|febbraio|marzo|aprile|maggio|giugno|"
|
| 1623 |
-
r"luglio|agosto|settembre|ottobre|novembre|dicembre)\s+(\d{4})\b",
|
| 1624 |
-
]
|
| 1625 |
-
_DATE_RE = _re.compile("|".join(_DATE_PATTERNS), _re.IGNORECASE)
|
| 1626 |
-
|
| 1627 |
-
|
| 1628 |
-
def _try_dateparser(raw: str) -> _datetime | None:
|
| 1629 |
-
"""Parse a raw date string to datetime using dateparser (Italian-aware)."""
|
| 1630 |
-
try:
|
| 1631 |
-
import dateparser
|
| 1632 |
-
dt = dateparser.parse(
|
| 1633 |
-
raw,
|
| 1634 |
-
languages=["it", "en"],
|
| 1635 |
-
settings={"PREFER_DAY_OF_MONTH": "first", "RETURN_AS_TIMEZONE_AWARE": False},
|
| 1636 |
-
)
|
| 1637 |
-
if dt and 1800 < dt.year < 2200:
|
| 1638 |
-
return dt
|
| 1639 |
-
except Exception:
|
| 1640 |
-
pass
|
| 1641 |
-
return None
|
| 1642 |
-
|
| 1643 |
-
|
| 1644 |
-
def extract_dates(text: str) -> list[tuple[str, _datetime]]:
|
| 1645 |
-
"""Extract and normalize dates from OCR text.
|
| 1646 |
-
|
| 1647 |
-
Returns a list of (raw_string, datetime) pairs, sorted chronologically.
|
| 1648 |
-
Uses regex L1 first; falls back to scanning NER DATE entities if nothing found.
|
| 1649 |
-
"""
|
| 1650 |
-
found: list[tuple[str, _datetime]] = []
|
| 1651 |
-
|
| 1652 |
-
# L1 — regex
|
| 1653 |
-
_BIRTH_KW = ("nata", "nato", "nascita", "nasc.", "nata il", "nato il")
|
| 1654 |
-
for m in _DATE_RE.finditer(text):
|
| 1655 |
-
raw = m.group(0).strip()
|
| 1656 |
-
context_before = text[max(0, m.start() - 35) : m.start()].lower()
|
| 1657 |
-
if any(kw in context_before for kw in _BIRTH_KW):
|
| 1658 |
-
continue # data di nascita — ignorala
|
| 1659 |
-
dt = _try_dateparser(raw)
|
| 1660 |
-
if dt:
|
| 1661 |
-
found.append((raw, dt))
|
| 1662 |
-
|
| 1663 |
-
# L2 — NER fallback (filters DATE entities from wikineural NER)
|
| 1664 |
-
if not found:
|
| 1665 |
-
try:
|
| 1666 |
-
ner_html, ner_md = ner_extract(text)
|
| 1667 |
-
# Extract DATE spans from NER Markdown (pattern: **WORD** `DATE`)
|
| 1668 |
-
for raw in _re.findall(r"\*\*([^*]+)\*\*\s*`DATE`", ner_md or ""):
|
| 1669 |
-
dt = _try_dateparser(raw)
|
| 1670 |
-
if dt:
|
| 1671 |
-
found.append((raw, dt))
|
| 1672 |
-
except Exception:
|
| 1673 |
-
pass
|
| 1674 |
-
|
| 1675 |
-
# De-duplicate by normalized date
|
| 1676 |
-
seen: set[str] = set()
|
| 1677 |
-
unique: list[tuple[str, _datetime]] = []
|
| 1678 |
-
for raw, dt in found:
|
| 1679 |
-
key = dt.strftime("%Y-%m-%d")
|
| 1680 |
-
if key not in seen:
|
| 1681 |
-
seen.add(key)
|
| 1682 |
-
unique.append((raw, dt))
|
| 1683 |
-
|
| 1684 |
-
return sorted(unique, key=lambda x: x[1])
|
| 1685 |
-
|
| 1686 |
-
|
| 1687 |
-
def dating_rank(files: list) -> str:
|
| 1688 |
-
"""Main function for the Datazione Documenti tab.
|
| 1689 |
-
|
| 1690 |
-
Accepts a list of uploaded files (gr.File objects), runs OCR on each,
|
| 1691 |
-
extracts dates, and returns a Markdown table sorted chronologically.
|
| 1692 |
-
"""
|
| 1693 |
if not files:
|
| 1694 |
return "Carica almeno un'immagine di documento."
|
| 1695 |
-
|
| 1696 |
-
|
| 1697 |
-
rows: list[tuple[str, str, _datetime | None]] = []
|
| 1698 |
-
|
| 1699 |
-
for f in files:
|
| 1700 |
-
path = f.name if hasattr(f, "name") else str(f)
|
| 1701 |
-
name = path.split("\\")[-1].split("/")[-1]
|
| 1702 |
-
try:
|
| 1703 |
-
img = Image.open(path).convert("RGB")
|
| 1704 |
-
img_np = np.array(img)
|
| 1705 |
-
ocr_lines = reader.readtext(img_np, detail=0, paragraph=False)
|
| 1706 |
-
text = "\n".join(ocr_lines)
|
| 1707 |
-
dates = extract_dates(text)
|
| 1708 |
-
if dates:
|
| 1709 |
-
raw, dt = dates[-1] # data più recente = data di redazione
|
| 1710 |
-
rows.append((name, raw, dt))
|
| 1711 |
-
else:
|
| 1712 |
-
rows.append((name, "— data non trovata", None))
|
| 1713 |
-
except Exception as e:
|
| 1714 |
-
rows.append((name, f"Errore: {e}", None))
|
| 1715 |
-
|
| 1716 |
-
# Sort: dated docs first (chronologically), undated last
|
| 1717 |
-
dated = [(n, r, dt) for n, r, dt in rows if dt is not None]
|
| 1718 |
-
undated = [(n, r, dt) for n, r, dt in rows if dt is None]
|
| 1719 |
-
dated.sort(key=lambda x: x[2])
|
| 1720 |
-
sorted_rows = dated + undated
|
| 1721 |
-
|
| 1722 |
-
lines = [
|
| 1723 |
-
"## Datazione Documenti — Risultati\n",
|
| 1724 |
-
"| # | Documento | Data estratta | Data normalizzata |",
|
| 1725 |
-
"|---|-----------|--------------|-------------------|",
|
| 1726 |
-
]
|
| 1727 |
-
for i, (name, raw, dt) in enumerate(sorted_rows, 1):
|
| 1728 |
-
norm = dt.strftime("%Y-%m-%d") if dt else "—"
|
| 1729 |
-
lines.append(f"| {i} | `{name}` | {raw} | {norm} |")
|
| 1730 |
-
|
| 1731 |
-
if not dated:
|
| 1732 |
-
lines.append("\n> Nessuna data rilevata nei documenti caricati.")
|
| 1733 |
-
else:
|
| 1734 |
-
lines.append(
|
| 1735 |
-
f"\n*{len(dated)} document{'o' if len(dated)==1 else 'i'} datato/i, "
|
| 1736 |
-
f"{len(undated)} senza data.*"
|
| 1737 |
-
)
|
| 1738 |
-
|
| 1739 |
-
return "\n".join(lines)
|
| 1740 |
|
| 1741 |
|
| 1742 |
dating_tab = gr.Interface(
|
| 1743 |
-
fn=
|
| 1744 |
inputs=gr.File(
|
| 1745 |
label="Immagini documenti (carica 2 o più)",
|
| 1746 |
file_count="multiple",
|
|
@@ -1757,505 +535,10 @@ dating_tab = gr.Interface(
|
|
| 1757 |
),
|
| 1758 |
)
|
| 1759 |
|
| 1760 |
-
|
| 1761 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1762 |
# Tab 9 — Consulente Forense IA (RAG + Ollama)
|
| 1763 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 1764 |
|
| 1765 |
-
_RAG_SYNTHETIC_DOCS = [
|
| 1766 |
-
(
|
| 1767 |
-
"Analisi della pressione",
|
| 1768 |
-
"La pressione grafica indica la forza con cui la penna o la matita viene premuta sul foglio. "
|
| 1769 |
-
"Una pressione forte (tratti profondi, rilevabili anche sul retro del foglio) è associata a "
|
| 1770 |
-
"carattere deciso, vitalità e a volte aggressività. Una pressione leggera (tratti quasi "
|
| 1771 |
-
"impercettibili) può indicare sensibilità, adattabilità o, in contesti patologici, stanchezza "
|
| 1772 |
-
"e astenia. La pressione irregolare — alternanza di tratti forti e deboli nello stesso scritto — "
|
| 1773 |
-
"può segnalare instabilità emotiva, stati di ansia o condizioni neurologiche. In grafologia "
|
| 1774 |
-
"forense la pressione è fondamentale per distinguere scritture apposte in condizioni normali "
|
| 1775 |
-
"da quelle prodotte sotto costrizione fisica o psicologica.",
|
| 1776 |
-
),
|
| 1777 |
-
(
|
| 1778 |
-
"Inclinazione del tratto",
|
| 1779 |
-
"L'inclinazione della scrittura descrive l'angolo dei tratti verticali delle lettere rispetto "
|
| 1780 |
-
"alla riga di base. Una scrittura verticale (0°) indica equilibrio e obiettività. "
|
| 1781 |
-
"L'inclinazione a destra (>15°) è associata a estroversia, impulsività e orientamento verso "
|
| 1782 |
-
"il futuro. L'inclinazione a sinistra (<−10°) può indicare introversione, tendenza al ripiegamento "
|
| 1783 |
-
"su se stessi o, in contesti forensi, un tentativo di camuffare la propria calligrafia. "
|
| 1784 |
-
"L'inclinazione variabile (misto destra/sinistra nello stesso testo) è indicatore di "
|
| 1785 |
-
"instabilità emotiva. La misurazione forense dell'inclinazione avviene tramite analisi "
|
| 1786 |
-
"angolare dei tratti ascendenti (h, l, b, f) e discendenti (g, p, q).",
|
| 1787 |
-
),
|
| 1788 |
-
(
|
| 1789 |
-
"Spaziatura grafica",
|
| 1790 |
-
"La spaziatura riguarda la distanza tra lettere, parole e righe. Spaziatura ampia tra le parole "
|
| 1791 |
-
"indica bisogno di spazio personale, pensiero indipendente e, talvolta, solitudine. "
|
| 1792 |
-
"Spaziatura ridotta (parole quasi attaccate) è correlata a socievolezza eccessiva, difficoltà "
|
| 1793 |
-
"nei confini relazionali e, in casi estremi, pensiero confusionario. La spaziatura irregolare — "
|
| 1794 |
-
"alternanza di parole distanti e ravvicinate — è un indicatore di disorganizzazione cognitiva "
|
| 1795 |
-
"o di scrittura non spontanea (es. copiatura o dettatura lenta). In perizie forensi, "
|
| 1796 |
-
"la spaziatura viene misurata in millimetri su campioni standardizzati.",
|
| 1797 |
-
),
|
| 1798 |
-
(
|
| 1799 |
-
"Margini e layout",
|
| 1800 |
-
"I margini del foglio riflettono il rapporto dello scrittore con l'ambiente e il contesto "
|
| 1801 |
-
"sociale. Un margine sinistro ampio e costante indica rispetto delle regole e pianificazione. "
|
| 1802 |
-
"Un margine sinistro che si allarga progressivamente (testo che 'scivola' verso destra) "
|
| 1803 |
-
"suggerisce entusiasmo crescente o impulsività. Margine destro ampio è associato a prudenza, "
|
| 1804 |
-
"timore del futuro e riservatezza. L'assenza di margini (testo che occupa tutto il foglio) "
|
| 1805 |
-
"indica esuberanza comunicativa o senso di urgenza. In perizia, il margine aiuta a "
|
| 1806 |
-
"distinguere scritti autentici da trascrizioni o copie, poiché l'autore mantiene "
|
| 1807 |
-
"inconsciamente le proprie abitudini spaziali.",
|
| 1808 |
-
),
|
| 1809 |
-
(
|
| 1810 |
-
"Firme autentiche",
|
| 1811 |
-
"Una firma autentica possiede caratteristiche di naturalezza e fluidità del movimento. "
|
| 1812 |
-
"I tratti sono continui, con accelerazione e decelerazione tipiche del gesto automatizzato. "
|
| 1813 |
-
"La pressione varia in modo coerente con il ritmo del tratto. I legamenti tra le lettere "
|
| 1814 |
-
"sono coerenti con il corpus grafico dello scrittore. La firma autentica presenta micro-tremori "
|
| 1815 |
-
"naturali (diversi dai tremori patologici) e piccole variazioni tra esecuzioni successive, "
|
| 1816 |
-
"mai perfettamente identiche. In perizia calligrafica, si confrontano almeno 10-15 firme "
|
| 1817 |
-
"autentiche per stabilire la 'gamma di variazione naturale' prima di esaminare la firma contestata.",
|
| 1818 |
-
),
|
| 1819 |
-
(
|
| 1820 |
-
"Firme false",
|
| 1821 |
-
"Le firme contraffatte si distinguono per diversi indicatori: velocità di esecuzione "
|
| 1822 |
-
"innaturalmente lenta (visibile nei 'tocchi' del pennino e nelle esitazioni), tremori "
|
| 1823 |
-
"artificiali (regolari, non spontanei), ritocchi e correzioni del tratto, interruzioni "
|
| 1824 |
-
"anomale del gesto. La falsificazione per imitazione diretta (calco o copia visiva) produce "
|
| 1825 |
-
"una firma con aspetto simile all'originale ma con movimenti invertiti rispetto alla direzione "
|
| 1826 |
-
"naturale. Il falsario tende a concentrarsi sulla forma complessiva trascurando i dettagli "
|
| 1827 |
-
"minuti (proporzioni tra lettere, angolo di attacco del tratto, pressione). "
|
| 1828 |
-
"L'analisi forense utilizza ingrandimenti 10x-40x e, nei casi dubbi, grafometria digitale.",
|
| 1829 |
-
),
|
| 1830 |
-
(
|
| 1831 |
-
"Velocità e ritmo",
|
| 1832 |
-
"La velocità di scrittura si manifesta nella forma delle lettere (semplificazione dei tratti "
|
| 1833 |
-
"in scrittura rapida), nell'inclinazione (più marcata ad alta velocità), nelle legature "
|
| 1834 |
-
"(frequenti in scrittura veloce, assenti in quella lenta). Il ritmo è la regolarità con cui "
|
| 1835 |
-
"si alternano tensione e distensione nel movimento grafico. Un ritmo regolare indica "
|
| 1836 |
-
"equilibrio psicofisico. Un ritmo aritmico (alternanza caotica di tratti tesi e distesi) "
|
| 1837 |
-
"può segnalare stati emotivi alterati, patologie neurologiche o scrittura non spontanea. "
|
| 1838 |
-
"In perizia forense la velocità è cruciale: una firma depositata 'lentamente' da una persona "
|
| 1839 |
-
"abitualmente veloce è un forte indicatore di contraffazione.",
|
| 1840 |
-
),
|
| 1841 |
-
(
|
| 1842 |
-
"Datazione documenti",
|
| 1843 |
-
"La datazione grafica di un documento si basa su elementi intrinseci ed estrinseci. "
|
| 1844 |
-
"Elementi intrinseci: evoluzione dello stile grafico dell'autore nel tempo (campioni noti "
|
| 1845 |
-
"datati permettono di costruire una 'curva di evoluzione'), deterioramento della calligrafia "
|
| 1846 |
-
"legato all'età, variazioni nelle abitudini punteggiatura e abbreviazioni. "
|
| 1847 |
-
"Elementi estrinseci: tipo di inchiostro (analisi spettroscopica), supporto cartaceo "
|
| 1848 |
-
"(filigrana, composizione chimica), strumento di scrittura (biro, stilografica, matita). "
|
| 1849 |
-
"L'analisi dell'inchiostro mediante cromatografia liquida può stabilire se l'inchiostro "
|
| 1850 |
-
"è compatibile con la data dichiarata. In perizia, la datazione grafica va sempre "
|
| 1851 |
-
"abbinata ad analisi chimiche per raggiungere un grado di certezza forense.",
|
| 1852 |
-
),
|
| 1853 |
-
]
|
| 1854 |
-
|
| 1855 |
-
_RAG_KNOWLEDGE_DIR = ROOT / "data" / "knowledge"
|
| 1856 |
-
|
| 1857 |
-
|
| 1858 |
-
def _chunk_text(text: str, source: str, size: int = 500, overlap: int = 50) -> list:
|
| 1859 |
-
chunks = []
|
| 1860 |
-
start = 0
|
| 1861 |
-
while start < len(text):
|
| 1862 |
-
end = min(start + size, len(text))
|
| 1863 |
-
chunk = text[start:end].strip()
|
| 1864 |
-
if chunk:
|
| 1865 |
-
chunks.append({"text": chunk, "source": source, "emb": None})
|
| 1866 |
-
start += size - overlap
|
| 1867 |
-
return chunks
|
| 1868 |
-
|
| 1869 |
-
|
| 1870 |
-
def _ollama_embed(text: str):
|
| 1871 |
-
try:
|
| 1872 |
-
r = _requests.post(
|
| 1873 |
-
f"{OLLAMA_URL}/api/embeddings",
|
| 1874 |
-
json={"model": _embed_model, "prompt": text},
|
| 1875 |
-
timeout=30,
|
| 1876 |
-
)
|
| 1877 |
-
return np.array(r.json()["embedding"], dtype=np.float32)
|
| 1878 |
-
except Exception:
|
| 1879 |
-
return None
|
| 1880 |
-
|
| 1881 |
-
|
| 1882 |
-
def _ollama_embed_batch(texts: list) -> list:
|
| 1883 |
-
"""Embed a list of texts in a single Ollama call (/api/embed, Ollama >= 0.1.26).
|
| 1884 |
-
Returns a list of np.ndarray (or None on error). Falls back to sequential
|
| 1885 |
-
_ollama_embed calls if the batch endpoint is unavailable.
|
| 1886 |
-
"""
|
| 1887 |
-
try:
|
| 1888 |
-
r = _requests.post(
|
| 1889 |
-
f"{OLLAMA_URL}/api/embed",
|
| 1890 |
-
json={"model": _embed_model, "input": texts},
|
| 1891 |
-
timeout=max(30, len(texts) * 3),
|
| 1892 |
-
)
|
| 1893 |
-
r.raise_for_status()
|
| 1894 |
-
data = r.json()
|
| 1895 |
-
embeddings = data.get("embeddings") or data.get("embedding")
|
| 1896 |
-
if embeddings and len(embeddings) == len(texts):
|
| 1897 |
-
return [np.array(e, dtype=np.float32) for e in embeddings]
|
| 1898 |
-
except Exception:
|
| 1899 |
-
pass
|
| 1900 |
-
# Fallback: sequential calls
|
| 1901 |
-
return [_ollama_embed(t) for t in texts]
|
| 1902 |
-
|
| 1903 |
-
|
| 1904 |
-
def _ollama_list_models() -> list:
|
| 1905 |
-
"""Return sorted list of model names available in Ollama."""
|
| 1906 |
-
try:
|
| 1907 |
-
r = _requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 1908 |
-
models = [m["name"] for m in r.json().get("models", [])]
|
| 1909 |
-
return sorted(models) if models else [OLLAMA_MODEL]
|
| 1910 |
-
except Exception:
|
| 1911 |
-
return [OLLAMA_MODEL]
|
| 1912 |
-
|
| 1913 |
-
|
| 1914 |
-
def set_rag_model(model_name: str) -> str:
|
| 1915 |
-
global _rag_model
|
| 1916 |
-
if model_name:
|
| 1917 |
-
_rag_model = model_name
|
| 1918 |
-
return f"✅ Modello attivo: **{_rag_model}**"
|
| 1919 |
-
|
| 1920 |
-
|
| 1921 |
-
def _cosine_top_k(query_emb: np.ndarray, k: int = 3) -> list:
|
| 1922 |
-
if not _rag_chunks:
|
| 1923 |
-
return []
|
| 1924 |
-
embs = np.stack([c["emb"] for c in _rag_chunks if c["emb"] is not None])
|
| 1925 |
-
valid = [c for c in _rag_chunks if c["emb"] is not None]
|
| 1926 |
-
if len(valid) == 0:
|
| 1927 |
-
return []
|
| 1928 |
-
q = query_emb / (np.linalg.norm(query_emb) + 1e-9)
|
| 1929 |
-
norms = np.linalg.norm(embs, axis=1, keepdims=True) + 1e-9
|
| 1930 |
-
scores = (embs / norms) @ q
|
| 1931 |
-
idxs = np.argsort(scores)[::-1][:k]
|
| 1932 |
-
return [(float(scores[i]), valid[i]) for i in idxs]
|
| 1933 |
-
|
| 1934 |
-
|
| 1935 |
-
def _rag_cache_path(filename: str, file_bytes: bytes) -> Path:
|
| 1936 |
-
h = _hashlib.sha256(file_bytes).hexdigest()[:8]
|
| 1937 |
-
stem = Path(filename).stem[:40]
|
| 1938 |
-
safe = "".join(c if c.isalnum() or c in "-_" else "_" for c in stem)
|
| 1939 |
-
return _RAG_CACHE_DIR / f"{safe}_{h}.npz"
|
| 1940 |
-
|
| 1941 |
-
|
| 1942 |
-
def _rag_cache_save(cache_path: Path, chunks: list, filename: str) -> None:
|
| 1943 |
-
_RAG_CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
| 1944 |
-
good = [c for c in chunks if c["emb"] is not None]
|
| 1945 |
-
if not good:
|
| 1946 |
-
return
|
| 1947 |
-
texts = np.array([c["text"] for c in good], dtype=object)
|
| 1948 |
-
sources = np.array([c["source"] for c in good], dtype=object)
|
| 1949 |
-
embs = np.stack([c["emb"] for c in good])
|
| 1950 |
-
np.savez_compressed(
|
| 1951 |
-
str(cache_path),
|
| 1952 |
-
texts=texts,
|
| 1953 |
-
sources=sources,
|
| 1954 |
-
embs=embs,
|
| 1955 |
-
filename=np.array(filename, dtype=object),
|
| 1956 |
-
)
|
| 1957 |
-
|
| 1958 |
-
|
| 1959 |
-
def _rag_cache_load(cache_path: Path) -> tuple:
|
| 1960 |
-
"""Returns (chunks, original_filename)."""
|
| 1961 |
-
data = np.load(str(cache_path), allow_pickle=True)
|
| 1962 |
-
filename = str(data["filename"])
|
| 1963 |
-
chunks = [
|
| 1964 |
-
{"text": str(t), "source": str(s), "emb": e}
|
| 1965 |
-
for t, s, e in zip(data["texts"], data["sources"], data["embs"])
|
| 1966 |
-
]
|
| 1967 |
-
return chunks, filename
|
| 1968 |
-
|
| 1969 |
-
|
| 1970 |
-
def _rag_doc_list() -> list:
|
| 1971 |
-
"""Return rows [[filename, chunk_count]] for gr.Dataframe (user docs only)."""
|
| 1972 |
-
synthetic_sources = {s for s, _ in _RAG_SYNTHETIC_DOCS}
|
| 1973 |
-
counts: dict = {}
|
| 1974 |
-
for c in _rag_chunks:
|
| 1975 |
-
src = c["source"]
|
| 1976 |
-
if src not in synthetic_sources:
|
| 1977 |
-
counts[src] = counts.get(src, 0) + 1
|
| 1978 |
-
return [[name, cnt] for name, cnt in sorted(counts.items())]
|
| 1979 |
-
|
| 1980 |
-
|
| 1981 |
-
def _rag_doc_choices() -> list:
|
| 1982 |
-
return [row[0] for row in _rag_doc_list()]
|
| 1983 |
-
|
| 1984 |
-
|
| 1985 |
-
def _extract_pdf_text(path: Path) -> str:
|
| 1986 |
-
"""Extract text from a PDF, falling back to EasyOCR for scanned pages."""
|
| 1987 |
-
full_text = []
|
| 1988 |
-
try:
|
| 1989 |
-
import pypdf
|
| 1990 |
-
except ImportError:
|
| 1991 |
-
print(f"[RAG] pypdf not installed — skipping {path.name}")
|
| 1992 |
-
return ""
|
| 1993 |
-
try:
|
| 1994 |
-
reader = pypdf.PdfReader(str(path))
|
| 1995 |
-
for page_num, page in enumerate(reader.pages):
|
| 1996 |
-
page_text = page.extract_text() or ""
|
| 1997 |
-
if len(page_text.strip()) >= 50:
|
| 1998 |
-
full_text.append(page_text)
|
| 1999 |
-
else:
|
| 2000 |
-
# Scanned page — render to image and OCR
|
| 2001 |
-
try:
|
| 2002 |
-
import fitz # pymupdf
|
| 2003 |
-
doc = fitz.open(str(path))
|
| 2004 |
-
fitz_page = doc[page_num]
|
| 2005 |
-
mat = fitz.Matrix(150 / 72, 150 / 72) # 150 DPI
|
| 2006 |
-
pix = fitz_page.get_pixmap(matrix=mat)
|
| 2007 |
-
img_arr = np.frombuffer(pix.samples, dtype=np.uint8).reshape(
|
| 2008 |
-
pix.height, pix.width, pix.n
|
| 2009 |
-
)
|
| 2010 |
-
if pix.n == 4:
|
| 2011 |
-
img_arr = img_arr[:, :, :3]
|
| 2012 |
-
ocr_result = get_easyocr().readtext(img_arr, detail=0, paragraph=True)
|
| 2013 |
-
full_text.append(" ".join(ocr_result))
|
| 2014 |
-
doc.close()
|
| 2015 |
-
except ImportError:
|
| 2016 |
-
print(f"[RAG] pymupdf not installed — cannot OCR scanned page {page_num+1} of {path.name}")
|
| 2017 |
-
except Exception as e:
|
| 2018 |
-
print(f"[RAG] OCR error on page {page_num+1} of {path.name}: {e}")
|
| 2019 |
-
except Exception as e:
|
| 2020 |
-
print(f"[RAG] Error reading PDF {path.name}: {e}")
|
| 2021 |
-
return "\n".join(full_text)
|
| 2022 |
-
|
| 2023 |
-
|
| 2024 |
-
def _rag_load_docs():
|
| 2025 |
-
global _rag_chunks, _rag_indexed_files, _rag_ready
|
| 2026 |
-
with _rag_lock:
|
| 2027 |
-
chunks: list = []
|
| 2028 |
-
|
| 2029 |
-
# Synthetic built-in knowledge (always re-embedded at startup)
|
| 2030 |
-
for source, text in _RAG_SYNTHETIC_DOCS:
|
| 2031 |
-
chunks.extend(_chunk_text(text, source))
|
| 2032 |
-
|
| 2033 |
-
# Load cached user documents (pre-embedded — no Ollama calls needed)
|
| 2034 |
-
_RAG_CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
| 2035 |
-
for cache_file in sorted(_RAG_CACHE_DIR.glob("*.npz")):
|
| 2036 |
-
try:
|
| 2037 |
-
cached_chunks, orig_filename = _rag_cache_load(cache_file)
|
| 2038 |
-
chunks.extend(cached_chunks)
|
| 2039 |
-
_rag_indexed_files.add(orig_filename)
|
| 2040 |
-
print(f"[RAG] Loaded from cache: {orig_filename} ({len(cached_chunks)} chunks)")
|
| 2041 |
-
except Exception as e:
|
| 2042 |
-
print(f"[RAG] Corrupt cache file {cache_file.name}: {e} — skipping")
|
| 2043 |
-
|
| 2044 |
-
_rag_chunks = chunks
|
| 2045 |
-
_rag_ready = True
|
| 2046 |
-
print(f"[RAG] Chunks loaded: {len(chunks)} (synthetic + cached)")
|
| 2047 |
-
|
| 2048 |
-
# Embed only synthetic chunks (emb is None); cached chunks already have embeddings
|
| 2049 |
-
to_embed = [c for c in _rag_chunks if c["emb"] is None]
|
| 2050 |
-
if to_embed:
|
| 2051 |
-
embeddings = _ollama_embed_batch([c["text"] for c in to_embed])
|
| 2052 |
-
embedded = 0
|
| 2053 |
-
for chunk, emb in zip(to_embed, embeddings):
|
| 2054 |
-
if emb is not None:
|
| 2055 |
-
chunk["emb"] = emb
|
| 2056 |
-
embedded += 1
|
| 2057 |
-
print(f"[RAG] Synthetic embedding done: {embedded} chunks")
|
| 2058 |
-
else:
|
| 2059 |
-
print("[RAG] Synthetic embedding done: 0 chunks (all cached)")
|
| 2060 |
-
|
| 2061 |
-
|
| 2062 |
-
def rag_add_docs(files) -> tuple:
|
| 2063 |
-
"""Index uploaded PDF/DOCX files and add them to the live knowledge base."""
|
| 2064 |
-
global _rag_indexed_files
|
| 2065 |
-
if not files:
|
| 2066 |
-
return "Nessun file caricato.", _rag_doc_list()
|
| 2067 |
-
try:
|
| 2068 |
-
_requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 2069 |
-
except Exception:
|
| 2070 |
-
return (
|
| 2071 |
-
"❌ Ollama non raggiungibile — i documenti non possono essere indicizzati.\n"
|
| 2072 |
-
"Avvia `ollama serve` e ricarica.",
|
| 2073 |
-
_rag_doc_list(),
|
| 2074 |
-
)
|
| 2075 |
-
lines = []
|
| 2076 |
-
for f in files:
|
| 2077 |
-
path = Path(f.name)
|
| 2078 |
-
suffix = path.suffix.lower()
|
| 2079 |
-
if path.name in _rag_indexed_files:
|
| 2080 |
-
lines.append(f"ℹ️ `{path.name}` — già indicizzato, saltato.")
|
| 2081 |
-
continue
|
| 2082 |
-
|
| 2083 |
-
file_bytes = path.read_bytes()
|
| 2084 |
-
cache_path = _rag_cache_path(path.name, file_bytes)
|
| 2085 |
-
|
| 2086 |
-
# Load from cache if available (avoids re-embedding)
|
| 2087 |
-
if cache_path.exists():
|
| 2088 |
-
try:
|
| 2089 |
-
cached_chunks, _ = _rag_cache_load(cache_path)
|
| 2090 |
-
with _rag_lock:
|
| 2091 |
-
_rag_chunks.extend(cached_chunks)
|
| 2092 |
-
_rag_indexed_files.add(path.name)
|
| 2093 |
-
lines.append(f"✅ `{path.name}` — {len(cached_chunks)} chunk caricati dalla cache.")
|
| 2094 |
-
continue
|
| 2095 |
-
except Exception:
|
| 2096 |
-
pass # fall through to re-embed
|
| 2097 |
-
|
| 2098 |
-
try:
|
| 2099 |
-
if suffix == ".pdf":
|
| 2100 |
-
text = _extract_pdf_text(path)
|
| 2101 |
-
elif suffix in (".docx", ".doc"):
|
| 2102 |
-
import docx as _docx
|
| 2103 |
-
doc_obj = _docx.Document(str(path))
|
| 2104 |
-
text = "\n".join(p.text for p in doc_obj.paragraphs)
|
| 2105 |
-
else:
|
| 2106 |
-
lines.append(f"⚠️ `{path.name}` — formato non supportato (solo PDF/DOCX).")
|
| 2107 |
-
continue
|
| 2108 |
-
except Exception as e:
|
| 2109 |
-
lines.append(f"❌ `{path.name}` — errore: {e}")
|
| 2110 |
-
continue
|
| 2111 |
-
|
| 2112 |
-
if not text.strip():
|
| 2113 |
-
lines.append(f"⚠️ `{path.name}` — nessun testo estratto.")
|
| 2114 |
-
continue
|
| 2115 |
-
|
| 2116 |
-
chunks = _chunk_text(text, path.name)
|
| 2117 |
-
embeddings = _ollama_embed_batch([c["text"] for c in chunks])
|
| 2118 |
-
embedded = 0
|
| 2119 |
-
for chunk, emb in zip(chunks, embeddings):
|
| 2120 |
-
if emb is not None:
|
| 2121 |
-
chunk["emb"] = emb
|
| 2122 |
-
embedded += 1
|
| 2123 |
-
|
| 2124 |
-
try:
|
| 2125 |
-
_rag_cache_save(cache_path, chunks, path.name)
|
| 2126 |
-
except Exception as e:
|
| 2127 |
-
print(f"[RAG] Cache write failed for {path.name}: {e}")
|
| 2128 |
-
|
| 2129 |
-
with _rag_lock:
|
| 2130 |
-
_rag_chunks.extend(chunks)
|
| 2131 |
-
_rag_indexed_files.add(path.name)
|
| 2132 |
-
lines.append(f"✅ `{path.name}` — {len(chunks)} chunk, {embedded} indicizzati.")
|
| 2133 |
-
|
| 2134 |
-
return "\n".join(lines), _rag_doc_list()
|
| 2135 |
-
|
| 2136 |
-
|
| 2137 |
-
def rag_remove_doc(filename: str) -> tuple:
|
| 2138 |
-
"""Remove all chunks for a document from memory and delete its cache file."""
|
| 2139 |
-
global _rag_chunks, _rag_indexed_files
|
| 2140 |
-
if not filename or not filename.strip():
|
| 2141 |
-
return "Nessun documento selezionato.", _rag_doc_list()
|
| 2142 |
-
|
| 2143 |
-
with _rag_lock:
|
| 2144 |
-
before = len(_rag_chunks)
|
| 2145 |
-
_rag_chunks = [c for c in _rag_chunks if c["source"] != filename]
|
| 2146 |
-
removed_chunks = before - len(_rag_chunks)
|
| 2147 |
-
_rag_indexed_files.discard(filename)
|
| 2148 |
-
|
| 2149 |
-
deleted_files = 0
|
| 2150 |
-
if _RAG_CACHE_DIR.exists():
|
| 2151 |
-
for cache_file in _RAG_CACHE_DIR.glob("*.npz"):
|
| 2152 |
-
try:
|
| 2153 |
-
with np.load(str(cache_file), allow_pickle=True) as data:
|
| 2154 |
-
match = str(data["filename"]) == filename
|
| 2155 |
-
if match:
|
| 2156 |
-
cache_file.unlink()
|
| 2157 |
-
deleted_files += 1
|
| 2158 |
-
except Exception:
|
| 2159 |
-
pass
|
| 2160 |
-
|
| 2161 |
-
if removed_chunks == 0:
|
| 2162 |
-
return f"⚠️ `{filename}` non trovato nell'indice.", _rag_doc_list()
|
| 2163 |
-
|
| 2164 |
-
msg = f"🗑️ `{filename}` rimosso ({removed_chunks} chunk eliminati"
|
| 2165 |
-
if deleted_files:
|
| 2166 |
-
msg += ", cache eliminata"
|
| 2167 |
-
msg += ")."
|
| 2168 |
-
return msg, _rag_doc_list()
|
| 2169 |
-
|
| 2170 |
-
|
| 2171 |
-
def _stream_ollama(prompt: str):
|
| 2172 |
-
"""Yield response tokens from Ollama one at a time (streaming)."""
|
| 2173 |
-
with _requests.post(
|
| 2174 |
-
f"{OLLAMA_URL}/api/generate",
|
| 2175 |
-
json={"model": _rag_model, "prompt": prompt, "stream": True},
|
| 2176 |
-
stream=True,
|
| 2177 |
-
timeout=120,
|
| 2178 |
-
) as r:
|
| 2179 |
-
for line in r.iter_lines():
|
| 2180 |
-
if line:
|
| 2181 |
-
data = _json.loads(line)
|
| 2182 |
-
if not data.get("done"):
|
| 2183 |
-
yield data.get("response", "")
|
| 2184 |
-
|
| 2185 |
-
|
| 2186 |
-
def _pipeline_llm_synthesis(
|
| 2187 |
-
step1_summary: str,
|
| 2188 |
-
step2_text: str,
|
| 2189 |
-
step3_summary: str,
|
| 2190 |
-
step4_report: str,
|
| 2191 |
-
step5_report: str,
|
| 2192 |
-
step6_report: str,
|
| 2193 |
-
) -> str:
|
| 2194 |
-
"""Chiama Ollama per sintetizzare i risultati dei 6 step in un referto forense narrativo."""
|
| 2195 |
-
try:
|
| 2196 |
-
_requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 2197 |
-
except Exception:
|
| 2198 |
-
return (
|
| 2199 |
-
"❌ **Ollama non raggiungibile.** Avvia il server con:\n"
|
| 2200 |
-
"```\nollama serve\n```"
|
| 2201 |
-
)
|
| 2202 |
-
prompt = (
|
| 2203 |
-
"Sei un perito calligrafo forense esperto. "
|
| 2204 |
-
"Sulla base delle seguenti analisi tecniche su un documento, "
|
| 2205 |
-
"fornisci in italiano una valutazione complessiva professionale: "
|
| 2206 |
-
"evidenzia elementi di interesse forense, coerenze e incoerenze tra i risultati, "
|
| 2207 |
-
"e suggerisci eventuali ulteriori verifiche.\n\n"
|
| 2208 |
-
f"=== RILEVAMENTO FIRMA ===\n{step1_summary}\n\n"
|
| 2209 |
-
f"=== TRASCRIZIONE HTR ===\n{step2_text}\n\n"
|
| 2210 |
-
f"=== ENTITÀ RICONOSCIUTE (NER) ===\n{step3_summary}\n\n"
|
| 2211 |
-
f"=== IDENTIFICAZIONE AUTORE ===\n{step4_report}\n\n"
|
| 2212 |
-
f"=== ANALISI GRAFOLOGICA ===\n{step5_report}\n\n"
|
| 2213 |
-
f"=== VERIFICA FIRMA ===\n{step6_report}\n\n"
|
| 2214 |
-
"Valutazione forense integrata:"
|
| 2215 |
-
)
|
| 2216 |
-
result = ""
|
| 2217 |
-
try:
|
| 2218 |
-
for token in _stream_ollama(prompt):
|
| 2219 |
-
result += token
|
| 2220 |
-
except Exception as e:
|
| 2221 |
-
return f"❌ Errore nella generazione LLM: {e}"
|
| 2222 |
-
return result if result else "*(Nessuna risposta dal modello)*"
|
| 2223 |
-
|
| 2224 |
-
|
| 2225 |
-
def _rag_retrieve(question: str):
|
| 2226 |
-
"""Return (results, error_str). results is list of (score, chunk)."""
|
| 2227 |
-
embedded_chunks = [c for c in _rag_chunks if c["emb"] is not None]
|
| 2228 |
-
if not embedded_chunks:
|
| 2229 |
-
total = len(_rag_chunks)
|
| 2230 |
-
return None, (
|
| 2231 |
-
f"⏳ Embedding in corso (0/{total} chunk pronti). "
|
| 2232 |
-
"Riprovare tra qualche secondo — l'indicizzazione procede in background."
|
| 2233 |
-
)
|
| 2234 |
-
q_emb = _ollama_embed(question)
|
| 2235 |
-
if q_emb is None:
|
| 2236 |
-
return None, "❌ Impossibile generare l'embedding della domanda. Ollama è in esecuzione?"
|
| 2237 |
-
|
| 2238 |
-
synthetic_sources = {s for s, _ in _RAG_SYNTHETIC_DOCS}
|
| 2239 |
-
user_chunks = [c for c in _rag_chunks if c["emb"] is not None and c["source"] not in synthetic_sources]
|
| 2240 |
-
synth_chunks = [c for c in _rag_chunks if c["emb"] is not None and c["source"] in synthetic_sources]
|
| 2241 |
-
|
| 2242 |
-
def _top_k_from(pool, q, k):
|
| 2243 |
-
if not pool:
|
| 2244 |
-
return []
|
| 2245 |
-
embs = np.stack([c["emb"] for c in pool])
|
| 2246 |
-
q_n = q / (np.linalg.norm(q) + 1e-9)
|
| 2247 |
-
norms = np.linalg.norm(embs, axis=1, keepdims=True) + 1e-9
|
| 2248 |
-
scores = (embs / norms) @ q_n
|
| 2249 |
-
idxs = np.argsort(scores)[::-1][:k]
|
| 2250 |
-
return [(float(scores[i]), pool[i]) for i in idxs]
|
| 2251 |
-
|
| 2252 |
-
user_results = _top_k_from(user_chunks, q_emb, 2)
|
| 2253 |
-
synth_results = _top_k_from(synth_chunks, q_emb, 2)
|
| 2254 |
-
if not user_results:
|
| 2255 |
-
synth_results = _top_k_from(synth_chunks, q_emb, 4)
|
| 2256 |
-
return user_results + synth_results, None
|
| 2257 |
-
|
| 2258 |
-
|
| 2259 |
def _content_str(content) -> str:
|
| 2260 |
"""Normalize Gradio 6.x content field (str or list of parts) to plain str."""
|
| 2261 |
if isinstance(content, str):
|
|
@@ -2272,82 +555,38 @@ def _content_str(content) -> str:
|
|
| 2272 |
|
| 2273 |
|
| 2274 |
def rag_chat(message: str, history: list):
|
| 2275 |
-
"""
|
| 2276 |
-
|
| 2277 |
-
history is a flat list of {"role": "user"|"assistant", "content": str}
|
| 2278 |
-
as required by Gradio 6.x Chatbot.
|
| 2279 |
-
"""
|
| 2280 |
if not message or not message.strip():
|
| 2281 |
yield history
|
| 2282 |
return
|
| 2283 |
|
| 2284 |
-
#
|
| 2285 |
-
|
| 2286 |
-
|
| 2287 |
-
|
| 2288 |
-
|
| 2289 |
-
"❌ **Ollama non raggiungibile.**\n\n"
|
| 2290 |
-
"Avvia il server con:\n```\nollama serve\n```\n"
|
| 2291 |
-
"e assicurati che il modello sia scaricato:\n"
|
| 2292 |
-
"```\nollama pull llama3.2\n```"
|
| 2293 |
-
)
|
| 2294 |
-
yield history + [{"role": "user", "content": message}, {"role": "assistant", "content": err}]
|
| 2295 |
-
return
|
| 2296 |
-
|
| 2297 |
-
if not _rag_ready:
|
| 2298 |
-
msg = "⏳ Indice della knowledge base in costruzione, riprovare tra qualche secondo…"
|
| 2299 |
-
yield history + [{"role": "user", "content": message}, {"role": "assistant", "content": msg}]
|
| 2300 |
-
return
|
| 2301 |
-
|
| 2302 |
-
results, err = _rag_retrieve(message)
|
| 2303 |
-
if err:
|
| 2304 |
-
yield history + [{"role": "user", "content": message}, {"role": "assistant", "content": err}]
|
| 2305 |
-
return
|
| 2306 |
-
|
| 2307 |
-
context = "\n\n".join(f"[{c['source']}]\n{c['text']}" for _, c in results)
|
| 2308 |
-
|
| 2309 |
-
# Build conversation context from last 6 exchanges (12 messages in flat list)
|
| 2310 |
-
recent = history[-12:] if len(history) > 12 else history
|
| 2311 |
-
conv_text = ""
|
| 2312 |
-
i = 0
|
| 2313 |
-
while i < len(recent) - 1:
|
| 2314 |
-
if recent[i]["role"] == "user" and recent[i + 1]["role"] == "assistant":
|
| 2315 |
-
u = _content_str(recent[i]["content"])
|
| 2316 |
-
a = _content_str(recent[i + 1]["content"]).split("\n\n---\n")[0]
|
| 2317 |
-
conv_text += f"Utente: {u}\nAssistente: {a}\n\n"
|
| 2318 |
-
i += 2
|
| 2319 |
-
else:
|
| 2320 |
-
i += 1
|
| 2321 |
-
|
| 2322 |
-
prompt = (
|
| 2323 |
-
"Sei un esperto di grafologia forense. Rispondi in italiano, in modo preciso e "
|
| 2324 |
-
"conciso, basandoti ESCLUSIVAMENTE sui seguenti estratti.\n\n"
|
| 2325 |
-
f"{context}\n\n"
|
| 2326 |
-
)
|
| 2327 |
-
if conv_text:
|
| 2328 |
-
prompt += f"Conversazione precedente:\n{conv_text}\n"
|
| 2329 |
-
prompt += f"Domanda: {message}\n\nRisposta:"
|
| 2330 |
-
|
| 2331 |
-
sources = list(dict.fromkeys(c["source"] for _, c in results))
|
| 2332 |
-
sources_footer = f"\n\n---\n*Fonti: {', '.join(sources)}*"
|
| 2333 |
|
| 2334 |
-
partial = ""
|
| 2335 |
new_history = history + [
|
| 2336 |
{"role": "user", "content": message},
|
| 2337 |
{"role": "assistant", "content": ""},
|
| 2338 |
]
|
|
|
|
| 2339 |
try:
|
| 2340 |
-
for
|
| 2341 |
-
partial +
|
| 2342 |
-
new_history[-1]["content"] =
|
| 2343 |
yield new_history
|
| 2344 |
except Exception as e:
|
| 2345 |
-
new_history[-1]["content"] = f"❌ Errore
|
| 2346 |
yield new_history
|
| 2347 |
-
return
|
| 2348 |
|
| 2349 |
-
|
| 2350 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2351 |
|
| 2352 |
|
| 2353 |
def save_conversation_md(history: list):
|
|
@@ -2424,7 +663,7 @@ with gr.Blocks() as rag_tab:
|
|
| 2424 |
rag_remove_status = gr.Markdown(label="Esito rimozione")
|
| 2425 |
|
| 2426 |
rag_upload_btn.click(
|
| 2427 |
-
fn=
|
| 2428 |
inputs=rag_upload,
|
| 2429 |
outputs=[rag_upload_status, rag_doc_table],
|
| 2430 |
).then(
|
|
@@ -2433,7 +672,7 @@ with gr.Blocks() as rag_tab:
|
|
| 2433 |
outputs=rag_remove_dd,
|
| 2434 |
)
|
| 2435 |
rag_remove_btn.click(
|
| 2436 |
-
fn=
|
| 2437 |
inputs=rag_remove_dd,
|
| 2438 |
outputs=[rag_remove_status, rag_doc_table],
|
| 2439 |
).then(
|
|
@@ -2445,8 +684,8 @@ with gr.Blocks() as rag_tab:
|
|
| 2445 |
with gr.Row():
|
| 2446 |
rag_model_dd = gr.Dropdown(
|
| 2447 |
label="Modello di generazione (Ollama)",
|
| 2448 |
-
choices=
|
| 2449 |
-
value=
|
| 2450 |
interactive=True,
|
| 2451 |
scale=3,
|
| 2452 |
)
|
|
@@ -2458,14 +697,13 @@ with gr.Blocks() as rag_tab:
|
|
| 2458 |
"Re-indicizzare i documenti per risultati ottimali.*"
|
| 2459 |
)
|
| 2460 |
|
| 2461 |
-
rag_chatbot = gr.Chatbot(
|
| 2462 |
-
label="Consulente Forense IA",
|
| 2463 |
-
height=500,
|
| 2464 |
-
)
|
| 2465 |
rag_in = gr.Textbox(
|
| 2466 |
-
placeholder=
|
| 2467 |
-
|
| 2468 |
-
|
|
|
|
|
|
|
| 2469 |
lines=1,
|
| 2470 |
show_label=False,
|
| 2471 |
interactive=_OLLAMA_AVAILABLE,
|
|
@@ -2496,7 +734,7 @@ with gr.Blocks() as rag_tab:
|
|
| 2496 |
outputs=rag_model_status,
|
| 2497 |
)
|
| 2498 |
rag_model_refresh.click(
|
| 2499 |
-
fn=lambda: gr.update(choices=
|
| 2500 |
outputs=rag_model_dd,
|
| 2501 |
)
|
| 2502 |
|
|
@@ -2506,23 +744,17 @@ with gr.Blocks() as rag_tab:
|
|
| 2506 |
fn=lambda: ([], "", gr.update(visible=False)),
|
| 2507 |
outputs=[rag_chatbot, rag_in, rag_download],
|
| 2508 |
)
|
| 2509 |
-
rag_save_btn.click(
|
| 2510 |
-
fn=save_conversation_md,
|
| 2511 |
-
inputs=rag_chatbot,
|
| 2512 |
-
outputs=rag_download,
|
| 2513 |
-
)
|
| 2514 |
|
| 2515 |
-
# Refresh table, dropdowns and model list when tab loads
|
| 2516 |
rag_tab.load(
|
| 2517 |
fn=lambda: (
|
| 2518 |
gr.update(value=_rag_doc_list()),
|
| 2519 |
gr.update(choices=_rag_doc_choices()),
|
| 2520 |
-
gr.update(choices=
|
| 2521 |
),
|
| 2522 |
outputs=[rag_doc_table, rag_remove_dd, rag_model_dd],
|
| 2523 |
)
|
| 2524 |
|
| 2525 |
-
|
| 2526 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 2527 |
# Main App
|
| 2528 |
# ──────────────────────────────────────────────────────────────────────────────
|
|
@@ -2550,7 +782,7 @@ demo = gr.TabbedInterface(
|
|
| 2550 |
),
|
| 2551 |
)
|
| 2552 |
|
| 2553 |
-
_threading.Thread(target=
|
| 2554 |
|
| 2555 |
if __name__ == "__main__":
|
| 2556 |
demo.launch(
|
|
|
|
| 27 |
sys.path.insert(0, str(ROOT))
|
| 28 |
|
| 29 |
import io
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
import tempfile as _tempfile
|
| 31 |
+
import threading as _threading
|
| 32 |
+
from datetime import datetime as _datetime
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
+
import gradio as gr
|
| 35 |
+
import numpy as np
|
| 36 |
+
from PIL import Image
|
| 37 |
+
|
| 38 |
+
# ── core imports ──────────────────────────────────────────────────────────────
|
| 39 |
+
from core.ocr import htr_transcribe, get_easyocr
|
| 40 |
+
from core.ner import ner_extract
|
| 41 |
+
from core.graphology import grapho_analyse
|
| 42 |
+
from core.writer import writer_identify, ensure_writer_examples
|
| 43 |
+
from core.signature import sig_verify, sig_detect, detect_and_crop
|
| 44 |
+
from core.dating import dating_rank as _dating_rank_core, extract_dates
|
| 45 |
+
from core.pipeline import run_pipeline_steps, generate_forensic_pdf, PipelineResults
|
| 46 |
+
from core.rag import (
|
| 47 |
+
check_ollama, ollama_list_models, set_rag_model,
|
| 48 |
+
rag_load_docs, rag_add_docs, rag_remove_doc,
|
| 49 |
+
rag_doc_list as _rag_doc_list, rag_doc_choices as _rag_doc_choices,
|
| 50 |
+
rag_chat_stream, _rag_ready,
|
| 51 |
+
)
|
| 52 |
|
| 53 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 54 |
+
# Configuration
|
| 55 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 56 |
|
| 57 |
SIGNET_WEIGHTS = ROOT / "models" / "signet.pth"
|
| 58 |
+
WRITER_SAMPLES_DIR = ROOT / "data" / "samples"
|
| 59 |
+
WRITER_EXAMPLES_DIR = WRITER_SAMPLES_DIR / "writer_examples"
|
| 60 |
+
RAG_CACHE_DIR = ROOT / "data" / "rag_cache"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
+
_OLLAMA_AVAILABLE = check_ollama()
|
| 63 |
|
| 64 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 65 |
# Tab 1 — Handwritten OCR
|
| 66 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
htr_tab = gr.Interface(
|
| 69 |
fn=htr_transcribe,
|
| 70 |
inputs=gr.Image(label="Immagine di testo manoscritto", type="numpy"),
|
|
|
|
| 107 |
return str(p) if p.exists() else None
|
| 108 |
|
| 109 |
|
| 110 |
+
def _sig_verify_wrapper(ref_image, ref_image2, query_image):
|
| 111 |
+
return sig_verify(ref_image, ref_image2, query_image, SIGNET_WEIGHTS)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
| 113 |
|
| 114 |
_sig_examples = []
|
|
|
|
| 117 |
_r2 = _sig_ex(f"genuine_{_n}_2.png")
|
| 118 |
_forg = _sig_ex(f"forged_{_n}_1.png")
|
| 119 |
if _r1 and _r2 and _forg:
|
| 120 |
+
_sig_examples.append([_r1, _r2, _forg])
|
| 121 |
if _n == "1" and _r1 and _r2:
|
| 122 |
+
_sig_examples.append([_r1, None, _r2])
|
| 123 |
|
| 124 |
|
| 125 |
sig_verify_tab = gr.Interface(
|
| 126 |
+
fn=_sig_verify_wrapper,
|
| 127 |
inputs=[
|
| 128 |
gr.Image(label="Firma di riferimento 1 (autentica nota)", type="numpy"),
|
| 129 |
gr.Image(label="Firma di riferimento 2 — opzionale (migliora l'accuratezza)", type="numpy"),
|
|
|
|
| 166 |
# Tab 3 — Signature Detection
|
| 167 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 168 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 169 |
sig_detect_tab = gr.Interface(
|
| 170 |
fn=sig_detect,
|
| 171 |
inputs=[
|
|
|
|
| 207 |
# Tab 4 — Named Entity Recognition
|
| 208 |
# ───────────────────────────────────���──────────────────────────────────────────
|
| 209 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 210 |
ner_tab = gr.Interface(
|
| 211 |
fn=ner_extract,
|
| 212 |
inputs=gr.Textbox(
|
|
|
|
| 263 |
# Tab 5 — Writer Identification
|
| 264 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 265 |
|
| 266 |
+
_writer_example_paths = ensure_writer_examples(WRITER_EXAMPLES_DIR)
|
| 267 |
+
_threading.Thread(
|
| 268 |
+
target=lambda: writer_identify(None, WRITER_SAMPLES_DIR), # pre-warm (returns early on None)
|
| 269 |
+
daemon=True,
|
| 270 |
+
).start()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 271 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 272 |
|
| 273 |
+
def _writer_identify_wrapper(image):
|
| 274 |
+
return writer_identify(image, WRITER_SAMPLES_DIR)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 275 |
|
| 276 |
|
| 277 |
writer_tab = gr.Interface(
|
| 278 |
+
fn=_writer_identify_wrapper,
|
| 279 |
inputs=gr.Image(label="Campione di scrittura a mano da attribuire", type="numpy"),
|
| 280 |
outputs=[
|
| 281 |
gr.Markdown(label="Candidati ordinati per probabilità"),
|
|
|
|
| 314 |
# Tab 6 — Graphological Feature Analysis
|
| 315 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 316 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 317 |
grapho_tab = gr.Interface(
|
| 318 |
fn=grapho_analyse,
|
| 319 |
inputs=gr.Image(label="Immagine di testo manoscritto", type="numpy"),
|
|
|
|
| 354 |
# Tab 7 — Forensic Pipeline
|
| 355 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 356 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 357 |
def run_pipeline(
|
| 358 |
doc_image: np.ndarray,
|
| 359 |
ref_sig: np.ndarray | None,
|
| 360 |
progress: gr.Progress = gr.Progress(track_tqdm=False),
|
| 361 |
):
|
| 362 |
+
"""Gradio generator: yields partial UI outputs after each pipeline step."""
|
| 363 |
+
_ = gr.update()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 364 |
|
| 365 |
if doc_image is None:
|
| 366 |
msg = "Carica il documento da analizzare."
|
|
|
|
| 368 |
gr.update(visible=False), _)
|
| 369 |
return
|
| 370 |
|
| 371 |
+
def _on_progress(step, total, desc):
|
| 372 |
+
progress(step / total, desc=f"Step {step}/{total} — {desc}")
|
| 373 |
+
|
| 374 |
+
results = PipelineResults()
|
| 375 |
+
|
| 376 |
+
for results in run_pipeline_steps(
|
| 377 |
+
doc_image, ref_sig, SIGNET_WEIGHTS, WRITER_SAMPLES_DIR, _on_progress
|
| 378 |
+
):
|
| 379 |
+
yield (
|
| 380 |
+
results.sig_detect_image, results.sig_detect_summary,
|
| 381 |
+
results.htr_text,
|
| 382 |
+
results.ner_highlighted, results.ner_summary,
|
| 383 |
+
results.writer_report, results.writer_chart,
|
| 384 |
+
results.grapho_report, results.grapho_image,
|
| 385 |
+
results.sig_verify_report, results.sig_verify_chart,
|
| 386 |
+
results.final_report,
|
| 387 |
+
gr.update(visible=bool(results.sig_detect_summary)),
|
| 388 |
+
results.llm_report,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 389 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 390 |
|
| 391 |
|
| 392 |
+
def _generate_pipeline_pdf_wrapper(
|
| 393 |
+
s1_img, s1_txt, s2_txt, s3_md,
|
| 394 |
+
s4_md, s4_img, s5_md, s5_img,
|
| 395 |
+
s6_txt, s6_img, llm_text,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 396 |
) -> str:
|
| 397 |
+
results = PipelineResults(
|
| 398 |
+
sig_detect_image=s1_img, sig_detect_summary=s1_txt or "",
|
| 399 |
+
htr_text=s2_txt or "",
|
| 400 |
+
ner_summary=s3_md or "",
|
| 401 |
+
writer_report=s4_md or "", writer_chart=s4_img,
|
| 402 |
+
grapho_report=s5_md or "", grapho_image=s5_img,
|
| 403 |
+
sig_verify_report=s6_txt or "", sig_verify_chart=s6_img,
|
| 404 |
+
llm_report=llm_text or "",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 405 |
)
|
| 406 |
+
return generate_forensic_pdf(results)
|
|
|
|
| 407 |
|
| 408 |
|
| 409 |
with gr.Blocks() as pipeline_tab:
|
|
|
|
| 424 |
)
|
| 425 |
|
| 426 |
with gr.Row():
|
| 427 |
+
pipe_doc = gr.Image(label="Documento da analizzare (testamento, lettera, atto)", type="numpy")
|
| 428 |
+
pipe_ref = gr.Image(label="Firma di riferimento nota — opzionale (per Step 6)", type="numpy")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 429 |
|
| 430 |
pipe_btn = gr.Button("▶ Avvia Analisi Forense", variant="primary", size="lg")
|
| 431 |
|
|
|
|
| 492 |
)
|
| 493 |
|
| 494 |
pdf_btn.click(
|
| 495 |
+
fn=_generate_pipeline_pdf_wrapper,
|
| 496 |
inputs=[
|
| 497 |
out_s1_img, out_s1_txt,
|
| 498 |
out_s2_txt,
|
|
|
|
| 505 |
outputs=pdf_out,
|
| 506 |
)
|
| 507 |
|
|
|
|
| 508 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 509 |
+
# Tab 8 — Document Dating
|
| 510 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 511 |
|
| 512 |
+
def _dating_rank_gradio(files: list) -> str:
|
| 513 |
+
"""Gradio wrapper: converts gr.File objects to file paths for core function."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 514 |
if not files:
|
| 515 |
return "Carica almeno un'immagine di documento."
|
| 516 |
+
paths = [f.name if hasattr(f, "name") else str(f) for f in files]
|
| 517 |
+
return _dating_rank_core(paths)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 518 |
|
| 519 |
|
| 520 |
dating_tab = gr.Interface(
|
| 521 |
+
fn=_dating_rank_gradio,
|
| 522 |
inputs=gr.File(
|
| 523 |
label="Immagini documenti (carica 2 o più)",
|
| 524 |
file_count="multiple",
|
|
|
|
| 535 |
),
|
| 536 |
)
|
| 537 |
|
|
|
|
| 538 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 539 |
# Tab 9 — Consulente Forense IA (RAG + Ollama)
|
| 540 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 541 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 542 |
def _content_str(content) -> str:
|
| 543 |
"""Normalize Gradio 6.x content field (str or list of parts) to plain str."""
|
| 544 |
if isinstance(content, str):
|
|
|
|
| 555 |
|
| 556 |
|
| 557 |
def rag_chat(message: str, history: list):
|
| 558 |
+
"""Gradio streaming wrapper for rag_chat_stream."""
|
|
|
|
|
|
|
|
|
|
|
|
|
| 559 |
if not message or not message.strip():
|
| 560 |
yield history
|
| 561 |
return
|
| 562 |
|
| 563 |
+
# Normalise history content to plain strings for core function
|
| 564 |
+
normalised_history = [
|
| 565 |
+
{"role": msg["role"], "content": _content_str(msg["content"])}
|
| 566 |
+
for msg in history
|
| 567 |
+
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 568 |
|
|
|
|
| 569 |
new_history = history + [
|
| 570 |
{"role": "user", "content": message},
|
| 571 |
{"role": "assistant", "content": ""},
|
| 572 |
]
|
| 573 |
+
|
| 574 |
try:
|
| 575 |
+
for partial, sources_footer in rag_chat_stream(message, normalised_history):
|
| 576 |
+
content = partial + (sources_footer or "")
|
| 577 |
+
new_history[-1]["content"] = content
|
| 578 |
yield new_history
|
| 579 |
except Exception as e:
|
| 580 |
+
new_history[-1]["content"] = f"❌ Errore: {e}"
|
| 581 |
yield new_history
|
|
|
|
| 582 |
|
| 583 |
+
|
| 584 |
+
def _rag_add_docs_wrapper(files):
|
| 585 |
+
return rag_add_docs(files, RAG_CACHE_DIR)
|
| 586 |
+
|
| 587 |
+
|
| 588 |
+
def _rag_remove_doc_wrapper(filename):
|
| 589 |
+
return rag_remove_doc(filename, RAG_CACHE_DIR)
|
| 590 |
|
| 591 |
|
| 592 |
def save_conversation_md(history: list):
|
|
|
|
| 663 |
rag_remove_status = gr.Markdown(label="Esito rimozione")
|
| 664 |
|
| 665 |
rag_upload_btn.click(
|
| 666 |
+
fn=_rag_add_docs_wrapper,
|
| 667 |
inputs=rag_upload,
|
| 668 |
outputs=[rag_upload_status, rag_doc_table],
|
| 669 |
).then(
|
|
|
|
| 672 |
outputs=rag_remove_dd,
|
| 673 |
)
|
| 674 |
rag_remove_btn.click(
|
| 675 |
+
fn=_rag_remove_doc_wrapper,
|
| 676 |
inputs=rag_remove_dd,
|
| 677 |
outputs=[rag_remove_status, rag_doc_table],
|
| 678 |
).then(
|
|
|
|
| 684 |
with gr.Row():
|
| 685 |
rag_model_dd = gr.Dropdown(
|
| 686 |
label="Modello di generazione (Ollama)",
|
| 687 |
+
choices=ollama_list_models(),
|
| 688 |
+
value="llama3.2",
|
| 689 |
interactive=True,
|
| 690 |
scale=3,
|
| 691 |
)
|
|
|
|
| 697 |
"Re-indicizzare i documenti per risultati ottimali.*"
|
| 698 |
)
|
| 699 |
|
| 700 |
+
rag_chatbot = gr.Chatbot(label="Consulente Forense IA", height=500)
|
|
|
|
|
|
|
|
|
|
| 701 |
rag_in = gr.Textbox(
|
| 702 |
+
placeholder=(
|
| 703 |
+
"Es: Come si valuta l'inclinazione della scrittura? (Invio per inviare)"
|
| 704 |
+
if _OLLAMA_AVAILABLE
|
| 705 |
+
else "⚠️ Non disponibile su HF Spaces — esegui localmente con Ollama"
|
| 706 |
+
),
|
| 707 |
lines=1,
|
| 708 |
show_label=False,
|
| 709 |
interactive=_OLLAMA_AVAILABLE,
|
|
|
|
| 734 |
outputs=rag_model_status,
|
| 735 |
)
|
| 736 |
rag_model_refresh.click(
|
| 737 |
+
fn=lambda: gr.update(choices=ollama_list_models(), value="llama3.2"),
|
| 738 |
outputs=rag_model_dd,
|
| 739 |
)
|
| 740 |
|
|
|
|
| 744 |
fn=lambda: ([], "", gr.update(visible=False)),
|
| 745 |
outputs=[rag_chatbot, rag_in, rag_download],
|
| 746 |
)
|
| 747 |
+
rag_save_btn.click(fn=save_conversation_md, inputs=rag_chatbot, outputs=rag_download)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 748 |
|
|
|
|
| 749 |
rag_tab.load(
|
| 750 |
fn=lambda: (
|
| 751 |
gr.update(value=_rag_doc_list()),
|
| 752 |
gr.update(choices=_rag_doc_choices()),
|
| 753 |
+
gr.update(choices=ollama_list_models(), value="llama3.2"),
|
| 754 |
),
|
| 755 |
outputs=[rag_doc_table, rag_remove_dd, rag_model_dd],
|
| 756 |
)
|
| 757 |
|
|
|
|
| 758 |
# ──────────────────────────────────────────────────────────────────────────────
|
| 759 |
# Main App
|
| 760 |
# ──────────────────────────────────────────────────────────────────────────────
|
|
|
|
| 782 |
),
|
| 783 |
)
|
| 784 |
|
| 785 |
+
_threading.Thread(target=lambda: rag_load_docs(RAG_CACHE_DIR), daemon=True).start()
|
| 786 |
|
| 787 |
if __name__ == "__main__":
|
| 788 |
demo.launch(
|
core/__init__.py
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — shared AI logic.
|
| 3 |
+
|
| 4 |
+
This package contains all AI/ML logic, independent of any web framework or UI.
|
| 5 |
+
It is used by:
|
| 6 |
+
- app/grapholab_demo.py (Gradio demo)
|
| 7 |
+
- backend/ (FastAPI professional app, future)
|
| 8 |
+
"""
|
core/dating.py
ADDED
|
@@ -0,0 +1,159 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Document Dating.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- extract_dates() extract and normalize dates from OCR text
|
| 6 |
+
- dating_rank() rank a list of document file paths by date
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import re
|
| 12 |
+
from datetime import datetime
|
| 13 |
+
from pathlib import Path
|
| 14 |
+
|
| 15 |
+
import numpy as np
|
| 16 |
+
from PIL import Image
|
| 17 |
+
|
| 18 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 19 |
+
# Date extraction patterns (Italian)
|
| 20 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 21 |
+
|
| 22 |
+
_DATE_PATTERNS = [
|
| 23 |
+
# "10 gennaio 2024" / "10 gennaio del 2024"
|
| 24 |
+
r"\b(\d{1,2})\s+(gennaio|febbraio|marzo|aprile|maggio|giugno|"
|
| 25 |
+
r"luglio|agosto|settembre|ottobre|novembre|dicembre)\s+(?:del\s+)?(\d{4})\b",
|
| 26 |
+
# "10 gen. 2024" abbreviations
|
| 27 |
+
r"\b(\d{1,2})\s+(gen|feb|mar|apr|mag|giu|lug|ago|set|ott|nov|dic)\.?\s+(\d{4})\b",
|
| 28 |
+
# "10/01/2024" or "10-01-2024" or "10.01.2024"
|
| 29 |
+
r"\b(\d{1,2})[\/\-\.](\d{1,2})[\/\-\.](\d{2,4})\b",
|
| 30 |
+
# "gennaio 2024" (no day)
|
| 31 |
+
r"\b(gennaio|febbraio|marzo|aprile|maggio|giugno|"
|
| 32 |
+
r"luglio|agosto|settembre|ottobre|novembre|dicembre)\s+(\d{4})\b",
|
| 33 |
+
]
|
| 34 |
+
_DATE_RE = re.compile("|".join(_DATE_PATTERNS), re.IGNORECASE)
|
| 35 |
+
|
| 36 |
+
_BIRTH_KW = ("nata", "nato", "nascita", "nasc.", "nata il", "nato il")
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 40 |
+
# Internal helpers
|
| 41 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 42 |
+
|
| 43 |
+
def _try_dateparser(raw: str) -> datetime | None:
|
| 44 |
+
"""Parse a raw date string to datetime using dateparser (Italian-aware)."""
|
| 45 |
+
try:
|
| 46 |
+
import dateparser
|
| 47 |
+
dt = dateparser.parse(
|
| 48 |
+
raw,
|
| 49 |
+
languages=["it", "en"],
|
| 50 |
+
settings={"PREFER_DAY_OF_MONTH": "first", "RETURN_AS_TIMEZONE_AWARE": False},
|
| 51 |
+
)
|
| 52 |
+
if dt and 1800 < dt.year < 2200:
|
| 53 |
+
return dt
|
| 54 |
+
except Exception:
|
| 55 |
+
pass
|
| 56 |
+
return None
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 60 |
+
# Core functions
|
| 61 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 62 |
+
|
| 63 |
+
def extract_dates(text: str) -> list[tuple[str, datetime]]:
|
| 64 |
+
"""Extract and normalize dates from OCR text.
|
| 65 |
+
|
| 66 |
+
Returns a list of (raw_string, datetime) pairs sorted chronologically.
|
| 67 |
+
Uses regex first; falls back to scanning NER DATE entities if nothing found.
|
| 68 |
+
"""
|
| 69 |
+
found: list[tuple[str, datetime]] = []
|
| 70 |
+
|
| 71 |
+
for m in _DATE_RE.finditer(text):
|
| 72 |
+
raw = m.group(0).strip()
|
| 73 |
+
context_before = text[max(0, m.start() - 35): m.start()].lower()
|
| 74 |
+
if any(kw in context_before for kw in _BIRTH_KW):
|
| 75 |
+
continue
|
| 76 |
+
dt = _try_dateparser(raw)
|
| 77 |
+
if dt:
|
| 78 |
+
found.append((raw, dt))
|
| 79 |
+
|
| 80 |
+
# NER fallback
|
| 81 |
+
if not found:
|
| 82 |
+
try:
|
| 83 |
+
from core.ner import ner_extract
|
| 84 |
+
_, ner_md = ner_extract(text)
|
| 85 |
+
for raw in re.findall(r"\*\*([^*]+)\*\*\s*`DATE`", ner_md or ""):
|
| 86 |
+
dt = _try_dateparser(raw)
|
| 87 |
+
if dt:
|
| 88 |
+
found.append((raw, dt))
|
| 89 |
+
except Exception:
|
| 90 |
+
pass
|
| 91 |
+
|
| 92 |
+
# De-duplicate by normalized date
|
| 93 |
+
seen: set[str] = set()
|
| 94 |
+
unique: list[tuple[str, datetime]] = []
|
| 95 |
+
for raw, dt in found:
|
| 96 |
+
key = dt.strftime("%Y-%m-%d")
|
| 97 |
+
if key not in seen:
|
| 98 |
+
seen.add(key)
|
| 99 |
+
unique.append((raw, dt))
|
| 100 |
+
|
| 101 |
+
return sorted(unique, key=lambda x: x[1])
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
def dating_rank(file_paths: list[str | Path]) -> str:
|
| 105 |
+
"""Rank documents by extracted date.
|
| 106 |
+
|
| 107 |
+
Args:
|
| 108 |
+
file_paths: List of image file paths (strings or Path objects).
|
| 109 |
+
|
| 110 |
+
Returns:
|
| 111 |
+
Markdown table with documents sorted chronologically.
|
| 112 |
+
"""
|
| 113 |
+
if not file_paths:
|
| 114 |
+
return "Carica almeno un'immagine di documento."
|
| 115 |
+
|
| 116 |
+
from core.ocr import get_easyocr
|
| 117 |
+
reader = get_easyocr()
|
| 118 |
+
rows: list[tuple[str, str, datetime | None]] = []
|
| 119 |
+
|
| 120 |
+
for fp in file_paths:
|
| 121 |
+
path = Path(fp)
|
| 122 |
+
name = path.name
|
| 123 |
+
try:
|
| 124 |
+
img = Image.open(path).convert("RGB")
|
| 125 |
+
img_np = np.array(img)
|
| 126 |
+
ocr_lines = reader.readtext(img_np, detail=0, paragraph=False)
|
| 127 |
+
text = "\n".join(ocr_lines)
|
| 128 |
+
dates = extract_dates(text)
|
| 129 |
+
if dates:
|
| 130 |
+
raw, dt = dates[-1] # most recent date = document date
|
| 131 |
+
rows.append((name, raw, dt))
|
| 132 |
+
else:
|
| 133 |
+
rows.append((name, "— data non trovata", None))
|
| 134 |
+
except Exception as e:
|
| 135 |
+
rows.append((name, f"Errore: {e}", None))
|
| 136 |
+
|
| 137 |
+
dated = [(n, r, dt) for n, r, dt in rows if dt is not None]
|
| 138 |
+
undated = [(n, r, dt) for n, r, dt in rows if dt is None]
|
| 139 |
+
dated.sort(key=lambda x: x[2])
|
| 140 |
+
sorted_rows = dated + undated
|
| 141 |
+
|
| 142 |
+
lines = [
|
| 143 |
+
"## Datazione Documenti — Risultati\n",
|
| 144 |
+
"| # | Documento | Data estratta | Data normalizzata |",
|
| 145 |
+
"|---|-----------|--------------|-------------------|",
|
| 146 |
+
]
|
| 147 |
+
for i, (name, raw, dt) in enumerate(sorted_rows, 1):
|
| 148 |
+
norm = dt.strftime("%Y-%m-%d") if dt else "—"
|
| 149 |
+
lines.append(f"| {i} | `{name}` | {raw} | {norm} |")
|
| 150 |
+
|
| 151 |
+
if not dated:
|
| 152 |
+
lines.append("\n> Nessuna data rilevata nei documenti caricati.")
|
| 153 |
+
else:
|
| 154 |
+
lines.append(
|
| 155 |
+
f"\n*{len(dated)} document{'o' if len(dated)==1 else 'i'} datato/i, "
|
| 156 |
+
f"{len(undated)} senza data.*"
|
| 157 |
+
)
|
| 158 |
+
|
| 159 |
+
return "\n".join(lines)
|
core/graphology.py
ADDED
|
@@ -0,0 +1,105 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Graphological Feature Analysis.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- grapho_analyse() extract graphological metrics from a handwriting image
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from __future__ import annotations
|
| 9 |
+
|
| 10 |
+
import cv2
|
| 11 |
+
import numpy as np
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 15 |
+
# Core function
|
| 16 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 17 |
+
|
| 18 |
+
def grapho_analyse(image: np.ndarray) -> tuple[str, np.ndarray]:
|
| 19 |
+
"""Analyse graphological features of a handwritten image.
|
| 20 |
+
|
| 21 |
+
Args:
|
| 22 |
+
image: RGB numpy array (H, W, 3).
|
| 23 |
+
|
| 24 |
+
Returns:
|
| 25 |
+
report_md: Markdown table with extracted metrics.
|
| 26 |
+
annotated: Annotated image (bounding boxes on letters) as numpy array.
|
| 27 |
+
"""
|
| 28 |
+
if image is None:
|
| 29 |
+
return "Carica un'immagine di scrittura a mano.", image
|
| 30 |
+
|
| 31 |
+
# Cap to 800 px: adaptive threshold is O(pixels × blockSize)
|
| 32 |
+
h0, w0 = image.shape[:2]
|
| 33 |
+
if max(h0, w0) > 800:
|
| 34 |
+
sc = 800 / max(h0, w0)
|
| 35 |
+
image = cv2.resize(image, (int(w0 * sc), int(h0 * sc)), interpolation=cv2.INTER_AREA)
|
| 36 |
+
|
| 37 |
+
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) if len(image.shape) == 3 else image
|
| 38 |
+
binary = cv2.adaptiveThreshold(
|
| 39 |
+
cv2.GaussianBlur(gray, (5, 5), 0), 255,
|
| 40 |
+
cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 31, 10,
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
# Slant
|
| 44 |
+
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
|
| 45 |
+
angles = []
|
| 46 |
+
for cnt in contours:
|
| 47 |
+
if cv2.contourArea(cnt) >= 20 and len(cnt) >= 5:
|
| 48 |
+
_, _, angle = cv2.fitEllipse(cnt)
|
| 49 |
+
slant = angle - 90.0
|
| 50 |
+
if -60 < slant < 60:
|
| 51 |
+
angles.append(slant)
|
| 52 |
+
slant_mean = float(np.mean(angles)) if angles else 0.0
|
| 53 |
+
slant_std = float(np.std(angles)) if angles else 0.0
|
| 54 |
+
|
| 55 |
+
# Pressure
|
| 56 |
+
ink_mask = binary > 0
|
| 57 |
+
pressure = (255 - gray)[ink_mask]
|
| 58 |
+
pressure_mean = float(pressure.mean()) if len(pressure) else 0.0
|
| 59 |
+
|
| 60 |
+
# Connected components
|
| 61 |
+
num, _, stats, _ = cv2.connectedComponentsWithStats(binary, 8)
|
| 62 |
+
valid = stats[1:][stats[1:, cv2.CC_STAT_AREA] > 15] if num > 1 else np.zeros((0, 5))
|
| 63 |
+
h_mean = float(valid[:, cv2.CC_STAT_HEIGHT].mean()) if len(valid) else 0.0
|
| 64 |
+
w_mean = float(valid[:, cv2.CC_STAT_WIDTH].mean()) if len(valid) else 0.0
|
| 65 |
+
|
| 66 |
+
# Word spacing
|
| 67 |
+
h_proj = binary.sum(axis=0)
|
| 68 |
+
gaps = []
|
| 69 |
+
in_gap, gap_w = False, 0
|
| 70 |
+
for v in h_proj:
|
| 71 |
+
if v == 0:
|
| 72 |
+
in_gap = True
|
| 73 |
+
gap_w += 1
|
| 74 |
+
elif in_gap:
|
| 75 |
+
if gap_w > 5:
|
| 76 |
+
gaps.append(gap_w)
|
| 77 |
+
in_gap = False
|
| 78 |
+
gap_w = 0
|
| 79 |
+
word_spacing = float(np.mean(gaps)) if gaps else 0.0
|
| 80 |
+
|
| 81 |
+
ink_density = ink_mask.mean() * 100
|
| 82 |
+
|
| 83 |
+
# Annotated visualisation
|
| 84 |
+
vis = cv2.cvtColor(binary, cv2.COLOR_GRAY2RGB)
|
| 85 |
+
for cnt in contours:
|
| 86 |
+
if cv2.contourArea(cnt) >= 20:
|
| 87 |
+
x, y, w, h = cv2.boundingRect(cnt)
|
| 88 |
+
cv2.rectangle(vis, (x, y), (x + w, y + h), (0, 180, 255), 1)
|
| 89 |
+
|
| 90 |
+
slant_dir = "destra" if slant_mean > 0 else ("sinistra" if slant_mean < 0 else "verticale")
|
| 91 |
+
report_md = (
|
| 92 |
+
f"**Analisi delle Caratteristiche Grafologiche**\n\n"
|
| 93 |
+
f"| Caratteristica | Valore |\n"
|
| 94 |
+
f"|----------------|--------|\n"
|
| 95 |
+
f"| Inclinazione media lettere | {slant_mean:+.1f}° ({slant_dir}) |\n"
|
| 96 |
+
f"| Variazione inclinazione (σ) | {slant_std:.1f}° |\n"
|
| 97 |
+
f"| Pressione del tratto | {pressure_mean:.1f} / 255 |\n"
|
| 98 |
+
f"| Altezza media lettere | {h_mean:.1f} px |\n"
|
| 99 |
+
f"| Larghezza media lettere | {w_mean:.1f} px |\n"
|
| 100 |
+
f"| Spaziatura media parole | {word_spacing:.1f} px |\n"
|
| 101 |
+
f"| Densità inchiostro | {ink_density:.2f}% |\n"
|
| 102 |
+
f"| Componenti connesse | {len(valid)} |\n\n"
|
| 103 |
+
f"*I bounding box delle lettere sono visibili nell'immagine annotata.*"
|
| 104 |
+
)
|
| 105 |
+
return report_md, vis
|
core/ner.py
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Named Entity Recognition (NER).
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- get_ner() lazy loader for the NER pipeline
|
| 6 |
+
- ner_extract() extract named entities from text, returns structured result
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import os
|
| 12 |
+
import threading
|
| 13 |
+
|
| 14 |
+
from transformers import pipeline as hf_pipeline
|
| 15 |
+
|
| 16 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 17 |
+
# Configuration
|
| 18 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 19 |
+
|
| 20 |
+
NER_MODEL = "Babelscape/wikineural-multilingual-ner"
|
| 21 |
+
|
| 22 |
+
_NER_LABELS = {
|
| 23 |
+
"PER": "Persona",
|
| 24 |
+
"ORG": "Organizzazione",
|
| 25 |
+
"LOC": "Luogo",
|
| 26 |
+
"MISC": "Varie",
|
| 27 |
+
}
|
| 28 |
+
|
| 29 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 30 |
+
# Lazy model loader
|
| 31 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 32 |
+
|
| 33 |
+
_ner_pipeline = None
|
| 34 |
+
_ner_lock = threading.Lock()
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def get_ner():
|
| 38 |
+
"""Return the NER pipeline, loading it on first call (thread-safe)."""
|
| 39 |
+
global _ner_pipeline
|
| 40 |
+
if _ner_pipeline is None:
|
| 41 |
+
with _ner_lock:
|
| 42 |
+
if _ner_pipeline is None:
|
| 43 |
+
import torch
|
| 44 |
+
device = 0 if torch.cuda.is_available() else -1
|
| 45 |
+
print("Loading NER model...")
|
| 46 |
+
_ner_pipeline = hf_pipeline(
|
| 47 |
+
"ner",
|
| 48 |
+
model=NER_MODEL,
|
| 49 |
+
aggregation_strategy="simple",
|
| 50 |
+
device=device,
|
| 51 |
+
)
|
| 52 |
+
return _ner_pipeline
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 56 |
+
# Core function
|
| 57 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 58 |
+
|
| 59 |
+
def ner_extract(text: str) -> tuple[list[tuple[str, str | None]], str]:
|
| 60 |
+
"""Extract named entities from *text*.
|
| 61 |
+
|
| 62 |
+
Returns:
|
| 63 |
+
highlighted: list of (span, label|None) suitable for Gradio HighlightedText
|
| 64 |
+
summary_md: Markdown table of detected entities
|
| 65 |
+
"""
|
| 66 |
+
if not text or not text.strip():
|
| 67 |
+
return [], "Inserisci del testo da analizzare."
|
| 68 |
+
|
| 69 |
+
nlp = get_ner()
|
| 70 |
+
entities = nlp(text)
|
| 71 |
+
|
| 72 |
+
# Build HighlightedText format: list of (span, label|None)
|
| 73 |
+
result: list[tuple[str, str | None]] = []
|
| 74 |
+
prev_end = 0
|
| 75 |
+
for ent in entities:
|
| 76 |
+
start, end = ent["start"], ent["end"]
|
| 77 |
+
if start > prev_end:
|
| 78 |
+
result.append((text[prev_end:start], None))
|
| 79 |
+
result.append((text[start:end], ent["entity_group"]))
|
| 80 |
+
prev_end = end
|
| 81 |
+
if prev_end < len(text):
|
| 82 |
+
result.append((text[prev_end:], None))
|
| 83 |
+
|
| 84 |
+
# Summary Markdown table
|
| 85 |
+
if entities:
|
| 86 |
+
rows = "\n".join(
|
| 87 |
+
f"| **{_NER_LABELS.get(e['entity_group'], e['entity_group'])}** "
|
| 88 |
+
f"(`{e['entity_group']}`) | {e['word']} | {e['score']:.0%} |"
|
| 89 |
+
for e in entities
|
| 90 |
+
)
|
| 91 |
+
summary_md = f"| Tipo | Entità | Confidenza |\n|------|--------|------------|\n{rows}"
|
| 92 |
+
else:
|
| 93 |
+
summary_md = "Nessuna entità trovata."
|
| 94 |
+
|
| 95 |
+
return result, summary_md
|
core/ocr.py
ADDED
|
@@ -0,0 +1,119 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Optical Character Recognition (OCR).
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- get_trocr() lazy loader for TrOCR processor + model
|
| 6 |
+
- get_easyocr() lazy loader for EasyOCR reader (Italian + English)
|
| 7 |
+
- htr_transcribe() transcribe a handwritten image to text
|
| 8 |
+
"""
|
| 9 |
+
|
| 10 |
+
from __future__ import annotations
|
| 11 |
+
|
| 12 |
+
import threading
|
| 13 |
+
|
| 14 |
+
import cv2
|
| 15 |
+
import numpy as np
|
| 16 |
+
|
| 17 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 18 |
+
# Configuration
|
| 19 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 20 |
+
|
| 21 |
+
TROCR_MODEL = "microsoft/trocr-large-handwritten"
|
| 22 |
+
|
| 23 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 24 |
+
# Lazy model loaders
|
| 25 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 26 |
+
|
| 27 |
+
_trocr_processor = None
|
| 28 |
+
_trocr_model = None
|
| 29 |
+
_trocr_lock = threading.Lock()
|
| 30 |
+
|
| 31 |
+
_easyocr_reader = None
|
| 32 |
+
_easyocr_lock = threading.Lock()
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
def get_trocr():
|
| 36 |
+
"""Return (processor, model) for TrOCR, loading on first call (thread-safe)."""
|
| 37 |
+
global _trocr_processor, _trocr_model
|
| 38 |
+
if _trocr_processor is None:
|
| 39 |
+
with _trocr_lock:
|
| 40 |
+
if _trocr_processor is None:
|
| 41 |
+
import torch
|
| 42 |
+
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
|
| 43 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 44 |
+
print("Loading TrOCR...")
|
| 45 |
+
_trocr_processor = TrOCRProcessor.from_pretrained(TROCR_MODEL)
|
| 46 |
+
_trocr_model = VisionEncoderDecoderModel.from_pretrained(TROCR_MODEL).to(device)
|
| 47 |
+
_trocr_model.eval()
|
| 48 |
+
return _trocr_processor, _trocr_model
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def get_easyocr():
|
| 52 |
+
"""Return the EasyOCR reader (Italian + English), loading on first call (thread-safe)."""
|
| 53 |
+
global _easyocr_reader
|
| 54 |
+
if _easyocr_reader is None:
|
| 55 |
+
with _easyocr_lock:
|
| 56 |
+
if _easyocr_reader is None:
|
| 57 |
+
import torch
|
| 58 |
+
import easyocr
|
| 59 |
+
gpu = torch.cuda.is_available()
|
| 60 |
+
print("Loading EasyOCR (Italian)...")
|
| 61 |
+
_easyocr_reader = easyocr.Reader(["it", "en"], gpu=gpu)
|
| 62 |
+
return _easyocr_reader
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 66 |
+
# Internal helpers
|
| 67 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 68 |
+
|
| 69 |
+
def _preprocess_for_htr(image: np.ndarray) -> np.ndarray:
|
| 70 |
+
"""Deskew + CLAHE contrast enhancement, keeping grayscale gradients for EasyOCR."""
|
| 71 |
+
if image.ndim == 3:
|
| 72 |
+
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
|
| 73 |
+
else:
|
| 74 |
+
gray = image.copy()
|
| 75 |
+
|
| 76 |
+
# Deskew via minAreaRect on ink pixels
|
| 77 |
+
_, bw = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
|
| 78 |
+
coords = np.column_stack(np.where(bw > 0))
|
| 79 |
+
if len(coords) > 100:
|
| 80 |
+
angle = cv2.minAreaRect(coords)[-1]
|
| 81 |
+
if angle < -45:
|
| 82 |
+
angle = 90 + angle
|
| 83 |
+
else:
|
| 84 |
+
angle = -angle
|
| 85 |
+
if abs(angle) > 0.3:
|
| 86 |
+
h, w = gray.shape
|
| 87 |
+
M = cv2.getRotationMatrix2D((w / 2, h / 2), angle, 1.0)
|
| 88 |
+
gray = cv2.warpAffine(
|
| 89 |
+
gray, M, (w, h),
|
| 90 |
+
flags=cv2.INTER_CUBIC,
|
| 91 |
+
borderMode=cv2.BORDER_REPLICATE,
|
| 92 |
+
)
|
| 93 |
+
|
| 94 |
+
# CLAHE contrast enhancement
|
| 95 |
+
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
|
| 96 |
+
enhanced = clahe.apply(gray)
|
| 97 |
+
|
| 98 |
+
return cv2.cvtColor(enhanced, cv2.COLOR_GRAY2BGR)
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 102 |
+
# Core function
|
| 103 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 104 |
+
|
| 105 |
+
def htr_transcribe(image: np.ndarray) -> str:
|
| 106 |
+
"""Transcribe a handwritten image to text using EasyOCR.
|
| 107 |
+
|
| 108 |
+
Args:
|
| 109 |
+
image: RGB numpy array (H, W, 3) or grayscale (H, W).
|
| 110 |
+
|
| 111 |
+
Returns:
|
| 112 |
+
Transcribed text as a string. Returns an error message if image is None.
|
| 113 |
+
"""
|
| 114 |
+
if image is None:
|
| 115 |
+
return "Carica un'immagine di testo manoscritto."
|
| 116 |
+
reader = get_easyocr()
|
| 117 |
+
processed = _preprocess_for_htr(image)
|
| 118 |
+
results = reader.readtext(processed, detail=0, paragraph=True)
|
| 119 |
+
return "\n".join(results)
|
core/pipeline.py
ADDED
|
@@ -0,0 +1,330 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Forensic Pipeline and PDF Report Generation.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- run_pipeline_steps() run all 6 AI steps and return structured results
|
| 6 |
+
- generate_forensic_pdf() generate a PDF forensic report from pipeline results
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import io
|
| 12 |
+
import re
|
| 13 |
+
import tempfile
|
| 14 |
+
import unicodedata
|
| 15 |
+
from dataclasses import dataclass, field
|
| 16 |
+
from datetime import datetime
|
| 17 |
+
from pathlib import Path
|
| 18 |
+
from typing import Generator
|
| 19 |
+
|
| 20 |
+
import numpy as np
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 24 |
+
# Data structures
|
| 25 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 26 |
+
|
| 27 |
+
@dataclass
|
| 28 |
+
class PipelineResults:
|
| 29 |
+
"""Structured results from the forensic pipeline."""
|
| 30 |
+
# Step 1 — Signature Detection
|
| 31 |
+
sig_detect_image: np.ndarray | None = None
|
| 32 |
+
sig_detect_summary: str = ""
|
| 33 |
+
sig_crop: np.ndarray | None = None
|
| 34 |
+
|
| 35 |
+
# Step 2 — HTR
|
| 36 |
+
htr_text: str = ""
|
| 37 |
+
|
| 38 |
+
# Step 3 — NER
|
| 39 |
+
ner_highlighted: list = field(default_factory=list)
|
| 40 |
+
ner_summary: str = ""
|
| 41 |
+
|
| 42 |
+
# Step 4 — Writer Identification
|
| 43 |
+
writer_report: str = ""
|
| 44 |
+
writer_chart: np.ndarray | None = None
|
| 45 |
+
|
| 46 |
+
# Step 5 — Graphological Analysis
|
| 47 |
+
grapho_report: str = ""
|
| 48 |
+
grapho_image: np.ndarray | None = None
|
| 49 |
+
|
| 50 |
+
# Step 6 — Signature Verification
|
| 51 |
+
sig_verify_report: str = ""
|
| 52 |
+
sig_verify_chart: np.ndarray | None = None
|
| 53 |
+
|
| 54 |
+
# Final integrated report
|
| 55 |
+
final_report: str = ""
|
| 56 |
+
|
| 57 |
+
# Step 7 — LLM Synthesis
|
| 58 |
+
llm_report: str = ""
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 62 |
+
# Pipeline runner
|
| 63 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 64 |
+
|
| 65 |
+
def run_pipeline_steps(
|
| 66 |
+
doc_image: np.ndarray,
|
| 67 |
+
ref_sig: np.ndarray | None,
|
| 68 |
+
signet_weights: Path,
|
| 69 |
+
writer_samples_dir: Path,
|
| 70 |
+
on_progress: callable | None = None,
|
| 71 |
+
) -> Generator[PipelineResults, None, None]:
|
| 72 |
+
"""Run all forensic pipeline steps, yielding partial results after each step.
|
| 73 |
+
|
| 74 |
+
Args:
|
| 75 |
+
doc_image: Document image (RGB numpy array).
|
| 76 |
+
ref_sig: Optional reference signature for step 6.
|
| 77 |
+
signet_weights: Path to signet.pth weights.
|
| 78 |
+
writer_samples_dir: Path to data/samples/ for writer model.
|
| 79 |
+
on_progress: Optional callback(step: int, total: int, desc: str).
|
| 80 |
+
|
| 81 |
+
Yields:
|
| 82 |
+
PipelineResults after each step (progressively filled).
|
| 83 |
+
"""
|
| 84 |
+
from core.signature import detect_and_crop, sig_verify
|
| 85 |
+
from core.ocr import htr_transcribe
|
| 86 |
+
from core.ner import ner_extract
|
| 87 |
+
from core.writer import writer_identify
|
| 88 |
+
from core.graphology import grapho_analyse
|
| 89 |
+
from core.rag import pipeline_llm_synthesis
|
| 90 |
+
|
| 91 |
+
results = PipelineResults()
|
| 92 |
+
total = 7
|
| 93 |
+
|
| 94 |
+
def _progress(step: int, desc: str):
|
| 95 |
+
if on_progress:
|
| 96 |
+
on_progress(step, total, desc)
|
| 97 |
+
|
| 98 |
+
# Step 1 — Signature Detection
|
| 99 |
+
_progress(1, "Rilevamento firma…")
|
| 100 |
+
results.sig_detect_image, results.sig_crop, results.sig_detect_summary = detect_and_crop(doc_image)
|
| 101 |
+
yield results
|
| 102 |
+
|
| 103 |
+
# Step 2 — HTR
|
| 104 |
+
_progress(2, "Trascrizione HTR…")
|
| 105 |
+
results.htr_text = htr_transcribe(doc_image)
|
| 106 |
+
yield results
|
| 107 |
+
|
| 108 |
+
# Step 3 — NER
|
| 109 |
+
_progress(3, "Riconoscimento entità…")
|
| 110 |
+
text_for_ner = results.htr_text if results.htr_text and results.htr_text.strip() else ""
|
| 111 |
+
if text_for_ner:
|
| 112 |
+
results.ner_highlighted, results.ner_summary = ner_extract(text_for_ner)
|
| 113 |
+
else:
|
| 114 |
+
results.ner_highlighted = []
|
| 115 |
+
results.ner_summary = "Nessun testo trascritto disponibile per il NER."
|
| 116 |
+
yield results
|
| 117 |
+
|
| 118 |
+
# Step 4 — Writer Identification
|
| 119 |
+
_progress(4, "Identificazione scrittore…")
|
| 120 |
+
results.writer_report, results.writer_chart = writer_identify(doc_image, writer_samples_dir)
|
| 121 |
+
yield results
|
| 122 |
+
|
| 123 |
+
# Step 5 — Graphological Analysis
|
| 124 |
+
_progress(5, "Analisi grafologica…")
|
| 125 |
+
results.grapho_report, results.grapho_image = grapho_analyse(doc_image)
|
| 126 |
+
yield results
|
| 127 |
+
|
| 128 |
+
# Step 6 — Signature Verification
|
| 129 |
+
_progress(6, "Verifica firma…")
|
| 130 |
+
if ref_sig is not None:
|
| 131 |
+
query_for_verify = results.sig_crop if results.sig_crop is not None else doc_image
|
| 132 |
+
results.sig_verify_report, results.sig_verify_chart = sig_verify(
|
| 133 |
+
ref_sig, None, query_for_verify, signet_weights
|
| 134 |
+
)
|
| 135 |
+
if results.sig_crop is None:
|
| 136 |
+
results.sig_verify_report += "\n\n⚠️ Nessuna firma estratta — confronto eseguito sull'immagine intera."
|
| 137 |
+
else:
|
| 138 |
+
results.sig_verify_report = (
|
| 139 |
+
"Firma di riferimento non fornita.\n\n"
|
| 140 |
+
"Per abilitare questo step carica una firma autentica nota "
|
| 141 |
+
"nel campo 'Firma di riferimento' sopra."
|
| 142 |
+
)
|
| 143 |
+
results.sig_verify_chart = None
|
| 144 |
+
yield results
|
| 145 |
+
|
| 146 |
+
# Integrated report
|
| 147 |
+
results.final_report = (
|
| 148 |
+
"## Referto Forense Integrato\n\n"
|
| 149 |
+
"---\n\n"
|
| 150 |
+
f"### Step 1 — Rilevamento Firma\n{results.sig_detect_summary}\n\n"
|
| 151 |
+
f"### Step 2 — Trascrizione HTR\n```\n{results.htr_text}\n```\n\n"
|
| 152 |
+
f"### Step 3 — Entità Nominate\n{results.ner_summary}\n\n"
|
| 153 |
+
f"### Step 4 — Identificazione Scrittore\n{results.writer_report}\n\n"
|
| 154 |
+
f"### Step 5 — Caratteristiche Grafologiche\n{results.grapho_report}\n\n"
|
| 155 |
+
f"### Step 6 — Verifica Firma\n{results.sig_verify_report}\n\n"
|
| 156 |
+
"---\n\n"
|
| 157 |
+
"*Referto generato automaticamente da GraphoLab. "
|
| 158 |
+
"Tutti i risultati hanno carattere indicativo e devono essere valutati "
|
| 159 |
+
"da un perito calligrafo qualificato.*"
|
| 160 |
+
)
|
| 161 |
+
yield results
|
| 162 |
+
|
| 163 |
+
# Step 7 — LLM Synthesis
|
| 164 |
+
_progress(7, "Sintesi LLM…")
|
| 165 |
+
results.llm_report = pipeline_llm_synthesis(
|
| 166 |
+
results.sig_detect_summary,
|
| 167 |
+
results.htr_text,
|
| 168 |
+
results.ner_summary,
|
| 169 |
+
results.writer_report,
|
| 170 |
+
results.grapho_report,
|
| 171 |
+
results.sig_verify_report,
|
| 172 |
+
)
|
| 173 |
+
yield results
|
| 174 |
+
|
| 175 |
+
|
| 176 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 177 |
+
# PDF report generation
|
| 178 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 179 |
+
|
| 180 |
+
def generate_forensic_pdf(results: PipelineResults) -> str:
|
| 181 |
+
"""Generate a PDF forensic report from pipeline results. Returns the file path."""
|
| 182 |
+
from fpdf import FPDF
|
| 183 |
+
from PIL import Image as _PILImage
|
| 184 |
+
|
| 185 |
+
def _to_latin1(text: str) -> str:
|
| 186 |
+
if not text:
|
| 187 |
+
return ""
|
| 188 |
+
replacements = {
|
| 189 |
+
"\u2014": "-", "\u2013": "-",
|
| 190 |
+
"\u2018": "'", "\u2019": "'",
|
| 191 |
+
"\u201c": '"', "\u201d": '"',
|
| 192 |
+
"\u2026": "...",
|
| 193 |
+
"\u2022": "*",
|
| 194 |
+
"\u2713": "v", "\u2714": "v",
|
| 195 |
+
"\u2718": "x", "\u2716": "x",
|
| 196 |
+
"\U0001f947": "1.", "\U0001f948": "2.", "\U0001f949": "3.",
|
| 197 |
+
"\u26a0\ufe0f": "(!)", "\u26a0": "(!)",
|
| 198 |
+
"\U0001f50d": "",
|
| 199 |
+
"\U0001f5d1": "",
|
| 200 |
+
}
|
| 201 |
+
for src, dst in replacements.items():
|
| 202 |
+
text = text.replace(src, dst)
|
| 203 |
+
return text.encode("latin-1", errors="replace").decode("latin-1")
|
| 204 |
+
|
| 205 |
+
def _md_to_plain(text: str) -> str:
|
| 206 |
+
if not text:
|
| 207 |
+
return ""
|
| 208 |
+
def _table_row_to_plain(m):
|
| 209 |
+
cells = [c.strip() for c in m.group(0).strip("|").split("|")]
|
| 210 |
+
return " | ".join(c for c in cells if c)
|
| 211 |
+
text = re.sub(r"^[-| ]+$", "", text, flags=re.MULTILINE)
|
| 212 |
+
text = re.sub(r"^\|.*\|$", _table_row_to_plain, text, flags=re.MULTILINE)
|
| 213 |
+
text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
|
| 214 |
+
text = re.sub(r"\*{1,2}(.+?)\*{1,2}", r"\1", text)
|
| 215 |
+
text = re.sub(r"`{1,3}[^`]*`{1,3}", "", text)
|
| 216 |
+
text = re.sub(r"\n{3,}", "\n\n", text)
|
| 217 |
+
return _to_latin1(text.strip())
|
| 218 |
+
|
| 219 |
+
def _numpy_to_jpeg_bytes(arr) -> bytes | None:
|
| 220 |
+
if arr is None:
|
| 221 |
+
return None
|
| 222 |
+
try:
|
| 223 |
+
img = _PILImage.fromarray(arr.astype("uint8"))
|
| 224 |
+
buf = io.BytesIO()
|
| 225 |
+
img.save(buf, format="JPEG", quality=85)
|
| 226 |
+
return buf.getvalue()
|
| 227 |
+
except Exception:
|
| 228 |
+
return None
|
| 229 |
+
|
| 230 |
+
class ForensicPDF(FPDF):
|
| 231 |
+
def header(self):
|
| 232 |
+
self.set_font("Helvetica", "B", 10)
|
| 233 |
+
self.set_text_color(80, 80, 80)
|
| 234 |
+
self.cell(0, 8, "GraphoLab - Referto Forense Integrato", align="C")
|
| 235 |
+
self.ln(2)
|
| 236 |
+
self.set_draw_color(180, 180, 180)
|
| 237 |
+
self.line(10, self.get_y(), 200, self.get_y())
|
| 238 |
+
self.ln(4)
|
| 239 |
+
|
| 240 |
+
def footer(self):
|
| 241 |
+
self.set_y(-15)
|
| 242 |
+
self.set_font("Helvetica", "I", 8)
|
| 243 |
+
self.set_text_color(130, 130, 130)
|
| 244 |
+
self.cell(0, 10, f"Pagina {self.page_no()} - Generato da GraphoLab", align="C")
|
| 245 |
+
|
| 246 |
+
pdf = ForensicPDF()
|
| 247 |
+
pdf.set_auto_page_break(auto=True, margin=18)
|
| 248 |
+
pdf.add_page()
|
| 249 |
+
|
| 250 |
+
pdf.set_font("Helvetica", "B", 18)
|
| 251 |
+
pdf.set_text_color(30, 30, 30)
|
| 252 |
+
pdf.cell(0, 12, "Referto Forense Integrato", align="C")
|
| 253 |
+
pdf.ln(4)
|
| 254 |
+
pdf.set_font("Helvetica", "", 10)
|
| 255 |
+
pdf.set_text_color(100, 100, 100)
|
| 256 |
+
now = datetime.now().strftime("%d/%m/%Y %H:%M")
|
| 257 |
+
pdf.cell(0, 8, f"Data generazione: {now}", align="C")
|
| 258 |
+
pdf.ln(10)
|
| 259 |
+
|
| 260 |
+
def _section_title(title: str):
|
| 261 |
+
pdf.set_font("Helvetica", "B", 12)
|
| 262 |
+
pdf.set_text_color(255, 255, 255)
|
| 263 |
+
pdf.set_fill_color(50, 80, 120)
|
| 264 |
+
pdf.cell(0, 8, _to_latin1(f" {title}"), fill=True)
|
| 265 |
+
pdf.ln(12)
|
| 266 |
+
pdf.set_text_color(30, 30, 30)
|
| 267 |
+
|
| 268 |
+
def _body_text(text: str):
|
| 269 |
+
if not text:
|
| 270 |
+
return
|
| 271 |
+
pdf.set_font("Helvetica", "", 10)
|
| 272 |
+
pdf.set_text_color(40, 40, 40)
|
| 273 |
+
pdf.multi_cell(0, 5, _md_to_plain(text))
|
| 274 |
+
pdf.ln(3)
|
| 275 |
+
|
| 276 |
+
def _embed_image(arr, max_w: int = 170):
|
| 277 |
+
data = _numpy_to_jpeg_bytes(arr)
|
| 278 |
+
if data is None:
|
| 279 |
+
return
|
| 280 |
+
buf = io.BytesIO(data)
|
| 281 |
+
img = _PILImage.open(buf)
|
| 282 |
+
w, h = img.size
|
| 283 |
+
ratio = min(max_w / w, 100 / h)
|
| 284 |
+
disp_w, disp_h = w * ratio, h * ratio
|
| 285 |
+
buf.seek(0)
|
| 286 |
+
x = (210 - disp_w) / 2
|
| 287 |
+
pdf.image(buf, x=x, w=disp_w, h=disp_h)
|
| 288 |
+
pdf.ln(4)
|
| 289 |
+
|
| 290 |
+
_section_title("Step 1 — Rilevamento Firma (YOLOv8)")
|
| 291 |
+
_body_text(results.sig_detect_summary)
|
| 292 |
+
_embed_image(results.sig_detect_image)
|
| 293 |
+
|
| 294 |
+
_section_title("Step 2 — Trascrizione HTR (EasyOCR)")
|
| 295 |
+
_body_text(results.htr_text)
|
| 296 |
+
|
| 297 |
+
_section_title("Step 3 — Riconoscimento Entita' (NER)")
|
| 298 |
+
_body_text(results.ner_summary or "Nessuna entita' rilevata nel testo trascritto.")
|
| 299 |
+
|
| 300 |
+
_section_title("Step 4 — Identificazione Scrittore")
|
| 301 |
+
_body_text(results.writer_report)
|
| 302 |
+
_embed_image(results.writer_chart)
|
| 303 |
+
|
| 304 |
+
_section_title("Step 5 — Analisi Grafologica")
|
| 305 |
+
_body_text(results.grapho_report)
|
| 306 |
+
_embed_image(results.grapho_image)
|
| 307 |
+
|
| 308 |
+
_section_title("Step 6 — Verifica Firma (SigNet)")
|
| 309 |
+
_body_text(results.sig_verify_report)
|
| 310 |
+
_embed_image(results.sig_verify_chart)
|
| 311 |
+
|
| 312 |
+
_section_title("Step 7 — Valutazione LLM (Ollama)")
|
| 313 |
+
_body_text(results.llm_report)
|
| 314 |
+
|
| 315 |
+
pdf.ln(6)
|
| 316 |
+
pdf.set_draw_color(180, 180, 180)
|
| 317 |
+
pdf.line(10, pdf.get_y(), 200, pdf.get_y())
|
| 318 |
+
pdf.ln(4)
|
| 319 |
+
pdf.set_font("Helvetica", "I", 8)
|
| 320 |
+
pdf.set_text_color(120, 120, 120)
|
| 321 |
+
pdf.multi_cell(
|
| 322 |
+
0, 4,
|
| 323 |
+
"Referto generato automaticamente da GraphoLab. "
|
| 324 |
+
"Tutti i risultati hanno carattere indicativo e devono essere valutati "
|
| 325 |
+
"da un perito calligrafo qualificato.",
|
| 326 |
+
)
|
| 327 |
+
|
| 328 |
+
tmp = tempfile.NamedTemporaryFile(suffix=".pdf", prefix="grapholab_referto_", delete=False)
|
| 329 |
+
pdf.output(tmp.name)
|
| 330 |
+
return tmp.name
|
core/rag.py
ADDED
|
@@ -0,0 +1,619 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — RAG (Retrieval-Augmented Generation) + Ollama integration.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- check_ollama() check whether Ollama server is reachable
|
| 6 |
+
- ollama_list_models() list available models
|
| 7 |
+
- set_rag_model() change the active generation model
|
| 8 |
+
- rag_load_docs() load synthetic + cached documents at startup
|
| 9 |
+
- rag_add_docs() index new uploaded PDF/DOCX files
|
| 10 |
+
- rag_remove_doc() remove a document from the index
|
| 11 |
+
- rag_doc_list() list indexed documents
|
| 12 |
+
- rag_doc_choices() list indexed document names
|
| 13 |
+
- rag_retrieve() retrieve top-k chunks for a query
|
| 14 |
+
- stream_ollama() stream tokens from Ollama /api/generate
|
| 15 |
+
- rag_chat_stream() full RAG chat: retrieve + build prompt + stream tokens
|
| 16 |
+
- pipeline_llm_synthesis() LLM synthesis of forensic pipeline results
|
| 17 |
+
"""
|
| 18 |
+
|
| 19 |
+
from __future__ import annotations
|
| 20 |
+
|
| 21 |
+
import hashlib
|
| 22 |
+
import json
|
| 23 |
+
import threading
|
| 24 |
+
from pathlib import Path
|
| 25 |
+
from typing import Generator
|
| 26 |
+
|
| 27 |
+
import numpy as np
|
| 28 |
+
import requests
|
| 29 |
+
|
| 30 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 31 |
+
# Configuration
|
| 32 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 33 |
+
|
| 34 |
+
OLLAMA_URL = "http://localhost:11434"
|
| 35 |
+
OLLAMA_MODEL = "llama3.2"
|
| 36 |
+
|
| 37 |
+
_embed_model = OLLAMA_MODEL # embedding model — changing it invalidates cache
|
| 38 |
+
_rag_model = OLLAMA_MODEL # generation model — selectable via UI
|
| 39 |
+
|
| 40 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 41 |
+
# In-memory state
|
| 42 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 43 |
+
|
| 44 |
+
_rag_chunks: list = []
|
| 45 |
+
_rag_indexed_files: set = set()
|
| 46 |
+
_rag_ready = False
|
| 47 |
+
_rag_lock = threading.Lock()
|
| 48 |
+
|
| 49 |
+
# Built-in synthetic knowledge base
|
| 50 |
+
_RAG_SYNTHETIC_DOCS = [
|
| 51 |
+
(
|
| 52 |
+
"Analisi della pressione",
|
| 53 |
+
"La pressione grafica indica la forza con cui la penna o la matita viene premuta sul foglio. "
|
| 54 |
+
"Una pressione forte (tratti profondi, rilevabili anche sul retro del foglio) è associata a "
|
| 55 |
+
"carattere deciso, vitalità e a volte aggressività. Una pressione leggera (tratti quasi "
|
| 56 |
+
"impercettibili) può indicare sensibilità, adattabilità o, in contesti patologici, stanchezza "
|
| 57 |
+
"e astenia. La pressione irregolare — alternanza di tratti forti e deboli nello stesso scritto — "
|
| 58 |
+
"può segnalare instabilità emotiva, stati di ansia o condizioni neurologiche. In grafologia "
|
| 59 |
+
"forense la pressione è fondamentale per distinguere scritture apposte in condizioni normali "
|
| 60 |
+
"da quelle prodotte sotto costrizione fisica o psicologica.",
|
| 61 |
+
),
|
| 62 |
+
(
|
| 63 |
+
"Inclinazione del tratto",
|
| 64 |
+
"L'inclinazione della scrittura descrive l'angolo dei tratti verticali delle lettere rispetto "
|
| 65 |
+
"alla riga di base. Una scrittura verticale (0°) indica equilibrio e obiettività. "
|
| 66 |
+
"L'inclinazione a destra (>15°) è associata a estroversia, impulsività e orientamento verso "
|
| 67 |
+
"il futuro. L'inclinazione a sinistra (<−10°) può indicare introversione, tendenza al ripiegamento "
|
| 68 |
+
"su se stessi o, in contesti forensi, un tentativo di camuffare la propria calligrafia. "
|
| 69 |
+
"L'inclinazione variabile (misto destra/sinistra nello stesso testo) è indicatore di "
|
| 70 |
+
"instabilità emotiva. La misurazione forense dell'inclinazione avviene tramite analisi "
|
| 71 |
+
"angolare dei tratti ascendenti (h, l, b, f) e discendenti (g, p, q).",
|
| 72 |
+
),
|
| 73 |
+
(
|
| 74 |
+
"Spaziatura grafica",
|
| 75 |
+
"La spaziatura riguarda la distanza tra lettere, parole e righe. Spaziatura ampia tra le parole "
|
| 76 |
+
"indica bisogno di spazio personale, pensiero indipendente e, talvolta, solitudine. "
|
| 77 |
+
"Spaziatura ridotta (parole quasi attaccate) è correlata a socievolezza eccessiva, difficoltà "
|
| 78 |
+
"nei confini relazionali e, in casi estremi, pensiero confusionario. La spaziatura irregolare — "
|
| 79 |
+
"alternanza di parole distanti e ravvicinate — è un indicatore di disorganizzazione cognitiva "
|
| 80 |
+
"o di scrittura non spontanea (es. copiatura o dettatura lenta). In perizie forensi, "
|
| 81 |
+
"la spaziatura viene misurata in millimetri su campioni standardizzati.",
|
| 82 |
+
),
|
| 83 |
+
(
|
| 84 |
+
"Margini e layout",
|
| 85 |
+
"I margini del foglio riflettono il rapporto dello scrittore con l'ambiente e il contesto "
|
| 86 |
+
"sociale. Un margine sinistro ampio e costante indica rispetto delle regole e pianificazione. "
|
| 87 |
+
"Un margine sinistro che si allarga progressivamente (testo che 'scivola' verso destra) "
|
| 88 |
+
"suggerisce entusiasmo crescente o impulsività. Margine destro ampio è associato a prudenza, "
|
| 89 |
+
"timore del futuro e riservatezza. L'assenza di margini (testo che occupa tutto il foglio) "
|
| 90 |
+
"indica esuberanza comunicativa o senso di urgenza. In perizia, il margine aiuta a "
|
| 91 |
+
"distinguere scritti autentici da trascrizioni o copie, poiché l'autore mantiene "
|
| 92 |
+
"inconsciamente le proprie abitudini spaziali.",
|
| 93 |
+
),
|
| 94 |
+
(
|
| 95 |
+
"Firme autentiche",
|
| 96 |
+
"Una firma autentica possiede caratteristiche di naturalezza e fluidità del movimento. "
|
| 97 |
+
"I tratti sono continui, con accelerazione e decelerazione tipiche del gesto automatizzato. "
|
| 98 |
+
"La pressione varia in modo coerente con il ritmo del tratto. I legamenti tra le lettere "
|
| 99 |
+
"sono coerenti con il corpus grafico dello scrittore. La firma autentica presenta micro-tremori "
|
| 100 |
+
"naturali (diversi dai tremori patologici) e piccole variazioni tra esecuzioni successive, "
|
| 101 |
+
"mai perfettamente identiche. In perizia calligrafica, si confrontano almeno 10-15 firme "
|
| 102 |
+
"autentiche per stabilire la 'gamma di variazione naturale' prima di esaminare la firma contestata.",
|
| 103 |
+
),
|
| 104 |
+
(
|
| 105 |
+
"Firme false",
|
| 106 |
+
"Le firme contraffatte si distinguono per diversi indicatori: velocità di esecuzione "
|
| 107 |
+
"innaturalmente lenta (visibile nei 'tocchi' del pennino e nelle esitazioni), tremori "
|
| 108 |
+
"artificiali (regolari, non spontanei), ritocchi e correzioni del tratto, interruzioni "
|
| 109 |
+
"anomale del gesto. La falsificazione per imitazione diretta (calco o copia visiva) produce "
|
| 110 |
+
"una firma con aspetto simile all'originale ma con movimenti invertiti rispetto alla direzione "
|
| 111 |
+
"naturale. Il falsario tende a concentrarsi sulla forma complessiva trascurando i dettagli "
|
| 112 |
+
"minuti (proporzioni tra lettere, angolo di attacco del tratto, pressione). "
|
| 113 |
+
"L'analisi forense utilizza ingrandimenti 10x-40x e, nei casi dubbi, grafometria digitale.",
|
| 114 |
+
),
|
| 115 |
+
(
|
| 116 |
+
"Velocità e ritmo",
|
| 117 |
+
"La velocità di scrittura si manifesta nella forma delle lettere (semplificazione dei tratti "
|
| 118 |
+
"in scrittura rapida), nell'inclinazione (più marcata ad alta velocità), nelle legature "
|
| 119 |
+
"(frequenti in scrittura veloce, assenti in quella lenta). Il ritmo è la regolarità con cui "
|
| 120 |
+
"si alternano tensione e distensione nel movimento grafico. Un ritmo regolare indica "
|
| 121 |
+
"equilibrio psicofisico. Un ritmo aritmico (alternanza caotica di tratti tesi e distesi) "
|
| 122 |
+
"può segnalare stati emotivi alterati, patologie neurologiche o scrittura non spontanea. "
|
| 123 |
+
"In perizia forense la velocità è cruciale: una firma depositata 'lentamente' da una persona "
|
| 124 |
+
"abitualmente veloce è un forte indicatore di contraffazione.",
|
| 125 |
+
),
|
| 126 |
+
(
|
| 127 |
+
"Datazione documenti",
|
| 128 |
+
"La datazione grafica di un documento si basa su elementi intrinseci ed estrinseci. "
|
| 129 |
+
"Elementi intrinseci: evoluzione dello stile grafico dell'autore nel tempo (campioni noti "
|
| 130 |
+
"datati permettono di costruire una 'curva di evoluzione'), deterioramento della calligrafia "
|
| 131 |
+
"legato all'età, variazioni nelle abitudini punteggiatura e abbreviazioni. "
|
| 132 |
+
"Elementi estrinseci: tipo di inchiostro (analisi spettroscopica), supporto cartaceo "
|
| 133 |
+
"(filigrana, composizione chimica), strumento di scrittura (biro, stilografica, matita). "
|
| 134 |
+
"L'analisi dell'inchiostro mediante cromatografia liquida può stabilire se l'inchiostro "
|
| 135 |
+
"è compatibile con la data dichiarata. In perizia, la datazione grafica va sempre "
|
| 136 |
+
"abbinata ad analisi chimiche per raggiungere un grado di certezza forense.",
|
| 137 |
+
),
|
| 138 |
+
]
|
| 139 |
+
|
| 140 |
+
|
| 141 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 142 |
+
# Ollama helpers
|
| 143 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 144 |
+
|
| 145 |
+
def check_ollama() -> bool:
|
| 146 |
+
"""Return True if Ollama server is reachable."""
|
| 147 |
+
try:
|
| 148 |
+
requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 149 |
+
return True
|
| 150 |
+
except Exception:
|
| 151 |
+
return False
|
| 152 |
+
|
| 153 |
+
|
| 154 |
+
def ollama_list_models() -> list[str]:
|
| 155 |
+
"""Return sorted list of model names available in Ollama."""
|
| 156 |
+
try:
|
| 157 |
+
r = requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 158 |
+
models = [m["name"] for m in r.json().get("models", [])]
|
| 159 |
+
return sorted(models) if models else [OLLAMA_MODEL]
|
| 160 |
+
except Exception:
|
| 161 |
+
return [OLLAMA_MODEL]
|
| 162 |
+
|
| 163 |
+
|
| 164 |
+
def set_rag_model(model_name: str) -> str:
|
| 165 |
+
"""Set the active Ollama generation model. Returns a status message."""
|
| 166 |
+
global _rag_model
|
| 167 |
+
if model_name:
|
| 168 |
+
_rag_model = model_name
|
| 169 |
+
return f"✅ Modello attivo: **{_rag_model}**"
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
def stream_ollama(prompt: str) -> Generator[str, None, None]:
|
| 173 |
+
"""Yield response tokens from Ollama one at a time (streaming)."""
|
| 174 |
+
with requests.post(
|
| 175 |
+
f"{OLLAMA_URL}/api/generate",
|
| 176 |
+
json={"model": _rag_model, "prompt": prompt, "stream": True},
|
| 177 |
+
stream=True,
|
| 178 |
+
timeout=120,
|
| 179 |
+
) as r:
|
| 180 |
+
for line in r.iter_lines():
|
| 181 |
+
if line:
|
| 182 |
+
data = json.loads(line)
|
| 183 |
+
if not data.get("done"):
|
| 184 |
+
yield data.get("response", "")
|
| 185 |
+
|
| 186 |
+
|
| 187 |
+
def _ollama_embed(text: str) -> np.ndarray | None:
|
| 188 |
+
try:
|
| 189 |
+
r = requests.post(
|
| 190 |
+
f"{OLLAMA_URL}/api/embeddings",
|
| 191 |
+
json={"model": _embed_model, "prompt": text},
|
| 192 |
+
timeout=30,
|
| 193 |
+
)
|
| 194 |
+
return np.array(r.json()["embedding"], dtype=np.float32)
|
| 195 |
+
except Exception:
|
| 196 |
+
return None
|
| 197 |
+
|
| 198 |
+
|
| 199 |
+
def _ollama_embed_batch(texts: list[str]) -> list[np.ndarray | None]:
|
| 200 |
+
"""Embed a list of texts. Falls back to sequential calls if batch endpoint unavailable."""
|
| 201 |
+
try:
|
| 202 |
+
r = requests.post(
|
| 203 |
+
f"{OLLAMA_URL}/api/embed",
|
| 204 |
+
json={"model": _embed_model, "input": texts},
|
| 205 |
+
timeout=max(30, len(texts) * 3),
|
| 206 |
+
)
|
| 207 |
+
r.raise_for_status()
|
| 208 |
+
data = r.json()
|
| 209 |
+
embeddings = data.get("embeddings") or data.get("embedding")
|
| 210 |
+
if embeddings and len(embeddings) == len(texts):
|
| 211 |
+
return [np.array(e, dtype=np.float32) for e in embeddings]
|
| 212 |
+
except Exception:
|
| 213 |
+
pass
|
| 214 |
+
return [_ollama_embed(t) for t in texts]
|
| 215 |
+
|
| 216 |
+
|
| 217 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 218 |
+
# Cache helpers
|
| 219 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 220 |
+
|
| 221 |
+
def _rag_cache_path(cache_dir: Path, filename: str, file_bytes: bytes) -> Path:
|
| 222 |
+
h = hashlib.sha256(file_bytes).hexdigest()[:8]
|
| 223 |
+
stem = Path(filename).stem[:40]
|
| 224 |
+
safe = "".join(c if c.isalnum() or c in "-_" else "_" for c in stem)
|
| 225 |
+
return cache_dir / f"{safe}_{h}.npz"
|
| 226 |
+
|
| 227 |
+
|
| 228 |
+
def _rag_cache_save(cache_path: Path, chunks: list, filename: str) -> None:
|
| 229 |
+
cache_path.parent.mkdir(parents=True, exist_ok=True)
|
| 230 |
+
good = [c for c in chunks if c["emb"] is not None]
|
| 231 |
+
if not good:
|
| 232 |
+
return
|
| 233 |
+
np.savez_compressed(
|
| 234 |
+
str(cache_path),
|
| 235 |
+
texts=np.array([c["text"] for c in good], dtype=object),
|
| 236 |
+
sources=np.array([c["source"] for c in good], dtype=object),
|
| 237 |
+
embs=np.stack([c["emb"] for c in good]),
|
| 238 |
+
filename=np.array(filename, dtype=object),
|
| 239 |
+
)
|
| 240 |
+
|
| 241 |
+
|
| 242 |
+
def _rag_cache_load(cache_path: Path) -> tuple[list, str]:
|
| 243 |
+
"""Returns (chunks, original_filename)."""
|
| 244 |
+
data = np.load(str(cache_path), allow_pickle=True)
|
| 245 |
+
filename = str(data["filename"])
|
| 246 |
+
chunks = [
|
| 247 |
+
{"text": str(t), "source": str(s), "emb": e}
|
| 248 |
+
for t, s, e in zip(data["texts"], data["sources"], data["embs"])
|
| 249 |
+
]
|
| 250 |
+
return chunks, filename
|
| 251 |
+
|
| 252 |
+
|
| 253 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 254 |
+
# Chunking
|
| 255 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 256 |
+
|
| 257 |
+
def _chunk_text(text: str, source: str, size: int = 500, overlap: int = 50) -> list:
|
| 258 |
+
chunks = []
|
| 259 |
+
start = 0
|
| 260 |
+
while start < len(text):
|
| 261 |
+
end = min(start + size, len(text))
|
| 262 |
+
chunk = text[start:end].strip()
|
| 263 |
+
if chunk:
|
| 264 |
+
chunks.append({"text": chunk, "source": source, "emb": None})
|
| 265 |
+
start += size - overlap
|
| 266 |
+
return chunks
|
| 267 |
+
|
| 268 |
+
|
| 269 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 270 |
+
# Document index queries
|
| 271 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 272 |
+
|
| 273 |
+
def rag_doc_list() -> list[list]:
|
| 274 |
+
"""Return rows [[filename, chunk_count]] (user docs only, not synthetic)."""
|
| 275 |
+
synthetic_sources = {s for s, _ in _RAG_SYNTHETIC_DOCS}
|
| 276 |
+
counts: dict = {}
|
| 277 |
+
for c in _rag_chunks:
|
| 278 |
+
src = c["source"]
|
| 279 |
+
if src not in synthetic_sources:
|
| 280 |
+
counts[src] = counts.get(src, 0) + 1
|
| 281 |
+
return [[name, cnt] for name, cnt in sorted(counts.items())]
|
| 282 |
+
|
| 283 |
+
|
| 284 |
+
def rag_doc_choices() -> list[str]:
|
| 285 |
+
return [row[0] for row in rag_doc_list()]
|
| 286 |
+
|
| 287 |
+
|
| 288 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 289 |
+
# Document loading and indexing
|
| 290 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 291 |
+
|
| 292 |
+
def _extract_pdf_text(path: Path) -> str:
|
| 293 |
+
"""Extract text from a PDF, falling back to EasyOCR for scanned pages."""
|
| 294 |
+
full_text = []
|
| 295 |
+
try:
|
| 296 |
+
import pypdf
|
| 297 |
+
except ImportError:
|
| 298 |
+
print(f"[RAG] pypdf not installed — skipping {path.name}")
|
| 299 |
+
return ""
|
| 300 |
+
try:
|
| 301 |
+
reader = pypdf.PdfReader(str(path))
|
| 302 |
+
for page_num, page in enumerate(reader.pages):
|
| 303 |
+
page_text = page.extract_text() or ""
|
| 304 |
+
if len(page_text.strip()) >= 50:
|
| 305 |
+
full_text.append(page_text)
|
| 306 |
+
else:
|
| 307 |
+
try:
|
| 308 |
+
import fitz
|
| 309 |
+
import numpy as np
|
| 310 |
+
from core.ocr import get_easyocr
|
| 311 |
+
doc = fitz.open(str(path))
|
| 312 |
+
fitz_page = doc[page_num]
|
| 313 |
+
mat = fitz.Matrix(150 / 72, 150 / 72)
|
| 314 |
+
pix = fitz_page.get_pixmap(matrix=mat)
|
| 315 |
+
img_arr = np.frombuffer(pix.samples, dtype=np.uint8).reshape(
|
| 316 |
+
pix.height, pix.width, pix.n
|
| 317 |
+
)
|
| 318 |
+
if pix.n == 4:
|
| 319 |
+
img_arr = img_arr[:, :, :3]
|
| 320 |
+
ocr_result = get_easyocr().readtext(img_arr, detail=0, paragraph=True)
|
| 321 |
+
full_text.append(" ".join(ocr_result))
|
| 322 |
+
doc.close()
|
| 323 |
+
except ImportError:
|
| 324 |
+
print(f"[RAG] pymupdf not installed — cannot OCR scanned page {page_num+1}")
|
| 325 |
+
except Exception as e:
|
| 326 |
+
print(f"[RAG] OCR error on page {page_num+1} of {path.name}: {e}")
|
| 327 |
+
except Exception as e:
|
| 328 |
+
print(f"[RAG] Error reading PDF {path.name}: {e}")
|
| 329 |
+
return "\n".join(full_text)
|
| 330 |
+
|
| 331 |
+
|
| 332 |
+
def rag_load_docs(cache_dir: Path) -> None:
|
| 333 |
+
"""Load synthetic knowledge + cached user documents at startup (call once in background)."""
|
| 334 |
+
global _rag_chunks, _rag_indexed_files, _rag_ready
|
| 335 |
+
with _rag_lock:
|
| 336 |
+
chunks: list = []
|
| 337 |
+
for source, text in _RAG_SYNTHETIC_DOCS:
|
| 338 |
+
chunks.extend(_chunk_text(text, source))
|
| 339 |
+
|
| 340 |
+
cache_dir.mkdir(parents=True, exist_ok=True)
|
| 341 |
+
for cache_file in sorted(cache_dir.glob("*.npz")):
|
| 342 |
+
try:
|
| 343 |
+
cached_chunks, orig_filename = _rag_cache_load(cache_file)
|
| 344 |
+
chunks.extend(cached_chunks)
|
| 345 |
+
_rag_indexed_files.add(orig_filename)
|
| 346 |
+
print(f"[RAG] Loaded from cache: {orig_filename} ({len(cached_chunks)} chunks)")
|
| 347 |
+
except Exception as e:
|
| 348 |
+
print(f"[RAG] Corrupt cache file {cache_file.name}: {e} — skipping")
|
| 349 |
+
|
| 350 |
+
_rag_chunks = chunks
|
| 351 |
+
_rag_ready = True
|
| 352 |
+
print(f"[RAG] Chunks loaded: {len(chunks)} (synthetic + cached)")
|
| 353 |
+
|
| 354 |
+
to_embed = [c for c in _rag_chunks if c["emb"] is None]
|
| 355 |
+
if to_embed:
|
| 356 |
+
embeddings = _ollama_embed_batch([c["text"] for c in to_embed])
|
| 357 |
+
embedded = 0
|
| 358 |
+
for chunk, emb in zip(to_embed, embeddings):
|
| 359 |
+
if emb is not None:
|
| 360 |
+
chunk["emb"] = emb
|
| 361 |
+
embedded += 1
|
| 362 |
+
print(f"[RAG] Synthetic embedding done: {embedded} chunks")
|
| 363 |
+
|
| 364 |
+
|
| 365 |
+
def rag_add_docs(files: list, cache_dir: Path) -> tuple[str, list]:
|
| 366 |
+
"""Index uploaded PDF/DOCX files. Returns (status_message, doc_list)."""
|
| 367 |
+
global _rag_indexed_files
|
| 368 |
+
if not files:
|
| 369 |
+
return "Nessun file caricato.", rag_doc_list()
|
| 370 |
+
try:
|
| 371 |
+
requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 372 |
+
except Exception:
|
| 373 |
+
return (
|
| 374 |
+
"❌ Ollama non raggiungibile — i documenti non possono essere indicizzati.\n"
|
| 375 |
+
"Avvia `ollama serve` e ricarica.",
|
| 376 |
+
rag_doc_list(),
|
| 377 |
+
)
|
| 378 |
+
lines = []
|
| 379 |
+
for f in files:
|
| 380 |
+
path = Path(f.name)
|
| 381 |
+
suffix = path.suffix.lower()
|
| 382 |
+
if path.name in _rag_indexed_files:
|
| 383 |
+
lines.append(f"ℹ️ `{path.name}` — già indicizzato, saltato.")
|
| 384 |
+
continue
|
| 385 |
+
|
| 386 |
+
file_bytes = path.read_bytes()
|
| 387 |
+
cache_path = _rag_cache_path(cache_dir, path.name, file_bytes)
|
| 388 |
+
|
| 389 |
+
if cache_path.exists():
|
| 390 |
+
try:
|
| 391 |
+
cached_chunks, _ = _rag_cache_load(cache_path)
|
| 392 |
+
with _rag_lock:
|
| 393 |
+
_rag_chunks.extend(cached_chunks)
|
| 394 |
+
_rag_indexed_files.add(path.name)
|
| 395 |
+
lines.append(f"✅ `{path.name}` — {len(cached_chunks)} chunk caricati dalla cache.")
|
| 396 |
+
continue
|
| 397 |
+
except Exception:
|
| 398 |
+
pass
|
| 399 |
+
|
| 400 |
+
try:
|
| 401 |
+
if suffix == ".pdf":
|
| 402 |
+
text = _extract_pdf_text(path)
|
| 403 |
+
elif suffix in (".docx", ".doc"):
|
| 404 |
+
import docx as _docx
|
| 405 |
+
doc_obj = _docx.Document(str(path))
|
| 406 |
+
text = "\n".join(p.text for p in doc_obj.paragraphs)
|
| 407 |
+
else:
|
| 408 |
+
lines.append(f"⚠️ `{path.name}` — formato non supportato (solo PDF/DOCX).")
|
| 409 |
+
continue
|
| 410 |
+
except Exception as e:
|
| 411 |
+
lines.append(f"❌ `{path.name}` — errore: {e}")
|
| 412 |
+
continue
|
| 413 |
+
|
| 414 |
+
if not text.strip():
|
| 415 |
+
lines.append(f"⚠️ `{path.name}` — nessun testo estratto.")
|
| 416 |
+
continue
|
| 417 |
+
|
| 418 |
+
chunks = _chunk_text(text, path.name)
|
| 419 |
+
embeddings = _ollama_embed_batch([c["text"] for c in chunks])
|
| 420 |
+
embedded = 0
|
| 421 |
+
for chunk, emb in zip(chunks, embeddings):
|
| 422 |
+
if emb is not None:
|
| 423 |
+
chunk["emb"] = emb
|
| 424 |
+
embedded += 1
|
| 425 |
+
|
| 426 |
+
try:
|
| 427 |
+
_rag_cache_save(cache_path, chunks, path.name)
|
| 428 |
+
except Exception as e:
|
| 429 |
+
print(f"[RAG] Cache write failed for {path.name}: {e}")
|
| 430 |
+
|
| 431 |
+
with _rag_lock:
|
| 432 |
+
_rag_chunks.extend(chunks)
|
| 433 |
+
_rag_indexed_files.add(path.name)
|
| 434 |
+
lines.append(f"✅ `{path.name}` — {len(chunks)} chunk, {embedded} indicizzati.")
|
| 435 |
+
|
| 436 |
+
return "\n".join(lines), rag_doc_list()
|
| 437 |
+
|
| 438 |
+
|
| 439 |
+
def rag_remove_doc(filename: str, cache_dir: Path) -> tuple[str, list]:
|
| 440 |
+
"""Remove all chunks for a document from memory and delete its cache file."""
|
| 441 |
+
global _rag_chunks, _rag_indexed_files
|
| 442 |
+
if not filename or not filename.strip():
|
| 443 |
+
return "Nessun documento selezionato.", rag_doc_list()
|
| 444 |
+
|
| 445 |
+
with _rag_lock:
|
| 446 |
+
before = len(_rag_chunks)
|
| 447 |
+
_rag_chunks = [c for c in _rag_chunks if c["source"] != filename]
|
| 448 |
+
removed_chunks = before - len(_rag_chunks)
|
| 449 |
+
_rag_indexed_files.discard(filename)
|
| 450 |
+
|
| 451 |
+
deleted_files = 0
|
| 452 |
+
if cache_dir.exists():
|
| 453 |
+
for cache_file in cache_dir.glob("*.npz"):
|
| 454 |
+
try:
|
| 455 |
+
with np.load(str(cache_file), allow_pickle=True) as data:
|
| 456 |
+
match = str(data["filename"]) == filename
|
| 457 |
+
if match:
|
| 458 |
+
cache_file.unlink()
|
| 459 |
+
deleted_files += 1
|
| 460 |
+
except Exception:
|
| 461 |
+
pass
|
| 462 |
+
|
| 463 |
+
if removed_chunks == 0:
|
| 464 |
+
return f"⚠️ `{filename}` non trovato nell'indice.", rag_doc_list()
|
| 465 |
+
|
| 466 |
+
msg = f"🗑️ `{filename}` rimosso ({removed_chunks} chunk eliminati"
|
| 467 |
+
if deleted_files:
|
| 468 |
+
msg += ", cache eliminata"
|
| 469 |
+
msg += ")."
|
| 470 |
+
return msg, rag_doc_list()
|
| 471 |
+
|
| 472 |
+
|
| 473 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 474 |
+
# Retrieval
|
| 475 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 476 |
+
|
| 477 |
+
def rag_retrieve(question: str) -> tuple[list | None, str | None]:
|
| 478 |
+
"""Return (results, error_str). results is list of (score, chunk)."""
|
| 479 |
+
embedded_chunks = [c for c in _rag_chunks if c["emb"] is not None]
|
| 480 |
+
if not embedded_chunks:
|
| 481 |
+
total = len(_rag_chunks)
|
| 482 |
+
return None, (
|
| 483 |
+
f"⏳ Embedding in corso (0/{total} chunk pronti). "
|
| 484 |
+
"Riprovare tra qualche secondo — l'indicizzazione procede in background."
|
| 485 |
+
)
|
| 486 |
+
q_emb = _ollama_embed(question)
|
| 487 |
+
if q_emb is None:
|
| 488 |
+
return None, "❌ Impossibile generare l'embedding della domanda. Ollama è in esecuzione?"
|
| 489 |
+
|
| 490 |
+
synthetic_sources = {s for s, _ in _RAG_SYNTHETIC_DOCS}
|
| 491 |
+
user_chunks = [c for c in _rag_chunks if c["emb"] is not None and c["source"] not in synthetic_sources]
|
| 492 |
+
synth_chunks = [c for c in _rag_chunks if c["emb"] is not None and c["source"] in synthetic_sources]
|
| 493 |
+
|
| 494 |
+
def _top_k_from(pool, q, k):
|
| 495 |
+
if not pool:
|
| 496 |
+
return []
|
| 497 |
+
embs = np.stack([c["emb"] for c in pool])
|
| 498 |
+
q_n = q / (np.linalg.norm(q) + 1e-9)
|
| 499 |
+
norms = np.linalg.norm(embs, axis=1, keepdims=True) + 1e-9
|
| 500 |
+
scores = (embs / norms) @ q_n
|
| 501 |
+
idxs = np.argsort(scores)[::-1][:k]
|
| 502 |
+
return [(float(scores[i]), pool[i]) for i in idxs]
|
| 503 |
+
|
| 504 |
+
user_results = _top_k_from(user_chunks, q_emb, 2)
|
| 505 |
+
synth_results = _top_k_from(synth_chunks, q_emb, 2 if user_results else 4)
|
| 506 |
+
return user_results + synth_results, None
|
| 507 |
+
|
| 508 |
+
|
| 509 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 510 |
+
# Chat stream (framework-agnostic)
|
| 511 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 512 |
+
|
| 513 |
+
def rag_chat_stream(
|
| 514 |
+
message: str,
|
| 515 |
+
history: list[dict],
|
| 516 |
+
) -> Generator[tuple[str, str | None], None, None]:
|
| 517 |
+
"""Core RAG chat logic. Yields (partial_response, sources_footer|None).
|
| 518 |
+
|
| 519 |
+
The caller (e.g. Gradio wrapper) is responsible for formatting history.
|
| 520 |
+
history is a list of {"role": "user"|"assistant", "content": str}.
|
| 521 |
+
"""
|
| 522 |
+
if not check_ollama():
|
| 523 |
+
yield (
|
| 524 |
+
"❌ **Ollama non raggiungibile.**\n\n"
|
| 525 |
+
"Avvia il server con:\n```\nollama serve\n```\n"
|
| 526 |
+
"e assicurati che il modello sia scaricato:\n"
|
| 527 |
+
"```\nollama pull llama3.2\n```",
|
| 528 |
+
None,
|
| 529 |
+
)
|
| 530 |
+
return
|
| 531 |
+
|
| 532 |
+
if not _rag_ready:
|
| 533 |
+
yield "⏳ Indice della knowledge base in costruzione, riprovare tra qualche secondo…", None
|
| 534 |
+
return
|
| 535 |
+
|
| 536 |
+
results, err = rag_retrieve(message)
|
| 537 |
+
if err:
|
| 538 |
+
yield err, None
|
| 539 |
+
return
|
| 540 |
+
|
| 541 |
+
context = "\n\n".join(f"[{c['source']}]\n{c['text']}" for _, c in results)
|
| 542 |
+
|
| 543 |
+
recent = history[-12:] if len(history) > 12 else history
|
| 544 |
+
conv_text = ""
|
| 545 |
+
i = 0
|
| 546 |
+
while i < len(recent) - 1:
|
| 547 |
+
if recent[i]["role"] == "user" and recent[i + 1]["role"] == "assistant":
|
| 548 |
+
u = recent[i]["content"]
|
| 549 |
+
a = recent[i + 1]["content"].split("\n\n---\n")[0]
|
| 550 |
+
conv_text += f"Utente: {u}\nAssistente: {a}\n\n"
|
| 551 |
+
i += 2
|
| 552 |
+
else:
|
| 553 |
+
i += 1
|
| 554 |
+
|
| 555 |
+
prompt = (
|
| 556 |
+
"Sei un esperto di grafologia forense. Rispondi in italiano, in modo preciso e "
|
| 557 |
+
"conciso, basandoti ESCLUSIVAMENTE sui seguenti estratti.\n\n"
|
| 558 |
+
f"{context}\n\n"
|
| 559 |
+
)
|
| 560 |
+
if conv_text:
|
| 561 |
+
prompt += f"Conversazione precedente:\n{conv_text}\n"
|
| 562 |
+
prompt += f"Domanda: {message}\n\nRisposta:"
|
| 563 |
+
|
| 564 |
+
sources = list(dict.fromkeys(c["source"] for _, c in results))
|
| 565 |
+
sources_footer = f"\n\n---\n*Fonti: {', '.join(sources)}*"
|
| 566 |
+
|
| 567 |
+
partial = ""
|
| 568 |
+
try:
|
| 569 |
+
for token in stream_ollama(prompt):
|
| 570 |
+
partial += token
|
| 571 |
+
yield partial, None
|
| 572 |
+
except Exception as e:
|
| 573 |
+
yield f"❌ Errore nella generazione: {e}", None
|
| 574 |
+
return
|
| 575 |
+
|
| 576 |
+
yield partial, sources_footer
|
| 577 |
+
|
| 578 |
+
|
| 579 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 580 |
+
# Pipeline LLM synthesis
|
| 581 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 582 |
+
|
| 583 |
+
def pipeline_llm_synthesis(
|
| 584 |
+
step1_summary: str,
|
| 585 |
+
step2_text: str,
|
| 586 |
+
step3_summary: str,
|
| 587 |
+
step4_report: str,
|
| 588 |
+
step5_report: str,
|
| 589 |
+
step6_report: str,
|
| 590 |
+
) -> str:
|
| 591 |
+
"""Call Ollama to synthesise forensic pipeline results into a narrative report."""
|
| 592 |
+
try:
|
| 593 |
+
requests.get(f"{OLLAMA_URL}/api/tags", timeout=3)
|
| 594 |
+
except Exception:
|
| 595 |
+
return (
|
| 596 |
+
"❌ **Ollama non raggiungibile.** Avvia il server con:\n"
|
| 597 |
+
"```\nollama serve\n```"
|
| 598 |
+
)
|
| 599 |
+
prompt = (
|
| 600 |
+
"Sei un perito calligrafo forense esperto. "
|
| 601 |
+
"Sulla base delle seguenti analisi tecniche su un documento, "
|
| 602 |
+
"fornisci in italiano una valutazione complessiva professionale: "
|
| 603 |
+
"evidenzia elementi di interesse forense, coerenze e incoerenze tra i risultati, "
|
| 604 |
+
"e suggerisci eventuali ulteriori verifiche.\n\n"
|
| 605 |
+
f"=== RILEVAMENTO FIRMA ===\n{step1_summary}\n\n"
|
| 606 |
+
f"=== TRASCRIZIONE HTR ===\n{step2_text}\n\n"
|
| 607 |
+
f"=== ENTITÀ RICONOSCIUTE (NER) ===\n{step3_summary}\n\n"
|
| 608 |
+
f"=== IDENTIFICAZIONE AUTORE ===\n{step4_report}\n\n"
|
| 609 |
+
f"=== ANALISI GRAFOLOGICA ===\n{step5_report}\n\n"
|
| 610 |
+
f"=== VERIFICA FIRMA ===\n{step6_report}\n\n"
|
| 611 |
+
"Valutazione forense integrata:"
|
| 612 |
+
)
|
| 613 |
+
result = ""
|
| 614 |
+
try:
|
| 615 |
+
for token in stream_ollama(prompt):
|
| 616 |
+
result += token
|
| 617 |
+
except Exception as e:
|
| 618 |
+
return f"❌ Errore nella generazione LLM: {e}"
|
| 619 |
+
return result if result else "*(Nessuna risposta dal modello)*"
|
core/signature.py
ADDED
|
@@ -0,0 +1,395 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Signature Verification and Detection.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- get_signet() lazy loader for the SigNet model
|
| 6 |
+
- get_yolo() lazy loader for the YOLOv8 signature detector
|
| 7 |
+
- preprocess_signature() sigver-compatible preprocessing
|
| 8 |
+
- sig_verify() verify signature authenticity (SigNet)
|
| 9 |
+
- sig_detect() detect signature locations in a document (YOLO)
|
| 10 |
+
- detect_and_crop() detect + return annotated image and first crop
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
from __future__ import annotations
|
| 14 |
+
|
| 15 |
+
import io
|
| 16 |
+
import os
|
| 17 |
+
import tempfile
|
| 18 |
+
import threading
|
| 19 |
+
from collections import OrderedDict
|
| 20 |
+
from pathlib import Path
|
| 21 |
+
|
| 22 |
+
import cv2
|
| 23 |
+
import matplotlib
|
| 24 |
+
matplotlib.use("Agg")
|
| 25 |
+
import matplotlib.pyplot as plt
|
| 26 |
+
import numpy as np
|
| 27 |
+
import torch
|
| 28 |
+
import torch.nn as nn
|
| 29 |
+
import torch.nn.functional as F
|
| 30 |
+
from PIL import Image
|
| 31 |
+
from skimage import filters, transform as sk_transform
|
| 32 |
+
|
| 33 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 34 |
+
# Configuration
|
| 35 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 36 |
+
|
| 37 |
+
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
|
| 38 |
+
SIGNET_CANVAS = (952, 1360)
|
| 39 |
+
SIG_THRESHOLD = 0.35
|
| 40 |
+
|
| 41 |
+
YOLO_REPO = "tech4humans/yolov8s-signature-detector"
|
| 42 |
+
YOLO_FILENAME = "yolov8s.pt"
|
| 43 |
+
|
| 44 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 45 |
+
# SigNet architecture
|
| 46 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 47 |
+
|
| 48 |
+
def _conv_bn_relu(in_ch, out_ch, kernel, stride=1, pad=0):
|
| 49 |
+
return nn.Sequential(OrderedDict([
|
| 50 |
+
("conv", nn.Conv2d(in_ch, out_ch, kernel, stride, pad, bias=False)),
|
| 51 |
+
("bn", nn.BatchNorm2d(out_ch)),
|
| 52 |
+
("relu", nn.ReLU()),
|
| 53 |
+
]))
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
def _linear_bn_relu(in_f, out_f):
|
| 57 |
+
return nn.Sequential(OrderedDict([
|
| 58 |
+
("fc", nn.Linear(in_f, out_f, bias=False)),
|
| 59 |
+
("bn", nn.BatchNorm1d(out_f)),
|
| 60 |
+
("relu", nn.ReLU()),
|
| 61 |
+
]))
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
class SigNet(nn.Module):
|
| 65 |
+
"""SigNet feature extractor (sigver re-implementation, output: 2048-d L2-normalised)."""
|
| 66 |
+
def __init__(self):
|
| 67 |
+
super().__init__()
|
| 68 |
+
self.conv_layers = nn.Sequential(OrderedDict([
|
| 69 |
+
("conv1", _conv_bn_relu(1, 96, 11, stride=4)),
|
| 70 |
+
("maxpool1", nn.MaxPool2d(3, 2)),
|
| 71 |
+
("conv2", _conv_bn_relu(96, 256, 5, pad=2)),
|
| 72 |
+
("maxpool2", nn.MaxPool2d(3, 2)),
|
| 73 |
+
("conv3", _conv_bn_relu(256, 384, 3, pad=1)),
|
| 74 |
+
("conv4", _conv_bn_relu(384, 384, 3, pad=1)),
|
| 75 |
+
("conv5", _conv_bn_relu(384, 256, 3, pad=1)),
|
| 76 |
+
("maxpool3", nn.MaxPool2d(3, 2)),
|
| 77 |
+
]))
|
| 78 |
+
self.fc_layers = nn.Sequential(OrderedDict([
|
| 79 |
+
("fc1", _linear_bn_relu(256 * 3 * 5, 2048)),
|
| 80 |
+
("fc2", _linear_bn_relu(2048, 2048)),
|
| 81 |
+
]))
|
| 82 |
+
|
| 83 |
+
def forward_once(self, x):
|
| 84 |
+
x = self.conv_layers(x)
|
| 85 |
+
x = x.view(x.size(0), 256 * 3 * 5)
|
| 86 |
+
x = self.fc_layers(x)
|
| 87 |
+
return F.normalize(x, p=2, dim=1)
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 91 |
+
# Lazy model loaders
|
| 92 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 93 |
+
|
| 94 |
+
_signet = None
|
| 95 |
+
_signet_pretrained = False
|
| 96 |
+
_signet_lock = threading.Lock()
|
| 97 |
+
|
| 98 |
+
_yolo_model = None
|
| 99 |
+
_yolo_lock = threading.Lock()
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
def get_signet(weights_path: Path):
|
| 103 |
+
"""Return the SigNet model, loading weights on first call (thread-safe)."""
|
| 104 |
+
global _signet, _signet_pretrained
|
| 105 |
+
if _signet is None:
|
| 106 |
+
with _signet_lock:
|
| 107 |
+
if _signet is None:
|
| 108 |
+
model = SigNet().to(DEVICE).eval()
|
| 109 |
+
if weights_path.exists():
|
| 110 |
+
state_dict, _, _ = torch.load(weights_path, map_location=DEVICE)
|
| 111 |
+
model.load_state_dict(state_dict)
|
| 112 |
+
_signet_pretrained = True
|
| 113 |
+
print("SigNet: loaded pre-trained weights from", weights_path)
|
| 114 |
+
else:
|
| 115 |
+
print("SigNet: no pre-trained weights found — using random initialisation.")
|
| 116 |
+
_signet = model
|
| 117 |
+
return _signet
|
| 118 |
+
|
| 119 |
+
|
| 120 |
+
def get_yolo():
|
| 121 |
+
"""Return the YOLO signature detector, downloading on first call (thread-safe)."""
|
| 122 |
+
global _yolo_model
|
| 123 |
+
if _yolo_model is None:
|
| 124 |
+
with _yolo_lock:
|
| 125 |
+
if _yolo_model is None:
|
| 126 |
+
from huggingface_hub import hf_hub_download
|
| 127 |
+
from ultralytics import YOLO
|
| 128 |
+
print("Loading YOLOv8 signature detector...")
|
| 129 |
+
hf_token = os.environ.get("HF_TOKEN")
|
| 130 |
+
model_path = hf_hub_download(
|
| 131 |
+
repo_id=YOLO_REPO, filename=YOLO_FILENAME, token=hf_token
|
| 132 |
+
)
|
| 133 |
+
_yolo_model = YOLO(model_path)
|
| 134 |
+
return _yolo_model
|
| 135 |
+
|
| 136 |
+
|
| 137 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 138 |
+
# Internal helpers
|
| 139 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 140 |
+
|
| 141 |
+
def preprocess_signature(pil_img: Image.Image) -> torch.Tensor:
|
| 142 |
+
"""Sigver-compatible preprocessing: centre on canvas, invert, resize to 150×220."""
|
| 143 |
+
arr = np.array(pil_img.convert("L"), dtype=np.uint8)
|
| 144 |
+
canvas = np.ones(SIGNET_CANVAS, dtype=np.uint8) * 255
|
| 145 |
+
try:
|
| 146 |
+
threshold = filters.threshold_otsu(arr)
|
| 147 |
+
blurred = filters.gaussian(arr, 2, preserve_range=True)
|
| 148 |
+
binary = blurred > threshold
|
| 149 |
+
rows, cols = np.where(binary == 0)
|
| 150 |
+
if len(rows) == 0:
|
| 151 |
+
raise ValueError("empty")
|
| 152 |
+
cropped = arr[rows.min():rows.max(), cols.min():cols.max()]
|
| 153 |
+
r_center = int(rows.mean() - rows.min())
|
| 154 |
+
c_center = int(cols.mean() - cols.min())
|
| 155 |
+
r_start = max(0, SIGNET_CANVAS[0] // 2 - r_center)
|
| 156 |
+
c_start = max(0, SIGNET_CANVAS[1] // 2 - c_center)
|
| 157 |
+
h = min(cropped.shape[0], SIGNET_CANVAS[0] - r_start)
|
| 158 |
+
w = min(cropped.shape[1], SIGNET_CANVAS[1] - c_start)
|
| 159 |
+
canvas[r_start:r_start + h, c_start:c_start + w] = cropped[:h, :w]
|
| 160 |
+
canvas[canvas > threshold] = 255
|
| 161 |
+
except Exception:
|
| 162 |
+
canvas = arr
|
| 163 |
+
|
| 164 |
+
inverted = 255 - canvas
|
| 165 |
+
resized = sk_transform.resize(inverted, (150, 220), preserve_range=True,
|
| 166 |
+
anti_aliasing=True).astype(np.uint8)
|
| 167 |
+
tensor = torch.from_numpy(resized).float().div(255)
|
| 168 |
+
return tensor.unsqueeze(0).unsqueeze(0).to(DEVICE)
|
| 169 |
+
|
| 170 |
+
|
| 171 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 172 |
+
# Core functions
|
| 173 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 174 |
+
|
| 175 |
+
def sig_verify(
|
| 176 |
+
ref_image: np.ndarray,
|
| 177 |
+
ref_image2: np.ndarray | None,
|
| 178 |
+
query_image: np.ndarray,
|
| 179 |
+
weights_path: Path,
|
| 180 |
+
) -> tuple[str, np.ndarray | None]:
|
| 181 |
+
"""Verify signature authenticity using SigNet.
|
| 182 |
+
|
| 183 |
+
Args:
|
| 184 |
+
ref_image: Known authentic signature (numpy RGB array).
|
| 185 |
+
ref_image2: Optional second reference (improves accuracy).
|
| 186 |
+
query_image: Signature to verify (numpy RGB array).
|
| 187 |
+
weights_path: Path to signet.pth weights file.
|
| 188 |
+
|
| 189 |
+
Returns:
|
| 190 |
+
report: Text report with verdict and distances.
|
| 191 |
+
chart: Matplotlib visualisation as numpy array.
|
| 192 |
+
"""
|
| 193 |
+
if ref_image is None or query_image is None:
|
| 194 |
+
return "Carica la firma di riferimento e quella da verificare.", None
|
| 195 |
+
|
| 196 |
+
model = get_signet(weights_path)
|
| 197 |
+
|
| 198 |
+
with torch.no_grad():
|
| 199 |
+
emb_ref1 = model.forward_once(preprocess_signature(Image.fromarray(ref_image)))
|
| 200 |
+
if ref_image2 is not None:
|
| 201 |
+
emb_ref2 = model.forward_once(preprocess_signature(Image.fromarray(ref_image2)))
|
| 202 |
+
mean_ref = F.normalize(emb_ref1 + emb_ref2, p=2, dim=1)
|
| 203 |
+
n_refs = 2
|
| 204 |
+
else:
|
| 205 |
+
mean_ref = emb_ref1
|
| 206 |
+
n_refs = 1
|
| 207 |
+
emb_query = model.forward_once(preprocess_signature(Image.fromarray(query_image)))
|
| 208 |
+
|
| 209 |
+
cosine_sim = F.cosine_similarity(mean_ref, emb_query).item()
|
| 210 |
+
cosine_dist = 1.0 - cosine_sim
|
| 211 |
+
verdict = "AUTENTICA ✓" if cosine_dist < SIG_THRESHOLD else "FALSA ✗"
|
| 212 |
+
color = "#2ca02c" if cosine_dist < SIG_THRESHOLD else "#d62728"
|
| 213 |
+
|
| 214 |
+
weights_note = (
|
| 215 |
+
"Modello: SigNet — pesi pre-addestrati GPDS (luizgh/sigver)."
|
| 216 |
+
if _signet_pretrained else
|
| 217 |
+
"⚠️ ATTENZIONE: pesi casuali — risultati non significativi.\n"
|
| 218 |
+
"Scarica signet.pth da luizgh/sigver e posizionalo in models/signet.pth."
|
| 219 |
+
)
|
| 220 |
+
report = (
|
| 221 |
+
f"Esito: {verdict}\n"
|
| 222 |
+
f"Similarità coseno: {cosine_sim:.4f}\n"
|
| 223 |
+
f"Distanza coseno: {cosine_dist:.4f} (soglia: {SIG_THRESHOLD})\n"
|
| 224 |
+
f"Riferimenti usati: {n_refs}"
|
| 225 |
+
+ (" (embedding mediato)" if n_refs > 1 else "") + "\n\n"
|
| 226 |
+
+ weights_note
|
| 227 |
+
)
|
| 228 |
+
|
| 229 |
+
# Matplotlib visualisation
|
| 230 |
+
n_img_panels = 2 + (1 if ref_image2 is not None else 0)
|
| 231 |
+
width_ratios = ([1] * n_img_panels) + [1.4]
|
| 232 |
+
fig, axes = plt.subplots(
|
| 233 |
+
1, n_img_panels + 1,
|
| 234 |
+
figsize=(3.2 * (n_img_panels + 1), 3.2),
|
| 235 |
+
gridspec_kw={"width_ratios": width_ratios},
|
| 236 |
+
)
|
| 237 |
+
panels = [ref_image]
|
| 238 |
+
labels = ["Rif. 1"]
|
| 239 |
+
if ref_image2 is not None:
|
| 240 |
+
panels.append(ref_image2)
|
| 241 |
+
labels.append("Rif. 2")
|
| 242 |
+
panels.append(query_image)
|
| 243 |
+
labels.append("Da verificare")
|
| 244 |
+
|
| 245 |
+
for ax, img, lbl in zip(axes[:-1], panels, labels):
|
| 246 |
+
ax.imshow(img, cmap="gray" if img.ndim == 2 else None)
|
| 247 |
+
ax.set_title(lbl, fontsize=10)
|
| 248 |
+
ax.axis("off")
|
| 249 |
+
|
| 250 |
+
ax_g = axes[-1]
|
| 251 |
+
ax_g.set_xlim(0, 1)
|
| 252 |
+
ax_g.set_ylim(0, 1)
|
| 253 |
+
ax_g.axis("off")
|
| 254 |
+
ax_g.text(0.5, 0.82, verdict, ha="center", va="center",
|
| 255 |
+
fontsize=14, fontweight="bold", color=color,
|
| 256 |
+
transform=ax_g.transAxes)
|
| 257 |
+
|
| 258 |
+
bar_ax = fig.add_axes([
|
| 259 |
+
axes[-1].get_position().x0 + 0.01,
|
| 260 |
+
axes[-1].get_position().y0 + 0.12,
|
| 261 |
+
axes[-1].get_position().width - 0.02,
|
| 262 |
+
0.18,
|
| 263 |
+
])
|
| 264 |
+
bar_ax.barh([0], [cosine_dist], color=color, alpha=0.75, height=0.6)
|
| 265 |
+
bar_ax.barh([0], [1.0 - cosine_dist], left=cosine_dist,
|
| 266 |
+
color="#cccccc", alpha=0.4, height=0.6)
|
| 267 |
+
bar_ax.axvline(SIG_THRESHOLD, color="black", linestyle="--", linewidth=1.2)
|
| 268 |
+
bar_ax.set_xlim(0, 1)
|
| 269 |
+
bar_ax.set_ylim(-0.5, 0.5)
|
| 270 |
+
bar_ax.set_yticks([])
|
| 271 |
+
bar_ax.set_xticks([0, SIG_THRESHOLD, 1])
|
| 272 |
+
bar_ax.set_xticklabels(["0", f"soglia\n{SIG_THRESHOLD}", "1"], fontsize=7)
|
| 273 |
+
bar_ax.set_xlabel(f"Distanza coseno: {cosine_dist:.3f}", fontsize=8)
|
| 274 |
+
|
| 275 |
+
plt.suptitle("Verifica Autenticità Firma — SigNet", fontsize=11, fontweight="bold")
|
| 276 |
+
plt.tight_layout()
|
| 277 |
+
|
| 278 |
+
buf = io.BytesIO()
|
| 279 |
+
fig.savefig(buf, format="png", dpi=130, bbox_inches="tight")
|
| 280 |
+
plt.close(fig)
|
| 281 |
+
buf.seek(0)
|
| 282 |
+
chart = np.array(Image.open(buf))
|
| 283 |
+
|
| 284 |
+
return report, chart
|
| 285 |
+
|
| 286 |
+
|
| 287 |
+
def sig_detect(
|
| 288 |
+
image: np.ndarray,
|
| 289 |
+
conf_threshold: float,
|
| 290 |
+
) -> tuple[np.ndarray, str]:
|
| 291 |
+
"""Detect signature locations in a document image using YOLO.
|
| 292 |
+
|
| 293 |
+
Args:
|
| 294 |
+
image: RGB numpy array of the document.
|
| 295 |
+
conf_threshold: Confidence threshold (0.1–0.9).
|
| 296 |
+
|
| 297 |
+
Returns:
|
| 298 |
+
annotated: Image with bounding boxes drawn.
|
| 299 |
+
summary: Markdown summary string.
|
| 300 |
+
"""
|
| 301 |
+
if image is None:
|
| 302 |
+
return image, "Carica un'immagine del documento."
|
| 303 |
+
try:
|
| 304 |
+
yolo = get_yolo()
|
| 305 |
+
except Exception as e:
|
| 306 |
+
msg = (
|
| 307 |
+
"⚠️ **Modello non disponibile.**\n\n"
|
| 308 |
+
"Il modello `tech4humans/yolov8s-signature-detector` è ad accesso limitato su Hugging Face.\n\n"
|
| 309 |
+
"**Per abilitare questa sezione:**\n"
|
| 310 |
+
"1. Crea un account su huggingface.co\n"
|
| 311 |
+
"2. Richiedi l'accesso su huggingface.co/tech4humans/yolov8s-signature-detector\n"
|
| 312 |
+
"3. Crea un token su huggingface.co/settings/tokens\n"
|
| 313 |
+
"4. Imposta la variabile d'ambiente `HF_TOKEN=<il_tuo_token>` prima di avviare l'app\n\n"
|
| 314 |
+
f"Errore: {e}"
|
| 315 |
+
)
|
| 316 |
+
return image, msg
|
| 317 |
+
|
| 318 |
+
pil_img = Image.fromarray(image).convert("RGB")
|
| 319 |
+
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp:
|
| 320 |
+
pil_img.save(tmp.name)
|
| 321 |
+
tmp_path = tmp.name
|
| 322 |
+
|
| 323 |
+
results = yolo.predict(tmp_path, conf=conf_threshold, verbose=False)
|
| 324 |
+
os.unlink(tmp_path)
|
| 325 |
+
|
| 326 |
+
result = results[0]
|
| 327 |
+
annotated = image.copy()
|
| 328 |
+
count = 0
|
| 329 |
+
|
| 330 |
+
if result.boxes is not None:
|
| 331 |
+
for box in result.boxes:
|
| 332 |
+
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy().astype(int)
|
| 333 |
+
conf = float(box.conf[0].cpu())
|
| 334 |
+
cv2.rectangle(annotated, (x1, y1), (x2, y2), (255, 0, 0), 2)
|
| 335 |
+
cv2.putText(annotated, f"Sig #{count+1} {conf:.0%}",
|
| 336 |
+
(x1, max(y1 - 8, 0)),
|
| 337 |
+
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 0), 2)
|
| 338 |
+
count += 1
|
| 339 |
+
|
| 340 |
+
summary = (
|
| 341 |
+
f"Rilevat{'a' if count == 1 else 'e'} {count} firma{'' if count == 1 else 'e'} "
|
| 342 |
+
f"(confidenza ≥ {conf_threshold:.0%})\n\n"
|
| 343 |
+
f"**Modello:** `tech4humans/yolov8s-signature-detector`\n"
|
| 344 |
+
f"**Uso forense:** Estrazione automatica di firme da documenti legali."
|
| 345 |
+
)
|
| 346 |
+
return annotated, summary
|
| 347 |
+
|
| 348 |
+
|
| 349 |
+
def detect_and_crop(
|
| 350 |
+
image: np.ndarray,
|
| 351 |
+
conf_threshold: float = 0.3,
|
| 352 |
+
) -> tuple[np.ndarray, np.ndarray | None, str]:
|
| 353 |
+
"""Run YOLO detection and return (annotated, first_crop, summary).
|
| 354 |
+
|
| 355 |
+
Gracefully degrades when YOLO is not available (missing HF_TOKEN).
|
| 356 |
+
"""
|
| 357 |
+
annotated = image.copy()
|
| 358 |
+
try:
|
| 359 |
+
yolo = get_yolo()
|
| 360 |
+
except Exception:
|
| 361 |
+
return annotated, None, "⚠️ Rilevamento firma non disponibile (HF_TOKEN mancante)."
|
| 362 |
+
|
| 363 |
+
pil_img = Image.fromarray(image).convert("RGB")
|
| 364 |
+
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp:
|
| 365 |
+
pil_img.save(tmp.name)
|
| 366 |
+
tmp_path = tmp.name
|
| 367 |
+
|
| 368 |
+
results = yolo.predict(tmp_path, conf=conf_threshold, verbose=False)
|
| 369 |
+
os.unlink(tmp_path)
|
| 370 |
+
|
| 371 |
+
result = results[0]
|
| 372 |
+
first_crop: np.ndarray | None = None
|
| 373 |
+
count = 0
|
| 374 |
+
|
| 375 |
+
if result.boxes is not None:
|
| 376 |
+
for box in result.boxes:
|
| 377 |
+
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy().astype(int)
|
| 378 |
+
conf = float(box.conf[0].cpu())
|
| 379 |
+
cv2.rectangle(annotated, (x1, y1), (x2, y2), (255, 0, 0), 2)
|
| 380 |
+
cv2.putText(annotated, f"Sig #{count+1} {conf:.0%}",
|
| 381 |
+
(x1, max(y1 - 8, 0)),
|
| 382 |
+
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 0), 2)
|
| 383 |
+
if count == 0:
|
| 384 |
+
x1c = max(0, x1); y1c = max(0, y1)
|
| 385 |
+
x2c = min(image.shape[1], x2); y2c = min(image.shape[0], y2)
|
| 386 |
+
if x2c > x1c and y2c > y1c:
|
| 387 |
+
first_crop = image[y1c:y2c, x1c:x2c]
|
| 388 |
+
count += 1
|
| 389 |
+
|
| 390 |
+
summary = (
|
| 391 |
+
f"Rilevat{'a' if count == 1 else 'e'} {count} firma{'' if count == 1 else 'e'}."
|
| 392 |
+
if count > 0
|
| 393 |
+
else "Nessuna firma rilevata nel documento."
|
| 394 |
+
)
|
| 395 |
+
return annotated, first_crop, summary
|
core/writer.py
ADDED
|
@@ -0,0 +1,353 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
GraphoLab core — Writer Identification.
|
| 3 |
+
|
| 4 |
+
Provides:
|
| 5 |
+
- writer_identify() identify the writer of a handwriting sample
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from __future__ import annotations
|
| 9 |
+
|
| 10 |
+
import io
|
| 11 |
+
import threading
|
| 12 |
+
from pathlib import Path
|
| 13 |
+
|
| 14 |
+
import matplotlib
|
| 15 |
+
matplotlib.use("Agg")
|
| 16 |
+
import matplotlib.pyplot as plt
|
| 17 |
+
import numpy as np
|
| 18 |
+
from PIL import Image, ImageDraw, ImageFont
|
| 19 |
+
from skimage import filters, transform as sk_transform
|
| 20 |
+
from skimage.feature import hog, local_binary_pattern
|
| 21 |
+
from sklearn.pipeline import Pipeline
|
| 22 |
+
from sklearn.preprocessing import LabelEncoder, StandardScaler
|
| 23 |
+
from sklearn.svm import SVC
|
| 24 |
+
|
| 25 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 26 |
+
# Configuration
|
| 27 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 28 |
+
|
| 29 |
+
WRITER_IMG_SIZE = (128, 256) # (H, W) for feature extraction
|
| 30 |
+
|
| 31 |
+
_WRITER_NAMES = {
|
| 32 |
+
0: "Scrittore A",
|
| 33 |
+
1: "Scrittore B",
|
| 34 |
+
2: "Scrittore C",
|
| 35 |
+
3: "Scrittore D",
|
| 36 |
+
4: "Scrittore E",
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
_FONTS_DIR = Path("C:/Windows/Fonts")
|
| 40 |
+
_WRITER_FONTS = [
|
| 41 |
+
("Inkfree.ttf", 19),
|
| 42 |
+
("LHANDW.TTF", 17),
|
| 43 |
+
("segoepr.ttf", 18),
|
| 44 |
+
("segoesc.ttf", 16),
|
| 45 |
+
("comic.ttf", 18),
|
| 46 |
+
]
|
| 47 |
+
|
| 48 |
+
_SENTENCES = [
|
| 49 |
+
"il gatto dorme sul tetto",
|
| 50 |
+
"la casa è piccola e bella",
|
| 51 |
+
"oggi il cielo è molto blu",
|
| 52 |
+
"scrivere a mano è un'arte",
|
| 53 |
+
"ogni persona ha uno stile",
|
| 54 |
+
"il sole tramonta a ovest",
|
| 55 |
+
"leggo un libro ogni sera",
|
| 56 |
+
"la penna scorre sul foglio",
|
| 57 |
+
"le parole raccontano storie",
|
| 58 |
+
"questo è un campione scritto",
|
| 59 |
+
]
|
| 60 |
+
|
| 61 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 62 |
+
# Lazy model state
|
| 63 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 64 |
+
|
| 65 |
+
_writer_clf: Pipeline | None = None
|
| 66 |
+
_writer_le: LabelEncoder | None = None
|
| 67 |
+
_writer_X_scaled: np.ndarray | None = None
|
| 68 |
+
_writer_dist_threshold: float | None = None
|
| 69 |
+
_writer_lock = threading.Lock()
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 73 |
+
# Internal helpers
|
| 74 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 75 |
+
|
| 76 |
+
def _make_synthetic_writer(writer_id: int, sample_id: int) -> Image.Image:
|
| 77 |
+
"""Generate a synthetic handwriting sample using system TTF fonts."""
|
| 78 |
+
rng = np.random.default_rng(writer_id * 1000 + sample_id)
|
| 79 |
+
font_name, base_size = _WRITER_FONTS[writer_id % len(_WRITER_FONTS)]
|
| 80 |
+
font_size = base_size + int(rng.integers(-1, 2))
|
| 81 |
+
try:
|
| 82 |
+
font = ImageFont.truetype(str(_FONTS_DIR / font_name), font_size)
|
| 83 |
+
except Exception:
|
| 84 |
+
font = ImageFont.load_default()
|
| 85 |
+
|
| 86 |
+
ink_value = int([25, 15, 35, 20, 30][writer_id % 5] + rng.integers(-5, 6))
|
| 87 |
+
lines = [
|
| 88 |
+
_SENTENCES[(writer_id * 3 + sample_id + i) % len(_SENTENCES)]
|
| 89 |
+
for i in range(3)
|
| 90 |
+
]
|
| 91 |
+
|
| 92 |
+
w, h = 320, 140
|
| 93 |
+
img = Image.new("L", (w, h), 255)
|
| 94 |
+
draw = ImageDraw.Draw(img)
|
| 95 |
+
line_gap = font_size + 12 + int(rng.integers(-2, 3))
|
| 96 |
+
y = 10
|
| 97 |
+
for line in lines:
|
| 98 |
+
x = 8 + int(rng.integers(-3, 4))
|
| 99 |
+
draw.text((x, y), line, fill=ink_value, font=font)
|
| 100 |
+
y += line_gap
|
| 101 |
+
|
| 102 |
+
angle = float(rng.uniform(-1.5, 1.5))
|
| 103 |
+
img = img.rotate(angle, fillcolor=255, expand=False)
|
| 104 |
+
return img
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
def _preprocess_writer_img(pil_img: Image.Image) -> np.ndarray:
|
| 108 |
+
"""Convert PIL image to normalised grayscale array of WRITER_IMG_SIZE."""
|
| 109 |
+
gray = pil_img.convert("L")
|
| 110 |
+
w, h = gray.size
|
| 111 |
+
target_ratio = WRITER_IMG_SIZE[1] / WRITER_IMG_SIZE[0] # 2.0
|
| 112 |
+
if h > w:
|
| 113 |
+
crop_h = int(w / target_ratio)
|
| 114 |
+
top = h // 6
|
| 115 |
+
top = min(top, max(0, h - crop_h))
|
| 116 |
+
gray = gray.crop((0, top, w, top + crop_h))
|
| 117 |
+
arr = np.array(gray, dtype=np.float32)
|
| 118 |
+
thresh = filters.threshold_otsu(arr) if arr.std() > 1 else 128.0
|
| 119 |
+
binary = (arr < thresh).astype(np.float32)
|
| 120 |
+
resized = sk_transform.resize(binary, WRITER_IMG_SIZE, anti_aliasing=True)
|
| 121 |
+
return resized.astype(np.float32)
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def _extract_writer_features(pil_img: Image.Image) -> np.ndarray:
|
| 125 |
+
"""Extract HOG + LBP + run-length features for writer identification."""
|
| 126 |
+
arr = _preprocess_writer_img(pil_img)
|
| 127 |
+
arr8 = (arr * 255).astype(np.uint8)
|
| 128 |
+
|
| 129 |
+
hog_feats = hog(
|
| 130 |
+
arr,
|
| 131 |
+
orientations=9,
|
| 132 |
+
pixels_per_cell=(16, 16),
|
| 133 |
+
cells_per_block=(2, 2),
|
| 134 |
+
feature_vector=True,
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
lbp = local_binary_pattern(arr8, P=24, R=3, method="uniform")
|
| 138 |
+
lbp_hist, _ = np.histogram(lbp, bins=26, range=(0, 26), density=True)
|
| 139 |
+
|
| 140 |
+
def _run_stats(binary_row):
|
| 141 |
+
runs = []
|
| 142 |
+
cnt = 0
|
| 143 |
+
for v in binary_row:
|
| 144 |
+
if v > 0.5:
|
| 145 |
+
cnt += 1
|
| 146 |
+
elif cnt > 0:
|
| 147 |
+
runs.append(cnt)
|
| 148 |
+
cnt = 0
|
| 149 |
+
if cnt > 0:
|
| 150 |
+
runs.append(cnt)
|
| 151 |
+
return runs
|
| 152 |
+
|
| 153 |
+
h_runs, v_runs = [], []
|
| 154 |
+
for row in arr:
|
| 155 |
+
h_runs.extend(_run_stats(row))
|
| 156 |
+
for col in arr.T:
|
| 157 |
+
v_runs.extend(_run_stats(col))
|
| 158 |
+
|
| 159 |
+
h_arr = np.array(h_runs, dtype=np.float32) if h_runs else np.array([0.0])
|
| 160 |
+
v_arr = np.array(v_runs, dtype=np.float32) if v_runs else np.array([0.0])
|
| 161 |
+
run_feats = np.array([
|
| 162 |
+
h_arr.mean(), h_arr.std(), h_arr.max(),
|
| 163 |
+
v_arr.mean(), v_arr.std(), v_arr.max(),
|
| 164 |
+
], dtype=np.float32)
|
| 165 |
+
|
| 166 |
+
return np.concatenate([hog_feats, lbp_hist, run_feats])
|
| 167 |
+
|
| 168 |
+
|
| 169 |
+
def _load_real_writer_samples(samples_dir: Path) -> tuple[list, list] | None:
|
| 170 |
+
"""Load samples from data/samples/writer_XX/sample_YY.png directories."""
|
| 171 |
+
writer_dirs = sorted(samples_dir.glob("writer_??"))
|
| 172 |
+
if len(writer_dirs) < 2:
|
| 173 |
+
return None
|
| 174 |
+
X, y = [], []
|
| 175 |
+
for wd in writer_dirs:
|
| 176 |
+
samples = sorted(wd.glob("sample_*.png"))
|
| 177 |
+
if len(samples) < 3:
|
| 178 |
+
continue
|
| 179 |
+
for sp in samples:
|
| 180 |
+
try:
|
| 181 |
+
img = Image.open(sp)
|
| 182 |
+
X.append(_extract_writer_features(img))
|
| 183 |
+
y.append(wd.name)
|
| 184 |
+
except Exception:
|
| 185 |
+
pass
|
| 186 |
+
if len(set(y)) < 2:
|
| 187 |
+
return None
|
| 188 |
+
return X, y
|
| 189 |
+
|
| 190 |
+
|
| 191 |
+
def _get_writer_model(samples_dir: Path):
|
| 192 |
+
"""Return (Pipeline, LabelEncoder), training lazily on first call (thread-safe)."""
|
| 193 |
+
global _writer_clf, _writer_le, _writer_X_scaled, _writer_dist_threshold
|
| 194 |
+
if _writer_clf is not None:
|
| 195 |
+
return _writer_clf, _writer_le
|
| 196 |
+
with _writer_lock:
|
| 197 |
+
if _writer_clf is not None:
|
| 198 |
+
return _writer_clf, _writer_le
|
| 199 |
+
print("Training writer identification model...")
|
| 200 |
+
|
| 201 |
+
real = _load_real_writer_samples(samples_dir)
|
| 202 |
+
if real is not None:
|
| 203 |
+
X_raw, labels = real
|
| 204 |
+
else:
|
| 205 |
+
X_raw, labels = [], []
|
| 206 |
+
for wid in range(5):
|
| 207 |
+
for sid in range(10):
|
| 208 |
+
img = _make_synthetic_writer(wid, sid)
|
| 209 |
+
X_raw.append(_extract_writer_features(img))
|
| 210 |
+
labels.append(_WRITER_NAMES[wid])
|
| 211 |
+
|
| 212 |
+
le = LabelEncoder()
|
| 213 |
+
y_enc = le.fit_transform(labels)
|
| 214 |
+
X = np.array(X_raw)
|
| 215 |
+
|
| 216 |
+
clf = Pipeline([
|
| 217 |
+
("scaler", StandardScaler()),
|
| 218 |
+
("svc", SVC(kernel="rbf", C=10, gamma="scale", probability=True)),
|
| 219 |
+
])
|
| 220 |
+
clf.fit(X, y_enc)
|
| 221 |
+
|
| 222 |
+
X_scaled = clf.named_steps["scaler"].transform(X)
|
| 223 |
+
max_intra = 0.0
|
| 224 |
+
for cls in np.unique(y_enc):
|
| 225 |
+
Xc = X_scaled[y_enc == cls]
|
| 226 |
+
if len(Xc) < 2:
|
| 227 |
+
continue
|
| 228 |
+
diff = Xc[:, np.newaxis, :] - Xc[np.newaxis, :, :]
|
| 229 |
+
dists = np.sqrt((diff ** 2).sum(axis=2))
|
| 230 |
+
np.fill_diagonal(dists, np.inf)
|
| 231 |
+
max_intra = max(max_intra, dists.min(axis=1).max())
|
| 232 |
+
|
| 233 |
+
_writer_X_scaled = X_scaled
|
| 234 |
+
_writer_dist_threshold = max_intra * 2.0
|
| 235 |
+
_writer_clf = clf
|
| 236 |
+
_writer_le = le
|
| 237 |
+
print(
|
| 238 |
+
f"Writer model ready — {len(le.classes_)} writers, {len(X)} samples. "
|
| 239 |
+
f"Rejection threshold: {_writer_dist_threshold:.3f}"
|
| 240 |
+
)
|
| 241 |
+
return _writer_clf, _writer_le
|
| 242 |
+
|
| 243 |
+
|
| 244 |
+
def ensure_writer_examples(examples_dir: Path) -> list[str]:
|
| 245 |
+
"""Pre-generate example images for UI examples."""
|
| 246 |
+
examples_dir.mkdir(parents=True, exist_ok=True)
|
| 247 |
+
paths = []
|
| 248 |
+
for wid in range(5):
|
| 249 |
+
p = examples_dir / f"writer_{wid}_example.png"
|
| 250 |
+
if not p.exists():
|
| 251 |
+
img = _make_synthetic_writer(wid, sample_id=99)
|
| 252 |
+
img.save(str(p))
|
| 253 |
+
paths.append(str(p))
|
| 254 |
+
return paths
|
| 255 |
+
|
| 256 |
+
|
| 257 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 258 |
+
# Core function
|
| 259 |
+
# ──────────────────────────────────────────────────────────────────────────────
|
| 260 |
+
|
| 261 |
+
def writer_identify(image: np.ndarray, samples_dir: Path) -> tuple[str, np.ndarray | None]:
|
| 262 |
+
"""Identify the most likely writer of a handwriting sample.
|
| 263 |
+
|
| 264 |
+
Args:
|
| 265 |
+
image: RGB numpy array of the handwriting sample.
|
| 266 |
+
samples_dir: Path to data/samples/ directory (for real writer samples).
|
| 267 |
+
|
| 268 |
+
Returns:
|
| 269 |
+
report_md: Markdown with ranked candidates.
|
| 270 |
+
chart: Bar chart as numpy array (or None on error).
|
| 271 |
+
"""
|
| 272 |
+
if image is None:
|
| 273 |
+
return "Carica un'immagine di testo manoscritto.", None
|
| 274 |
+
try:
|
| 275 |
+
clf, le = _get_writer_model(samples_dir)
|
| 276 |
+
except Exception as e:
|
| 277 |
+
return f"Errore nel caricamento del modello: {e}", None
|
| 278 |
+
|
| 279 |
+
pil_img = Image.fromarray(image)
|
| 280 |
+
try:
|
| 281 |
+
feat = _extract_writer_features(pil_img)
|
| 282 |
+
except Exception as e:
|
| 283 |
+
return f"Errore nell'estrazione delle caratteristiche: {e}", None
|
| 284 |
+
|
| 285 |
+
proba = clf.predict_proba([feat])[0]
|
| 286 |
+
order = np.argsort(proba)[::-1]
|
| 287 |
+
names = le.inverse_transform(order)
|
| 288 |
+
scores = proba[order]
|
| 289 |
+
|
| 290 |
+
is_unknown = False
|
| 291 |
+
if _writer_X_scaled is not None and _writer_dist_threshold is not None:
|
| 292 |
+
feat_scaled = clf.named_steps["scaler"].transform([feat])[0]
|
| 293 |
+
min_dist = np.linalg.norm(_writer_X_scaled - feat_scaled, axis=1).min()
|
| 294 |
+
is_unknown = min_dist > _writer_dist_threshold
|
| 295 |
+
|
| 296 |
+
rows = "\n".join(
|
| 297 |
+
f"| {'🥇' if i == 0 else '🥈' if i == 1 else '🥉' if i == 2 else ' '} "
|
| 298 |
+
f"**{name}** | {score:.1%} |"
|
| 299 |
+
for i, (name, score) in enumerate(zip(names, scores))
|
| 300 |
+
)
|
| 301 |
+
if is_unknown:
|
| 302 |
+
report_md = (
|
| 303 |
+
"**⚠️ Scrittore non identificato nel database**\n\n"
|
| 304 |
+
"La scrittura analizzata non corrisponde a nessuno degli scrittori noti. "
|
| 305 |
+
"Le probabilità di seguito hanno valore puramente indicativo "
|
| 306 |
+
"e **non devono essere usate per un'attribuzione**.\n\n"
|
| 307 |
+
"| Candidato | Probabilità (riferimento) |\n"
|
| 308 |
+
"|-----------|---------------------------|\n"
|
| 309 |
+
+ rows
|
| 310 |
+
+ "\n\n*La distanza dal campione più simile nel database supera la soglia "
|
| 311 |
+
"di affidabilità. Aggiungere campioni dello scrittore al database per "
|
| 312 |
+
"un confronto diretto.*"
|
| 313 |
+
)
|
| 314 |
+
else:
|
| 315 |
+
report_md = (
|
| 316 |
+
"**Identificazione Scrittore — Risultati**\n\n"
|
| 317 |
+
"| Candidato | Probabilità |\n"
|
| 318 |
+
"|-----------|-------------|\n"
|
| 319 |
+
+ rows
|
| 320 |
+
+ "\n\n*I risultati si basano su caratteristiche HOG + LBP + statistiche dei tratti.*"
|
| 321 |
+
)
|
| 322 |
+
if _load_real_writer_samples(samples_dir) is None:
|
| 323 |
+
report_md += (
|
| 324 |
+
"\n\n⚠️ *Dati sintetici: il modello è addestrato su scritture generate "
|
| 325 |
+
"artificialmente. Per risultati forensi reali, popola `data/samples/writer_XX/`.*"
|
| 326 |
+
)
|
| 327 |
+
|
| 328 |
+
# Bar chart
|
| 329 |
+
fig, ax = plt.subplots(figsize=(5, max(2.5, len(names) * 0.55)))
|
| 330 |
+
if is_unknown:
|
| 331 |
+
colors = ["#aaaaaa"] * len(names)
|
| 332 |
+
chart_title = "Scrittore non nel database — solo riferimento"
|
| 333 |
+
else:
|
| 334 |
+
colors = [
|
| 335 |
+
"#1B3A6B" if i == 0 else "#C8973A" if i == 1 else "#9eb8e0"
|
| 336 |
+
for i in range(len(names))
|
| 337 |
+
]
|
| 338 |
+
chart_title = "Probabilità per scrittore"
|
| 339 |
+
ax.barh(names[::-1], scores[::-1] * 100, color=colors[::-1])
|
| 340 |
+
ax.set_xlabel("Probabilità (%)")
|
| 341 |
+
ax.set_xlim(0, 105)
|
| 342 |
+
ax.set_title(chart_title)
|
| 343 |
+
for i, (name, score) in enumerate(zip(names[::-1], scores[::-1])):
|
| 344 |
+
ax.text(score * 100 + 1, i, f"{score:.1%}", va="center", fontsize=9)
|
| 345 |
+
plt.tight_layout()
|
| 346 |
+
|
| 347 |
+
buf = io.BytesIO()
|
| 348 |
+
fig.savefig(buf, format="png", dpi=120)
|
| 349 |
+
plt.close(fig)
|
| 350 |
+
buf.seek(0)
|
| 351 |
+
chart_arr = np.array(Image.open(buf))
|
| 352 |
+
|
| 353 |
+
return report_md, chart_arr
|
notebooks/02_handwritten_ocr_trocr.ipynb
CHANGED
|
@@ -27,6 +27,52 @@
|
|
| 27 |
"> **Note on line segmentation:** TrOCR is optimised for *single-line* images. For multi-line documents, we first segment the image into individual line strips using a horizontal projection profile, then transcribe each line separately. This avoids the hallucinations that occur when the model is shown multiple lines at once."
|
| 28 |
]
|
| 29 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
{
|
| 31 |
"cell_type": "markdown",
|
| 32 |
"id": "237c8d70",
|
|
@@ -48,10 +94,19 @@
|
|
| 48 |
},
|
| 49 |
{
|
| 50 |
"cell_type": "code",
|
| 51 |
-
"execution_count":
|
| 52 |
"id": "82f9ca02",
|
| 53 |
"metadata": {},
|
| 54 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
"source": [
|
| 56 |
"import warnings\n",
|
| 57 |
"warnings.filterwarnings('ignore')\n",
|
|
@@ -82,10 +137,62 @@
|
|
| 82 |
},
|
| 83 |
{
|
| 84 |
"cell_type": "code",
|
| 85 |
-
"execution_count":
|
| 86 |
"id": "6e5b55c8",
|
| 87 |
"metadata": {},
|
| 88 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
"source": [
|
| 90 |
"MODEL_NAME = \"microsoft/trocr-large-handwritten\"\n",
|
| 91 |
"DEVICE = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
|
|
@@ -107,7 +214,7 @@
|
|
| 107 |
},
|
| 108 |
{
|
| 109 |
"cell_type": "code",
|
| 110 |
-
"execution_count":
|
| 111 |
"id": "677aabb0",
|
| 112 |
"metadata": {},
|
| 113 |
"outputs": [],
|
|
@@ -237,10 +344,30 @@
|
|
| 237 |
},
|
| 238 |
{
|
| 239 |
"cell_type": "code",
|
| 240 |
-
"execution_count":
|
| 241 |
"id": "7c007b96",
|
| 242 |
"metadata": {},
|
| 243 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 244 |
"source": [
|
| 245 |
"# Public sample from Hugging Face (IAM handwriting)\n",
|
| 246 |
"SAMPLE_URL = \"https://fki.tic.heia-fr.ch/static/img/a01-122-02.jpg\"\n",
|
|
@@ -277,10 +404,28 @@
|
|
| 277 |
},
|
| 278 |
{
|
| 279 |
"cell_type": "code",
|
| 280 |
-
"execution_count":
|
| 281 |
"id": "06391d26",
|
| 282 |
"metadata": {},
|
| 283 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 284 |
"source": [
|
| 285 |
"# ─── Change this path to your own image ───────────────────────────────────────\n",
|
| 286 |
"USER_IMAGE_PATH = \"../data/samples/handwritten_text_01.png\"\n",
|
|
@@ -309,10 +454,24 @@
|
|
| 309 |
},
|
| 310 |
{
|
| 311 |
"cell_type": "code",
|
| 312 |
-
"execution_count":
|
| 313 |
"id": "533344f7",
|
| 314 |
"metadata": {},
|
| 315 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 316 |
"source": [
|
| 317 |
"samples_dir = Path(\"../data/samples\")\n",
|
| 318 |
"image_files = sorted(samples_dir.glob(\"handwritten_text_*.png\"))\n",
|
|
@@ -344,10 +503,20 @@
|
|
| 344 |
},
|
| 345 |
{
|
| 346 |
"cell_type": "code",
|
| 347 |
-
"execution_count":
|
| 348 |
"id": "a238b64c",
|
| 349 |
"metadata": {},
|
| 350 |
-
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 351 |
"source": [
|
| 352 |
"def cer(reference: str, hypothesis: str) -> float:\n",
|
| 353 |
" \"\"\"Compute Character Error Rate (CER) using edit distance.\"\"\"\n",
|
|
|
|
| 27 |
"> **Note on line segmentation:** TrOCR is optimised for *single-line* images. For multi-line documents, we first segment the image into individual line strips using a horizontal projection profile, then transcribe each line separately. This avoids the hallucinations that occur when the model is shown multiple lines at once."
|
| 28 |
]
|
| 29 |
},
|
| 30 |
+
{
|
| 31 |
+
"cell_type": "markdown",
|
| 32 |
+
"id": "9ivizqpes6v",
|
| 33 |
+
"metadata": {},
|
| 34 |
+
"source": [
|
| 35 |
+
"## GraphoLab Core — Quick Start\n",
|
| 36 |
+
"\n",
|
| 37 |
+
"> The production implementation of the HTR functionality is available in [`core/ocr.py`](../core/ocr.py).\n",
|
| 38 |
+
"> It uses **EasyOCR** (Italian + English) with deskew and CLAHE preprocessing, and is shared by both\n",
|
| 39 |
+
"> the Gradio demo and the FastAPI backend.\n",
|
| 40 |
+
">\n",
|
| 41 |
+
"> Run the cell below to import it directly. The remaining cells in this notebook implement\n",
|
| 42 |
+
"> **TrOCR** (Transformer-based OCR) from scratch for educational purposes."
|
| 43 |
+
]
|
| 44 |
+
},
|
| 45 |
+
{
|
| 46 |
+
"cell_type": "code",
|
| 47 |
+
"execution_count": 1,
|
| 48 |
+
"id": "7ntnij9l6r2",
|
| 49 |
+
"metadata": {},
|
| 50 |
+
"outputs": [
|
| 51 |
+
{
|
| 52 |
+
"name": "stdout",
|
| 53 |
+
"output_type": "stream",
|
| 54 |
+
"text": [
|
| 55 |
+
"core.ocr imported — htr_transcribe() ready.\n"
|
| 56 |
+
]
|
| 57 |
+
}
|
| 58 |
+
],
|
| 59 |
+
"source": [
|
| 60 |
+
"# GraphoLab Core — production usage\n",
|
| 61 |
+
"# Run this cell to use the shared core module instead of the notebook implementation below.\n",
|
| 62 |
+
"import sys, pathlib\n",
|
| 63 |
+
"sys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n",
|
| 64 |
+
"\n",
|
| 65 |
+
"from core.ocr import htr_transcribe, get_easyocr\n",
|
| 66 |
+
"from PIL import Image\n",
|
| 67 |
+
"import numpy as np\n",
|
| 68 |
+
"\n",
|
| 69 |
+
"# Example: transcribe a local image\n",
|
| 70 |
+
"# img = np.array(Image.open(\"../data/samples/handwritten_text_01.png\").convert(\"RGB\"))\n",
|
| 71 |
+
"# text = htr_transcribe(img)\n",
|
| 72 |
+
"# print(text)\n",
|
| 73 |
+
"print(\"core.ocr imported — htr_transcribe() ready.\")"
|
| 74 |
+
]
|
| 75 |
+
},
|
| 76 |
{
|
| 77 |
"cell_type": "markdown",
|
| 78 |
"id": "237c8d70",
|
|
|
|
| 94 |
},
|
| 95 |
{
|
| 96 |
"cell_type": "code",
|
| 97 |
+
"execution_count": 2,
|
| 98 |
"id": "82f9ca02",
|
| 99 |
"metadata": {},
|
| 100 |
+
"outputs": [
|
| 101 |
+
{
|
| 102 |
+
"name": "stdout",
|
| 103 |
+
"output_type": "stream",
|
| 104 |
+
"text": [
|
| 105 |
+
"PyTorch version: 2.6.0+cu124\n",
|
| 106 |
+
"Device: cuda\n"
|
| 107 |
+
]
|
| 108 |
+
}
|
| 109 |
+
],
|
| 110 |
"source": [
|
| 111 |
"import warnings\n",
|
| 112 |
"warnings.filterwarnings('ignore')\n",
|
|
|
|
| 137 |
},
|
| 138 |
{
|
| 139 |
"cell_type": "code",
|
| 140 |
+
"execution_count": 3,
|
| 141 |
"id": "6e5b55c8",
|
| 142 |
"metadata": {},
|
| 143 |
+
"outputs": [
|
| 144 |
+
{
|
| 145 |
+
"name": "stdout",
|
| 146 |
+
"output_type": "stream",
|
| 147 |
+
"text": [
|
| 148 |
+
"Loading microsoft/trocr-large-handwritten ...\n"
|
| 149 |
+
]
|
| 150 |
+
},
|
| 151 |
+
{
|
| 152 |
+
"name": "stderr",
|
| 153 |
+
"output_type": "stream",
|
| 154 |
+
"text": [
|
| 155 |
+
"Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.\n",
|
| 156 |
+
"The image processor of type `ViTImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. \n"
|
| 157 |
+
]
|
| 158 |
+
},
|
| 159 |
+
{
|
| 160 |
+
"data": {
|
| 161 |
+
"application/vnd.jupyter.widget-view+json": {
|
| 162 |
+
"model_id": "25177219f7db437eba687822aa6fbe28",
|
| 163 |
+
"version_major": 2,
|
| 164 |
+
"version_minor": 0
|
| 165 |
+
},
|
| 166 |
+
"text/plain": [
|
| 167 |
+
"Loading weights: 0%| | 0/635 [00:00<?, ?it/s]"
|
| 168 |
+
]
|
| 169 |
+
},
|
| 170 |
+
"metadata": {},
|
| 171 |
+
"output_type": "display_data"
|
| 172 |
+
},
|
| 173 |
+
{
|
| 174 |
+
"name": "stderr",
|
| 175 |
+
"output_type": "stream",
|
| 176 |
+
"text": [
|
| 177 |
+
"The tied weights mapping and config for this model specifies to tie decoder.model.decoder.embed_tokens.weight to decoder.output_projection.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning\n",
|
| 178 |
+
"\u001b[1mVisionEncoderDecoderModel LOAD REPORT\u001b[0m from: microsoft/trocr-large-handwritten\n",
|
| 179 |
+
"Key | Status | \n",
|
| 180 |
+
"----------------------------+---------+-\n",
|
| 181 |
+
"encoder.pooler.dense.bias | MISSING | \n",
|
| 182 |
+
"encoder.pooler.dense.weight | MISSING | \n",
|
| 183 |
+
"\n",
|
| 184 |
+
"\u001b[3mNotes:\n",
|
| 185 |
+
"- MISSING\u001b[3m\t:those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.\u001b[0m\n"
|
| 186 |
+
]
|
| 187 |
+
},
|
| 188 |
+
{
|
| 189 |
+
"name": "stdout",
|
| 190 |
+
"output_type": "stream",
|
| 191 |
+
"text": [
|
| 192 |
+
"Model ready.\n"
|
| 193 |
+
]
|
| 194 |
+
}
|
| 195 |
+
],
|
| 196 |
"source": [
|
| 197 |
"MODEL_NAME = \"microsoft/trocr-large-handwritten\"\n",
|
| 198 |
"DEVICE = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
|
|
|
|
| 214 |
},
|
| 215 |
{
|
| 216 |
"cell_type": "code",
|
| 217 |
+
"execution_count": 4,
|
| 218 |
"id": "677aabb0",
|
| 219 |
"metadata": {},
|
| 220 |
"outputs": [],
|
|
|
|
| 344 |
},
|
| 345 |
{
|
| 346 |
"cell_type": "code",
|
| 347 |
+
"execution_count": 5,
|
| 348 |
"id": "7c007b96",
|
| 349 |
"metadata": {},
|
| 350 |
+
"outputs": [
|
| 351 |
+
{
|
| 352 |
+
"name": "stdout",
|
| 353 |
+
"output_type": "stream",
|
| 354 |
+
"text": [
|
| 355 |
+
"Loaded local sample: ..\\data\\samples\\handwritten_text_01.png\n",
|
| 356 |
+
"\n",
|
| 357 |
+
"Transcription: Io sottoscritto Mario Bianchi , vinto a Firenze il\n"
|
| 358 |
+
]
|
| 359 |
+
},
|
| 360 |
+
{
|
| 361 |
+
"data": {
|
| 362 |
+
"image/png": "iVBORw0KGgoAAAANSUhEUgAABSIAAAGNCAYAAAAFLQ8PAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAeIhJREFUeJzt3QmcVXMfx/Ff+75oV6hUSknZUkJFqbRQWmwp+xJZs5MtRAqhEEqSJQlJ5SEhlCJEqWjf931T87y+v3HGnTt31uZMU33ez+s+mpm7nPM/59x7z/f8/v9/jri4uDgDAAAAAAAAgBDlDPPJAQAAAAAAAEAIIgEAAAAAAACEjiASAAAAAAAAQOgIIgEAAAAAAACEjiASAAAAAAAAQOgIIgEAAAAAAACEjiASAAAAAAAAQOgIIgEAAAAAAACEjiASAAAAAAAAQOgIIgEAAID9bMGCBZYjR46E21dffZUlr6vXiXxdLQcAAEBYCCIBAACSUalSpUQhTVpu6Q2QvvnmG7viiiusevXqVqRIEcuXL5+VL1/ezj33XHv55Zdtx44dKT4+Li7Oxo4da126dLFjjjnGihYtanny5LGyZcva2WefbX369LHly5cn3H/IkCExlzt37txWsmRJq1+/vj322GO2cePGDAVZabmpXRE+QkYAAJDd5N7fCwAAAHAo2rJli1155ZX23nvvJfmbgkPdPvvsM3vyySdt5MiRdtJJJyW53+LFi+3iiy+2b7/9NsnfVq1aZV9++aXfZs2a5QFkSvbs2WPr1q2zKVOm+G348OE2depUD0cRvhIlStjTTz+d8HOVKlWypNn1OpGvq+UAAAAIC0EkAABAMu67775ElYHr16+3xx9/POHnZs2a2TnnnJPoMSkFSJs2bfKKxb1791rnzp29kjFQrVo1a9eunQd/33//fcLf1FVWr6NwUPcJrFy50ho1amTz589P+F3lypWtbdu2Xg2pZf3hhx9ihpSRrrvuOl/mtWvX2jvvvJPQNXf27Nn2xhtvWI8ePdIVZMmECRPs888/T/j53nvvtcMOOyzh52LFiqX4nJs3bz5kAtBdu3Z5Vav2izvuuCPLX//II4/cL68LAAAOUXEAAABIk/nz58fp61Nw69WrV4p/nzhxYtzgwYPjTjjhhLj8+fPH1alTx+83fPjwRPdr2bJl3M6dOxM915AhQxLdp0WLFon+fuGFFyb6+/XXXx+3e/fuJMs8Z86cuLfeeivh5zfeeCPJMgZmzZqV6G/XXntthvYMtUvk86hdkvt7xYoV49asWRN3ww03xFWoUCEuZ86ccf379/f7jRo1Ku7SSy+Nq127dlyZMmXi8uTJE1eoUKG4Y489Nq579+5JnlcaNWqU8Nxdu3b19VdblSxZMi5fvny+LUaPHp3kcQsWLIi75ppr4qpWrerbSvctX7583GmnnRZ36623xv3xxx9JHjN16tS4bt26xVWpUiWuQIECvmzVqlXz382bNy/ZZfrtt9/izjvvvLgSJUr4737++eeY+05y7bV+/fq4Hj16eHvlzZvX22PAgAFxe/fuTXhM5HPFumk5RK+T0rb6559/4l577bW4s846y9swd+7cvtyNGzeOe+WVV5Lsc7HWY8SIEXH16tXzNipevHhchw4d4hYtWpSmfQkAABxcqIgEAAAIyYMPPuhjQEZ75ZVXEv6dM2dOe/bZZy1v3ryJ7tO1a1cfI1LVkTJu3DhbuHChVaxY0bttv/vuuwn3rVu3rr3wwgv+XNFURRlZSZmSChUqJPq5VKlSFratW7fa6aef7hWY0dQ9/IMPPkj0u927d3tXc92GDRvmFZ+1a9eO+dw///yzd2lXhWXk71R5qopNjaEZdGM/5ZRTbPXq1Ykev2zZMr999913Pv7msccem/C3Rx55xB566CGvZow0d+5cv5133nkxq2N//fVXH4dT650R27ZtszPOOMNmzpyZ8Du1xU033WRz5syx559/3jKLllFjlX799deJfq8u/Bp/Urc333zThxAoXLhwzOd44IEHElXlbt++3Yca+OWXX7wt8ufPn2nLCwAAsj+CSAAAgJAohFRweMEFF1jBggU98NJYjEG4KHXq1PGQKxZ13468b/B8EydOTBSAKbSMFUKmh8IlTWwT0KQyHTt2tLCtWbPGb02bNrWGDRt6GKiu5VK8eHHv+q4AUF27FdaqS/qHH35oixYt8q7ud911V6Iu7pEUdOlxt956qwdgr776qre/2k7dyYMgUmFnEELq/pdffrlP3KMQUgFpdJj8/vvvW69evRJ+1ra98MILfduoq/wnn3yS7PoqCNXEQJpcSAGxnj89YZyWU+utLvVqn7feesuWLFnifxswYIDva+qyr/X766+/bNCgQTG7yB933HGpvpa65UeGkNoWDRo08C7/48eP998pZNT9Xn/99ZjPob8r5G3evLnvt5MnT/bfK6wdPXq0txsAADh0EEQCAACERGM2/vTTTx4YBRRGalzAgMKr5ET/LZj9eunSpYl+X6NGjQwvY5MmTZL8TmGVQi2FpFnhlltusf79+yf5/eDBg70CUsGXgisFcEcccYQHiBq/UjQZj+6jmcKjKUz94osv7IQTTvCfFfip+lR+/PHHhPtFzkzeqVMne+aZZ5JUBmpyoYAmEAoUKlTIt3FkmKz7p1TxqIpAVUxGCsbmTAuFfpqkSK699lp/bbWBKGxVEKlxH1WxGBlEXn311WmesVxjhg4dOjRRu0RW4SokDyZaUlWkgk+Ft9Hq1avnYaS2j5ZR20/HQLANCCIBADi0EEQCAACEpHv37olCyAPFFVdc4cFTVrn//vtj/l5dsxVSqmIyOTt37vS/H3744Un+puq9IISU6tWrJ/xbk/kEVImp0FKVkuoOr4CsZs2afv+TTz7Zw9qgSlNdo1XVGLjsssuSVLQqnNQtFlUiRoeQ6aFATyFgQMGiurar2lCmT59umUEzpqt6NLLqNpJ+DoJI3U/3b9myZZLnueqqqxJCYv1X4XwQREZuAwAAcGjYtz48AAAASFasSkVVjUWOB6lxH5MT/bcgbIseyzHW+IpppS6+jz76qI87GFBF4DXXXGNZQeNQxqqkU5WhQr6UQsjIMDKW6Oq/fPnyJfw7smu7qvb69euXMM6hXltdnjW+ocI1VfGpujAIzyIfq2AtPfalelXUVrly5Ur0uyAklQ0bNlhmUFf95F4j1s/JhYopbQPNHg8AAA4tBJEAAAAhiVUVpxBJlXqR4xjOmzcv5uODirNAEBaqQk8VfAF1jc1oqKPqOlUkKmiLrGgbMmRIzIl2MltylYMahzFYJ63riBEjvHu0QsBPP/00Tc8d3V07ss2iqfJS40+qK7cmfNHkL8EkPwpDg4pAdVuPfB6NCZkZ65tW6jIdWakoWu5AZlXglihRItnXiPVzMPbkvmwDAABw8COIBAAAyGKR1YYKlTSZSjDGX0AzQmu25kCLFi0SxoxUZWRk12l1Fb755puTBFSisRXVxTk1muxGAVxktZ1m/d5fFLgFihUr5usbhHjRAe2+0qQ0CtY06cxZZ53lIaTaInJMRE2Oo2XSfSK7e2s7RQfJmhgn6H6c2bSfRC6XxpaMnJVas4QnFwKqW3laqUo0cl+IHC8y+mfdT/cHAABIDWNEAgAAZDFN0KGuv5999pn/PGbMGB87sF27dt49eMqUKf67yGqz5557LtFzaHIXTeISdN9+4YUX/PnatGnj3WbVtVbPo6pGdXG+5JJLUl2uqlWreoXk22+/7T+rSlJh6GmnnWZZLXI8R3U3btWqlS+HQrcJEyZk6mtpZmi1j8Za1Azd5cuX91B31KhRCfdRd3qFkHL33XcnBMGq0qxbt27CrNmLFy/2bffSSy/Z+eefb2GN4antGsyaHRlia0zGQHQXfo1ZqtmrNWt327Ztk52tPegC3q1bN3vttdcSwl9th+hZs0X7V6zu9QAAANEIIgEAALKYqg8V7ChQUhdkmTNnjvXp0yfmGHuaZTk6NFJV5KRJk+yiiy6y77//3n/3119/JcwKnVH33HOPd4MOxkF87LHHbOzYsZbVLr/8ch+3UdWKMm7cOL+JuklHV+jtK3UDVyCpWyw33nijFShQwP/dsWNHe+ihh+zhhx/2dtIM2UFgFzaFzBqzMnI27MANN9xgjRs3TrTvqHozmFxHwXIw1qX+llIQKQq/VVEbtIkC4OgQWBP9qHoUAAAgLeiaDQAAsB+o8lFhpIIhVZ5pPEJ1PVZ32nLlynlX7IEDB9off/yRqLttJFXgTZ482T755BOv6FNFo55DFW9lypSxpk2b2osvvmhPPfVUmpdLlZmqqgyoylKTt2Q1jVGo6sf27dtb0aJFPQQ85ZRTvEpR7ZWZVAnZu3dvr7qsUqWKFSlSxNuwdOnSdvbZZ/t4mZrAJ1KvXr28MlCh6NFHH2358+f3ikn9u0uXLt6OYdDraIZsdedXIKlKTVWPKjRUVWw0tZcqbdWe6R2fUfuSxswcPHiwj0uq51C7qEK3UaNGPsO49t9gkh8AAIDU5IiLnPYPAAAAQLYSVF8G4bPGhQQAADgQUREJAAAAAAAAIHQEkQAAAAAAAABCRxAJAAAAAAAAIHSMEQkAAAAAAAAgdFREAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAACA0BFEAgAAAAAAAAgdQSQAAAAAAFlgyJAhliNHDluwYEGmPm/jxo39BgDZHUEkAAAAACDLKZBLy+2rr75i65jZH3/8YQ899FCmh5gAkJVyxMXFxWXpKwIAAAAADnlvvfVWojZ488037fPPP7dhw4Yl+n2zZs2sbNmyB0V77dmzx3bv3m358uXzkDU9Ro4caR07drSJEycmqX7ctWuX/zdv3ryZurwAkNlyZ/ozAgAAAACQiksvvTTRzz/88IMHkdG/j7Zt2zYrWLDgAdW+W7dutUKFClmuXLn8ltkIIAEcKOiaDQAAAADIllT5d9xxx9n06dPtzDPP9ADy3nvv9b999NFH1qpVKytfvrxXGFapUsUeffRRrzqM9Rzq2tykSRN/jgoVKthTTz2V5PUGDBhgtWrV8vscdthhdvLJJ9vbb7+d6D5Lly61K6+8MuF1K1eubNdff31CVWIwDuSkSZPshhtusDJlytgRRxyR6G+R3asrVapkrVu3tgkTJljdunUtf/78VrNmTRs1alTCffQ4VUOK1iG623qsMSJXrVrly6lqUj1nnTp1bOjQoYnuo+XQ8/Tt29deeeUVb0Ot0ymnnGI//vhjBrcaACSPikgAAAAAQLa1du1aa9mypV144YVeLRl001Y4V7hwYbvtttv8v19++aU9+OCDtmnTJnv66acTPcf69eutRYsW1r59e+vUqZN3c77rrrusdu3a/tzy6quvWo8ePaxDhw528803244dO+zXX3+1KVOm2MUXX+z3WbZsmdWrV882bNhg11xzjdWoUcODST2fKjUjKxMVQpYuXdqXSRWRKZk7d6517tzZrrvuOuvatau98cYbHjyOGzfOu6YrhNWyPf/88x7EHnvssf644L/Rtm/f7sHkvHnz7MYbb/Sw9P3337du3br5smv9Iils3bx5s1177bUeTCqkVVv9/ffflidPngxtNwCIhSASAAAAAJBtrVixwgYNGuQhWXR4VqBAgYSfFeLp9tJLL9ljjz3mlX0BBYgag7JLly7+syoFK1asaK+99lpCEPnpp596NaQCu+Tcc889vjwKJ1UtGXjkkUcsevqFEiVK2BdffJGmrthz5syxDz74wMO/YPkUciosVRB59NFH2xlnnOFBpH5ObYZsVTfOmjXLx+G85JJLEtqnUaNGdv/999sVV1xhRYoUSbj/okWLPAxVFahUr17dzjvvPBs/frxXawJAZqFrNgAAAAAg21KgePnllyf5fWQIqWq+NWvWeFinysTZs2cnuq8qJiPHnlTloiobVfEXKF68uC1ZsiTZLsl79+610aNHW5s2bRKFkIHoyWeuvvrqNI8HqW7e7dq1S/i5aNGidtlll9nPP//swWd6jR071sqVK2cXXXRRwu9U2aiqyi1btni38UiqxgxCSFE7SmT7AEBmIIgEAAAAAGRbGs8x1mQsv//+u4d3xYoV8+BO3aCDsHHjxo2J7qsxGqODQgVv6rIdUPWhAksFlNWqVbPu3bvb5MmTE/6+evVq7/at8SbTQt2h06pq1apJlu+YY47x/0aOJ5lWCxcu9HXImTPxKX/QlVt/j3TUUUcl+jkIJSPbBwAyA0EkAAAAACDbiqx8DGicQ3Uz/uWXX7xb9CeffOIzbvfp0yehejFScpWJkd2pFdL9+eef9s4779jpp5/uXaX13169emXacmdXaWkfAMgMjBEJAAAAADigaLZoTWKjmaU1kUtg/vz5+/S8hQoV8m7KumkWbI3Z2Lt3bx8bUhWXqrycOXOmZTZNKqPQL7IqUuNGBrNqS3TFZEo0/qUm2lEgG1kVGXRZ198BYH+gIhIAAAAAcEAJKvgiK/YUHGqimoxSsBlJ3cFr1qzpr7F7924P9M4//3yvvpw2bVqmVg9qMp0PP/ww4Wd1AdfkOnXr1vWxHoOQNKgGTc25557rY0u+++67Cb/7559/bMCAAd79XNWkALA/UBEJAAAAADignHbaaT6OYdeuXX0CFlULDhs2bJ/CwHPOOcdDv4YNG1rZsmV91ukXXnjBWrVqlTDD9OOPP24TJkzwIO+aa67x7tzLly/3mba//fZbn/AmIzQepGbK1kQ5eu3XX3/dVq5caW+88UbCfRRKKoBV93ONgalJfM466ywrU6ZMkufTsr388svWrVs3mz59uldVjhw50se8fPbZZxPNmA0AWYkgEgAAAABwQClZsqSNGTPGbr/9drv//vs9lNRENWeffbY1b948Q8957bXX2vDhw61fv34+s7QmuFHIqeePnDhnypQp9sADD/h9Vbmo37Vs2dIKFiyY4fXRxDKqVuzZs6ePU6mJblTNGLkuCkkHDRpkTzzxhIeWe/bssYkTJ8YMIjU+pbqv33333TZ06FBfzurVq3uwqXASAPaXHHGMPgsAAAAAwH6hakXNxK1gFQAOdowRCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0jBEJAAAAAAAAIHRURAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAAAIHUEkAAAAAAAAgNARRAII1ZAhQyxHjhw2bdq0bNHS27Zts4ceesi++uqrNN1f99Pyjxw5MvRlAwAAAADgYEYQCeCQoiDy4YcfTnMQCQAAAAAAMgdBJAAAAAAAAIDQEUQCyHLdunWzwoUL29KlS+3888/3f5cuXdruuOMO27NnT8L9FixY4N2i+/bta/3797eKFStagQIFrFGjRjZz5sxEz9m4cWO/xXqtSpUqJTyfXkdUFann1k1dtdND99fj5syZY5deeqkVK1bMn/eBBx6wuLg4W7x4sZ133nlWtGhRK1eunD3zzDOJHr9r1y578MEH7aSTTvLHFipUyM444wybOHFiktdau3atdenSxZ+rePHi1rVrV/vll1/89dXtPdLs2bOtQ4cOVqJECcufP7+dfPLJ9vHHH6dr3QAAAAAACAtBJID9QoFj8+bNrWTJkh40KlxUYPfKK68kue+bb75pzz//vHXv3t3uueceDyHPOussW7lyZbpeU2HhwIED/d/t2rWzYcOG+a19+/YZWofOnTvb3r177cknn7RTTz3VHnvsMXv22WetWbNmVqFCBevTp49VrVrVA9avv/464XGbNm2ywYMHe3Cq+yjYXL16tbfHjBkzEu6n527Tpo2NGDHCA8jevXvb8uXL/d/Rfv/9d6tfv77NmjXL7r77bm9LBZwKej/88MMMrR8AAAAAAJkpd6Y+GwCk0Y4dOzzIUxWhXHfddXbiiSfaa6+9Ztdff32i+86bN8/mzp3r4Z60aNHCgz+FeP369UtzmyuYU8Wgnv/444/3asZ9Ua9ePXv55Zf939dcc41XXt5+++32xBNP2F133eW/v+iii6x8+fL2+uuv25lnnum/O+yww7w6M2/evAnPdfXVV1uNGjVswIAB3gYyevRo+/777z3cvPnmm/13WnYFndH096OOOsp+/PFHy5cvn//uhhtusNNPP92XRcErAAAAAAD7ExWRAPYbhY+R1D3577//TnI/VfUFIWQQACqIHDt2rO1PV111VcK/c+XK5V2h1TX7yiuvTPi9ulNXr1490XrpvkEIqarHdevW2T///OOP/+mnnxLuN27cOMuTJ4+HlIGcOXN6ZWgkPf7LL7+0Tp062ebNm23NmjV+U7duVVkqxFU3eAAAAAAA9ieCSAD7hcYwDMZrDKhScP369UnuW61atSS/O+aYY7yqcH9SBWIkjfeo9SpVqlSS30ev19ChQ70qU/dX93S1xaeffmobN25MuM/ChQvt8MMPt4IFCyZ6rLp7R1eMKgBVdameJ/LWq1cvv8+qVasybb0BAAAAAMgIumYD2C9UFZiZNHmLwrhokZPfZMU6JLdekcv21ltv+SQ6qvTs2bOnlSlTxh+nLt1//fVXupdDVZWisShVARlLdHgJAAAAAEBWI4gEkO2pa3E0zVgdzIYdVFPG6tatqsLowHJ/GzlypB199NE2atSoRMsTVC8GNEu4ZtLetm1boqpIVUBG0nOJunE3bdo09OUHAAAAACAj6JoNINvTpC2RYxxOnTrVpkyZYi1btkz4XZUqVWz27Nk++3Tgl19+scmTJyd6riDQ27Bhg+0vQdVkZJWk1kcT00RSdePu3bvt1VdfTVT9+OKLLya6nyoqNQO3Js7RrNrRItsEAAAAAID9hYpIANmeuhVr9mfNGL1z506fRVrjKt55550J97niiit8Bm2Fd5osRmMiDho0yGrVqmWbNm1KuF+BAgWsZs2a9u677/o4kyVKlLDjjjvOb1mldevWXg2pmaxbtWpl8+fP92XVcm3ZsiXhfuq6rYl5NBO3qiA1q/bHH3/sk9NIZDWlwkm1Ue3atX1yG1VJrly50sPNJUuWeCgLAAAAAMD+REUkgGzvsssus5tuusleeOEF6927t4eLmiVaE7kEjj32WHvzzTd9spfbbrvNA7thw4bZiSeemOT5Bg8e7LNw33rrrXbRRRd5V+mspPEhH3/8cQ8He/ToYePHj/dxIzVrdnTlpCaw6dy5s09uc99991n58uUTKiI10U1AIea0adM82BwyZIjPrK1wU7NsP/jgg1m6fgAAAAAAxJIjLtbsDgCQDWhW7MqVK9vTTz/tE7Hgv67qqqb89ttvrWHDhjQLAAAAAOCAQEUkAGRj27dvTzIL+IABA6xo0aIxqz0BAAAAAMiuGCMSALIxdUlXGNmgQQMfH1NjS3733XfetVvjXQIAAAAAcKAgiASAbOyss86yZ555xsaMGWM7duzwiXtUEXnjjTfu70UDAAAAACBdGCMSAAAAAAAAQOgYIxIAAAAAAABA6AgiAQAAAAAAAISOIBIAAAAAAABA9pmsZs/ubeEuCQAAALKVXHkK7u9FAAAAwEGEikgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAocsd/ksAAAAA4ZozZ45NmDDBvv56kq1du8Z2795NkwMAgCyTI0cOK1SosFWvXt2aNm1mTZo0sYIFC7IFouSIi4uLszTYs3tbWu4GAACAg0SuPAfGl+dRo0bZ448/ZsWKFbZGjU61I48sb3nz5t3fiwUAAA4he/futS1bttq0ab/aL7/MsqpVj7GBAwfZYYcdtr8XLVshiES2FWTk+v8cEVcYcGCJvtZxoG/Dg3W/zKztdLBtb4Qv1vVQ9pusaevIdtbffvrlL5vx6192fusGVrJE0QMmiPzqq6/sjjtus86dW9ltt11ruXLl2t+LBAAADnHz5s23G26410qVOtzeemu45czJyIgBWgLZ1vYdu+z5QR/bOW3vs6eeHWlbt+7Y34uEdNqzZ69NmvybnX/Ro3bJlU/bkqVrDvg2PFj3y3l/L7eOlz1uL7/xme3ZuzdDz3Ewbm+ES+HX/IUr7da7X7GW7R+0sROmxQwmse9SOz4XLl5lDz0+3Pq/MNpmz1lyQDX5yJHvW506NeyOO64nhAQAANlC1aqVrXfvu2zOnD/tt99+29+Lc+gGkTq5iLwBKfn1t/n27Iujbeashfb15Jm2afPBOTxA9HGR3mMjOx9Xq1ZvsH4DRtuUaX/a5B/+sEVLVme7tk7udijtl1rft9750ib/MMteevVT27lj90GxvZG9jqlYx5fCsXdGTrJ3R31tM377236aMc/27s1e72MHi5SOT22TX2fO91uu3DktT54Dp6Jw06ZNNnXqFGvZsgmVtAAAIFs56aTjrVSp4vbFF1/s70U5dCer2bFjlz370kc2c9YCe/bJa610qWJ2KNAXfFUeaP2Prny45cqVev67Y+cuW7R4tRUpUsDKlTnskPxyXaBA/NhOWvc6x1W2MqWL73M7Zcd21Un38Hcn2lvvfmktmp1sN1zVyvLnT/u4VgrC+r842v6cs8Ruu6mdnXxCtWyxXpInT27Lkzt+f69QvqQdd2zF/bIc/+zZYxs2brU1azbZoiWrPEScP3+FLV2+1tZv2OLHpk68a9ao6F0SGzU8zgoUyJfh/fJAo31IFVB6r9q+fecBv733hw0bt9jyFevsiAqlrXCh/NnmGMyKasbfZs63EiWKWp68uW3D+i22bOVaW7Fygy1fvtZWrdloa9ZutK1bd1refHns9Po17b6ena1Qwfz+HGqnAnq/izMrXLiA1Tv5mDR9RiJzj08FwtN+mms7d+22smWKW9nSB844RsuXL/fxmGrWPGZ/LwoAAEAi6o5ds2Y1W7x4ES2zv4LIP2YvtpdfH+vBy8+//GXnnH2iHQo2b9lu1/QYYKvXbrQRr91p1apWSPXk7ocf/7Rb7nrZGpxSw558+HIrVqyQHWrUTrVrVbJ8efPY5V2aWe7cufapnbJru+7e/Y+99+E39uvvC/yk/YzTaqUpTNT6bNu+0wYO/tReHTLOTyQVsJ5Yp6rlypU9QpCSJYpY/Xo1bM3aTXbzDedZ4SIFsuy11T5qE4WO4/833X7+9S/7fdYiX5bkKh7nzFtmn0/82W6/sZ1dcdk5lj9f3nTvlwcihbQKZKXEYUUsZ84cB9z23p8UdA8cPNaGDv+f3XN7J+ty0VmHRBC5cdM2e7TPCPvs82lpfsymjVvt5uvbJgSR2tfqnVzdjq5czlqec4qdenL1Q6Lt9oeUjk/tw7/MnO//PrJCaStb9sC5wLJjh4bHiLP8+WNfPMKh46GH+trDD/ez+fOnWKVKR+7vxTmoHGhtmyNHeevatZMNGfKsZUfdut1iQ4e+Z3Fxy7LkcdndV199Z02adLA33uhv3bp13t+Lc1ChbdNnwYLFVrnyqdar12320EN3JPl7ixYX2/jxXyX8nNz9ouk7yrp129O5NAe3LA0i/5y3xP7Zs9dy5cxpxYvv/wAoq0z/eZ79Pnuh5c2T23KnobuTwpMvJv5sK1aut81bt1vOQ7Q6RFUyr790q1fHFI8RGKa3nbJruyoUUxAkK1ett4lf/2on1qmS6jhX//yzx4aN+NJeHTre103n7ocVL5zhECkMChSuv7KVdb24qR12WBE/9rPK9u27rHffd+zjsVNs3frN3kaqNK1SuZzVPf5oq12rsh17zJHeZnnz5rY/5y6x5wd+7MGlxkk8oU5Vq39K0lAktf3yQKT9b+Om+H2wUsWyGR5IeX9u7/1p+fJ1Nunb32zj5m0ZHl/zQKRjQBV0gTKli9lRR5SxOrUrW/VqR1iVSuWsVKlivh/oQtziJWus6tGHJ+oNoX3mlBOr2ajh91vRIgWTrUTGvkvp+NyxfZfNmbfUt+mJdav6hZYDTXIBdnASlpyD4YT+2WdfteLFi6bpBF6BUt26tez881vaoWrIkHdtw4ZNdsstV2dq22aWyH123Li3rXnzxgl/u/vu3tanz4v+7+XLZ1i5cmXsYBV97Oq7ida3Tp2adscd19lZZ52+X5cP2dvo0Z/ZjBm/pykkOlAoYE9O585t7Z13BmXp8hyK7rzzBrv00gtszZp1duutvdL8OC6y78cgUmHLgkUrvfuMTvyPSaUq8GDhVXjTZntopG5Qh5crkepjdv+zx36Y9qd/4J5wfBUrUvjQqCqKdcCWKhk/a2dmtFN2bVdVCK9bv8n/rUK9sRN+tO7XtLZCBXOluF9pfMLnBn7kk6UofFR7qVtodnujU3dL3bLa2vWb7P3R39qmTfFjOGq7X9yxkd14bVs7vGzSbvl6T9LvevQcZCtXbbD5C1d4EJne/fJAo31p06b/gkiFs/vSNXZ/be/92X4aa0837Rc1axx1yMyIpy7ofiwphCxT3F585gav5s6XL0+S46tqlfLWoF7s58mbN4+VLXPgdAU+kCV3fKqLvSrsVeFd/5Qa2e5zZF8ce2w1GzZsgP97zJjP7d13P7Z77+3hvz9YKCxTlVpawjJVtalS7GAMIu+//xa7++4bLV++lC9oDBnynle9pDWITGvbZraCBQvYO++MThREav/V77dt254t2zYMClhat27m55ArVqyyN95415o27WwfffSGtWlzTszHbN/+90E5cdWrrz5tgwY9aQebM8+s79ssT57Muwg2evR4rx49mIJIOf74mtaz5/VJfl+p0hFZ1rYHs4oVj/D2yp07dkwWXADRZ0h6gkjsxyBSQdzf81d46KJqr4LZoOph1+5/bPXqDbZpy3avDNAJlb6gZ+YXcHXL1uDvWu8mZx5vedLQjXPpsjX21/zlXqXV+PTalhUn0gqzVq/d5OPDqeLriCNKJ7us3iV4205bsWq9h3tqN1WyZPWJS3rbKavbNa1UKbRt+y7/twIMnRCqivbMhscl2/6//DbfHu/7rm3Zut3at21oP/8yz5YsW2MVjyydxUuffSn0H//ho/bamxPs9WETfH9ue259K5/CxQDtzzpWCxbMZ6VKHPhho760q8v16jWbfAxMdbnU/h9N99F7lQJIdTtnjL600/H4x+xFtn7DZjv15Bp2dKVydqhQlbECbA10cMZpx3m3ao1DmJ3s3LnbJ2nZsnWHHwN6X9D3j+wctO3Zs8fWrd/ildx74+KsdMli3q06zGVWRbi2Z7myh1n1g+xCcdmypb16QZYsWeZBTrNmZ1rjxqft70VDJtOJY3InjweiVq2aepjy8su7LG/evPbDD9Nt2bKVPjHTRx+NP2TaVhW8wTEsHTu2saOPru+BZHJBZP788cN/HGwUJh2MgZLOfw7WbZbZDj+8TKLjITW0bfrouxb7YtbIsrKNtes2e7dTOeVEDUafcrWXTozHTphmz7402ifiUJWYn/Ds46zAwYydc/9aZv0GjLLLrn3Gzr3gQWvb+WF7afCn/kU8uceoQkoTivR9/gMb9NpY++HH2X6in9wy6fcLF62y+QtWevXdSSdU8zeDlGYQ1b+/+2GW7dr1jw8mX+OYI1OddXRflk9h7Dff/2539XrDOnV9wlq0f8A6d3vSvp8yK9nHqPLnwd5v2QWX9rbWHXrZA48Ns63bNEZT0rEPVZH27gdfexAd+Ry6v8bi6/fChx4Sad8IljMtsxent50y0q7B47Ts30+d5d111bZvjvjCFi5elWmzVHtAv2evd/NVSK9l1Jhrye2LGpetT//37Y8/F1nD+jXt1u7n2fYdu3zfOvKI0mnsCr7FX0PH1+Ch4/2EN9a66HfTfprjY1AuW742UXsqsNMstwNf+9QGvPyx/fRL4tlu07odw2rf3LlyWeWK5axWjYreNhoWQZNFJd8mW33maHU1rXfSMd59O/LEP72zk6e3nTN71nMd+xO/+dVu6jnIWnXsZe0vecxGjv420bEYvPayFeviQ4gyh1nlimUzFHhk5eztYbxO8Hyq3Nf76DMDRtm4/033ICul11F4/e33v+uri4e4pUoWyxbvK/u6Xmmhz+S16+PHFj2prsamTd9XitTaSJNJ6biZOn1Oks8HfWYMeyd+fXSMbdu2I8nnoj4Tn3pupF185VPWsv0D1u7ix3z4BQ0Rk9KyZERynycp3ZK7yDf07S/s2psHWOtOD/n3kzsfeM0W/btvZFRqbT3v72X+nlG39tFeoXooW7ZshY/DVrbs8ZY3b0WrVq2hPfpof/vnn3/26Xm//voHa9q0k5UurcnQjrYqVRrYVVfdbps2bU5y388++9IaNmxrhQpVscKFq1qjRu3tyy+/TdLFWt30dFu4cIlNmvR9ws+6qftxQOsT/F5UJRR5X3WBjbRr1y574IGnfBnz5atkhx9e15d11ao1SZZ1zZq11r37PVapUj3Ln7+yt1vz5hf58sTy++9/2kUXXe/PqftXrXqa3XbbQ7Z27bpE91O1iZZN66l1P+20Nlaw4NFWqlQt69r15oT7bdmyNdG66KbHRlN7BH/XsqnNIh+j18lI2wbHUP/+r1jNmo28vbSMnTtfa3//vdD2hQJzfX/R/iDvvPORNW/eyIoXL5Zkn7399oetTp2mVqxYdW+nE05oZgMHDo35vJndtunZbzNDkSKxh8W5446HEy2r9vtYNm/eYvff38fq1TvXSpas5fvhsceeab17P2e7d/833EjkfjN27Bd24433+vFbokRN34c3bozvzRTtu+9+tLZtu3p76ljXftGr19P/jmebmNqzVasuVqRINTvqqJN9m+i9ONLMmbOTbIfsSG2n9mzWLGn1sI4RrV+DBm0S/f7kk1ukeGxFH5PTp/9ql1zS3YoXr2HlytXx956dO/+bZFHvZcFz6X1OUtsnwthvX3ppiLdD+fIn+HvCkUeeZNddd5etXLnaskpa2zY97wkyf/4i69LlJn+v17pp/471XtO48QX+uTBjxkw788x23r56v1fbRIv8jIq+Rb43B0aNGuvLqOfUsaNxG/U6+6JDh6tTfV1kriy7tLV23SYPIzVAfY3qRyY7jp1CGJ2cPf7Me/b3/OUJAZcepxOdPo9eYUcdkbHupx6Abd1hoz/9wZ4f9JEtW77OA8Ly5Up68Dl9xjw/GY+efEKVQpqE4K33JtrKlett565/EsaH63LhWXbD1a29i5ps2bLdlixb61UMcXv32sdjf7DFS+PfdO64d7Dli5gNWU2gUPbKy5r78+nUIH75vveTVM20fXbbexOta948uazLhWfb5Zc2S5iQJK3LFx2qLF+53vq/8KF9Ov5HnzW3ZImidkT5Uj6QvSp8oivy9Jjffl/gwaPG0dNswXrt0WO+9wlSul58dqLX0MnhfQ8P9SqU5k1P8uXRc2i9Hn7ybfv6u5neXmpvja/25COX2+FlS/hJok4uf5w+xx648yIPhPQ4je2o7ZTWdjq/VQNrdtYJliNnzgy1a/zJ2XJ7ftDH9sWkGbZ58zb755+9XlFW6aiyductF1jLZifvc5WK2lrjyp1Qp4q1an6K/fr7fJv281xbvGS1j9cXacfO3fbiK594eKyx2HrdfYlXIG3cuNWDN22/lCh8mDzlD3v6uQ/8xFPtovbX9rzm8hZ2TJUKifZ/rb+2tyZ50UyqbVqe6u2i7fbmiC/tlSGf+XGt3ynEeql/d6+KEoVguojQvs1pdtlFZyeZQCer2leVanotVUIV+fc4jV4OrYPC3SnT/rTyh5e0njd3SDSOncTaLzOjnT3s/Xmut8PtN7X3IGBf6QLDJ59NsYceH+7vaaruXLZirb3wyid2XM2KHpgF7aq/z1+wwv9d8agydtSRZZKMtbli5TrfR1VRqW63saS2vTNLGO0lCqIVbL39/le2YsU6fx9VFdptN7azFk1PsnJlS/jnltrr7wUrvI1lyZLV9t2UWb5cCnr/99WMhOdUCxQrWsgnxtK4iVm536d3vTJCVXvr12/2z9GMBNi6qPXoUyOs5GFFfFtGzkCv92pdLHz/w2/9feekuvHj5upix8Svf7Gnnx9lf85Z7Ouj17/+ynPtpuva+nGlbaTJqe584HU//nQsH354SVuwcKW9Mfxzu7hTY6sYsZ8rTB7+7kTvvaDJhqKP/bTQd5db7nrFtm7bbue1buDbUh9WGo9YY4iqIv/PeUtt4eKV/ln74N0X+zAI0RcE9D405rOpfpyVLV3cZ2If/7+frPEZx9ulnZtkeL9I6fhUeykED3qs5Pp3Zu1DkULB008/3xYvXmbdu3ezY4452saNm2gPPvi0/fXXwgxPfPHnn/P8RKly5aPsnntusiJFCntI9f77Y2zdug1WtGiRhPt+/PF4a9fuSqtSpZIPgq9Q4uWX37JzzrnIxo0bbk2bnun3a9/+XKtaNf59Rd3DSpUqYffd998J42mnnZzw72uvvdSaNj3D/60TyDPOONWuuebShL9Hd1O/+OLu9sEHn/pr3H77tfbHH3Ns0KBh9t1302zatM+sYMGCCfft1Ok6+/776dajxxXeXqtXr7Wvv57iwWujRg0SPe8330zxdlBF19VXX2zVq1fxCr833xxpbdueE7NK9aeffrO+fQdZ164d7bLLOnrX3G+/nZpoAoCg671OTj/88LNkuycG91PgpPG9+vd/OOHvxx9/bMK/09O20rPnI/bMMy97l70bbuhqS5Yst+eff80mTfrBZsz4PMPjOOr7Xfv2Lb2KV5V/2l+eeup++/zzrxPd79dfZ9nrr79jF154nr++QvMJEybZDTfc4+Fxr163x3z+zGrb9Oy3GaFu6Aq89R6l0LV37+f9vfDii9slup9+rlv3uIT9PDlLly734LhDh9a+7qr0nDz5Rw/f582bb2+8kfQ4V1iu4+SRR3ra1Kk/e/d+GTFiYKL7vfvuR3bJJTdamTKl7LrrLrPKlY/0945XXhlul19+YZKJflq3vsyPzdatm3rQrKETDj+8rF17bZeE+xx5ZPmE7fDKK2/5cZQd6bi+4IJzfV/Ufqc2iAxn9b4a3aX40UfvtLVr19usWXPt8cefT/U1Lr30Rqtf/0R78sl7/b35pZeG+vvnE0/cm2QojqCtgp+lSpWKWbLfPvbYcz6Wqd4/Ncbsn3/+ZQMHvmkTJ062mTMn7lNVqwp9dDxEK1HisERDA6W3bdPynqDPstNOO8+/N15//WVWocLh/l6v9xq9loZwiLR581Y7//wr7JJL2vsQCzoOune/19s7csiJyM+owKef/s+Picj9SPr2HWg9ez5qp59ez5544h7/vvjaayPsjDPa2Y8/jrUaNTI27MrNN19l55/fIt1jP+IACCI1TpvCrtKlivqX61hfpnXwf/b5j9br8eG2du0mHzBd4aPGLVJYNmnyb9b76Xf8pE4zu6bX7t17rN+Lo23I8M+9K/Z559a3dm1P8/ECZ/6xwL/4KzRLtNybt9lzL31kL7/+meUvkNdPENVtec7cJT5G36DXP/OTCgVYOij1Zb/77QP9TSKaTjIi5c6V06pXO9Iee/odD+JihRk6cYqkE67I507P8kUGbKqOue+RN31ilCMrlLLLL2lqZzc5wY6oUMomf/+7z2IayYPAVevt/seG+Yznt914vl3UsbG98PInXrHy5aQZ1qndGd6lNbj/jz/N8aBSJ3z58ub23+lk5+5eQzxsVjVfmVLF7LPPp9vkH/6w2XOWeBCp7uFvvv2FV9t8+8PvHvjoREnbfuRHk9PUTupir0qjPs+OzFC7Bm102z2veOhRqWI5a928nge6Ch1+mfm3B3SquNOYcBmlN8/fFUTu2WvH16pkJ59wjB1d6XCb99cym/nHQg+GgmNFx4dmfx7+3kRfjp63XGDVq1Xwbtz6claqZJGEQDzma+3Z41VXGldS4VKj04+zCoeXsk/HT7V3Rk7yCtWP33nQqkRUDaq7t0KTPHlzJ4wrppP25wZ+7JWsWr42LU61L7/+xbuUK/xSEKnQ6oOPJntopy7RF3dsnKhaKqvaNwjc1T6VjyqbpGIrCLgfe+od+/izKR6mPnzfpXZCVDWkxNovM6OdtQ8Mf+8r++KrGVatSvlEwZqWTxNI6P1BQWjTxnVTHV9X6/rVN7/6bMa6gHP/nRfasdWPtO63veTtqirW42oqiPzv/gsWrfL11XMH761B2+j4/uDjyR60PHTPJdax3elJ2ia17Z2ZMru9RO9tTz070qu3NeGK3tt0EUzrdN8jQ/39qf+T1/gQFJu3bLN2Fz/qF2yi6UJJMPu4xI/bWsri4vZm6X6fkfXKiPj13epjY6oSND0hmdpCwd+4z6d78Nex3RlJgsjvp8729xtNbhf0JtC2vf2+wf4+3uH80/2Covb39z78xq678lzf7/R53vP+1+yvv5db65b17Obrz/OJqhq3vMv34+iq4IWLVtrQt//nz3V+mwYJQaRXru7Z65+lf8xeaKVLFbdzzjoh+e8fOeLsx5/nehgdWR0e7YjyJX1ymEjaLjpm9R6qiwW3dm9np558jN14xyDf17W+GZXa8akqe12M0d8UoObMxt3Ww6aqDlV7DBz4pIcIcsMN3axNm8u8ukYnlbVr/xdYpZVm2Ny+fYcNG/a8nXji8Qm/f/zxe5JUP911V28fA3Dy5I+sdOmS/rsuXTpYlSqn+d+mTz8zYZww3UTVXZFd0KM1aKBKpJMTApqjj66Y7H2nTPnJQ8joiQ8Uot5xxyN+wnfTTVf671QRphPrm266wvr0uT/hvnfffVOSClKt55VXxgdi06eP8xPR/9a5u1ffxfLJJ5/bpEmjPEiMHL4goBApWBeFSMmFZVpn3WTw4Ld9eyTXBulpWwVjzz472MPJCRNGJPT4OuWUul5h8/TTA+2ZZzJ+UqtwUSfy2ocUWiuwjQ4i69Wra0uWTLdChf57L+/e/XKv7NKyPfDArTHHLs6stk3PfpsRjz76rN8Cev7333/FLrigVaL76dgKjq+Ugsgjj6zg7XXYYf995ij402fYsGEf2FNPPZCwDgEF0x9++Lr/+/rru9r8+Yu927z286DLuiotr7vubg+eZ8yYYKVKlUy0j8fq2t6tWye74474cE5tXaHCiTZy5JhEQWSxYkUTtsP//vdNtg0igzD41VeH+3uI2imgMF3HRqdObRPdv2XLsxIqGdMSlp199un2wguP+791MUUV6yNHfpoQREYeq0FbpdSNOaz9dsqUMb6fRVJIesUVt/lxp4sdGaWKxdKlkw4xtnjxNDviiPIZbtu0vCfcdNP9XvmqCyzB+6n21cKFC/lr3Hjj5YkqttetW28vv9zHQ38599yzvdJe+3hkEBn5GSW6+KVwU1XhkceCKtTvuecJH7ZizJg3E35/1VUXe7XlI4/0t7fffskyQhfodGPsx4MsiNSXeQVSqqJSRY0mq4l1HwUvDz3xtm3YsMUrG7pd0tS/8OsAOOO0WnbPQ0Psi0m/+MnJueekr2JEz//JuCk2dPjnftL05EOXW72Tj/ETdT1PoxhjBioc0gnvG2997t15H7j7ImvcsLbP6qmT0RdeHuPduV8dOs7atz3N10snj1o2BV05cubwEy31glLIpK7ZGj9MYzDpREcVggoJp07/03Zs32m7/tnj665AR4Fe/ZNr2PHHVfbXVjVXieKF/TU0KUBQmZOe5QtOnhS49X9htJ9caVlViaIT+mBsr3Ztkl6RVkVXvwEfegh54QVn2jWXt/RxtlSloZNcdQ3etWt3oiBy5aqN/l91c1WVisboeuKZ97yrb/erW9uVl51jBfLns7rHV/GujUdViO9WvMbHTNzpJ0rHHRv/RVX/btW8nlcJpqWdihcvbL/89rcNeevzdLdrUCH32FPv2k+//GUN69fy8Fsnj1oPdVG+7tYXvS2GDP+f9X7wsgyPi6axyxTO6nlr1qjoMzqfUOdo+3PuYg+smjc90Z9byzT372X2ZP/3PXzWzKeqntQXy+Ur19neuL1WvlziK2GRdPL97geTvEJPz/fEw928Gip/vrxe/fLKG5/ZNlXtRYWFuoCgbX9MtQoeWGufe++Dr23oiP/5hYKnHr3C26VVi1Ns4OCxdlq9+BM0dS9UFbRokqboEDKr2ldhgMI0vWbkbNA6IVKlmN5LXnp1jIdzel/o98TVdtqpNWO2Y6z9MjPaec/eOO92qWVduiy++7vekxRoqtpYwxcsXrzafz92/I/Wt/eVPiNxcu9/S5au9kBD77eP3t/FWjQ7yYOFk06oar/MnG+bN2+3OK8r/rcicu9eD8e0TKrQDo4BVWLdfu9gPzYVBonCnrMa1UkyWU9K2zuzZXZ7qfv8w08Mt0/H/egTrSi4Pb5WZd/WqgCf9efihKEPRO97F3Vo5O+f6u6vsEv7kiqG9X6oEFHHit7jS5QoYsWKFvRxSbNyv8/IeqWX1kfVxnqd8uWOTDKDvPdA2LbTKz6D8RlVsR1ZCaxjU6Gg2k636AuHq9dstNy54/dLmfHr3/bYUyM8VHy8VzffF7Wf63NF7azn1oWC3n3f9eFX9FmlCsfixQr78C4KFWvXrOQXvBK11Yatvg313qrq8qDrsn6nCwr6jNN7Yd58uf3z+vFeXf2YjqSKVm1PPdfuf/7xz/3ItlA4etu9g/0iqCbMUtj437r+45/jWsaqR5e35566zo6pUt7XQceh9qVax1bMcDVkasdnEKBrJm0d29l5/MywKeBRWKBqkEhXX32JjRnzP/97RoLIoOJRJ8aq2AqOO7V15FBFGsNy9ux5Xj0SGYSo6uTcc8/yIEhdmEuWTH3iw4wKQi6d2EXSzwoi9fcgiFTFnMYunDbtFz/hVEVOIDp0+fnnmTZ37t9eaRoZQoomQEluEhRVGUaeFEt2moTkyy8n+3nK5Zd3TrRc7dq19ErK6NAwvVQlqoBRJ+XaB1RNGy2y3dVNVaGu3nc0tqKqlVSlqoAmrLYNe7+98sqLPJDVdzdVm6riTt1FixYtbM2aNUr386k9g9BWwxCovfTctWvX8G2pauXoIFKvH+mUU+p4l31V/pUvHz82tKpQN2zYaPfd1yNRCBmEibFEPq+2raqEFy5cagcq7U/a7goegyBSbauwsEmT02Luh+kR2V56Hz3xxNoZHi81zP02CCG1PymgVmBdo0ZV/53eB/fFySfX8UrAaNH7bEak9J6gttBnmKoGdexFVmU2bHiKV6BOmfJzooBRnw+RoasuaJUseZgHisnRBS5dfNH9Rox4KdF7kgJutaU+o6OrQk866XibODHxMCPI3rKsInLdhs3erVSVDdEnLJFf+DXu0wVtG1qP69smfNHXycXZjepag3rH2kef/uAnxuecfWKaJn6JrDb4aMwPvgzdr2ljjc+oneoJ2Ny/ltrAf8eNvOvWDnZuMwU/8V/QdWJzVdfmNmLkV97t6qcZ86xpkxP8JOLl5+KvwimEbNP5YT/dV8ChE4lYzj3nFL/pddRlvE//kXbi8VXshWeuT7Hyc+5fS9K9fKKqj8/+N81PqNU9LLKLWiw64VeFx0djf/CTo2uviA8h9QW6RrUjrE7to/1Et0iR/2bi1IlnMAuv/q6TshEjJ3lFX6f2Z1qP69r6yaQocNYtoBBXYyYedljhhO6Meq2W55zst7S2k0JfharpbVfdX4HG/7762aupHnugiweEwcmZKk67XXy2/fb7fA+w1J6qDMyIxUvXeJc9VYYqkFDw0OLsk2zk6MleXauTQ1UJrVy9wXo/9Y5X7pzeoJZ3t1eIKzqZ1zYqG2Mm6GD7qZq474BRvn/cc1tHP8aCk1Gd5OpxxYoVslIRXRKDoEDDDCiwVRdOjQM54OVPfD9/4qGuCd0xdWzqFlB3zfXrt1i+vHnslJM0JmzORMuTVe2rCyAKSsQrfXLm8J/1PjLhi5/sm+9+T+hiq/cbBXVapnIx2jLWfpkZ7ayDQwFm0OY6VvTSU6fN8a7VClguvfAsPwanz5jrQx5Uq1IhZtdnhUwvvvqpT8rUuf2ZHoKqy740P/sk++qb36x2rYqJKp4UvKxercAnV0KVp0KLR/qMsG9/+MManHqsFSqU3z7/8mdbtGSVLV2+JkkQmdL2znSZ2F47duzyavKPP51iVascbn0eucJqHPPfrIN6n9CjVClb6N+LLBo/776eF9p9PePb6eqbnvdqRg2xoQssya17Vr6vZGS90kvtrtnlVamnLkZ6v1+1ZoNXiipA1EWWeX8tt1lzFnvXf4Vcg1/o4cN4eHuoEvffynS1R3Q4qOdSQKcvoKp61bGsY2vR0jX27JPXePW/jjNNrqbP16CNdXFOlZ66KKghFrSuqgTs+/woX9eruzVP+OwJ6HHxV/z/Gz9Rv9PYuKqU1CzSmtX7g9Hf2oeffO9dm6OHBPBju2ghvyVupzjvyq9xH9VoXS9p6l3Dg5BZf1cwPeSt/3ll/t23dfT20PGkoSBU4Xtp57P8PSejAWFqx+fGjVv8faDEYYV9Ox3K1HVQJ8oFCiSeWVwnT7Jo0dIMnzwPHfq+V9g89dRLduqpJ/qJ22WXdUhUwaLXl+jum5HLoPuEGUQmtwwKUhR4RbaBwsO+fR+w22572MqWreOhgKrzdOLZpEnDRI9XRZ0cd1yNdC1PrVrxFyKyq+TaS+cYRx1Vwbvl7gu9B3bo0MpefHGI9elzX8z76MT8ySdf8MlbYo1LuWPHf2PohdG2Ye+3VatWStRFVl1G69ZtZldf3dPmzfsuQ5PoaJw6detVt9XoquRY7aUJQiL9F2Tu3qd9XN2wo59XwfGBSvu9qqk127yqhRXSKrBdvnylPfbYnfv8/LHaK3pcz+yw36pq8aGHnvFgTmF3Wo7HtFJAty9DHaQkpfeEefMW+HcWBbTJVUdHjyNcpkzJJHmLtlnkcRNJz6+hDRQSf/fdx0nafu7c+GOsU6drYz4+oxfXcRAHkTpB17h8osClYFT30b3/drfSSbKqSa7q2sK/MEfSiYPGU9LYSRpXKv5DI+1BpL6E68TIT1rTcP8dO3fZK0PG+QmVqgZbNI0/6YmkQFVhlypYtEzR/pwXPwulJhDx8aJSofv+OXepn2xUP+YIDxPDWL75i1b6yaOClxwWX/2U0gmOTjAVPqmq4pJOTXwbBfcvf3gJe6HvdVawYP5EVyxUdaVKGHXJrlC+hAcCOqk79ZTqHjInN86czNbsnXv3erf8WCfJ6WmnjNxflW+qXPNuRF3O8RAjsn0U4lQ8qqyVKF7EQ60Nm7Za7Ig59eNCgZG2ocaH1Bhmoq7NCghU6fj5xBnW9txTfYKFid/85gHgXbd0TBTmq/ultqGCrlhjva1Ytc5n2FZAcP1VreyC809POBHVOioI1bKoq3AQWgXLp7EBRcGc7qtKP83U3f/Jq61K5fLJ7jdaJ80GrqpT7YOR98uq9g32Xd100q+qNL2OuhoPfG2sV2KpYk2huj4QFfb2fe4D+3LSL75+OmYjlyu1/TKj7Rz37/Gs19LsuHpJdYnv98Io/1lVlRo+QkG1xs6d9O1Mu+GqpIFXMCGTunZqHF2NDxs53meDejXs3SF3JwSiAXW9VzWhqvgUBinke3XIeBv/xXTrcF5Du/OWDl7JN3defHXW2rXxY25GPkdK2zuzZVZ7aVtM+PJnrz7UfnDP7Z0TdeNWQK0qxZy5clrVow+PuU6+3ms2etfqWjWOSnGsxax8X9nX9Urb6+z1CnXRuLaa+E3B4c6d//iETzqmtCxqd1XpFy1SwI+1hMfv2evHlPbRU0+qnmRs5uAiyGHFCnn4r2ERfpg62665ooV3j45u6yDwUxdkfX/QBTNdHFOA/mDvYbZuwxbrcd151uLfADOSKn41TmfBAvkThqDQZG9qw2ZNTvTqVFU8Ll26xr6bOsu++W5mmscm1fvBg48N8273CiB7XNcm0UVUVX7qIp3e79U9/czTjrNZsxf5BEb6TtTkzON9KBT13sio1I5PVYLqpmEEMtpNHynTDJxffvm+z3qsCjp1Z9Z4dAol1YWvevX4gP5ApOpIVf999tlEH0vs7bdH2wsvvOFjuN111437/PwlSiSulj4U9ex5g9Wvf5K1adMs5t81UY3GpFRXZY1hWLq0xv7N6cHk229/mOxEVwdq2yp4bNGisY/L+ddfC9J9/Gh8SI35qOovDbegcEufQWPHful/i9VeYVXhZqfq3szsnt2v38s+pqnG3VN1pC5a7Et35AOpvX78cYaPMamuyxrTVeNSqnp86dL4idDCntBxX6TlPUEVpBpSIJZatarv0/bSpE4aG/Ktt15IGO81lqFDn7Py5VPPVpC9ZUkQqZN3dYuNn9W3TJLxh3bv+sfeUffe7Tvt7MZ1rdaxR8U8OfITlRxmuXIqPkufQoUKePdbnTypu7KqG1QVGR14BtRtTV3v9KVc3fAKFEhcQeFy/Nv1xLvWJA0F1P1NIZhOWDTGXmpUkfHbHwu8i6SCqZROavdl+TRGp9ZbFSa9nnjLeva4wMeQS+6EVN0b/5y7xE9gNH5W5HNpm6pSLtr2HTu9259O6hQiv/bmBK+4uO3G9ilOqKITV40VqXbTOJXRJ6fpbaeM3P/jsVN8tladpLc599SYFU7qXqd1176cwXkefLmmTvvT11lVqcG4ZIUKF7CmTep6V71h73zhlbUaF1Inwj1vvsC7RAfbSvuZqtn0HOrmH70NFQioMkrbr85xR/uEDqogCihYUndHPY+qGyPbJr7r5Rb/nYJRjdP6409z7YarWvm2Sa4dvVv+6vjKKFVJ6uR2f7SvqEu+ukoW/zfMUPto3EDtV6qUUuirtteJ+vgvfvJuvVOmzbbH+77n3c6DYSRS2y/3tZ03bd7uv9NEMXqtD0ZPtp9/+du7FWsZtdxNGtW1p577wF9DFxKiw3yFQJqQROusoRgiJ6QRhbHR2yKo/FYopHEjtV6aZOe1YRO8evi+np197D8dw+o6q4k/FKilZ3tntsxqr3XrN/ls5hof78ZrW3vgE7ldZvz6l++n2g8rVyoXczkUlulCgCY40nqnFOpl1X6/r+uVVmp3VXQHFZjqJq9qVF200GeCqmb1nqZwWMH4CXWrWIV/L7YEgaiq/bTPaZiUWBcP1cb6rNYFG4V1ev/uelHTmLM6exA/5Q8fH7nO8Uf759XTz470xykAvee2TknC+cDOXbts585dPqTHYcUK+7784qtjfJ+//aZ2vi5aXw3hoonC9Pmr94LUKn91rKgr+dff/W7ntTrV7r6tU5Kgb/mKtTbu82mWJ3dua92inlfwavstXLzaulzYxMPTjEyek9bjU3/XvqILjfq8Tmmc4UOBJoXQBCPbt29PVBWpcSNFFW4Z5b0H/h0HSxOf/O9/X1uzZhf6mIsaky54fYk1M3GwDMF9op87s0QuQzBhS9BdTt2vGzZMPFGLqKpT3dd1UzfEU09t5SFRZBAZPJdmAN7f0tNeqd03uW2m8w1Vj+7LPhOoWPEIvyVHk/1o4oaRI19N9PtgQpWwZXS/3RdBFeT69Ym/k6TFm2++7+0ZOaanfPVV7Jne0ypyH2/RookdqtRFVhNXaeIeDcWg7rQtWzZJMtt72FI7zMPabxX+6xzvs8/eSjQMhd7zD2QKVH0opH/+CaUi86OPxvkkP7fccrWHnclVR4smsAmrKjStgu+i0eOOI+2ypH5VJ7kLF63ykyFVJUVTpZDGDVTQ0qZlvZgnCjoJ8LHw9sb5SZ9mQk6Pw4oXsh7Xn+fVSKriuPXul+32e1/1Sg4FoJFXJ/TlQV2ZVTWi7my1a8XuEqUKCp2I6qRRyxRJXZNVaaTnrXNc5URjwiVHk2qoG7XGHdNYXsnZ1+WrWqW83XhtG++Gpe6p3a7vZy+9+qmPVRg5IK2oMufTCT/6a7Zr0yBJt7PkqKpKNwUYf85Z4l1WO19wpp9IpkRVlCtWrPeTIY1bGWu90tpOGbm/KjlVnauTTHWXj3VSpm2qaiUFuTqpzGgFiWZ7nTp9jp/MKjQKqmS0r2jsMz2vQkiFkfqdwi11r40MFRSAaVkSultGzYyuMFwBYuFCBeyGa1ol6pKuCwQK3zRZjugENbKk3WfH3rrDn1NBjmYE1kUCbUcFJslRFZMmP1GlT5WjD0/UPlnZvv48G7f6cymIDKpINSbpLTec790rNXag9lFVLWvW9+7XqHIulwcaChvSsl/uazvr2Nq8aZvvB+qqq+rxoSO+8EluNARF8FqqWNTrK+APKlUjqRpMlVq6KNHhvNPTNL5g/AzOy7ybrLqy6j12wKCPvXL9zls7eLgt6jIatN+iJfGVnWnZ3mHIjPby8Yo+muyhlQIrdauOvCilKlp1pdVFMh1vqiSL2Xbzl/sYhfFDF/w3Rle0rNrvM2O90vxacaryje8+pnGA3xh4q415/2Eb/tqd9lK/7t4d/P6eF9oNV7fy4FUXoCKPG31+6bhSu0V/fsqWbTu8B4P2PY1zqTBNF0Gix5KM/Lz79vs/fBxIjed4z0NDfbZwHeMvP3+TXdK5SczvFmp3TSqlYVu0vyuMVBWlPmMvv6SZV6FrudVeGmdWz6FQT9s9JWrrp/qPtE8+m2ptW57qbaF1ifb9j7P9fVafT7pgp0mEdHxpfFeFp7GGiUiP1I5Prb/WR/dT1WXMi5qHEA2Mr5MsdaOOpMlNgr9nRKwZTqtVi6+qjeyipkBP44hpzLPI7pma5VfVWur6HKuboMaWU9fHtEjtvsE6KiBNrQ22bVOIvS3J86t7e3TXuxNOOM7XWeFYcJIfUNdFhZxZpUiRQj4zavSEOhlpr7PO0hAsubz6MLKLr7ou6jUyus+khz4Po7snK1xJaXKZzJTR/Taj1M7BeK4aUzG9tL302Mj3Vo3tqG24L845p5GHbc89NzjJ/qyAPvpYyS6GDHnXcuQob5Uq1cu057zoovPthx9+8jbVcaCfs1ownmpyx29Y+21wkTLymNRnrSaOOpBp3FO9340a9Zn99tusJH9PadzH1MyePde6dOlhjRo1sKefjr8wF4uqvtWuTzwxIGaX/EWLMr4M6aVJqTQGpmZER3bumh0X513nFIipq1i0aTPm+UmAvnAnN1OoKgs0mY0Grj/5xGppCvYi6cOm/inV7Y2XbrXXho337lsah+/LSb/aea3q2/VXnWtHHVEm4YTmx+lz/U1DJzGabTQWVdroZKpIkYIeNkbSWJeqgtLkKMkFatG0fhrkXt0jU6oq2tfly5snt5/QabB8TWij2a0ffWqEV6Vq/Kx2rRv4Y0RjeM2dt9SrojSJR1rHftPVAd3i9sZ5OyuY6XB+Q3/tlKxZt8lvun+pGBV+6WmnjNxf498pkNXJmMYyizXWhAfBv8zzLso6sSsXI1xPi7/nL/fKJHXTVfVZQOusrqUn1q3iXfPU5l0vbmpXd2uZpKpLE8noJgqKIptLoYcCMAU1zZqcYGc0OC5ReyqsGPTa2ITZWKO7dsd3gY3/m8ZOVDtq/DJVN6W0P+vCgybwUOVylUqJu35mZfvKqjUbPQSqXrVCzLFpIym4UzWd2kHHrk7Ogy7IKe2X+9rOqtjcvnOXh7uqUFMlrF73yq7NE3VlVQCi6jAFh6pEO6Za4uoIVWppjMgT61SxGtWTr5yIpABGlVd6P1XIorBZQzk8fO+lPkFIsB7aB0v6JBZms+ckrgZLaXuHITPaS6HWp+Om+j5+2UVnWemIMS+1bvp80Bit+ru2lWZLjqb7KYBW6fnRFculGBxm1X6fGeuVVj5Z0PI13s4dzz/dxyNNz7bXhToF4ApMYwWEqrJUEqkLhe+P/sZnkG7YoGayrxFUvosmeVHl8R09LvDuzrroltzjFHauXbvJ20ftpUl1NMGbPltVWRzs53q83kMUJgYXODTeayz6u6oaR3zwtXcjf+T+S5NcKAoEPSf0Pq5jT0G1Zv9Wd/rMOJZSOz6DynfR95WDcaKalStXJ0wYMmPG7/5f/axJLyRyRlVNrvDyy2/5zKA6uahWrbKNGzfRZxLVuHQZmahGVOWhiSzUtVYVMps2bfZgTydU6sYYSV2a27W70ho2PM+uvvpif28YNGjYv+MAxs8MG03VcJrxW110Gzdu4OO2auINTboQ675alt69n/PJTBTKaFzHYMITjV+pLpTvvPORh4lNm57hM5gOHPimn7Rr4pDAnDl/W6NGF9gFF5zrbaOxv7744hufpVUzaUfS+97gwX2tRYuL7cQTm/u6qWpq5co1NmzYSBs0qI9PzJJen38+yZ9Dfv01/uR49OhxPlGMKAiMniBDbaBtqlm827fXd6t8XmkTWQGa1rbVGHjqfqquqKpwbdeuhXfBVBilqp2ePeNnRA6TZqR9+eVhPoGLlleVmFpm7WuxAoMw2jYj+21a6bh9660P4iu8V662UaPG2k8//eYzUQczX0ce5wGNl6nHSdmypRImttGYm5p59/zzL7fWrZv5uHY6HrVeGtdwX4KvQYOe9DHujj++qV1xRWcff1AzbCvs++ab0VapUvovMo4e/Zlt2RIfYgZjgAbrJRoaIXLG9PQKAnTt25lFwePDD/fz40YzKscaVuDXX/9I2K80Vqd8//30hADv+OOPTZi9PiN0LAwY8LpdddUd/r6l2bErVCiX6H08jP1W26N//1etdevL7JprLvHzYVWFbt4cXziSFcJq2xdffNzbqkGDNl4Br5nAdfxMmfKTjRv3le3enfgiU1p17XqLbdu23dq2Pcc/eyJFLquOJ03U07Pno3bSSS18/GWNmanxWRUen3JKXRsy5Nl0v76Oq+++m+b/VnAuar/gOFM1aOSs3sFnWqdObWzEiNF2771PWM2ax3iVZPTM8NjPQaROJDT+kMT6krtocfyYjz6jZoyxBHTCo5Ps32cttOpVj/Cuzhn5sqzHaPbfB++62L/sD35zvH0/ZZZXiMz+c7H17tXVuyh7d4olq/7tSh47dNE6DX93oo8tdc7ZJySZCTwYO0zd0qLHZEuOgkOdkOiEOaXALzOWT6HLWWfW8Rk0NamDxi77e8FKe+iJ4X5i9PB9l3rXUlUTakKC2sdV8gqjtIof/H+vnxgrDOpw/ulW45ikgwFH8xO8jVu88lITZOxLO2Xk/sHsqXp9VcbEalu1h7qC5s+Xx/ej5Lr3p+aHH//0Dye9VnSoVLJEEbvp2ra2ZcsOO7txHbvi0nNinvQqeFKXOm1Pba/I5VVX3f9N/Dl+sqfGdb3bpOiL3JJla6330+/69tXftRxJujvGxT+H2k5htfbjNi1PTXVf1vG6cLFmjc+ZJCTPyvbVcaIgXeur1yuaSjVvfJffbX7sqC0jQ62U9st9bedlK9Z6YK/JhhSAamxGBZon1fmvC77oQo4u1ijY1Div0cs+8/eF8V3HNd5eGsdk0YQgGjtX3dYV2L454ks7s2Fta9f2tERhmZZD3Wpz5shp02fMs81btiVUS6a0vcOwr+3lM9DPW+ZhlS5MaCKS4HFqP82K/OxLH/m//117H+s29tizS/w9RZ8rKR0XWbHfZ9Z6pZXed7ROamdNvpTez2QFszpGNRlX9HAtwXGlwHTmrIUelChQTOlClkLNdes2+3K0bn6K3d6jvVczRoecOr7ffn+Sd7dWDwxdKF22cp0/7ogKpW3c59O9J8Gt3c9Pcqyr67KCyGCW6VgTvamitf+LH9pb7060ls1O8gnhkgshfSb1f8dc1QWevr2v8jFoowNBjeGqkFKf1Tdd0yZhHMu0SO34VBCr7Sjalw/GIFInYV26xE8iGHj88ecT/h0ZRGp262+++dDuvfdJGz58lG3YsMm71j788B127709MrwM553X3INPddlbtWqtHXZYMe+++Nprz1i9eidE3beFjRnzpj322LPWq1df3ya6r0K8s88+I+bzaxIIVXSp2jAY4+6NN/pbt26dk9x34MAnrXv3e318SgWiMnHiyEQh4Ntvv2iPPvqsL68COy2vZil9/PF7rGDB/wIPdVvUWGEa91LjwYnGRXv22Ue8S2Y0zcY6depYXzd1J1a32iOOONxDCoV7GdG79/M+GUakW2/tlfBvrVt0EKngcMGCJR5oKQRVe/XqdZs99NAdGWrbvn0f9IDj1VeH2+23P+KhUKtWTT3IUNVM2Pr162WFCxe09977xMfjU4Der99DHkjuSxCZnrbNyH6bVlon3YKwT/uKjp3LL78wxeP8m2+m+E1UaRUEkRpzU++NqvqdMOFr348VnCs4vuyyjB/n0rnzeb5Pa/KgF18calu3brNKlY7w/aVcuYzNGH3LLb2SVJtFruv8+VP2KYhUqCtXXfXfRYZ9pXE7VVGo51Y32+gJwETHn8LKSJp5WTfRMbkvQWTHjm3st99mewjcseM1/p2ja9dOiYKqMPbbhg3jh0l49NH+PkGZPld0cefGGy+3WrX+m1E6TGG1rbbr9Onj7ZFH+vl7vkJIBfg6Jp9//tEML68uJOiCssZujRa9rHfccb1XQvfr94r16fOiV0aquvXMM0/1cDQjvv76B7v88lsT/S5yUh7tN9FBpAwY8Jj/V+26du16n9SNIDKbBZE6udcbvk4qYs2SpC5O+jKswfNjWbBopU/WoQqJzh3OtKMrJb3Cmx76kn9mw+P8RG3Ux9/5SYO6R7334TfefUpfMnQyIbHGwdMbmbpbjZ3wo3dtix53Kqgw0Jd7dUUPgomAuqPphEKBQdA1To9Z5CdmcT7JQnQXWZ1cqqtoq+an7PPyBfRmq5PAizs29lDy9WETfJy7d0d97ROkaHZmnUBrLK9Ys4EG1AX7i0kzPKgIxqVTd1S/7dnrIVnzs05MtRpSNI6kxqsqUyp+HMto6WknrXN676+uzqpc0wlorDHatC/rZH7x0tXW+PTjfSzHjNBJtqqf1D7qXh8Zeona8LRTj7VP3vvvC19yFW3qeqp1UZfEyElEtP/p2NHJpWZDDn6viUceefJt++rb33y95y9Y6UMUbNsa3xUy8jxUyxls4yZnHJ9qVWHCJBZL1/jzRHdBzar2FbXtnHlLEoLdyDEbY1H15/sffuvtdvxxlX224aDNUtov97Wd9XsdK5ohWTP+6tj2LrVR1a96f1RYofv6GHV79yZ0kdd2CqqaKkR1/Q5o31DoOGnyTA/udHFCP+umUGrYiC88BLvpujYxuw7XOrair6MusGi8Q1Xp6v0npe0di2Y0VrXpGafV8mVIb/Cxr+2lzyFdyNF204UnTbilZVD7KHC/u9cb3p7dLmlqr785wZMaVatFdrcXPZ8uoilEi6601uO/nzrLu/3rfT4r9vvMWq+00n6jYyyYLTq9VEGs92XtM7H2AXUlDt57ah1bIfVhOHwG9fjJcXRBQBWN0UMo6GKCuj/r9sRD3fzve//Z6/uUKgY1s7rGWa1fr4YH8tE00ZkuCOlzQ2Fl3HGJJ20KJnbTZ2mTM+pY7we7euCZYgX5v8PCaFZsfSfREDWR1Mbqmt6n//tesa0u3OmR2vGpdtm2Lb6qPpiN/mCjgC0uLn521LRQpZsGws9MmkE6ehbplJx77tl+SytVM7799ktpuq/GxtOJd0pUIfjYY3f5LSXqtvjcc+k7+dSMwu+8MyjV+6nyJS3b7auv/qsMSyut30svPeG3zGhbHeO33Xat37Jin1WQEhmmKBzu27eX36Ldf/8tWda26d1vM/PYTc99dXFLFxZiXVzo0qVDop8VIMYK9BVaxwqugyDqk09SPsait2FKbb5gwVQLkypJFZ726HFlpj6vAquUpNSGab1vcu0YHJePPnqn37Jyvw2qInWLlp7PoljS+vj0tG163hOCz5DXXksccqbn/SPW/pzefbxNm3P8llmSO85To6EYhg0bkGnLcajJkjEi9/yzN+EkQV/Uo5UoUdhPaBcuWukVXgHdX4PZa8bJv/5eZs2bnmSd2p2R6mQjqdEbk24+JuW59eysxnX8OXVS5kFOzvgu5PoCr+6zqioMlkdf1NW17Yl+7/nJpbo4166ZeFII0Umnqp90khGMp6bHq6v0w0+8bU/2e9/Hxoxc12DdvWLl35MTPceM3/627re9ZEPe+twrIzJj+aLbQpVDXS9p6t0zdUarMavi/5448Iqk5VTFmaoo73loiE8oEIsqSDQZQWqBQ/yYhNu90kbbI9Z2Tk87ZeT+CgQVVGvCDz/R/Pf+QbXcS6+O8e6rqrS59cbzfebxjFCVaDCTeRDwJLdtglssHnDsit/X1P6RY/cpOFPlWLC+CiE04Yq21ecTf7ZmTep68B6MufbXguWJxjf6dyn8/xXSnla/ZsxAO5oCUVUmBY+LlFXtGyyHZkuP74KafMWWB/ubt9nLr431iSIUHl3csZEdWaF0mvbLfW3n+QtX+M9/zF7k1X2d25/hYx9GL6+CEoUr2iaa5Oirr3/1WX0V7AVdbUVtGznmrf6tWaan/TTXrrl5gPV/4UO/uBBMtqLAVENfzJqz2C7q2MiHcYjVVuryrcBN66jQUtWAQQCe3PaOpquduuBzx32DbcT7kxLtr2m1z+31/e+2cWN8hb5e37vFbtvh75u33fOqLVuxzm654TwfQzRX7lweOP61YEXMzzXvPhw1ULXeb1QNd+vdr3rFfVbt98GkI/u6XmmlWZh9PXJYzFnkU6LlW/Pv4yNnkE4kYnMeU6W8V+Sn9BmifU/dsbWPaTzOteviu1tr2+h7h8Z8vPGOQfbqkHHWtlV9v7DiQaRPprfCKyo1zqvGvbzi0mZ+ATH69RQqarI3bfdRH0+2n375yx7v+27CrNPPDPjQXhs63nt3aMiXKT/OtoGDP7W7HnzDul3Xz9p0ethatn/QLr++n7/XaP3VNV3H7vSf5/kFDR2bWibtR3pPVwB5/6NvWoGC+XwM2/TuG6kdn3ERF5z0vSUjxyQA4MCkMfU0xIJmW49VtQgAB3RFZMF/Kx68UnB9fNVOpIb1a1nevKNt5cr13p34sovP9i/6U6b9aS+8/In9+vsCO71BTbuv54UZGsBfr6sufDopURigSUL0s8ZN0iQqH4/9wbsZaiIQfVHPuTeHVwPq5FVdwr+e/Js1PLWmh4ijP/3Bhgz/3Gdt1eQWl3RqHHNSCJ14Kuhctnydj69XunQx+3Xmgvjqy6mz7You51jNY/8bD1PtE6ybumDppLFIoQL2ybipNnDwGA8GVV2hqkKtT0aXTyca6tqo11L3LnXH1PidGv/v8y9/8nG7FERofDhRVY2qXWb9uchPvNQ9TkGNJvz59vvffTIAPeaKLs2s8RnHJ7yOTi6DE8xqVQ73STzSQmGPB6q7//FbtPS0U0buryqryhXLeTWTqq1U4VasWGGb+fsCG/HBJA8LtK/cf+eFdvIJ1TLcjW3Vqg1+08NrVj8yzWNvJhFxzqjtoBPY4Lk0LlqBAvm8QkqzQSuU0eysqo7RTOu339TejwdVzWo5vvhqhl3drUVCACcF8udJqChMabbzSAqsPZiJM9u67b8LC1nZvqKqO1VdKdHQPhuL9rX5C1bY84M+9mNH7xEXdjjTLuzQKNE2SWm/3Jd29ordJRqaIn7CCI2Re1HHxskG05rZulyZ4l6RdclVT3vAOrB/d99/PdDIkcNGj/neGtavabVrVvSLFLqY88nYKTbqk+/8/e2+Ozr7OJd6Te0zQYimIS9ULZfcJDeqhtWsv0/2e89WrNqQMFlWSts7mio1Tzmxmo9VqKrwq7o2T9e+nxntpYlU9H6stlClZN/nR/nnzYQvf/aqyrtu6WDdLmnm21ljBmo7jvlsip1yQrVEXXX1eL2HKlD7bsosn5F49ZoNHrBq1nGN8deubcMs2++17TXJ0L6uV1oFs1rrdaOr+FKj/SUYf1bvwZHBeSAI3LRta1Q/MtWQW/vtea0b2Kw5S3yCNR2PmhxNgauGG9HEYBoLVT0ENFlVsM66CKXPMx97bNUGP/4ju7VHUuCqvyncVVWwbgro9ZwaV/K1oRP89eYvXGndb0+5gmruX8v9M1PV0m+O+MI/m67p8bxXxBYpUsCrbfX5ru9FGl7lzls6+mR76ZXq8RkX361dFFyr63pG9gcAwIHnqKOO2OcqPQDItkGkKgsUaC1bsd4nPIhWq8ZR1vbc+n5y+tzAj+yDjyf7F2d1/VJoqDHyHrv/Mj+Zy8gJmk7+7n14qFcLqiolqBTRzJr6t8Z50qD26ioYPztmTmvdvJ7978ufPQS9/Z7BvtyqmNSssuri2eO6tnblZefEPGnXc2hZNYHB7LlLrFfvt+z5QR/ZmrWbbevW7XbhBWfazdeflyicC7riKiz448/FdvEVT/mJl072FaI8cNdF1rzpib5sOmHKyPKpTV98dYy9/+E3fuKov6lNVO2lttDPbVrUsx7Xt/XgSctUs0ZFn+Rnwpc/2dPPfeAn0VoujRGnIEAnz/fe0cku7tgkURd0ncgEJzPVqx2ZbLf7aDph1uvGj8mXtHo2Pe2UkftrUp5LOzexmX8siJ/pu9uTPqafJjNYt2GzTzykE/qzG9XZp5BMbb5123bfBzTjbkapW2W5Mof5ibRCiMhlUrfXTu3P8JPciV//6mGJus8/eNdF/nt1MxTt9yNGTvIxQlWt2/vBy3ySD1XpqOu+qEtyWsc6zZcnt1dcKfDQckXKqvYVBXDxIVt8dVgwZIIqjVTBqOoxVSWNGTfVg6p8+XL7zLq339guSffrlPbLfWlntW9wjKrbpII5BVax1l2/q12rot16Yzuv7FMIdk23FgmTypzfuoG9P/pbn1zq2psH+L6h9Vy/cYuvqybsuf/Oi6zeSdU83FHbKHTQ8aAuzHoPTKmLrY6RFs1O8uEXdu7cZUf9Oz5eSts71jqUL6exJnP4e0isACo1+9pemuFZ75PqMj112mwb/t5Ef6+qWf0ou+PmC6zeicckDOtwRsPj7O33Jnq4q/f0669qlTD+pu6jwFd/G/H+Vz5juSrHNS6uQl2NDRiMq5tV+/0pJx2zz+uVVvHhfESvh3TQ/heMT6nu6EH35Egaz1DvQfqfAu/U2kX31aQ5mpxo7Pgf7evvZtpX3/7q66V20PeMm65ra6fVO9YnDQqeT/8N9qlaxx7l4/EmF3rqvrpI8ffCFT5hnLpT62KDumO/N+rrhPcDXXjQeNYa3kEXcPQeX6RwQZ80TUMbKBTVc6kd1I3+jh7t/bP5r/krvNJYx5qCWD3unjs6Wef2Z6baxTs5qR2fesrgQo1PfrZtB0EkAAAADvwgUl9yH7r3Uh+rr23LU5P8XSf599zWyYoXLWQffzbFT6Q16PKx1Y+ydm0a2PmtGljRGN2k0vz6+fL6TNOqpNEYYcWLF/LwsVHD2tbo9NpebakT8MgTE534PN6rq/UdMMq7dOnkQEFMy2YnW8d2p3vVgp8kJbNM1apU8JMezZy5cNEqn+X12GOO9PEYW7esl2RMQGnVop539VL3MA3krzCwaZMTvLJRJ8+R4VpGlk//LJA/n4+ppmqWfPnzeDcznTCpckTjZqoaJ3LAei2DJvfRNtLkKuoin79APqta+XB/rYs6NLbq1SokWRfNUK2JcFat3ugBTDCWXWpU8aHuiUWLFrBCBWNXZaS1nTJyf7WZutRp3NK33psYf1KYQ7PzlvXfX35ps/ju6/uoQvlSHtCqG6Eq4zK6byvg10nstJ/nWusW9RJVhulk+rYb21utGhXt7wUrPGxqcubx3g0/UpMz63g18JeTZvgJeZAJ6LlOPaW6jzWqmeqjH5ec/AXy2un1a/oy6eQ5Ula1r/h4aDn+q4qdM2+ZdwnWzLq7d/1je+MUTsbfV2HB1Ze3sG4XN/WQPnp7pLRf7ms7qxuoKipVGasgJaUKQQWkqr7SLRAsq7r4a9y7wUPHewWYLhQojNTxqWNVF3uCCwyi11E1od77tH0VYKW2H6ryUt1f07q9oykQ1tioCp40TmV6x7vLrPbS+3//J662j8b+4N1S69Q+2i+4aNsGbaDtqsf9OnO+V7ZrP42k2ZVvvKa1bdq01ccY1jiD2ne1D197eUsPo4Lnyor9PrPWK600VqL3dEhcmJ0mag/1hPjmu98TguloCnN1kaVgwfxW57ijU31OLYs+A/s8fLlfUPvtj4UerGkiprq1q3hoGD2hl2hcVFWi6jPigranJYxznBwdQ8/0virhZ3UD/3T8VCtSpKBvS31f0Riz2j+So/fqSKpE1gWzr7/73das2egXTDUWdoN6NRKOqQx//0nl+NTnn8ay/u81DrzJauJnAc3hA9YDAABkN/qOkpkz0x8McsSlsSRlz+74safCpAHTNcusrsirWqTkYUX+rYrYt6EsdfKr8SfVXTm4+q9gU5ULOlFNaey4Nes2eRdaDZyvkwOdGKnbVFpOClRxpGoPVSOpm7LGdVTwl9z66PV04qQ22LFjp59slytbwoom83rpXT5//m07vSpCXfZ0QqrHKIhQWyR3Qh+MPaaxzbbv2OnroplfddKnYCK519L9VT2miptYwWssOnlWWKigRl1ZY1XpZKSd0nP/YDlWrFofP7ZWjhzeLVXBQkr7S3r3SY0RqkDGZ4tPw9iLYfGZW9dt9iox7T9B5U382Ig7/AKC1l1VXWl9vvhtv8Wr1WJ1iw67fUWhz32PvGmLlq62px+90iujH+w9zHZs3+UVutp3FaYogL+6awvvUp1cF9O07Jf70s46fnPlzpnmwD6l/UoVV7op5M6fP5+VLFHYSpaIry4LQ1q2t/07Udl3U2f7pCkKiZ9/6jqvBkvv9s7M9kqNuu0uXbrWq2iPOrK0v49FLm/QRXzlqvU+LIL2J723qNov9gQs4e/3mbFeaaHtrWp/zQj/yvM90jSRVXRVuC7Sqfo6MiAPqD1VqaxjrtJRZUPbf4PX0ue1Xiu97aDvKxorWZ+fCoL1+ZKdpHZ86u9Tp82xx5951yvCL2jbMMl3gVx5Mj4ba1ZYvny5tWnTyvr2vS/RzM8AAADZwSWX3GhVqx5nDz/88P5elGwjWwWRAJBZot/aFAJ+9On3Pj5soUIFfOKVU/+duT6l6mbsG4VvqnzTpDoffPStHVGhtN3Xs3Oy4/DhwDy+2JYHx7aMtR2zexApnTp1tBo1jrBHHkl5dlQAAICstGzZCmvb9gp7/PE+ds45mTfb94EuS7pmA0BWizXj7ZWXNWdDZDFVpg5+c7yPm9nn0St9yAYNKUBwdWBj+x08DoZt2azZOfbGG6/auedOt/r1T9rfiwMAAGC7du2yPn1etPz5C9rpp59Oi0SgIhIAEGq1lQqugqzjYAg9gEPJgVARqS/6d97Z06ZO/d7OPbeJnXVWQzvyyPKWL1/26ioPAAAObhryZ/PmLfbjjzNszJgvbMGCZdav37NWv379/b1o2QpBJAAAAA7YIDIIIwcPHmzjx4+zpUuXZGAaJQAAgMyQwyenqV+/gXXpcpmdeOKJNGsUgkgAAAAc0EFkZBX2ggULbO3atcykDQAAspR6fxUsWNAqV65sRYoUofWTQRAJAACAgyKIBAAAQPaWc38vAAAAAAAAAICD3z7Pmr1l+y5buW5r5iwNAAAAslTOHDmsYrliljMnk0kBAAAgm3fNTuPDAQAAkI3FmtWertkAAADIVhWRsb60AgAAAAAAAEAkxogEAAAAAAAAEDqCSAAAAAAAAAChI4gEAAAAAAAAEDqCSAAAAAAAAAChI4gEAAAAAAAAEDqCSAAAAAAAAAChI4gEAAAAAAAAEDqCSAAAAAAAAAChI4gEAAAAAAAAEDqCSAAAAAAAAAChI4gEAAAAAAAAEDqCSAAAAAAAAAChyxEXFxcX/ssAAAAAAAAAOJRREQkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAADAwvZ/DzpNilc/YfkAAAAASUVORK5CYII=",
|
| 363 |
+
"text/plain": [
|
| 364 |
+
"<Figure size 1400x400 with 2 Axes>"
|
| 365 |
+
]
|
| 366 |
+
},
|
| 367 |
+
"metadata": {},
|
| 368 |
+
"output_type": "display_data"
|
| 369 |
+
}
|
| 370 |
+
],
|
| 371 |
"source": [
|
| 372 |
"# Public sample from Hugging Face (IAM handwriting)\n",
|
| 373 |
"SAMPLE_URL = \"https://fki.tic.heia-fr.ch/static/img/a01-122-02.jpg\"\n",
|
|
|
|
| 404 |
},
|
| 405 |
{
|
| 406 |
"cell_type": "code",
|
| 407 |
+
"execution_count": 6,
|
| 408 |
"id": "06391d26",
|
| 409 |
"metadata": {},
|
| 410 |
+
"outputs": [
|
| 411 |
+
{
|
| 412 |
+
"name": "stdout",
|
| 413 |
+
"output_type": "stream",
|
| 414 |
+
"text": [
|
| 415 |
+
"Transcription: Io sottoscritto Mario Bianchi , vinto a Firenze il\n"
|
| 416 |
+
]
|
| 417 |
+
},
|
| 418 |
+
{
|
| 419 |
+
"data": {
|
| 420 |
+
"image/png": "iVBORw0KGgoAAAANSUhEUgAABSIAAAGNCAYAAAAFLQ8PAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAflZJREFUeJzt3Qd4FNUaxvEPkA4iHUEFpEoRREUQpChIR0AFK2Av2AW7Iir23kDRK0XFgogNEQuClaaoKAgovffehNzn/eLEzWbTMyHA/3efvZJksztzpmzmne+ckysuLi7OAAAAAAAAACBEucN8cQAAAAAAAAAQgkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAAAAAAoSOIBAAAAAAAABA6gkgAAAAA+6WhQ4darly5Eh7Z5d577014z0qVKmXb+wIAsL8jiAQA4AC3e/duq1+/fsJFc968ee2XX35J8rxly5bZYYcdlvC8ChUq2IYNG2x/sGDBgkRhhEIC5IxtkdZHevzzzz/2xhtvWNeuXe2oo46yggULWuHChe3oo4+2c8891z755JNUX2PHjh02ZMgQ69y5sx155JH+GgUKFPBQqVu3bvbaa6/Ztm3bEp7fu3fvmMudP39+K1++vLVp08Z/Z+/evekOstL60DIgfISMAACE55AQXxsAAOQACh5VNdSwYUMPJRXiXHrppfbjjz9anjx5Ep539dVX28aNGxO+fvnllz2YBHKSefPm2Zlnnmm//vprkp/Nnz/fH2+99ZaddtppNnLkSCtdunSS502aNMnOP/98W7JkSZKfLVy40B/vv/9+msK/Xbt22fLly/0xfvx4++KLLzwkRfY48cQT7bHHHsv25j799NOtSJEi/u9ixYpl+/sDALC/IogEAOAgoIrIO+64wwYMGOBfT5s2zZ566inr27evf/3uu+/aBx98kPB8hS8dOnSwnGDPnj22c+dOK1So0L5eFKRBiRIlkgRD2t/efvvthK+vvPJKq1KlSprbc9OmTXbooYfaqlWrPGBctGhRws9OOeUU/55CdlVCzpgxw7//5ZdfWvv27e2bb77xSseAvlaIpH0q0KhRI2vZsqUHS6oM/uqrr2zWrFkpLpPWUdWPCi1HjBhhmzdv9u+/+eabduutt9qxxx6b5iArMGjQIPv777/938WLF/djNlKdOnVSDETj4uK8QvNgEOwTtWvX9kd2O/nkk/0BAADSKQ4AABwUdu3aFVevXr04ffzrUbBgwbh58+bFrV27Nq5s2bIJ369QoULc+vXr/XeWLFkS17dv37g6derEFS5cOC5//vxxFStWjDv//PPjJk+enOQ9evXqlfA6zZs3T/SzCRMmJPxMj/nz5yf7ewsXLoy74IIL4sqUKROXK1euuPfffz/FddNrRb52//79k/3Zl19+Gff000/HVa9ePa5AgQJxtWvXjhsxYoQ/d8uWLXE33nhjXPny5X1d69evH/O9R48e7ctXt25dX8a8efN6+xxzzDFxffr0SbRukX799de4jh07xhUtWtQfbdu2jfv55599eYPlU/tG27hxY9yDDz4Y17Bhw7hDDz3U3+/II4/0dps5c2ZcTvfaa68l2gbaF1L6+datW+PuuOOOuMqVK8cdcsghcddff70/77LLLkv0vPvvvz/R6+zZsyfu4osvTvSchx9+OOHnO3bsiKtUqVLCz3Lnzh03fPjwmMv8xRdfxE2aNCnmPhr9J/SgQYMS/WzkyJEZaift+yntB5E/1/L89ttvcWeccUZciRIl/Hval+TRRx/171erVi2uePHi3obFihWLO/HEE+MeeOAB38+jRS6/tsf48ePjWrRo4ft1kSJFfF+Nta+pjbp06eLHTHAcaNn1fO3XGzZsSPT8vXv3xr377rtxnTp18t/Jly+fL6OONR17O3fuTHaZxowZE9e4cWN/D61PrH0npfaaNWtWXLdu3fz9dP5r0qRJ3Oeff57sOSrWQ+8nqR2z69atixswYEDc8ccfn3DMan27du3qbRstej20r2pbaRuqjXRevvnmm/37AADszwgiAQA4iCio0AVxcLHbsmXLuJ49eya6AP7kk0/8uRMnTvQL9uQuyBXiPPHEE1keROrCu1y5comem5VBpIKBWOvz4osvetAX/X0FoQqlIp155pkphhUKHhQ6Rpo6daoHOtHPVRjaunXrZEONOXPmJArPoh8KTN955524AymIPOWUUxJ9rSBy+/bt3lbB9xRS7t69O8l7rVmzJlE7q+0Cb731VqLXvfbaa9O8DikFkR9++GGin0WGW2EFkccdd5wHcpHvGwSRJUuWTHH/VIC+efPmRK8d+XMFdNrvo39Pr7tq1aqE39FxkSdPnhTfS+FfQNuwQ4cOKT4/uAkSvUzR+0R6g8ggEIx1HguOn6wKIv/444+4I444IsXXCcL1QPR6NG3aNObvXXjhhRnYswAAyDnomg0AwEHWRfvOO+9MmMxlwoQJiX5+0UUXeXdWTVKjCTvWr1/v39dEHvqZukJq3D11R1W3VHXtPv7446158+ZZtoxz5871/+r969Wr5++VlWOwTZ8+3dq2betjy73yyis+tl8wRqZo8hJ19Xzuuedsy5Yt3t1V3XDV/TegsTPVtfaYY47xLrT58uWzlStX+riC6jasbqPqnjt27Fh/vl7j4osv9tcLaFIVTa7yzjvv2Oeff55st3RNyKIJYETjHZ533nne/fmzzz6z77//3rsY9+zZ07eDXu9AoO7TJ510krVu3dq2bt3qE9JMnTrVJ5gJdOnSxQ45JOmfsiVLlrRWrVrZmDFj/Gu1ncaCPOKII7y7diRtk8zQMaDt/fzzzyd8TxPXNG3a1ML2888/+/pfeOGFVq1aNZs9e3ZCF3Stq7qaV6xY0fdP7X8aO1Pd49Wev/32m7344ot2yy23xHzt7777zmrWrOnHoLq6B/vx2rVr7dVXX7XbbrstYRxZ7aOi55999tm+TGoT/d5PP/2U6HVvvvnmRBMJaZIg7d86vn///Xf7+OOPU9wnSpUqZeecc45vYz0/vce9ts1VV13l3ei1Hjp2tA0vv/xyP541XICOdY31GRyT0V3kdd5Iicbg1ToF449qHF5tI20T7ZMzZ8707z/zzDPWoEEDP3Zj+fbbb/11atWq5WOOBucA/fvhhx/2dQEAYH9EEAkAwEFGF9UaD1JBRiRdKGvcSNHkNgodAu+99561a9fO/33jjTf6BXsQ0ul3sjKIlKefftquv/56C4MCBwUrmohE63zFFVck/EzjYgZjZWrddMEvCsEiKcDUmISa8EfBqYJHvZbCSs2cLBpnUM/RZEGTJ0/28CegkDJ4bYUzas8g9I2k0CYIXBRoKCBS6CQKlI877jh/XQV0CsOefPJJOxAoANO4pblz5074ngLbSArZkhP9M4XN2j5Lly5N9H2FZxkVa6bv6tWr+3JGjkkZplGjRtkZZ5yR5PsKATXxlIJqhYIKHxWaK6zWRD2iIDu5IFIB4ZQpU6xo0aL+tQKz4HwReSxEBsP9+/f3kDDSihUr/OaFaP9WcBnQvqtliRwnc/HixT77eSx6HYWJCqUzQsehjh/Nii5NmjTxCYtEN160v2kSL91c0bktCCL1vsFYummhMPXPP/9M+Fo3NBR+BudebQfdXBEdr8kFkTfccEPC+VgBr24iiYLTIFQFAGB/RBAJAMBBOov2CSec4EFZYMiQIQmVhz/88EPC91WFF4SQUqZMGf9aF+7Rz80KqkDq06ePhUUVhUGIFIQSge7duyf8O3IyleiQUFVJCgrWrFmT7Puo2ko/P/zww32ylkiR4YPWV2GStkk0BScBVZ4p6EqOQqfUKNT89NNPLbPSE8xkhAKbyBByf6AA7a677vIq3uygiWtihZAKqlSxqIo7TWCTnFgzhgdUwReEkKL9LggiI48FTRT04YcfJkxw9dJLL/lza9So4UFfw4YNE441hfaqFgxoGaMn61EAmhwdMxkNIYNljTzee/To4cscnAMV7imIzKzo82Hksa7Kcp1jgsmcNPP7tm3bYk7EFVRoi9ozUqybFgAA7C8IIgEAOAhpRt/GjRsnVEepgkzdlQPr1q1L+HfZsmWT/H7k95K7KI4f4u0/kbMUp0QBYKwut1klspJIXaqT+1nkMkSui7qbKlxQ4JOaYJ1VcRWpXLlyKX4dazukZvXq1ak+R9Vs/fr1s5weRMaqVFSgGymoKosl+mfB71aoUCHR99WdOag0Sy+FSao61CzZmulaVYfBftGrVy8LW3LVnM8++2ySWctjSel4jA7oI2fijtzvFcYrTFMb6PW+/vprf0SGpermrPaP3pcrV65s6ZGZ6tXgBkokVRiri7eqNmMdoxkVuZ4KWqMrPCPPnTqv6H1jBZGR2yB6JvS0nHsAAMip9q9bzQAAIMvE6loa0BiEAY19GC3ye6roC0RWsW3fvj3m2I+pSa5rZlZWhCYnLQGoKkGDIEBtqDEzg27qkePfRdKYkpFWrVqV6OsgDElpO6i7rwKm5B7BuH0Hglj7gCp4I7s8a7y9YHzC6CAocixIBTrqli2R43xKrCrU9ISx999/v3e7jww41dVeAWXYkjtONA5kZLCu5VNIqP0zrSF09DGS3LlCx8vw4cO967u2xyOPPOLjbgbnBI2HGOyXkfuyaMzK9MjseSH6mNO+Ezn8RPQxmlGR66nzggLq5M6datfk3jdyG6R0rgYAYH9DEAkAAJI4+eSTE1XaRXbn1QV95NeRz428qNY4aUGVkYKZF1544YBo6cjwQl3Z1dUyCEmixzGMDNEiKbyMrCgNxqWMFtm2Go9Pk+goAIt+qNtpapNoiLqiKpDK7GNfULdWdRmODLIeffTRRM/Rsmn8TU1GErjyyisTTXATOX6kxtVUNV8sCjM1QUpqNIHKAw88kGj/0BinOWH/1H6n7tGq/NX+89FHH2Xpe+kYV9diDd+gbuIac1KTwNx9990JzwkmrGnUqFGioF+hpX430rJlyxINF5GVtC2DCV+CwDbyvTR+ZqwQMHoZUxN5zIqC2sibM5HnCHXjj1UNCQDAgYyu2QAAIAl1LVW1VxBqnHnmmV7ppIkbFNwEsz+rUkfdMwORYZgmcNGEFApCNNZh9EQh+6vI8doUtGqCG4UPmuVW3VBjUQhTt27dhAlr1LYK0jTmnYKJ5Lq367U1ucWsWbMSgjRN5KKZdFWV+ddff3n3enVF1iQ5Ge1mvL9Q4KdJVjQBSzCWpL4+9dRTPVRSRWrkJEwK4q677rpEXVxVBdmmTRsfP1FVcZqwRIGkZplWV1rtp5poSG2uNlXIm5oLLrjAZ6IPuoSre7QqI6PHQMyu/TOoPtbEKZqMSV3/NbGNuqJnJU2mMmLECK80VVdrdTtWRWpk+BbcnFCVpGan1mzdQUCp/Vj7tJ4zZ84cn3Ve1ZVZVZ0YSfuHxq1UmB3Mmh15Q0ETwgQiK1x1I+aiiy7yZdX5TuPXKhRPjo5ZbYNgwpprr73Wh0TQa6pqNHLYAE38BQDAwYYgEgAAJKEgYPTo0V7lpLBNlTzRFY3qhq2KtMgZs7t27eqzOgdBiCqQgiqk9u3b+2zV+zuFEprtVtVbMm7cOH8EAe6wYcNi/t7//vc/D7uCbtwKcIJwTEGawq/o7u2qIFN4oeBM7ajw7K233rKDlcb5U6WignGNTSgTJ070RzS1qdoqOjRq0aKFby+Fh8E21AQjmZl0SdtJ3Z6vueYa/1phnAK35GalDpMqQrV+mhhGYXUwU7VCUYXYOq6zkioGk6u01L6sQDbwxBNP+H4cnAcUymlSneygmwEKO1WJGb2MgwcPTpioSzRerioVg2rIyC78qipOKYjUvqBA9fTTT/cJgRR2K9COpoA8uRmzAQA4kNE1GwAAxNSsWTMf401BgroE68JcXTxVxacqMs3SHBkyiMbwU1Ck7soKM/X1SSed5BfmWTFJSk6gMeBU/ahQRxWiCiVUCaqARyFFclSdpzZTxZRCIT1USaaKRoW3gehqMM1CrNBNoa8qL1VZpok2NKuxJh3STL9qX80GfjCoWrWqz3CsIFdBuSrNFOZqO2g8SM2GrGDsiy++8C7DsSgQVliuAErbQ6+hfVX7t7puqzpOY4HqtdLqkksuSTQRicLq6HFSs0PTpk29SlT7itpFAZtuAmjfU1VuVtI6K/jUuUIzXgdtqH+rDRUQq+IxoJ+rSlNVwB07dvRKTXWD1nGkZbv++utD66qsKsUpU6bYWWed5ceQ9he1kULRc845J9FztVzah1RBmZGxKVXF/Msvv3iVbIMGDfxYV0CpSXt0s0bbJ7sCWAAAcppccftqoB8AAICDiKoZFUZEVjyKKiQ1u3DQZfOyyy5LqGIDkHGqfg2qZVWtnJnJiQAAQNagazYAAEA2+OOPP6xz585eTarx5lSVpW6qqsoLQkiFlBqDDgAAADgQEUQCAABkk8WLF9vDDz8c82fq0jpo0CCfSRcAAAA4EBFEAgAAZAONm6dZcr/++muf9Xnjxo0+Zp5mG1YX0quvvtpq1qzJtgAAAMABizEiAQAAAAAAAISOWbMBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAACAbDB06FDLlSuXLViwIEtft0WLFv4AgJyOIBIAAAAAkO0UyKXl8fXXX7N1zOyPP/6we++9N8tDTADITrni4uLisvUdAQAAAAAHvddffz1RGwwfPtw+//xzGzFiRKLvt27d2sqWLXtAtNeePXts9+7dlj9/fg9Z02PUqFF29tln24QJE5JUP+7atcv/my9fvixdXgDIaodk+SsCAAAAAJCKCy64INHXP/74oweR0d+Ptm3bNitUqNB+1b5bt261woULW548efyR1QggAewv6JoNAAAAAMiRVPlXp04dmz59ujVr1swDyDvuuMN/9sEHH1iHDh2sfPnyXmFYpUoVu//++73qMNZrqGtzy5Yt/TUqVKhgjz76aJL3e+6556x27dr+nOLFi9sJJ5xgb775ZqLnLF261C655JKE961cubJdddVVCVWJwTiQEydOtKuvvtrKlCljRxxxRKKfRXavrlSpknXs2NHGjx9v9evXtwIFClitWrVs9OjRCc/R76kaUrQO0d3WY40RuWrVKl9OVZPqNevVq2fDhg1L9Bwth17n8ccft5dfftnbUOt04okn2tSpUzO41QAgeVREAgAAAAByrLVr11q7du3snHPO8WrJoJu2wrkiRYrYTTfd5P/96quv7J577rFNmzbZY489lug11q9fb23btrVu3bpZ9+7dvZvzrbfeanXr1vXXliFDhth1111nZ511ll1//fW2Y8cO+/XXX23y5Ml23nnn+XOWLVtmDRs2tA0bNtjll19uNWvW9GBSr6dKzcjKRIWQpUuX9mVSRWRK5s6daz169LArr7zSevXqZa+99poHj+PGjfOu6QphtWzPPvusB7HHHHOM/17w32jbt2/3YHLevHl2zTXXeFj67rvvWu/evX3ZtX6RFLZu3rzZrrjiCg8mFdKqrf7++2/LmzdvhrYbAMRCEAkAAAAAyLFWrFhhgwcP9pAsOjwrWLBgwtcK8fR48cUX7YEHHvDKvoACRI1BeeGFF/rXqhSsWLGivfrqqwlB5CeffOLVkArsknP77bf78iicVLVk4L777rPo6RdKlChhX375ZZq6Ys+ZM8fee+89D/+C5VPIqbBUQeTRRx9tp5xyigeR+jq1GbJV3Thr1iwfh/P8889PaJ/mzZvbXXfdZRdffLEVLVo04fmLFi3yMFRVoFKjRg0744wz7LPPPvNqTQDIKnTNBgAAAADkWAoUL7rooiTfjwwhVc23Zs0aD+tUmTh79uxEz1XFZOTYk6pcVGWjKv4Chx12mC1ZsiTZLsl79+61MWPGWKdOnRKFkIHoyWcuu+yyNI8HqW7eXbt2Tfj60EMPtZ49e9rPP//swWd6jR071sqVK2fnnntuwvdU2aiqyi1btni38UiqxgxCSFE7SmT7AEBWIIgEAAAAAORYGs8x1mQsv//+u4d3xYoV8+BO3aCDsHHjxo2JnqsxGqODQgVv6rIdUPWhAksFlNWqVbM+ffrYd999l/Dz1atXe7dvjTeZFuoOnVZVq1ZNsnzVq1f3/0aOJ5lWCxcu9HXInTvxJX/QlVs/j3TUUUcl+joIJSPbBwCyAkEkAAAAACDHiqx8DGicQ3Uz/uWXX7xb9EcffeQzbj/yyCMJ1YuRkqtMjOxOrZDuzz//tLfeesuaNm3qXaX13/79+2fZcudUaWkfAMgKjBEJAAAAANivaLZoTWKjmaU1kUtg/vz5mXrdwoULezdlPTQLtsZsHDhwoI8NqYpLVV7OnDnTspomlVHoF1kVqXEjg1m1JbpiMiUa/1IT7SiQjayKDLqs6+cAsC9QEQkAAAAA2K8EFXyRFXsKDjVRTUYp2Iyk7uC1atXy99i9e7cHel26dPHqy2nTpmVp9aAm03n//fcTvlYXcE2uU79+fR/rMQhJg2rQ1LRv397Hlnz77bcTvvfPP//Yc889593PVU0KAPsCFZEAAAAAgP3KySef7OMY9urVyydgUbXgiBEjMhUGnn766R76NWnSxMqWLeuzTj///PPWoUOHhBmmH3zwQRs/frwHeZdffrl3516+fLnPtP3tt9/6hDcZofEgNVO2JsrRe//vf/+zlStX2muvvZbwHIWSCmDV/VxjYGoSn1NPPdXKlCmT5PW0bC+99JL17t3bpk+f7lWVo0aN8jEvn3766UQzZgNAdiKIBAAAAADsV0qWLGkff/yx3XzzzXbXXXd5KKmJak477TRr06ZNhl7ziiuusDfeeMOefPJJn1laE9wo5NTrR06cM3nyZLv77rv9uapc1PfatWtnhQoVyvD6aGIZVSv269fPx6nURDeqZoxcF4WkgwcPtoceeshDyz179tiECRNiBpEan1Ld12+77TYbNmyYL2eNGjU82FQ4CQD7Sq44Rp8FAAAAAGCfULWiZuJWsAoABzrGiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQOsaIBAAAAAAAABA6KiIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBAAAAAAAAhI4gEgAAAAAAAEDoCCIBhGro0KGWK1cumzZtWo5o6W3bttm9995rX3/9dZqer+dp+UeNGhX6sgEAAAAAcCAjiARwUFEQOWDAgDQHkQAAAAAAIGsQRAIAAAAAAAAIHUEkgGzXu3dvK1KkiC1dutS6dOni/y5durT17dvX9uzZk/C8BQsWeLfoxx9/3J566imrWLGiFSxY0Jo3b24zZ85M9JotWrTwR6z3qlSpUsLr6X1EVZF6bT3UVTs99Hz93pw5c+yCCy6wYsWK+evefffdFhcXZ4sXL7YzzjjDDj30UCtXrpw98cQTiX5/165dds8999jxxx/vv1u4cGE75ZRTbMKECUnea+3atXbhhRf6ax122GHWq1cv++WXX/z91e090uzZs+2ss86yEiVKWIECBeyEE06wDz/8MF3rBgAAAABAWAgiAewTChzbtGljJUuW9KBR4aICu5dffjnJc4cPH27PPvus9enTx26//XYPIU899VRbuXJlut5TYeGgQYP83127drURI0b4o1u3bhlahx49etjevXvt4YcftpNOOskeeOABe/rpp61169ZWoUIFe+SRR6xq1aoesE6aNCnh9zZt2mSvvPKKB6d6joLN1atXe3vMmDEj4Xl67U6dOtnIkSM9gBw4cKAtX77c/x3t999/t0aNGtmsWbPstttu87ZUwKmg9/3338/Q+gEAAAAAkJUOydJXA4A02rFjhwd5qiKUK6+80ho0aGCvvvqqXXXVVYmeO2/ePJs7d66He9K2bVsP/hTiPfnkk2lucwVzqhjU6x977LFezZgZDRs2tJdeesn/ffnll3vl5c0332wPPfSQ3Xrrrf79c88918qXL2//+9//rFmzZv694sWLe3Vmvnz5El7rsssus5o1a9pzzz3nbSBjxoyxH374wcPN66+/3r+nZVfQGU0/P+qoo2zq1KmWP39+/97VV19tTZs29WVR8AoAAAAAwL5ERSSAfUbhYyR1T/7777+TPE9VfUEIGQSACiLHjh1r+9Kll16a8O88efJ4V2h1zb7kkksSvq/u1DVq1Ei0XnpuEEKq6nHdunX2zz//+O//9NNPCc8bN26c5c2b10PKQO7cub0yNJJ+/6uvvrLu3bvb5s2bbc2aNf5Qt25VWSrEVTd4AAAAAAD2JYJIAPuExjAMxmsMqFJw/fr1SZ5brVq1JN+rXr26VxXuS6pAjKTxHrVepUqVSvL96PUaNmyYV2Xq+eqerrb45JNPbOPGjQnPWbhwoR1++OFWqFChRL+r7t7RFaMKQFVdqteJfPTv39+fs2rVqixbbwAAAAAAMoKu2QD2CVUFZiVN3qIwLlrk5DfZsQ7JrVfksr3++us+iY4qPfv162dlypTx31OX7r/++ivdy6GqStFYlKqAjCU6vAQAAAAAILsRRALI8dS1OJpmrA5mww6qKWN161ZVYXRgua+NGjXKjj76aBs9enSi5QmqFwOaJVwzaW/bti1RVaQqICPptUTduFu1ahX68gMAAAAAkBF0zQaQ42nSlsgxDqdMmWKTJ0+2du3aJXyvSpUqNnv2bJ99OvDLL7/Yd999l+i1gkBvw4YNtq8EVZORVZJaH01ME0nVjbt377YhQ4Ykqn584YUXEj1PFZWagVsT52hW7WiRbQIAAAAAwL5CRSSAHE/dijX7s2aM3rlzp88irXEVb7nlloTnXHzxxT6DtsI7TRajMREHDx5stWvXtk2bNiU8r2DBglarVi17++23fZzJEiVKWJ06dfyRXTp27OjVkJrJukOHDjZ//nxfVi3Xli1bEp6nrtuamEczcasKUrNqf/jhhz45jURWUyqcVBvVrVvXJ7dRleTKlSs93FyyZImHsgAAAAAA7EtURALI8Xr27GnXXnutPf/88zZw4EAPFzVLtCZyCRxzzDE2fPhwn+zlpptu8sBuxIgR1qBBgySv98orr/gs3DfeeKOde+653lU6O2l8yAcffNDDweuuu84+++wzHzdSs2ZHV05qApsePXr45DZ33nmnlS9fPqEiUhPdBBRiTps2zYPNoUOH+szaCjc1y/Y999yTresHAAAAAEAsueJize4AADmAZsWuXLmyPfbYYz4RC/7rqq5qym+//daaNGlCswAAAAAA9gtURAJADrZ9+/Yks4A/99xzduihh8as9gQAAAAAIKdijEgAyMHUJV1hZOPGjX18TI0t+f3333vXbo13CQAAAADA/oIgEgBysFNPPdWeeOIJ+/jjj23Hjh0+cY8qIq+55pp9vWgAAAAAAKQLY0QCAAAAAAAACB1jRAIAAAAAAAAIHUEkAAAAAAAAgNARRAIAAAAAAADIOZPV7Nm9LdwlAQAAQI6SJ2+hfb0IAAAAOIBQEQkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdASRAAAAAAAAAEJHEAkAAAAAAAAgdIeE/xYAAABAuObMmWPjx4+3SZMm2tq1a2z37t00OQAAyDa5cuWywoWLWI0aNaxVq9bWsmVLK1SoEFsgSq64uLg4S4M9u7el5WkAAAA4QOTJu3/88Tx69Gh78MEHrFixIta8+Ul25JHlLV++fPt6sQAAwEFk7969tmXLVps27Vf75ZdZVrVqdRs0aLAVL158Xy9ajkIQiRwryMj1/7ki7jBg/xJ9r2N/34YH6n6ZVdvpQNveCF+s+6HsN9nT1pHtrJ/99MtfNuPXv6xLx8ZWssSh+00Q+fXXX1vfvjdZjx4d7KabrrA8efLs60UCAAAHuXnz5tvVV99hpUodbq+//oblzs3IiAFaAjnW9h277NnBH9rpne+0R58eZVu37tjXi4R02rNnr0387jfrcu79dv4lj9mSpWv2+zY8UPfLeX8vt7N7Pmgvvfap7dm7N0OvcSBub4RL4df8hSvtxttetnbd7rGx46fFDCaReakdnwsXr7J7H3zDnnp+jM2es2S/avJRo961evVqWt++VxFCAgCAHKFq1co2cOCtNmfOn/bbb7/t68U5eINIXVxEPoCU/PrbfHv6hTE2c9ZCm/TdTNu0+cAcHiD6uEjvsZGTj6tVqzfYk8+NscnT/rTvfvzDFi1ZnePaOrnHwbRfan1ff+sr++7HWfbikE9s547dB8T2Rs46pmIdXwrH3ho10d4ePclm/Pa3/TRjnu3dm7POYweKlI5PbZNfZ873R55DclvevPtPReGmTZtsypTJ1q5dSyppAQBAjnL88cdaqVKH2ZdffrmvF+Xgnaxmx45d9vSLH9jMWQvs6YevsNKlitnBQH/gq/JA63905cMtT57U898dO3fZosWrrWjRglauTPGD8o/rggXjx3bSuterU9nKlD4s0+2UE9tVF91vvD3BXn/7K2vb+gS7+tIOVqBA2se1UhD21Atj7M85S+yma7vaCcdVyxHrJXnzHmJ5D4nf3yuUL2l1jqm4T5bjnz17bMPGrbZmzSZbtGSVh4jz56+wpcvX2voNW/zY1IV3rZoVvUti8yZ1rGDB/BneL/c32odUAaVz1fbtO/f77b0vbNi4xZavWGdHVChtRQoXyDHHYHZUM/42c76VKHGo5c13iG1Yv8WWrVxrK1ZusOXL19qqNRttzdqNtnXrTsuXP681bVTL7uzXwwoXKuCvoXYqqPNdnFmRIgWt4QnV0/QZiaw9PhUIT/tpru3ctdvKljnMypbef8YxWr58uY/HVKtW9X29KAAAAImoO3atWtVs8eJFtMy+CiL/mL3YXvrfWA9efv7lLzv9tAZ2MNi8Zbtdft1ztnrtRhv56i1WrWqFVC/ufpz6p91w60vW+MSa9vCAi6xYscJ2sFE71a1dyfLny2sXXdjaDjkkT6baKae26+7d/9g7739jv/6+wC/aTzm5dprCRK3Ptu07bdArn9iQoeP8QlIBa4N6VS1PnpwRgpQsUdQaNaxpa9ZusuuvPsOKFC2Ybe+t9lGbKHT87Ivp9vOvf9nvsxb5siRX8Thn3jL7fMLPdvM1Xe3inqdbgfz50r1f7o8U0iqQlRLFi1ru3Ln2u+29LynoHvTKWBv2xhd2+83d7cJzTz0ogsiNm7bZ/Y+MtE8/n5bm39m0catdf1XnhCBS+1rDE2rY0ZXLWbvTT7STTqhxULTdvpDS8al9+JeZ8/3fR1YobWXL7j83WHbs0PAYcVagQOybRzh43Hvv4zZgwJM2f/5kq1TpyH29OAeU/a1tc+Uqb716dbehQ5+2nKh37xts2LB3LC5uWbb8Xk739dffW8uWZ9lrrz1lvXv32NeLc0ChbdNnwYLFVrnySda//0127719k/y8bdvz7LPPvk74OrnnRdPfKOvWbU/n0hzYsjWI/HPeEvtnz17Lkzu3HXbYvg+Assv0n+fZ77MXWr68h9ghaejupPDkywk/24qV623z1u2W+yCtDlGVzP9evNGrYw6LERimt51yarsqFFMQJCtXrbcJk361BvWqpDrO1T//7LERI7+yIcM+83XTtXvxw4pkOEQKgwKFqy7pYL3Oa2XFixf1Yz+7bN++ywY+/pZ9OHayrVu/2dtIlaZVKpez+scebXVrV7Zjqh/pbZYv3yH259wl9uygDz241DiJx9Wrao1OTBqKpLZf7o+0/23cFL8PVqpYNsMDKe/L7b0vLV++ziZ++5tt3Lwtw+Nr7o90DKiCLlCmdDE76ogyVq9uZatR7QirUqmclSpVzPcD3YhbvGSNVT368ES9IbTPnNigmo1+4y47tGihZCuRkXkpHZ87tu+yOfOW+jZtUL+q32jZ3yQXYAcXYck5EC7on356iB122KFpuoBXoFS/fm3r0qWdHayGDn3bNmzYZDfccFmWtm1Widxnx41709q0aZHws9tuG2iPPPKC/3v58hlWrlwZO1BFH7v620TrW69eLevb90o79dSm+3T5kLONGfOpzZjxe5pCov2FAvbk9OjR2d56a3C2Ls/B6JZbrrYLLjjT1qxZZzfe2D/Nv8dN9n0YRCpsWbBopXef0YV/9VSqAg8UXoU3bbaHRuoGdXi5Eqn+zu5/9tiP0/70D9zjjq1iRYscHFVFsQ7YUiXjZ+3MinbKqe2qCuF16zf5v1WoN3b8VOtzeUcrXChPivuVxid8ZtAHPlmKwke1l7qF5rQTnbpb6pHd1q7fZO+O+dY2bYofw1Hb/byzm9s1V3S2w8sm7Zavc5K+d12/wbZy1Qabv3CFB5Hp3S/3N9qXNm36L4hUOJuZrrH7anvvy/bTWHt6aL+oVfOog2ZGPHVB92NJIWSZw+yFJ672au78+fMmOb6qVilvjRvGfp18+fJa2TL7T1fg/Vlyx6e62KvCXhXejU6smeM+RzLjmGOq2YgRz/m/P/74c3v77Q/tjjuu8+8fKBSWqUotLWGZqtpUKXYgBpF33XWD3XbbNZY/f8o3NIYOfcerXtIaRKa1bbNaoUIF7a23xiQKIrX/6vvbtm3PkW0bBgUsHTu29mvIFStW2WuvvW2tWvWwDz54zTp1Oj3m72zf/vcBOXHVkCGP2eDBD9uBplmzRr7N8ubNuptgY8Z85tWjB1IQKcceW8v69bsqyfcrVToi29r2QFax4hHeXoccEjsmC26A6DMkPUEk9mEQqSDu7/krPHRRtVehHFD1sGv3P7Z69QbbtGW7Vwbogkp/oGflH+Dqlq3B37XeLZsda3nT0I1z6bI19tf85V6l1aJpXcuOC2mFWavXbvLx4VTxdcQRpZNdVu8SvG2nrVi13sM9tZsqWbL7wiW97ZTd7ZpWqhTatn2X/1sBhi4IVUXbrEmdZNv/l9/m24OPv21btm63bp2b2M+/zLMly9ZYxSNLZ/PS51wK/T97/357dfh4+9+I8b4/d27fyMqncDNA+7OO1UKF8lupEvt/2Kg/2tXlevWaTT4Gprpcav+PpufoXKUAUt3OGaMv7XQ8/jF7ka3fsNlOOqGmHV2pnB0sVGWsAFsDHZxych3vVq1xCHOSnTt3+yQtW7bu8GNA5wX9/ZGTg7Y9e/bYuvVbvJJ7b1yclS5ZzLtVh7nMqgjX9ixXtrjVOMBuFJctW9qrF2TJkmUe5LRu3cxatDh5Xy8aspguHJO7eNwfdejQysOUl17aZfny5bMff5xuy5at9ImZPvjgs4OmbVXBGxzDcvbZnezooxt5IJlcEFmgQPzwHwcahUkHYqCk658DdZtltcMPL5PoeEgNbZs++luLfTF7ZFvZxtp1m73bqZzYQIPRp1ztpQvjseOn2dMvjvGJOFQl5hc8mZwVOJixc+5fy+zJ50ZbzyuesPZn3mOdewywF1/5xP8QT+53VCGlCUUef/Y9G/zqWPtx6my/0E9umfT9hYtW2fwFK7367vjjqvnJIKUZRPXv73+cZbt2/eODydesfmSqs45mZvkUxn7zw+92a//XrHuvh6xtt7utR++H7YfJs5L9HVX+3DPwdTvzgoHW8az+dvcDI2zrNo3RlHTsQ1Wkvf3eJA+iI19Dz9dYfE8+/76HRNo3guVMy+zF6W2njLRr8Hta9h+mzPLuumrb4SO/tIWLV2XZLNUe0O/Z6918FdJrGTXmWnL7osZle+Spd+2PPxdZk0a17MY+Z9j2Hbt83zryiNJp7Aq+xd9Dx9crwz7zC95Y66LvTftpjo9BuWz52kTtqcBOs9wOevUTe+6lD+2nXxLPdpvW7RhW+x6SJ49VrljOates6G2jYRE0WVTybbLVZ45WV9OGx1f37tuRF/7pnZ08ve2c1bOe69if8M2vdm2/wdbh7P7W7fwHbNSYbxMdi8F7L1uxLj6EKFPcKlcsm6HAIztnbw/jfYLXU+W+zqNPPDfaxn0x3YOslN5H4fW3P/yuP108xC1VsliOOK9kdr3SQp/Ja9fHjy16fH2NTZu+PylSayNNJqXjZsr0OUk+H/SZMeKt+PXRMbZt244kn4v6THz0mVF23iWPWrtud1vX8x7w4Rc0RExKy5IRyX2epPRI7ibfsDe/tCuuf846dr/X/z655e5XbdG/+0ZGpdbW8/5e5ueM+nWP9grVg9myZSt8HLayZY+1fPkqWrVqTez++5+yf/75J1OvO2nSj9aqVXcrXVqToR1tVao0tksvvdk2bdqc5LmffvqVNWnS2QoXrmJFilS15s272VdffZuki7W66emxcOESmzjxh4Sv9VD344DWJ/i+qEoo8rnqAhtp165ddvfdj/oy5s9fyQ4/vL4v66pVa5Is65o1a61Pn9utUqWGVqBAZW+3Nm3O9eWJ5fff/7Rzz73KX1PPr1r1ZLvppntt7dp1iZ6nahMtm9ZT637yyZ2sUKGjrVSp2tar1/UJz9uyZWuiddFDvxtN7RH8XMumNov8Hb1PRto2OIaeeuplq1WrubeXlrFHjyvs778XWmYoMNffL9of5K23PrA2bZrbYYcVS7LP3nzzAKtXr5UVK1bD2+m441rboEHDYr5uVrdtevbbrFC0aOxhcfr2HZBoWbXfx7J58xa7665HrGHD9layZG3fD485ppkNHPiM7d7933AjkfvN2LFf2jXX3OHHb4kStXwf3rgxvjdTtO+/n2qdO/fy9tSxrv2if//H/h3PNjG1Z4cOF1rRotXsqKNO8G2ic3GkmTNnJ9kOOZHaTu3ZunXS6mEdI1q/xo07Jfr+CSe0TfHYij4mp0//1c4/v48ddlhNK1eunp97du78b5JFncuC19J5TlLbJ8LYb198cai3Q/nyx/k54cgjj7crr7zVVq5cbdklrW2bnnOCzJ+/yC688Fo/12vdtH/HOte0aHGmfy7MmDHTmjXr6u2r873aJlrkZ1T0I/LcHBg9eqwvo15Tx47GbdT7ZMZZZ12W6vsia2Xbra216zZ5GKkB6mvWODLZcewUwuji7MEn3rG/5y9PCLj0e7rQeeT+i+2oIzLW/dQDsK07bMwnP9qzgz+wZcvXeUBYvlxJDz6nz5jnF+PRk0+oUkiTELz+zgRbuXK97dz1T8L4cBeec6pdfVlH76ImW7ZstyXL1noVQ9zevfbh2B9t8dL4k07fO16x/BGzIasJFMpe0rONv54uDeKX7we/SNVM26d1viPRuubLm8cuPOc0u+iC1gkTkqR1+aJDleUr19tTz79vn3w21WfNLVniUDuifCkfyF4VPtEVefqd335f4MGjxtHTbMF67zEf/+ATpPQ677RE76GLwzsHDPMqlDatjvfl0WtovQY8/KZN+n6mt5faW+OrPXzfRXZ42RJ+kaiLy6nT59jdt5zrgZB+T2M7ajultZ26dGhsrU89znLlzp2hdo2/OFtuzw7+0L6cOMM2b95m//yz1yvKKh1V1m654Uxr1/qETFepqK01rtxx9apYhzYn2q+/z7dpP8+1xUtW+3h9kXbs3G0vvPyRh8cai63/bed7BdLGjVs9eNP2S4nCh+8m/2GPPfOeX3iqXdT+2p6XX9TWqlepkGj/1/pre2uSF82k2qndSd4u2m7DR35lLw/91I9rfU8h1otP9fGqKFEIppsI3TqdbD3PPS3JBDrZ1b6qVNN7qRKq6L/HafRyaB0U7k6e9qeVP7yk9bv+rETj2Ems/TIr2tnD3p/nejvcfG03DwIySzcYPvp0st374Bt+TlN157IVa+35lz+yOrUqemAWtKt+Pn/BCv93xaPK2FFHlkky1uaKlet8H1VFpbrdxpLa9s4qYbSXKIhWsPXmu1/bihXr/DyqKrSbrulqbVsdb+XKlvDPLbXX3wtWeBvLkiWr7fvJs3y5FPR+8fWMhNdUCxQ7tLBPjKVxE7Nzv0/vemWEqvbWr9/sn6MZCbB1U+v+R0dayeJFfVtGzkCvc7VuFr77/rd+3jm+fvy4ubrZMWHSL/bYs6PtzzmLfX30/ldd0t6uvbKzH1faRpqc6pa7/+fHn47lww8vaQsWrrTX3vjczuvewipG7OcKk994e4L3XtBkQ9HHflrob5cbbn3Ztm7bbmd0bOzbUh9WGo9YY4iqIv/PeUtt4eKV/ll7z23n+TAI0TcEdB76+NMpfpyVLX2Yz8T+2Rc/WYtTjrULerTM8H6R0vGp9lIIHvRYyfPvzNoHI4WCTZt2scWLl1mfPr2tevWjbdy4CXbPPY/ZX38tzPDEF3/+Oc8vlCpXPspuv/1aK1q0iIdU7777sa1bt8EOPbRownM//PAz69r1EqtSpZIPgq9Q4qWXXrfTTz/Xxo17w1q1aubP69atvVWtGn9eUfewUqVK2J13/nfBePLJJyT8+4orLrBWrU7xf+sC8pRTTrLLL78g4efR3dTPO6+PvffeJ/4eN998hf3xxxwbPHiEff/9NJs27VMrVKhQwnO7d7/Sfvhhul133cXeXqtXr7VJkyZ78Nq8eeNEr/vNN5O9HVTRddll51mNGlW8wm/48FHWufPpMatUf/rpN3v88cHWq9fZ1rPn2d4199tvpySaACDoeq+L0/ff/zTZ7onB8xQ4aXyvp54akPDzY489JuHf6Wlb6dfvPnviiZe8y97VV/eyJUuW27PPvmoTJ/5oM2Z8nuFxHPX3Xbdu7byKV5V/2l8effQu+/zzSYme9+uvs+x//3vLzjnnDH9/hebjx0+0q6++3cPj/v1vjvn6WdW26dlvM0Ld0BV46xyl0HXgwGf9XHjeeV0TPU9f169fJ2E/T87Spcs9OD7rrI6+7qr0/O67qR6+z5s33157LelxrrBcx8l99/WzKVN+9u79MnLkoETPe/vtD+z886+xMmVK2ZVX9rTKlY/0c8fLL79hF110TpKJfjp27OnHZseOrTxo1tAJhx9e1q644sKE5xx5ZPmE7fDyy6/7cZQT6bg+88z2vi9qv1MbRIazOq9Gdym+//5bbO3a9TZr1lx78MFnU32PCy64xho1amAPP3yHn5tffHGYnz8feuiOJENxBG0VfC1VqlTMlv32gQee8bFMdf7UGLN//vmXDRo03CZM+M5mzpyQqapWFfroeIhWokTxREMDpbdt03JO0GfZySef4X83XnVVT6tQ4XA/1+tco/fSEA6RNm/eal26XGznn9/Nh1jQcdCnzx3e3pFDTkR+RgU++eQLPyYi9yN5/PFB1q/f/da0aUN76KHb/e/FV18daaec0tWmTh1rNWtmbNiV66+/1Lp0aZvusR+xHwSRGqdNYVfpUof6H9ex/pjWwf/p51Ot/4Nv2Nq1m3zAdIWPGrdIYdnE736zgY+95Rd1mtk1vXbv3mNPvjDGhr7xuXfFPqN9I+va+WQfL3DmHwv8D3+FZomWe/M2e+bFD+yl/31qBQrm8wtEdVueM3eJj9E3+H+f+kWFAiwdlPpjv8/Ng/wkEU0XGZEOyZPbalQ70h547C0P4mKFGbpwiqQLrsjXTs/yRQZsqo65877hPjHKkRVK2UXnt7LTWh5nR1QoZd/98LvPYhrJg8BV6+2uB0b4jOc3XdPFzj27hT3/0kdesfLVxBnWvesp3qU1eP7Un+Z4UKkLvvz5DvHv6WLntv5DPWxWNV+ZUsXs08+n23c//mGz5yzxIFLdw4e/+aVX23z74+8e+OhCSdt+1Affpamd1MVelUaPPD0qQ+0atNFNt7/soUeliuWsY5uGHugqdPhl5t8e0KniTmPCZZROnr8riNyz146tXclOOK66HV3pcJv31zKb+cdCD4aCY0XHh2Z/fuOdCb4c/W4402pUq+DduPXHWamSRRMC8ZjvtWePV11pXEmFS82b1rEKh5eyTz6bYm+NmugVqh++dY9ViagaVHdvhSZ58x2SMK6YLtqfGfShV7Jq+Tq1Pcm+mvSLdylX+KUgUqHVex9856GdukSfd3aLRNVS2dW+QeCu9ql8VNkkFVtBwP3Ao2/Zh59O9jB1wJ0X2HFR1ZASa7/MinbWPvDGO1/bl1/PsGpVyicK1rR8mkBC5wcFoa1a1E91fF2t69ff/OqzGesGzl23nGPH1DjS+tz0orerqljr1FIQ+d/zFyxa5eur1w7OrUHb6Ph+78PvPGi59/bz7eyuTZO0TWrbOytldXuJzm2PPj3Kq7c14YrObboJpnW6875hfn566uHLfQiKzVu2Wdfz7vcbNtF0oySYfVzix20tZXFxe7N1v8/IemVE/Ppu9bExVQmanpBMbaHgb9zn0z34O7vrKUmCyB+mzPbzjSa3C3oTaNvefOcrfh4/q0tTv6Go/f2d97+xKy9p7/udPs/73fWq/fX3cuvYrqFdf9UZPlFVi3a3+n4cXRW8cNFKG/bmF/5aXTo1TggivXJ1z17/LP1j9kIrXeowO/3U45L/+yNXnE39ea6H0ZHV4dGOKF/SJ4eJpO2iY1bnUN0suLFPVzvphOp2Td/Bvq9rfTMqteNTVfa6GaOfKUDNnYO7rYdNVR2q9hg06GEPEeTqq3tbp049vbpGF5V16/4XWKWVZtjcvn2HjRjxrDVocGzC9x988PYk1U+33jrQxwD87rsPrHTpkv69Cy88y6pUOdl/Nn16s4RxwvQQVXdFdkGP1rixKpFOSAhojj66YrLPnTz5Jw8hoyc+UIjat+99fsF37bWX+PdUEaYL62uvvdgeeeSuhOfedtu1SSpItZ6XXBIfiE2fPs4vRP9b5z5efRfLRx99bhMnjvYgMXL4goBCpGBdFCIlF5ZpnfWQV15507dHcm2QnrZVMPb00694ODl+/MiEHl8nnljfK2wee2yQPfFExi9qFS7qQl77kEJrBbbRQWTDhvVtyZLpVrjwf+fyPn0u8souLdvdd98Yc+zirGrb9Oy3GXH//U/7I6DXf/fdl+3MMzskep6OreD4SimIPPLICt5exYv/95mj4E+fYSNGvGePPnp3wjoEFEy///7//N9XXdXL5s9f7N3mtZ8HXdZVaXnllbd58DxjxngrVapkon08Vtf23r27W9++8eGc2rpChQY2atTHiYLIYsUOTdgOX3zxTY4NIoMweMiQN/wconYKKEzXsdG9e+dEz2/X7tSESsa0hGWnndbUnn/+Qf+3bqaoYn3UqE8SgsjIYzVoq5S6MYe1306e/LHvZ5EUkl588U1+3OlmR0apYrF06aRDjC1ePM2OOKJ8hts2LeeEa6+9yytfdYMlOJ9qXy1SpLC/xzXXXJSoYnvduvX20kuPeOgv7duf5pX22scjg8jIzyjRzS+Fm6oKjzwWVKF+++0P+bAVH388POH7l156nldb3nffU/bmmy9aRugGnR6M/XiABZH6Y16BlKqoVFGjyWpiPUfBy70PvWkbNmzxyobe57fyP/h1AJxycm27/d6h9uXEX/zipP3p6asY0et/NG6yDXvjc79oevjei6zhCdX9Ql2v0zzGmIEKh3TB+9rrn3t33rtvO9daNKnrs3rqYvT5lz727txDho2zbp1P9vXSxaOWTUFXrty5/EJLvaAUMqlrtsYP0xhMutBRhaBCwinT/7Qd23farn/2+Lor0FGg1+iEmnZsncr+3qrmKnFYEX8PTQoQVOakZ/mCiycFbk89P8YvrrSsqkTRBX0wtlfXTknvSKui68nn3vcQ8pwzm9nlF7XzcbZUpaGLXHUN3rVrd6IgcuWqjf5fdXNVlYrG6HroiXe8q2+fyzraJT1Pt4IF8lv9Y6t418ajKsR3K17jYybu9AulOsfE/6Gqf3do09CrBNPSTocdVsR++e1vG/r65+lu16BC7oFH37affvnLmjSq7eG3Lh61HuqifOWNL3hbDH3jCxt4T88Mj4umscsUzup1a9Ws6DM6H1fvaPtz7mIPrNq0auCvrWWa+/cye/ipdz181synqp7UH5bLV66zvXF7rXy5xHfCIuni++33JnqFnl7voQG9vRqqQP58Xv3y8muf2jZV7UWFhbqBoG1fvVoFD6y1z73z3iQbNvILv1Hw6P0Xe7t0aHuiDXplrJ3cMP4CTd0LVQUtmqQpOoTMrvZVGKAwTe8ZORu0LohUKaZzyYtDPvZwTueFJx+6zE4+qVbMdoy1X2ZFO+/ZG+fdLrWsS5fFd3/XOUmBpqqNNXzB4sWr/ftjP5tqjw+8xGckTu78t2Tpag80dL69/64LrW3r4z1YOP64qvbLzPm2efN2i/O64n8rIvfu9XBMy6QK7eAYUCXWzXe84semwiBR2HNq83pJJutJaXtntaxuL3WfH/DQG/bJuKk+0YqC22NrV/ZtrQrwWX8uThj6QHTeO/es5n7+VHd/hV3al1QxrPOhQkQdKzrHlyhR1IodWsjHJc3O/T4j65VeWh9VG+t9ypc7MskM8t4DYdtOr/gMxmdUxXZkJbCOTYWCajs9om8crl6z0Q45JH6/lBm//m0PPDrSQ8UH+/f2fVH7uT5X1M56bd0oGPj42z78ij6rVOF4WLEiPryLQsW6tSr5Da9EbbVhq29DnVtVXR50Xdb3dENBn3E6F+bLf4h/Xj/Yv5cf05FU0artqdfa/c8//rkf2RYKR2+64xW/CaoJsxQ2/reu//jnuJax6tHl7ZlHr7TqVcr7Oug41L5U+5iKGa6GTO34DAJ0zaStYzsnj58ZNgU8CgtUDRLpssvOt48//sJ/npEgMqh41IWxKraC405tHTlUkcawnD17nlePRAYhqjpp3/5UD4LUhblkydQnPsyoIOTShV0kfa0gUj8PgkhVzGnswmnTfvELTlXkBKJDl59/nmlz5/7tlaaRIaRoApTkJkFRlWHkRbHkpElIvvrqO79OueiiHomWq2vXdl5JGR0appeqRBUw6qJc+4CqaaNFtru6qSrU1XlHYyuqWklVqgpowmrbsPfbSy451wNZ/e2malNV3Km76KGHFrHWrZun+/XUnkFoq2EI1F567bp1a/q2VLVydBCp94904on1vMu+Kv/Kl48fG1pVqBs2bLQ777wuUQgZhImxRL6utq2qhBcuXGr7K+1P2u4KHoMgUm2rsLBly5Nj7ofpEdleOo82aFA3w+OlhrnfBiGk9icF1Aqsa9as6t/TeTAzTjihnlcCRoveZzMipXOC2kKfYaoa1LEXWZXZpMmJXoE6efLPiQJGfT5Ehq66oVWyZHEPFJOjG1y6+aLnjRz5YqJzkgJutaU+o6OrQo8//libMCHxMCPI2bKtInLdhs3erVSVDdEXLJF/8GvcpzM7N7Hrruqc8Ie+Li5Oa17fGjc8xj745Ee/MD79tAZpmvglstrgg49/9GXoc3kna3FK3VQvwOb+tdQG/Ttu5K03nmXtWyv4if8DXRc2l/ZqYyNHfe3drn6aMc9atTzOLyJeeib+LpxCyE49BvjlvgIOXUjE0v70E/2h91GX8UeeGmUNjq1izz9xVYqVn3P/WpLu5RNVfXz6xTS/oFb3sMguarHogl8VHh+M/dEvjq64OD6E1B/QNasdYfXqHu0XukWL/jcTpy48g1l49XNdlI0cNdEr+rp3a2bXXdnZLyZFgbMeAYW4GjOxePEiCd0Z9V7tTj/BH2ltJ4W+ClXT2656vgKNL77+2aupHrj7Qg8Ig4szVZz2Pu80++33+R5gqT1VGZgRi5eu8S57qgxVIKHgoe1px9uoMd95da0uDlUltHL1Bhv46FteudO0cW3vbq8QV3Qxr21UNsZM0MH2UzXx48+N9v3j9pvO9mMsuBjVRa5+r1ixwlYqoktiEBRomAEFturCqXEgn3vpI9/PH7q3V0J3TB2begTUXXP9+i2WP19eO/F4jQmbO9HyZFf76gaIghLxSp/cufxrnUfGf/mTffP97wldbHW+UVCnZSoXoy1j7ZdZ0c46OBRgBm2uY0VvPWXaHO9arYDlgnNO9WNw+oy5PuRBtSoVYnZ9Vsj0wpBPfFKmHt2aeQiqLvvS5rTj7etvfrO6tSsmqnhS8LJ6tQKfPAlVngot7ntkpH374x/W+KRjrHDhAvb5Vz/boiWrbOnyNUmCyJS2d5bLwvbasWOXV5N/+Mlkq1rlcHvkvoutZvX/Zh3UeUK/pUrZwv/eZNH4eXf2O8fu7BffTpdd+6xXM2qIDd1gSW7ds/O8kpH1Si+1u2aXV6WeuhjpfL9qzQavFFWAqJss8/5abrPmLPau/wq5Xnn+Oh/Gw9tDlbj/VqarPaLDQb2WAjr9AaqqVx3LOrYWLV1jTz98uVf/6zjT5Gr6fA3aWDfnVOmpm4IaYkHrqkrAx58d7et6We82CZ89Af1e/B3//8ZP1Pc0Nq4qJTWLtGb1fm/Mt/b+Rz941+boIQH82D60sD8St1Ocd+XXuI9qtF7nt/Ku4UHIrJ8rmB76+hdemX/bTWd7e+h40lAQqvC9oMepfs7JaECY2vG5ceMWPw+UKF7Et9PBTF0HdaFcsGDimcV18SSLFi3N8MXzsGHveoXNo4++aCed1MAv3Hr2PCtRBYveX6K7b0Yug54TZhCZ3DIoSFHgFdkGCg8ff/xuu+mmAVa2bD0PBVSdpwvPli2bJPp9VdRJnTo107U8tWvH34jIqZJrL11jHHVUBe+Wmxk6B551Vgd74YWh9sgjd8Z8ji7MH374eZ+8Jda4lDt2/DeGXhhtG/Z+W7VqpURdZNVltH791nbZZf1s3rzvMzSJjsapU7dedVuNrkqO1V6aICTSf0Hm7kzt4+qGHf26Co73V9rvVU2t2eZVLayQVoHt8uUr7YEHbsn068dqr+hxPXPCfquqxXvvfcKDOYXdaTke00oBXWaGOkhJSueEefMW+N8sCmiTq46OHke4TJmSSfIWbbPI4yaSXl9DGygk/v77D5O0/dy58cdY9+5XxPz9jN5cxwEcROoCXePyiQKXQlHdR/f+291KF8mqJrm0V1v/gzmSLhw0npLGTtK4UvEfGmkPIvVHuC6M/KI1Dc/fsXOXvTx0nF9QqWqwbav4i55IClQVdqmCRcsU7c958bNQagIRHy8qFXrun3OX+sVGjepHeJgYxvLNX7TSLx4VvOSy+OqnlC5wdIGp8ElVFed3b+nbKHh++cNL2POPX2mFChVIdMdCVVeqhFGX7ArlS3ggoIu6k06s4SFzcuPMyWzN3rl3r3fLj3WRnJ52ysjzVfmmyjXvRnTh6R5iRLaPQpyKR5W1EocV9VBrw6atFjtiTv24UGCkbajxITWGmahrswICVTp+PmGGdW5/kk+wMOGb3zwAvPWGsxOF+ep+qW2ooCvWWG8rVq3zGbYVEFx1aQc7s0vThAtRraOCUC2LugoHoVWwfBobUBTM6bmq9NNM3U89fJlVqVw+2f1G66TZwFV1qn0w8nnZ1b7BvquHLvpVlab3UVfjQa+O9UosVawpVNcHosLex595z76a+Iuvn47ZyOVKbb/MaDvH/Xs86700O67eUl3in3x+tH+tqkoNH6GgWmPnTvx2pl19adLAK5iQSV07NY6uxoeNHO+zccOa9vbQ2xIC0YC63quaUFV8CoMU8g0Z+pl99uV0O+uMJnbLDWd5Jd/cefHVWWvXxo+5GfkaKW3vrJZV7aVtMf6rn736UPvB7Tf3SNSNWwG1qhRz58ltVY8+POY6+Xqv2ehdq2vXPCrFsRaz87yS2fVK2/vs9Qp10bi2mvhNweHOnf/4hE86prQsandV6R9atKAfawm/v2evH1PaR086vkaSsZmDmyDFixX28F/DIvw4ZbZdfnFb7x4d3dZB4KcuyPr7QTfMdHNMAfo9A0fYug1b7Lorz7C2/waYkVTxq3E6CxUskDAEhSZ7Uxu2btnAq1NV8bh06Rr7fsos++b7mWkem1Tng3seGOHd7hVAXndlp0Q3UVX5qZt0Ot+re3qzk+vYrNmLfAIj/U3UstmxPhSKem9kVGrHpypB9dAwAhntpo+UaQbOr75612c9VgWdujNrPDqFkurCV6NGfEC/P1J1pKr/Pv10go8l9uabY+z551/zMdxuvfWaTL9+iRKJq6UPRv36XW2NGh1vnTq1jvlzTVSjMSnVVVljGJYurbF/c3sw+eab7yc70dX+2rYKHtu2beHjcv7114J0Hz8aH1JjPqr6S8MtKNzSZ9DYsV/5z2K1V1hVuDmpujcru2c/+eRLPqapxt1TdaRuWmSmO/L+1F5Tp87wMSbVdVljumpcSlWPL10aPxFa2BM6ZkZazgmqINWQArHUrl0jU9tLkzppbMjXX38+YbzXWIYNe8bKl089W0HOli1BpC7e1S02flbfMknGH9q96x97S917t++001rUt9rHHBXz4sgvVHKZ5cmt+Cx9Chcu6N1vdfGk7sqqblBVZHTgGVC3NXW90x/l6oZXsGDiCgqX69+uJ961JmkooO5vCsF0waIx9lKjiozf/ljgXSQVTKV0UZuZ5dMYnVpvVZj0f+h163fdmT6GXHIXpOre+OfcJX4Bo/GzIl9L21SVctG279jp3f50UacQ+dXh473i4qZruqU4oYouXDVWpNpN41RGX5ymt50y8vwPx0722Vp1kd6p/UkxK5zUvU7rrn05g/M8+HJNmfanr7OqUoNxyQoXKWitWtb3rnoj3vrSK2s1LqQuhPtdf6Z3iQ62lfYzVbPpNdTNP3obKhBQZZS2X706R/uEDqogCihYUndHvY6qGyPbJr7r5Rb/noJRjdM69ae5dvWlHXzbJNeO3i1/dXxllKokdXG7L9pX1CVfXSUP+zfMUPto3EDtV6qUUuirtteF+mdf/uTdeidPm20PPv6OdzsPhpFIbb/MbDtv2rzdv6eJYvRe7435zn7+5W/vVqxl1HK3bF7fHn3mPX8P3UiIDvMVAmlCEq2zhmKInJBGFMZGb4ug8luhkMaN1Hppkp1XR4z36uE7+/Xwsf90DKvrrCb+UKCWnu2d1bKqvdat3+SzmWt8vGuu6OiBT+R2mfHrX76faj+sXKlczOVQWKYbAZrgSOudUqiXXft9ZtcrrdTuqugOKjDVTV7VqLppoc8EVc3qnKZwWMH4cfWrWIV/b7YEgaiq/bTPaZiUWDcP1cb6rNYNG4V1On/3OrdVzFmdPYif/IePj1zv2KP98+qxp0f57ykAvf2m7knC+cDOXbts585dPqRH8WJFfF9+YcjHvs/ffG1XXxetr4Zw0URh+vzVuSC1yl8dK+pKPun73+2MDifZbTd1TxL0LV+x1sZ9Ps3yHnKIdWzb0Ct4tf0WLl5tF57T0sPTjEyek9bjUz/XvqIbjfq8Tmmc4YOBJoXQBCPbt29PVBWpcSNFFW4Z5b0H/h0HSxOffPHFJGvd+hwfc1Fj0gXvL7FmJg6WIXhO9GtnlchlCCZsCbrLqft1kyaJJ2oRVXWq+7oe6oZ40kkdPCSKDCKD19IMwPtaetortecmt810vaHq0czsM4GKFY/wR3I02Y8mbhg1akii7wcTqoQto/ttZgRVkOvXJ/6bJC2GD3/X2zNyTE/5+uvYM72nVeQ+3rZtSztYqYusJq7SxD0aikHdadu1a5lktvewpXaYh7XfKvzXNd6nn76eaBgKnfP3ZwpUfSikf/4JpSLzgw/G+SQ/N9xwmYedyVVHiyawCasqNK2Cv0Wjxx1H2mVL/aouchcuWuUXQ6pKiqZKIY0bqKClU7uGMS8UdBHgY+HtjfOLPs2EnB7FDyts1111hlcjqYrjxttespvvGOKVHApAI+9O6I8HdWVW1Yi6s9WtHbtLlCoodCGqi0YtUyR1TValkV63Xp3KicaES44m1VA3ao07prG8kpPZ5atapbxdc0Un74al7qm9r3rSXhzyiY9VGDkgragy55PxU/09u3ZqnKTbWXJUVaWHAow/5yzxLqs9zmzmF5IpURXlihXr/WJI41bGWq+0tlNGnq9KTlXn6iJT3eVjXZRpm6paSUGuLiozWkGi2V6nTJ/jF7MKjYIqGe0rGvtMr6sQUmGkvqdwS91rI0MFBWBaloTullEzoysMV4BYpHBBu/ryDom6pOsGgcI3TZYjukCNLGn32bG37vDXVJCjGYF1k0DbUYFJclTFpMlPVOlT5ejDE7VPdravv87Grf5aCiKDKlKNSXrD1V28e6XGDtQ+qqplzfre53JVzuXxQENhQ1r2y8y2s46tzZu2+X6grrqqHh828kuf5EZDUATvpYpFvb8C/qBSNZKqwVSppZsSZ53RNE3jC8bP4LzMu8mqK6vOsc8N/tAr12+58SwPt0VdRoP2W7QkvrIzLds7DFnRXj5e0QffeWilwErdqiNvSqmKVl1pdZNMx5sqyWK23fzlPkZh/NAF/43RFS279vusWK80v1ecqnzju49pHODXBt1oH787wN549RZ78ck+3h38rn7n2NWXdfDgVTegIo8bfX7puFK7RX9+ypZtO7wHg/Y9jXOpME03QaLHkoz8vPv2hz98HEiN53j7vcN8tnAd4y89e62d36NlzL8t1O6aVErDtmh/VxipKkp9xl50fmuvQtdyq700zqxeQ6GetntK1NaPPjXKPvp0inVud5K3hdYl2g9TZ/t5Vp9PumGnSYR0fGl8V4WnsYaJSI/Ujk+tv9ZHz1PVZcybmgcRDYyviyx1o46kyU2Cn2dErBlOq1WLr6qN7KKmQE/jiGnMs8jumZrlV9Va6vocq5ugxpZT18e0SO25wToqIE2tDbZtU4i9Lcnrq3t7dNe7446r4+uscCy4yA+o66JCzuxStGhhnxk1ekKdjLTXqadqCJY8Xn0Y2cVXXRf1HhndZ9JDn4fR3ZMVrqQ0uUxWyuh+m1Fq52A8V42pmF7aXvrdyHOrxnbUNsyM009v7mHbM8+8kmR/VkAffazkFEOHvm25cpW3SpUaZtlrnntuF/vxx5+8TXUc6OvsFoynmtzxG9Z+G9ykjDwm9VmriaP2Zxr3VOe70aM/td9+m5Xk5ymN+5ia2bPn2oUXXmfNmze2xx6LvzEXi6q+1a4PPfRczC75ixZlfBnSS5NSaQxMzYiOnNw1Oy7Ou84pEFNXsWjTZszziwD9wZ3cTKGqLNBkNhq4/oQG1dIU7EXSh02jE2vYay/eaK+O+My7b2kcvq8m/mpndGhkV13a3o46okzCBc3U6XP9pKGLGM02GosqbXQxVbRoIQ8bI2msS1VBaXKU5AK1aFo/DXKv7pEpVRVldvny5T3EL+g0WL4mtNHs1vc/OtKrUjV+VteOjf13RGN4zZ231KuiNIlHWsd+090BPeL2xnk7K5g5q0sTf++UrFm3yR96fqkYFX7paaeMPF/j3ymQ1cWYxjKLNdaEB8G/zPMuyrqwKxcjXE+Lv+cv98okddNV9VlA66yupQ3qV/GueWrzXue1sst6t0tS1aWJZPQQBUWRzaXQQwGYgprWLY+zUxrXSdSeCisGvzo2YTbW6K7d8V1g43+msRPVjhq/TNVNKe3PuvGgCTxUuVylUuKun9nZvrJqzUYPgWpUrRBzbNpICu5UTad20LGri/OgC3JK+2Vm21kVm9t37vJwVxVqqoTV+17Sq02irqwKQFQdpuBQlWjVqyWujlCllsaIbFCvitWskXzlRCQFMKq80vlUIYvCZg3lMOCOC3yCkGA9tA+W9EkszGbPSVwNltL2DkNWtJdCrU/GTfF9vOe5p1rpiDEvtW76fNAYrfq5tpVmS46m5ymAVun50RXLpRgcZtd+nxXrlVY+WdDyNd7OZ3dp6uORpmfb60adAnAFprECQlVZKonUjcJ3x3zjM0g3aVwr2fcIKt9Fk7yo8rjvdWd6d2fddEvu9xR2rl27ydtH7aVJdTTBmz5bVVkc7Of6fZ1DFCYGNzg03mss+rmqGke+N8m7kd931wVJbhQFgp4TOo/r2FNQrdm/1Z0+K46l1I7PoPJd9PfKgThRzcqVqxMmDJkx43f/r77WpBcSOaOqJld46aXXfWZQXVxUq1bZxo2b4DOJaly6jExUI6ry0EQW6lqrCplNmzZ7sKcLKnVjjKQuzV27XmJNmpxhl112np8bBg8e8e84gPEzw0ZTNZxm/FYX3RYtGvu4rZp4Q5MuxHqulmXgwGd8MhOFMhrXMZjwRONXqgvlW2994GFiq1an+AymgwYN94t2TRwSmDPnb2ve/Ew788z23jYa++vLL7/xWVo1k3YknfdeeeVxa9v2PGvQoI2vm6qmVq5cYyNGjLLBgx/xiVnS6/PPJ/pryK+/xl8cjxkzzieKEQWB0RNkqA20TTWLd7du+tsqv1faRFaAprVtNQaeup+qK6oqXLt2betdMBVGqWqnX7/4GZHDpBlpX3pphE/gouVVJaaWWftarMAgjLbNyH6bVjpuX3/9vfgK75WrbfTosfbTT7/5TNTBzNeRx3lA42Xq96Rs2VIJE9tozE3NvNuly0XWsWNrH9dOx6PWS+MaZib4Gjz4YR/j7thjW9nFF/fw8Qc1w7bCvm++GWOVKqX/JuOYMZ/ali3xIWYwBmiwXqKhESJnTE+vIEDXvp1VFDwOGPCkHzeaUTnWsAK//vpHwn6lsTrlhx+mJwR4xx57TMLs9RmhY+G55/5nl17a189bmh27QoVyic7jYey32h5PPTXEOnbsaZdffr5fD6sqdPPm+MKR7BBW277wwoPeVo0bd/IKeM0EruNn8uSfbNy4r2337sQ3mdKqV68bbNu27da58+n+2RMpcll1PGminn797rfjj2/r4y9rzEyNz6rw+MQT69vQoU+n+/11XH3//TT/t4JzUfsFx5mqQSNn9Q4+07p372QjR46xO+54yGrVqu5VktEzw2MfB5G6kND4QxLrj9xFi+PHfPQZNWOMJaALHl1k/z5rodWoeoR3dc7IH8v6Hc3+e8+t5/kf+68M/8x+mDzLK0Rm/7nYBvbv5V2UvTvFklX/diWPHbpond54e4KPLXX6acclmQk8GDtM3dKix2RLjoJDXZDogjmlwC8rlk+hy6nN6vkMmprUQWOX/b1gpd370Bt+YTTgzgu8a6mqCTUhQd06lbzCKK3iB//f6xfGCoPO6tLUalZPOhhwNL/A27jFKy81QUZm2ikjzw9mT9X7qzImVtuqPdQVtED+vL4fJde9PzU/Tv3TP5z0XtGhUskSRe3aKzrbli077LQW9eziC06PedGr4Eld6rQ9tb0il1dddb+Y8HP8ZE8t6nu3SdEfckuWrbWBj73t21c/13Ik6e4YF/8aajuF1dqPO7U7KdV9WcfrwsWaNT53kpA8O9tXx4mCdK2v3u/QVKp547v8bvNjR20ZGWqltF9mtp2XrVjrgb0mG1IAqrEZFWgeX++/LviiGzm6WaNgU+O8Ri/7zN8Xxncd13h7aRyTRROCaOxcdVtXYDt85FfWrEld69r55ERhmZZD3Wpz58pt02fMs81btiVUS6a0vcOQ2fbyGejnLfOwSjcmNBFJ8HtqP82K/PSLH/i//117H+s29tizS/ycos+VlI6L7Njvs2q90krnHa2T2lmTL6X3M1nBrI5RTcYVPVxLcFwpMJ05a6EHJQoUU7qRpVBz3brNvhwd25xoN1/XzasZo0NOHd9vvjvRu1urB4ZulC5buc5/74gKpW3c59O9J8GNfbokOdbVdVlBZDDLdKyJ3lTR+tQL79vrb0+wdq2P9wnhkgshfSb1f8dc1Q2exwde6mPQRgeCGsNVIaU+q6+9vFPCOJZpkdrxqSBW21G0Lx+IQaQuwi68MH4SwcCDDz6b8O/IIFKzW3/zzft2xx0P2xtvjLYNGzZ519oBA/raHXdcl+FlOOOMNh58qsveqlVrrXjxYt598dVXn7CGDY+Lem5b+/jj4fbAA09b//6P+zbRcxXinXbaKTFfX5NAqKJL1YbBGHevvfaU9e7dI8lzBw162Pr0ucPHp1QgKhMmjEoUAr755gt2//1P+/IqsNPyapbSBx+83QoV+i/wULdFjRWmcS81HpxoXLSnn77Pu2RG02ysU6aM9XVTd2J1qz3iiMM9pFC4lxEDBz7rk2FEuvHG/gn/1rpFB5EKDhcsWOKBlkJQtVf//jfZvff2zVDbPv74PR5wDBnyht18830eCnXo0MqDDFXNhO3JJ/tbkSKF7J13PvLx+BSgP/nkvR5IZiaITE/bZmS/TSutkx5B2Kd9RcfORRedk+Jx/s03k/0hqrQKgkiNualzo6p+x4+f5PuxgnMFxz17Zvw4lx49zvB9WpMHvfDCMNu6dZtVqnSE7y/lymVsxugbbuifpNoscl3nz5+cqSBSoa5ceul/NxkyS+N2qqJQr61uttETgImOP4WVkTTzsh6iYzIzQeTZZ3ey336b7SHw2Wdf7n9z9OrVPVFQFcZ+26RJ/DAJ99//lE9Qps8V3dy55pqLrHbt/2aUDlNYbavtOn36Z3bffU/6OV8hpAJ8HZPPPnt/hpdXNxJ0Q1ljt0aLXta+fa/ySugnn3zZHnnkBa+MVHVrs2YneTiaEZMm/WgXXXRjou9FTsqj/SY6iJTnnnvA/6t2Xbt2vU/qRhCZw4JIXdzrhK+LilizJKmLk/4Y1uD5sSxYtNIn61CFRI+zmtnRlZLe4U0P/ZHfrEkdv1Ab/eH3ftGg7lHvvP+Nd5/SHxm6mJBY4+DpRKbuVmPHT/WubdHjTgUVBvrjXl3Rg2AioO5ouqBQYBB0jdPvLPILszifZCG6i6wuLtVVtEObEzO9fAGdbHUReN7ZLTyU/N+I8T7O3dujJ/kEKZqdWRfQGssr1mygAXXB/nLiDA8qgnHp1B3VH3v2ekjW5tQGqVZDisaR1HhVZUrFj2MZLT3tpHVO7/PV1VmVa7oAjTVGm/ZlXcwvXrraWjQ91sdyzAhdZKv6Se2j7vWRoZeoDU8+6Rj76J3//uBLrqJNXU+1LuqSGDmJiPY/HTu6uNRsyMH3NfHIfQ+/aV9/+5uv9/wFK32Igm1b47tCRl6HajmDbdzylGNTrSpMmMRi6Rp/neguqNnVvqK2nTNvSUKwGzlmYyyq/nz3/W+93Y6tU9lnGw7aLKX9MrPtrO/rWNEMyZrxV8e2d6mNqn7V+VFhhZ7rY9Tt3ZvQRV7bKahqqhDV9TugfUOh48TvZnpwp5sT+loPhVIjRn7pIdi1V3aK2XW49jEVfR11g0XjHapKV+eflLZ3LJrRWNWmp5xc25chvcFHZttLn0O6kaPtphtPmnBLy6D2UeB+W//XvD17n9/K/jd8vCc1qlaL7G4vej3dRFOIFl1prd//Ycos7/av83x27PdZtV5ppf1Gx1gwW3R6qYJY52XtM7H2AXUlDs49tY+pkPowHD6DevzkOLohoIrG6CEUdDNB3Z/1eOje3v7zvf/s9X1KFYOaWV3jrDZqWNMD+Wia6Ew3hPS5obAyrk7iSZuCid30WdrylHo28J5eHnimWEH+77AwmhVbf5NoiJpIamN1TX/kqXe9YltduNMjteNT7bJtW3xVfTAb/YFGAVtcXPzsqGmhSjcNhJ+VNIN09CzSKWnf/jR/pJWqGd9888U0PVdj4+nCOyWqEHzggVv9kRJ1W3zmmfRdfGpG4bfeGpzq81T5kpbt9vXX/1WGpZXW78UXH/JHVrStjvGbbrrCH9mxzypIiQxTFA4//nh/f0S7664bsq1t07vfZuWxm57n6uaWbizEurlw4YVnJfpaAWKsQF+hdazgOgiiPvoo5WMsehum1OYLFkyxMKmSVOHpddddkqWvq8AqJSm1YVqfm1w7Bsfl/fff4o/s3G+Dqkg9oqXnsyiWtP5+eto2PeeE4DPk1VcTh5zpOX/E2p/Tu4936nS6P7JKcsd5ajQUw4gRz2XZchxssmWMyD3/7E24SNAf6tFKlCjiF7QLF630Cq+Anq/B7DXj5F9/L7M2rY637l1PSXWykdToxKSHj0nZvqGd2qKev6YuyjzIyR3fhVx/wKv7rKoKg+XRH+rq2vbQk+/4xaW6ONetlXhSCNFFp6qfdJERjKem31dX6QEPvWkPP/muj40Zua7BunvFyr8XJ3qNGb/9bX1uetGGvv65V0ZkxfJFt4Uqh3qd38q7Z+qKVmNWxf88ceAVScupijNVUd5+71CfUCAWVZBoMoLUAof4MQm3e6WNtkes7ZyedsrI8xUIKqjWhB9+ofnv84NquReHfOzdV1Vpc+M1XXzm8YxQlWgwk3kQ8CS3bYJHLB5w7Irf19T+kWP3KThT5ViwvgohNOGKttXnE3621i3re/AejLn214LlicY3+ncp/P8V0p7cqFbMQDuaAlFVJgW/Fym72jdYDs2WHt8FNfmKLQ/2N2+zl14d6xNFKDw67+zmdmSF0mnaLzPbzvMXrvCv/5i9yKv7enQ7xcc+jF5eBSUKV7RNNMnR15N+9Vl9FewFXW1FbRs55q3+rVmmp/001y6//jl76vn3/eZCMNmKAlMNfTFrzmI79+zmPoxDrLZSl28FblpHhZaqBgwC8OS2dzTd7dQNn753vmIj352YaH9Nq0y31w+/28aN8RX6en/vFrtth583b7p9iC1bsc5uuPoMH0M0zyF5PHD8a8GKmJ9r3n04aqBqnW9UDXfjbUO84j679vtg0pHMrldaaRZmX49cFnMW+ZRo+db8+/uRM0gnErE5q1cp7xX5KX2GaN9Td2ztYxqPc+26+O7W2jb6u0NjPl7Td7ANGTrOOndo5DdWPIj0yfRWeEWlxnnVuJcXX9DabyBGv59CRU32pu0++sPv7Kdf/rIHH387YdbpJ557314d9pn37tCQL5OnzrZBr3xit97zmvW+8knr1H2Atet2j1101ZN+rtH6q2u6jt3pP8/zGxo6NrVM2o90TlcAedf9w61gofw+hm16943Ujs+4iBtO+rslI8ckAGD/pDH1NMSCZluPVbUIAPt1RWShfysevFJwfXzVTqQmjWpbvnxjbOXK9d6duOd5p/kf+pOn/WnPv/SR/fr7AmvauJbd2e+cDA3gr/dVFz5dlCgM0CQh+lrjJmkSlQ/H/ujdDDURiP5Qz703l1cD6uJVXcInffebNTmploeIYz750Ya+8bnP2qrJLc7v3iLmpBC68FTQuWz5Oh9fr3TpYvbrzAXx1ZdTZtvFF55utY75bzxMtU+wbuqCpYvGooUL2kfjptigVz72YFDVFaoq1PpkdPl0oaGujXovde9Sd0yN36nx/z7/6icft0tBhMaHE1XVqNpl1p+L/MJL3eMU1GjCn29/+N0nA9DvXHxha2txyrEJ76OLy+ACs1qVw30Sj7RQ2OOB6u5//BEtPe2UkeeryqpyxXJezaRqK1W4FStWxGb+vsBGvjfRwwLtK3fdco6dcFy1DHdjW7Vqgz/067VqHJnmsTeTiLhm1HbQBWzwWhoXrWDB/F4hpdmgFcpodlZVx2im9Zuv7ebHg6pmtRxffj3DLuvdNiGAk4IF8iZUFKY023kkBdYezMSZbd32342F7GxfUdWdqq6UaGifjUX72vwFK+zZwR/6saNzxDlnNbNzzmqeaJuktF9mpp29YneJhqaInzBCY+See3aLZINpzWxdrsxhXpF1/qWPecA66Kk+vv96oJErl435+Adr0qiW1a1V0W9S6GbOR2Mn2+iPvvfz2519e/g4l3pP7TNBiKYhL1Qtl9wkN6qG1ay/Dz/5jq1YtSFhsqyUtnc0VWqe2KCaj1WoqvBLe7VJ176fFe2liVR0PlZbqFLy8WdH++fN+K9+9qrKW284y3qf39q3s8YM1Hb8+NPJduJx1RJ11dXv6xyqQO37ybN8RuLVazZ4wKpZxzXGX9fOTbJtv9e21yRDmV2vtApmtdb7RlfxpUb7SzD+rM7BkcF5IAjctG1r1jgy1ZBb++0ZHRvbrDlLfII1HY+aHE2Bq4Yb0cRgGgtVPQQ0WVWwzroJpc8zH3ts1QY//iO7tUdS4KqfKdxVVbAeCuj1mhpX8tVh4/395i9caX1uTrmCau5fy/0zU9XSw0d+6Z9Nl1/3rFfEFi1a0Ktt9fmuv4s0vMotN5ztk+2lV6rHZ1x8t3ZRcK2u6xnZHwAA+5+jjjoi01V6AJBjg0hVFijQWrZivU94EK12zaOsc/tGfnH6zKAP7L0Pv/M/nNX1S6Ghxsh74K6efjGXkQs0XfzdMWCYVwuqKiWoFNHMmvq3xnnSoPbqKhg/O2Zu69imoX3x1c8egt58+yu+3KqY1Kyy6uJ53ZWd7ZKep8e8aNdraFk1gcHsuUus/8DX7dnBH9iatZtt69btds6Zzez6q85IFM4FXXEVFvzx52I77+JH/cJLF/sKUe6+9Vxr06qBL5sumDKyfGrTF4Z8bO++/41fOOpnahNVe6kt9HWntg3tuqs6e/CkZapVs6JP8jP+q5/ssWfe84toLZfGiFMQoIvnO/p2t/PObpmoC7ouZIKLmRrVjky22300XTDrfePH5EtaPZuedsrI8zUpzwU9WtrMPxbEz/Td+2Ef00+TGazbsNknHtIF/WnN62UqJFObb9223fcBzbibUepWWa5Mcb+QVggRuUzq9tq92yl+kTth0q8elqj7/D23nuvfVzdD0X4/ctREHyNU1boD7+npk3yoSkdd90VdktM61mn+vId4xZUCDy1XpOxqX1EAFx+yxVeHBUMmqNJIFYyqHlNV0sfjpnhQlT//IT6z7s3XdE3S/Tql/TIz7az2DY5RdZtUMKfAKta663t1a1e0G6/p6pV9CsEu7902YVKZLh0b27tjvvXJpa64/jnfN7Se6zdu8XXVhD133XKuNTy+moc7ahuFDjoe1IVZ58CUutjqGGnb+ngffmHnzl121L/j46W0vWOtQ/lyGmsyl59DYgVQqclse2mGZ50n1WV6yrTZ9sY7E/xcVavGUdb3+jOtYYPqCcM6nNKkjr35zgQPd3VOv+rSDgnjb+o5Cnz1s5Hvfu0zlqtyXOPiKtTV2IDBuLrZtd+feHz1TK9XWsWH8xG9HtJB+18wPqW6owfdkyNpPEOdg/Q/Bd6ptYueq0lzNDnR2M+m2qTvZ9rX3/7q66V20N8Z117Z2U5ueIxPGhS8nv4b7FO1jznKx+NNLvTUc3WT4u+FK3zCOHWn1s0Gdcd+Z/SkhPOBbjxoPGsN76AbODrHFy1SyCdN09AGCkX1WmoHdaPve103/2z+a/4KrzTWsaYgVr93e9/u1qNbs1S7eCcnteNTLxncqPHJz7btIIgEAADA/h9E6o/ce++4wMfq69zupCQ/10X+7Td1t8MOLWwffjrZL6Q16PIxNY6yrp0aW5cOje3QGN2k0vz++fP5TNOqpNEYYYcdVtjDx+ZN6lrzpnW92lIX4JEXJrrwebB/L3v8udHepUsXBwpi2rU+wc7u2tSrFvwiKZllqlalgl/0aObMhYtW+Syvx1Q/0sdj7NiuYZIxAaVD24be1UvdwzSQv8LAVi2P88pGXTxHhmsZWT79s2CB/D6mmqpZ8hfI693MdMGkyhGNm6lqnMgB67UMmtxH20iTq6iLfIGC+a1q5cP9vc49q4XVqFYhybpohmpNhLNq9UYPYIKx7FKjig91Tzz00IJWuFDsqoy0tlNGnq82U5c6jVv6+jsT4i8Kc2l23rL+/YsuaB3ffT2TKpQv5QGtuhGqMi6j+7YCfl3ETvt5rnVs2zBRZZgupm+6ppvVrlnR/l6wwsOmls2O9W74kVo2q+fVwF9NnOEX5EEmoNc66cQaPtaoZqqP/r3kFCiYz5o2quXLpIvnSNnVvuLjoeX6ryp2zrxl3iVYM+vu3vWP7Y1TOBn/XIUFl13U1nqf18pD+ujtkdJ+mdl2VjdQVVSqMlZBSkoVggpIVX2lRyBYVnXx17h3rwz7zCvAdKNAYaSOTx2rutkT3GAQvY+qCXXu0/ZVgJXafqjKS3V/Tev2jqZAWGOjKnjSOJXpHe8uq9pL5/+nHrrMPhj7o3dLrVf3aL/hom0btIG2q37v15nzvbJd+2kkza58zeUdbdOmrT7GsMYZ1L6rffiKi9p5GBW8Vnbs91m1XmmlsRK9p0Piwuw0UXuoJ8Q33/+eEExHU5irmyyFChWwenWOTvU1tSz6DHxkwEV+Q+23PxZ6sKaJmOrXreKhYfSEXqJxUVWJqs+IMzufnDDOcXJ0DD0x8NKEr9UN/JPPpljRooV8W+rvFY0xq/0jOTpXR1Ilsm6YTfr+d1uzZqPfMNVY2I0b1kw4pjL8908qx6c+/zSW9X/vsf9NVhM/C2guH7AeAAAgp9HfKFk5M/2BIFdcGktS9uyOH3sqTBowXbPM6o68qkVKFi/6b1VE5oay1MWvxp9Ud+Xg7r+CTVUu6EI1pbHj1qzb5F1oNXC+Lg50YaRuU2m5KFDFkao9VI2kbsoa11HBX3Lro/fThZPaYMeOnX6xXa5sCTs0mfdL7/L562/b6VUR6rKnC1L9joIItUVyF/TB2GMa22z7jp2+Lpr5VRd9CiaSey89X9VjqriJFbzGootnhYUKatSVNVaVTkbaKT3PD5Zjxar18WNr5crl3VIVLKS0v6R3n9QYoQpkfLb4NIy9GBafuXXdZq8S0/4TVN7Ej424w28gaN1V1ZXW14vf9lu8Wi1Wt+iw21cU+tx533BbtHS1PXb/JV4Zfc/AEbZj+y6v0NW+qzBFAfxlvdp6l+rkupimZb/MTDvr+M1zSO40B/Yp7VequNJDIXeBAvmtZIkiVrJEfHVZGNKyve3ficq+nzLbJ01RSPzso1d6NVh6t3dWtldq1G136dK1XkV71JGl/TwWubxBF/GVq9b7sAjan3RuUbVf7AlYwt/vs2K90kLbW9X+mhH+5WevS9NEVtFV4bpJp+rryIA8oPZUpbKOuUpHlQ1t/w3eS5/Xeq/0toP+XtFYyfr8VBCsz5ecJLXjUz+fMm2OPfjE214RfmbnJkn+FsiTN+OzsWaH5cuXW6dOHezxx+9MNPMzAABATnD++ddY1ap1bMCAAft6UXKMHBVEAkBWiT61KQT84JMffHzYwoUL+sQrJ/07c31K1c3IHIVvqnzTpDrvffCtHVGhtN3Zr0ey4/Bh/zy+2JYHxraMtR1zehAp3bufbTVrHmH33Zfy7KgAAADZadmyFda588X24IOP2OmnZ91s3/u7bOmaDQDZLdaMt5f0bMOGyGaqTH1l+Gc+buYj91/iQzZoSAGCq/0b2+/AcSBsy9atT7fXXhti7dtPt0aNjt/XiwMAAGC7du2yRx55wQoUKGRNmzalRSJQEQkACLXaSgVXQdZxIIQewMFkf6iI1B/6t9zSz6ZM+cHat29pp57axI48srzlz5+zusoDAIADm4b82bx5i02dOsM+/vhLW7BgmT355NPWqFGjfb1oOQpBJAAAAPbbIDIII1955RX77LNxtnTpkgxMowQAAJAVcvnkNI0aNbYLL+xpDRo0oFmjEEQCAABgvw4iI6uwFyxYYGvXrmUmbQAAkK3U+6tQoUJWuXJlK1q0KK2fDIJIAAAAHBBBJAAAAHK23Pt6AQAAAAAAAAAc+DI9a/aW7bts5bqtWbM0AAAAyFa5c+WyiuWKWe7cTCYFAACAHN41O42/DgAAgBws1qz2dM0GAABAjqqIjPVHKwAAAAAAAABEYoxIAAAAAAAAAKEjiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQOoJIAAAAAAAAAKEjiAQAAAAAAAAQulxxcXFx4b8NAAAAAAAAgIMZFZEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAAQkcQCQAAAAAAACB0BJEAAAAAAAAALGz/B+iu5+bT6+h6AAAAAElFTkSuQmCC",
|
| 421 |
+
"text/plain": [
|
| 422 |
+
"<Figure size 1400x400 with 2 Axes>"
|
| 423 |
+
]
|
| 424 |
+
},
|
| 425 |
+
"metadata": {},
|
| 426 |
+
"output_type": "display_data"
|
| 427 |
+
}
|
| 428 |
+
],
|
| 429 |
"source": [
|
| 430 |
"# ─── Change this path to your own image ───────────────────────────────────────\n",
|
| 431 |
"USER_IMAGE_PATH = \"../data/samples/handwritten_text_01.png\"\n",
|
|
|
|
| 454 |
},
|
| 455 |
{
|
| 456 |
"cell_type": "code",
|
| 457 |
+
"execution_count": 7,
|
| 458 |
"id": "533344f7",
|
| 459 |
"metadata": {},
|
| 460 |
+
"outputs": [
|
| 461 |
+
{
|
| 462 |
+
"name": "stdout",
|
| 463 |
+
"output_type": "stream",
|
| 464 |
+
"text": [
|
| 465 |
+
"Found 3 image(s)\n",
|
| 466 |
+
"\n",
|
| 467 |
+
" handwritten_text_01.png: Io sottoscritto Mario Bianchi , vinto a Firenze il\n",
|
| 468 |
+
" handwritten_text_02.png: Ho ricevuto la somma di tremila ero dal Sig .\n",
|
| 469 |
+
" handwritten_text_03.png: Firenze , 15 giugno 2024 - a titolo di rimborso\n",
|
| 470 |
+
"\n",
|
| 471 |
+
"Batch transcription complete.\n"
|
| 472 |
+
]
|
| 473 |
+
}
|
| 474 |
+
],
|
| 475 |
"source": [
|
| 476 |
"samples_dir = Path(\"../data/samples\")\n",
|
| 477 |
"image_files = sorted(samples_dir.glob(\"handwritten_text_*.png\"))\n",
|
|
|
|
| 503 |
},
|
| 504 |
{
|
| 505 |
"cell_type": "code",
|
| 506 |
+
"execution_count": 8,
|
| 507 |
"id": "a238b64c",
|
| 508 |
"metadata": {},
|
| 509 |
+
"outputs": [
|
| 510 |
+
{
|
| 511 |
+
"name": "stdout",
|
| 512 |
+
"output_type": "stream",
|
| 513 |
+
"text": [
|
| 514 |
+
"Ground truth : The quick brown fox jumps over the lazy dog\n",
|
| 515 |
+
"Predicted : Io sottoscritto Mario Bianchi , vinto a Firenze il\n",
|
| 516 |
+
"CER : 1.000 (100.0% character errors)\n"
|
| 517 |
+
]
|
| 518 |
+
}
|
| 519 |
+
],
|
| 520 |
"source": [
|
| 521 |
"def cer(reference: str, hypothesis: str) -> float:\n",
|
| 522 |
" \"\"\"Compute Character Error Rate (CER) using edit distance.\"\"\"\n",
|
notebooks/03_signature_verification_siamese.ipynb
CHANGED
|
@@ -2,6 +2,7 @@
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
|
|
|
| 5 |
"metadata": {},
|
| 6 |
"source": [
|
| 7 |
"# Lab 03 — Signature Authenticity Verification\n",
|
|
@@ -41,6 +42,48 @@
|
|
| 41 |
},
|
| 42 |
{
|
| 43 |
"cell_type": "markdown",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
"metadata": {},
|
| 45 |
"source": [
|
| 46 |
"## Setup\n"
|
|
@@ -49,6 +92,7 @@
|
|
| 49 |
{
|
| 50 |
"cell_type": "code",
|
| 51 |
"execution_count": null,
|
|
|
|
| 52 |
"metadata": {},
|
| 53 |
"outputs": [],
|
| 54 |
"source": [
|
|
@@ -58,6 +102,7 @@
|
|
| 58 |
{
|
| 59 |
"cell_type": "code",
|
| 60 |
"execution_count": null,
|
|
|
|
| 61 |
"metadata": {},
|
| 62 |
"outputs": [],
|
| 63 |
"source": [
|
|
@@ -82,6 +127,7 @@
|
|
| 82 |
},
|
| 83 |
{
|
| 84 |
"cell_type": "markdown",
|
|
|
|
| 85 |
"metadata": {},
|
| 86 |
"source": [
|
| 87 |
"## Load the Feature Extractor\n",
|
|
@@ -94,6 +140,7 @@
|
|
| 94 |
{
|
| 95 |
"cell_type": "code",
|
| 96 |
"execution_count": null,
|
|
|
|
| 97 |
"metadata": {},
|
| 98 |
"outputs": [],
|
| 99 |
"source": [
|
|
@@ -121,6 +168,7 @@
|
|
| 121 |
},
|
| 122 |
{
|
| 123 |
"cell_type": "markdown",
|
|
|
|
| 124 |
"metadata": {},
|
| 125 |
"source": [
|
| 126 |
"## Image Preprocessing\n",
|
|
@@ -131,6 +179,7 @@
|
|
| 131 |
{
|
| 132 |
"cell_type": "code",
|
| 133 |
"execution_count": null,
|
|
|
|
| 134 |
"metadata": {},
|
| 135 |
"outputs": [],
|
| 136 |
"source": [
|
|
@@ -158,6 +207,7 @@
|
|
| 158 |
},
|
| 159 |
{
|
| 160 |
"cell_type": "markdown",
|
|
|
|
| 161 |
"metadata": {},
|
| 162 |
"source": [
|
| 163 |
"## Verification Function\n"
|
|
@@ -166,6 +216,7 @@
|
|
| 166 |
{
|
| 167 |
"cell_type": "code",
|
| 168 |
"execution_count": null,
|
|
|
|
| 169 |
"metadata": {},
|
| 170 |
"outputs": [],
|
| 171 |
"source": [
|
|
@@ -241,6 +292,7 @@
|
|
| 241 |
},
|
| 242 |
{
|
| 243 |
"cell_type": "markdown",
|
|
|
|
| 244 |
"metadata": {},
|
| 245 |
"source": [
|
| 246 |
"## Demo 1 — Load Real Signature Images\n",
|
|
@@ -254,6 +306,7 @@
|
|
| 254 |
{
|
| 255 |
"cell_type": "code",
|
| 256 |
"execution_count": null,
|
|
|
|
| 257 |
"metadata": {},
|
| 258 |
"outputs": [],
|
| 259 |
"source": [
|
|
@@ -306,6 +359,7 @@
|
|
| 306 |
},
|
| 307 |
{
|
| 308 |
"cell_type": "markdown",
|
|
|
|
| 309 |
"metadata": {},
|
| 310 |
"source": [
|
| 311 |
"## Demo 2 — Reference vs. Genuine Copy\n"
|
|
@@ -314,6 +368,7 @@
|
|
| 314 |
{
|
| 315 |
"cell_type": "code",
|
| 316 |
"execution_count": null,
|
|
|
|
| 317 |
"metadata": {},
|
| 318 |
"outputs": [],
|
| 319 |
"source": [
|
|
@@ -330,6 +385,7 @@
|
|
| 330 |
},
|
| 331 |
{
|
| 332 |
"cell_type": "markdown",
|
|
|
|
| 333 |
"metadata": {},
|
| 334 |
"source": [
|
| 335 |
"## Demo 3 — Reference vs. Forged\n"
|
|
@@ -338,6 +394,7 @@
|
|
| 338 |
{
|
| 339 |
"cell_type": "code",
|
| 340 |
"execution_count": null,
|
|
|
|
| 341 |
"metadata": {},
|
| 342 |
"outputs": [],
|
| 343 |
"source": [
|
|
@@ -354,6 +411,7 @@
|
|
| 354 |
},
|
| 355 |
{
|
| 356 |
"cell_type": "markdown",
|
|
|
|
| 357 |
"metadata": {},
|
| 358 |
"source": [
|
| 359 |
"## Demo 4 — Use Your Own Images\n"
|
|
@@ -362,6 +420,7 @@
|
|
| 362 |
{
|
| 363 |
"cell_type": "code",
|
| 364 |
"execution_count": null,
|
|
|
|
| 365 |
"metadata": {},
|
| 366 |
"outputs": [],
|
| 367 |
"source": [
|
|
@@ -385,6 +444,7 @@
|
|
| 385 |
},
|
| 386 |
{
|
| 387 |
"cell_type": "markdown",
|
|
|
|
| 388 |
"metadata": {},
|
| 389 |
"source": [
|
| 390 |
"## Similarity Distribution Across Multiple Pairs\n",
|
|
@@ -395,6 +455,7 @@
|
|
| 395 |
{
|
| 396 |
"cell_type": "code",
|
| 397 |
"execution_count": null,
|
|
|
|
| 398 |
"metadata": {},
|
| 399 |
"outputs": [],
|
| 400 |
"source": [
|
|
@@ -424,12 +485,14 @@
|
|
| 424 |
" ax.set_title(\"Similarity Scores — All Pairs vs. Reference\", fontweight='bold')\n",
|
| 425 |
" ax.legend()\n",
|
| 426 |
" plt.tight_layout()\n",
|
| 427 |
-
" plt.show()\
|
|
|
|
| 428 |
" print(\"Add multiple genuine_*.png and forged_*.png files to data/samples/ to see the distribution plot.\")"
|
| 429 |
]
|
| 430 |
},
|
| 431 |
{
|
| 432 |
"cell_type": "markdown",
|
|
|
|
| 433 |
"metadata": {},
|
| 434 |
"source": [
|
| 435 |
"## Forensic Notes\n",
|
|
@@ -448,13 +511,21 @@
|
|
| 448 |
],
|
| 449 |
"metadata": {
|
| 450 |
"kernelspec": {
|
| 451 |
-
"display_name": "Python 3",
|
| 452 |
"language": "python",
|
| 453 |
"name": "python3"
|
| 454 |
},
|
| 455 |
"language_info": {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 456 |
"name": "python",
|
| 457 |
-
"
|
|
|
|
|
|
|
| 458 |
}
|
| 459 |
},
|
| 460 |
"nbformat": 4,
|
|
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
| 5 |
+
"id": "908cb1dd",
|
| 6 |
"metadata": {},
|
| 7 |
"source": [
|
| 8 |
"# Lab 03 — Signature Authenticity Verification\n",
|
|
|
|
| 42 |
},
|
| 43 |
{
|
| 44 |
"cell_type": "markdown",
|
| 45 |
+
"id": "e2lciijg43",
|
| 46 |
+
"metadata": {},
|
| 47 |
+
"source": [
|
| 48 |
+
"## GraphoLab Core — Quick Start\n",
|
| 49 |
+
"\n",
|
| 50 |
+
"> The production implementation of signature verification is available in [`core/signature.py`](../core/signature.py).\n",
|
| 51 |
+
"> It uses **SigNet** (Hafemann et al. 2017) — a Siamese CNN trained on GPDS-300 — which produces\n",
|
| 52 |
+
"> 2048-dimensional L2-normalised embeddings and significantly outperforms the ResNet-18 baseline\n",
|
| 53 |
+
"> used in this notebook.\n",
|
| 54 |
+
">\n",
|
| 55 |
+
"> Run the cell below to import it directly. The remaining cells implement the **ResNet-18 baseline**\n",
|
| 56 |
+
"> from scratch for educational purposes."
|
| 57 |
+
]
|
| 58 |
+
},
|
| 59 |
+
{
|
| 60 |
+
"cell_type": "code",
|
| 61 |
+
"execution_count": null,
|
| 62 |
+
"id": "rco6ot0z81m",
|
| 63 |
+
"metadata": {},
|
| 64 |
+
"outputs": [],
|
| 65 |
+
"source": [
|
| 66 |
+
"# GraphoLab Core — production usage\n",
|
| 67 |
+
"# Run this cell to use the shared core module instead of the notebook implementation below.\n",
|
| 68 |
+
"import sys, pathlib\n",
|
| 69 |
+
"sys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n",
|
| 70 |
+
"\n",
|
| 71 |
+
"from core.signature import sig_verify, get_signet, SIG_THRESHOLD\n",
|
| 72 |
+
"from PIL import Image\n",
|
| 73 |
+
"import numpy as np\n",
|
| 74 |
+
"\n",
|
| 75 |
+
"# Example: verify two signatures\n",
|
| 76 |
+
"# ref = np.array(Image.open(\"../data/samples/signature_genuine_01.png\").convert(\"RGB\"))\n",
|
| 77 |
+
"# query = np.array(Image.open(\"../data/samples/signature_forged_01.png\").convert(\"RGB\"))\n",
|
| 78 |
+
"# weights = pathlib.Path(\"../data/signet.pth\")\n",
|
| 79 |
+
"# report, chart = sig_verify(ref, None, query, weights)\n",
|
| 80 |
+
"# print(report)\n",
|
| 81 |
+
"print(f\"core.signature imported — SIG_THRESHOLD={SIG_THRESHOLD:.2f}, sig_verify() ready.\")"
|
| 82 |
+
]
|
| 83 |
+
},
|
| 84 |
+
{
|
| 85 |
+
"cell_type": "markdown",
|
| 86 |
+
"id": "88e9efee",
|
| 87 |
"metadata": {},
|
| 88 |
"source": [
|
| 89 |
"## Setup\n"
|
|
|
|
| 92 |
{
|
| 93 |
"cell_type": "code",
|
| 94 |
"execution_count": null,
|
| 95 |
+
"id": "e17aa82c",
|
| 96 |
"metadata": {},
|
| 97 |
"outputs": [],
|
| 98 |
"source": [
|
|
|
|
| 102 |
{
|
| 103 |
"cell_type": "code",
|
| 104 |
"execution_count": null,
|
| 105 |
+
"id": "bd858242",
|
| 106 |
"metadata": {},
|
| 107 |
"outputs": [],
|
| 108 |
"source": [
|
|
|
|
| 127 |
},
|
| 128 |
{
|
| 129 |
"cell_type": "markdown",
|
| 130 |
+
"id": "d386c7a1",
|
| 131 |
"metadata": {},
|
| 132 |
"source": [
|
| 133 |
"## Load the Feature Extractor\n",
|
|
|
|
| 140 |
{
|
| 141 |
"cell_type": "code",
|
| 142 |
"execution_count": null,
|
| 143 |
+
"id": "50d2f8b4",
|
| 144 |
"metadata": {},
|
| 145 |
"outputs": [],
|
| 146 |
"source": [
|
|
|
|
| 168 |
},
|
| 169 |
{
|
| 170 |
"cell_type": "markdown",
|
| 171 |
+
"id": "0383ce0a",
|
| 172 |
"metadata": {},
|
| 173 |
"source": [
|
| 174 |
"## Image Preprocessing\n",
|
|
|
|
| 179 |
{
|
| 180 |
"cell_type": "code",
|
| 181 |
"execution_count": null,
|
| 182 |
+
"id": "5e7e0965",
|
| 183 |
"metadata": {},
|
| 184 |
"outputs": [],
|
| 185 |
"source": [
|
|
|
|
| 207 |
},
|
| 208 |
{
|
| 209 |
"cell_type": "markdown",
|
| 210 |
+
"id": "f14109eb",
|
| 211 |
"metadata": {},
|
| 212 |
"source": [
|
| 213 |
"## Verification Function\n"
|
|
|
|
| 216 |
{
|
| 217 |
"cell_type": "code",
|
| 218 |
"execution_count": null,
|
| 219 |
+
"id": "04541467",
|
| 220 |
"metadata": {},
|
| 221 |
"outputs": [],
|
| 222 |
"source": [
|
|
|
|
| 292 |
},
|
| 293 |
{
|
| 294 |
"cell_type": "markdown",
|
| 295 |
+
"id": "179b949c",
|
| 296 |
"metadata": {},
|
| 297 |
"source": [
|
| 298 |
"## Demo 1 — Load Real Signature Images\n",
|
|
|
|
| 306 |
{
|
| 307 |
"cell_type": "code",
|
| 308 |
"execution_count": null,
|
| 309 |
+
"id": "be92ed83",
|
| 310 |
"metadata": {},
|
| 311 |
"outputs": [],
|
| 312 |
"source": [
|
|
|
|
| 359 |
},
|
| 360 |
{
|
| 361 |
"cell_type": "markdown",
|
| 362 |
+
"id": "955defce",
|
| 363 |
"metadata": {},
|
| 364 |
"source": [
|
| 365 |
"## Demo 2 — Reference vs. Genuine Copy\n"
|
|
|
|
| 368 |
{
|
| 369 |
"cell_type": "code",
|
| 370 |
"execution_count": null,
|
| 371 |
+
"id": "1be070f7",
|
| 372 |
"metadata": {},
|
| 373 |
"outputs": [],
|
| 374 |
"source": [
|
|
|
|
| 385 |
},
|
| 386 |
{
|
| 387 |
"cell_type": "markdown",
|
| 388 |
+
"id": "d9530a45",
|
| 389 |
"metadata": {},
|
| 390 |
"source": [
|
| 391 |
"## Demo 3 — Reference vs. Forged\n"
|
|
|
|
| 394 |
{
|
| 395 |
"cell_type": "code",
|
| 396 |
"execution_count": null,
|
| 397 |
+
"id": "7daff644",
|
| 398 |
"metadata": {},
|
| 399 |
"outputs": [],
|
| 400 |
"source": [
|
|
|
|
| 411 |
},
|
| 412 |
{
|
| 413 |
"cell_type": "markdown",
|
| 414 |
+
"id": "5a651d94",
|
| 415 |
"metadata": {},
|
| 416 |
"source": [
|
| 417 |
"## Demo 4 — Use Your Own Images\n"
|
|
|
|
| 420 |
{
|
| 421 |
"cell_type": "code",
|
| 422 |
"execution_count": null,
|
| 423 |
+
"id": "387371ed",
|
| 424 |
"metadata": {},
|
| 425 |
"outputs": [],
|
| 426 |
"source": [
|
|
|
|
| 444 |
},
|
| 445 |
{
|
| 446 |
"cell_type": "markdown",
|
| 447 |
+
"id": "c741e3a2",
|
| 448 |
"metadata": {},
|
| 449 |
"source": [
|
| 450 |
"## Similarity Distribution Across Multiple Pairs\n",
|
|
|
|
| 455 |
{
|
| 456 |
"cell_type": "code",
|
| 457 |
"execution_count": null,
|
| 458 |
+
"id": "4c762fb5",
|
| 459 |
"metadata": {},
|
| 460 |
"outputs": [],
|
| 461 |
"source": [
|
|
|
|
| 485 |
" ax.set_title(\"Similarity Scores — All Pairs vs. Reference\", fontweight='bold')\n",
|
| 486 |
" ax.legend()\n",
|
| 487 |
" plt.tight_layout()\n",
|
| 488 |
+
" plt.show()\n",
|
| 489 |
+
"else:\n",
|
| 490 |
" print(\"Add multiple genuine_*.png and forged_*.png files to data/samples/ to see the distribution plot.\")"
|
| 491 |
]
|
| 492 |
},
|
| 493 |
{
|
| 494 |
"cell_type": "markdown",
|
| 495 |
+
"id": "fe74b3d8",
|
| 496 |
"metadata": {},
|
| 497 |
"source": [
|
| 498 |
"## Forensic Notes\n",
|
|
|
|
| 511 |
],
|
| 512 |
"metadata": {
|
| 513 |
"kernelspec": {
|
| 514 |
+
"display_name": "Python 3 (ipykernel)",
|
| 515 |
"language": "python",
|
| 516 |
"name": "python3"
|
| 517 |
},
|
| 518 |
"language_info": {
|
| 519 |
+
"codemirror_mode": {
|
| 520 |
+
"name": "ipython",
|
| 521 |
+
"version": 3
|
| 522 |
+
},
|
| 523 |
+
"file_extension": ".py",
|
| 524 |
+
"mimetype": "text/x-python",
|
| 525 |
"name": "python",
|
| 526 |
+
"nbconvert_exporter": "python",
|
| 527 |
+
"pygments_lexer": "ipython3",
|
| 528 |
+
"version": "3.11.9"
|
| 529 |
}
|
| 530 |
},
|
| 531 |
"nbformat": 4,
|
notebooks/04_signature_detection_yolo.ipynb
CHANGED
|
@@ -24,6 +24,20 @@
|
|
| 24 |
"YOLOv8 was fine-tuned on annotated signature images to detect signatures specifically in document scans.\n"
|
| 25 |
]
|
| 26 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
{
|
| 28 |
"cell_type": "markdown",
|
| 29 |
"metadata": {},
|
|
@@ -314,4 +328,4 @@
|
|
| 314 |
},
|
| 315 |
"nbformat": 4,
|
| 316 |
"nbformat_minor": 5
|
| 317 |
-
}
|
|
|
|
| 24 |
"YOLOv8 was fine-tuned on annotated signature images to detect signatures specifically in document scans.\n"
|
| 25 |
]
|
| 26 |
},
|
| 27 |
+
{
|
| 28 |
+
"cell_type": "markdown",
|
| 29 |
+
"id": "48jzghnekwn",
|
| 30 |
+
"source": "## GraphoLab Core — Quick Start\n\n> The production implementation of signature detection is available in [`core/signature.py`](../core/signature.py).\n> It wraps **YOLOv8** (`tech4humans/yolov8s-signature-detector`) with lazy thread-safe model loading,\n> bounding-box annotation, and automatic cropping of detected signatures.\n>\n> Run the cell below to import it directly. The remaining cells implement the same detection\n> pipeline from scratch for educational purposes.",
|
| 31 |
+
"metadata": {}
|
| 32 |
+
},
|
| 33 |
+
{
|
| 34 |
+
"cell_type": "code",
|
| 35 |
+
"id": "yjec2b865cc",
|
| 36 |
+
"source": "# GraphoLab Core — production usage\n# Run this cell to use the shared core module instead of the notebook implementation below.\nimport sys, pathlib\nsys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n\nfrom core.signature import detect_and_crop, sig_detect, get_yolo\nfrom PIL import Image\nimport numpy as np\n\n# Example: detect and crop signature from a document\n# doc = np.array(Image.open(\"../data/samples/document_with_signature_01.png\").convert(\"RGB\"))\n# annotated, crop, summary = detect_and_crop(doc)\n# print(summary)\nprint(\"core.signature imported — detect_and_crop(), sig_detect() ready.\")",
|
| 37 |
+
"metadata": {},
|
| 38 |
+
"execution_count": null,
|
| 39 |
+
"outputs": []
|
| 40 |
+
},
|
| 41 |
{
|
| 42 |
"cell_type": "markdown",
|
| 43 |
"metadata": {},
|
|
|
|
| 328 |
},
|
| 329 |
"nbformat": 4,
|
| 330 |
"nbformat_minor": 5
|
| 331 |
+
}
|
notebooks/05_writer_identification.ipynb
CHANGED
|
@@ -28,6 +28,20 @@
|
|
| 28 |
"We extract these features from each sample, then train a **Support Vector Machine (SVM)** or **k-Nearest Neighbours (kNN)** classifier.\n"
|
| 29 |
]
|
| 30 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
{
|
| 32 |
"cell_type": "markdown",
|
| 33 |
"metadata": {},
|
|
@@ -394,4 +408,4 @@
|
|
| 394 |
},
|
| 395 |
"nbformat": 4,
|
| 396 |
"nbformat_minor": 5
|
| 397 |
-
}
|
|
|
|
| 28 |
"We extract these features from each sample, then train a **Support Vector Machine (SVM)** or **k-Nearest Neighbours (kNN)** classifier.\n"
|
| 29 |
]
|
| 30 |
},
|
| 31 |
+
{
|
| 32 |
+
"cell_type": "markdown",
|
| 33 |
+
"id": "x2hyjkwyykt",
|
| 34 |
+
"source": "## GraphoLab Core — Quick Start\n\n> The production implementation of writer identification is available in [`core/writer.py`](../core/writer.py).\n> It uses a **HOG + LBP + SVM pipeline** with open-set calibration and lazy thread-safe model loading,\n> and accepts a `samples_dir` parameter for flexible deployment.\n>\n> Run the cell below to import it directly. The remaining cells implement the same pipeline\n> from scratch for educational purposes.",
|
| 35 |
+
"metadata": {}
|
| 36 |
+
},
|
| 37 |
+
{
|
| 38 |
+
"cell_type": "code",
|
| 39 |
+
"id": "hlvfzn4pgpr",
|
| 40 |
+
"source": "# GraphoLab Core — production usage\n# Run this cell to use the shared core module instead of the notebook implementation below.\nimport sys, pathlib\nsys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n\nfrom core.writer import writer_identify, ensure_writer_examples\nfrom PIL import Image\nimport numpy as np\n\n# Example: identify writer of an anonymous sample\n# samples_dir = pathlib.Path(\"../data/samples\")\n# doc = np.array(Image.open(\"../data/samples/handwritten_text_01.png\").convert(\"RGB\"))\n# report, chart = writer_identify(doc, samples_dir)\n# print(report)\nprint(\"core.writer imported — writer_identify() ready.\")",
|
| 41 |
+
"metadata": {},
|
| 42 |
+
"execution_count": null,
|
| 43 |
+
"outputs": []
|
| 44 |
+
},
|
| 45 |
{
|
| 46 |
"cell_type": "markdown",
|
| 47 |
"metadata": {},
|
|
|
|
| 408 |
},
|
| 409 |
"nbformat": 4,
|
| 410 |
"nbformat_minor": 5
|
| 411 |
+
}
|
notebooks/06_graphological_feature_analysis.ipynb
CHANGED
|
@@ -28,6 +28,20 @@
|
|
| 28 |
"| **Ink density** | Distribution of dark vs. light pixels | Pen pressure |\n"
|
| 29 |
]
|
| 30 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
{
|
| 32 |
"cell_type": "markdown",
|
| 33 |
"metadata": {},
|
|
@@ -467,4 +481,4 @@
|
|
| 467 |
},
|
| 468 |
"nbformat": 4,
|
| 469 |
"nbformat_minor": 5
|
| 470 |
-
}
|
|
|
|
| 28 |
"| **Ink density** | Distribution of dark vs. light pixels | Pen pressure |\n"
|
| 29 |
]
|
| 30 |
},
|
| 31 |
+
{
|
| 32 |
+
"cell_type": "markdown",
|
| 33 |
+
"id": "3zdljilbgdy",
|
| 34 |
+
"source": "## GraphoLab Core — Quick Start\n\n> The production implementation of graphological feature analysis is available in [`core/graphology.py`](../core/graphology.py).\n> It measures slant, pressure, letter height/width, word spacing, and ink density using\n> OpenCV + classical image processing, and returns both a Markdown report and an annotated image.\n>\n> Run the cell below to import it directly. The remaining cells implement the same analysis\n> from scratch for educational purposes.",
|
| 35 |
+
"metadata": {}
|
| 36 |
+
},
|
| 37 |
+
{
|
| 38 |
+
"cell_type": "code",
|
| 39 |
+
"id": "r8ltseprer",
|
| 40 |
+
"source": "# GraphoLab Core — production usage\n# Run this cell to use the shared core module instead of the notebook implementation below.\nimport sys, pathlib\nsys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n\nfrom core.graphology import grapho_analyse\nfrom PIL import Image\nimport numpy as np\n\n# Example: analyse graphological features of a handwriting sample\n# doc = np.array(Image.open(\"../data/samples/handwritten_text_01.png\").convert(\"RGB\"))\n# report, annotated_image = grapho_analyse(doc)\n# print(report)\nprint(\"core.graphology imported — grapho_analyse() ready.\")",
|
| 41 |
+
"metadata": {},
|
| 42 |
+
"execution_count": null,
|
| 43 |
+
"outputs": []
|
| 44 |
+
},
|
| 45 |
{
|
| 46 |
"cell_type": "markdown",
|
| 47 |
"metadata": {},
|
|
|
|
| 481 |
},
|
| 482 |
"nbformat": 4,
|
| 483 |
"nbformat_minor": 5
|
| 484 |
+
}
|
notebooks/07_named_entity_recognition.ipynb
CHANGED
|
@@ -32,6 +32,20 @@
|
|
| 32 |
"**WikiNEural** is a BERT-based model fine-tuned on automatically annotated Wikipedia corpora in 9 languages (including Italian), making it well-suited for multilingual forensic documents."
|
| 33 |
]
|
| 34 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
{
|
| 36 |
"cell_type": "markdown",
|
| 37 |
"id": "cell-1",
|
|
@@ -421,4 +435,4 @@
|
|
| 421 |
},
|
| 422 |
"nbformat": 4,
|
| 423 |
"nbformat_minor": 5
|
| 424 |
-
}
|
|
|
|
| 32 |
"**WikiNEural** is a BERT-based model fine-tuned on automatically annotated Wikipedia corpora in 9 languages (including Italian), making it well-suited for multilingual forensic documents."
|
| 33 |
]
|
| 34 |
},
|
| 35 |
+
{
|
| 36 |
+
"cell_type": "markdown",
|
| 37 |
+
"id": "wq008pk1yq",
|
| 38 |
+
"source": "## GraphoLab Core — Quick Start\n\n> The production implementation of NER is available in [`core/ner.py`](../core/ner.py).\n> It uses **WikiNEural** (`Babelscape/wikineural-multilingual-ner`) with lazy thread-safe model loading\n> and returns both highlighted spans and a Markdown summary table.\n>\n> Run the cell below to import it directly. The remaining cells implement the same pipeline\n> from scratch for educational purposes.",
|
| 39 |
+
"metadata": {}
|
| 40 |
+
},
|
| 41 |
+
{
|
| 42 |
+
"cell_type": "code",
|
| 43 |
+
"id": "fwncffbdtjd",
|
| 44 |
+
"source": "# GraphoLab Core — production usage\n# Run this cell to use the shared core module instead of the notebook implementation below.\nimport sys, pathlib\nsys.path.insert(0, str(pathlib.Path(\"..\").resolve()))\n\nfrom core.ner import ner_extract, get_ner\n\n# Example: extract named entities from OCR text\n# text = \"Mario Rossi ha firmato il contratto a Roma il 12 marzo 2024.\"\n# highlighted_spans, markdown_table = ner_extract(text)\n# print(markdown_table)\nprint(\"core.ner imported — ner_extract() ready.\")",
|
| 45 |
+
"metadata": {},
|
| 46 |
+
"execution_count": null,
|
| 47 |
+
"outputs": []
|
| 48 |
+
},
|
| 49 |
{
|
| 50 |
"cell_type": "markdown",
|
| 51 |
"id": "cell-1",
|
|
|
|
| 435 |
},
|
| 436 |
"nbformat": 4,
|
| 437 |
"nbformat_minor": 5
|
| 438 |
+
}
|