Spaces:

fabioantonini
/

grapholab

Running

App Files Files Community

grapholab / docs /NOTEBOOKS_GUIDE.md

Fabio Antonini

docs: update all YOLOv8 references to Conditional DETR

e3ef569 about 2 months ago

preview code

raw

history blame contribute delete

16.5 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

GraphoLab — Notebooks Guide

A practical guide to the GraphoLab demo labs on AI-assisted forensic graphology (8 labs).

Overview

Forensic graphology is the scientific examination of handwriting and signatures to support legal investigations. AI and machine learning can automate and scale several tasks that experts traditionally perform manually:

Task	AI Approach	Forensic Application
Handwritten text transcription	Transformer OCR (TrOCR / EasyOCR)	Anonymous letters, historical documents
Signature authenticity	Siamese Neural Network (SigNet)	Checks, contracts, wills
Signature location in documents	Object Detection (Conditional DETR)	Document processing pipelines
Writer identification	Feature extraction + classifier	Disputed authorship
Graphological feature analysis	OpenCV + ML	Profiling, comparative analysis
Named entity recognition	Token classification (BERT-NER)	People, places, orgs in documents
Deep OCR (Italian cursive)	Vision-Language Model (dots.ocr 1.7B)	Wills, cursive handwriting, complex layouts

Lab 01 — Introduction: AI and Forensic Graphology

File: notebooks/01_intro_forensic_graphology.ipynb

A presentation-style notebook with no executable code. Covers:

What is forensic graphology and why it matters
The limits of purely manual analysis
How AI/ML can augment expert analysis
Concept map of the full pipeline: acquisition → preprocessing → AI analysis → forensic report
Overview of all subsequent labs

Prerequisites: None. Audience: All levels, including non-technical stakeholders.

Lab 02 — Handwritten Text Recognition (HTR/OCR)

File: notebooks/02_handwritten_ocr_trocr.ipynb

Uses TrOCR (microsoft/trocr-base-handwritten on Hugging Face) to automatically transcribe handwritten text from images.

What you will learn:

How Transformer-based OCR works (BEiT vision encoder + RoBERTa text decoder)
How to load and run a pre-trained HTR model via the transformers library
How to preprocess handwritten images for the model
How to interpret and visualise the transcription output

Demo flow:

Load a sample handwritten image from data/samples/
Run TrOCR inference
Display the original image alongside the transcribed text
(Optional) Compare against a ground-truth transcript and compute CER (Character Error Rate)

Forensic use cases:

Automatic transcription of anonymous threatening letters
Digitisation of historical handwritten court documents
Pre-processing step for author identification pipelines

Prerequisites: transformers, torch, Pillow

EasyOCR vs TrOCR — Architecture Comparison

The Gradio app uses EasyOCR instead of TrOCR for interactive inference. EasyOCR is not a Transformer — it uses a classic CNN + RNN pipeline:

	EasyOCR	TrOCR
Architecture	CNN + BiLSTM + CTC	Transformer encoder-decoder
Text detector	CRAFT (CNN heatmap)	— (line-level input)
Visual encoder	VGG-based CNN	BEiT (Vision Transformer)
Decoder	Bidirectional LSTM + CTC	RoBERTa
Linguistic context	Limited (local LSTM)	Broad (global self-attention)
CPU speed	~1–3 s/image	~10–20 s/image
Italian cursive quality	Adequate	Better
RAM footprint	~200 MB	~400 MB

When to use which: EasyOCR is the right choice for interactive demos (fast, low memory). TrOCR — and especially dots.ocr (Lab 08) — are preferable when transcription accuracy on complex Italian cursive is the priority.

Lab 03 — Signature Authenticity Verification

File: notebooks/03_signature_verification_siamese.ipynb

Uses a Siamese Neural Network (SigNet architecture) to compare two signature images and determine whether the questioned signature is genuine or forged.

What you will learn:

The Siamese network paradigm for one-shot similarity learning
How SigNet encodes signature images into feature vectors
How to compute a similarity (or distance) score between two signatures
Setting a decision threshold for genuine / forged classification

Demo flow:

Load a reference signature and a questioned signature from data/samples/
Extract feature embeddings with the SigNet encoder
Compute cosine similarity score
Display: genuine / forged verdict + confidence score

Forensic use cases:

Verification of signatures on bank cheques
Authentication of signatures on contracts, wills, and legal deeds
Detection of traced or digitally reproduced signatures

Prerequisites: torch, scikit-image, Pillow

Note: Pre-trained SigNet weights (models/signet.pth) were trained on the GPDS signature dataset. Demo sample pairs are sourced from the CEDAR signature database (data/samples/genuine_N_M.png / forged_N_M.png) and pre-selected so the model correctly detects the forgery.

Reference: luizgh/sigver — SigNet implementation and pre-trained weights.

Generalisation to signers not in the training set

SigNet uses metric learning (Siamese network with contrastive loss), not a standard classifier. This is a key architectural distinction:

A classifier learns "who is person X" — it cannot generalise to unseen identities.
A Siamese/metric model learns "do these two signatures come from the same hand?" — the question is identity-agnostic and generalises to any signer, including those never seen during training.

In practice this means SigNet can be applied to any pair of signatures, regardless of whether the signer appears in GPDS or CEDAR.

Known limitations:

Limitation	Detail
Cultural/stylistic bias	GPDS contains primarily Brazilian/Portuguese signatures; Italian or non-Latin signatures may fall outside the training distribution
Threshold calibration	The 0.35 cosine-distance threshold was optimised on CEDAR; for signers with unusually compact or elaborate styles it may need adjustment
Forgery type	Trained on skilled forgeries (forger practised the target signature); performance on naive forgeries is generally better, on professional forgeries comparable to a human expert
Image quality	Performance degrades with photos of documents on coloured, folded, or stamp-overlaid paper

Practical guidance: Use SigNet as a first-level screening tool. It is not certified for standalone forensic legal testimony; results should always be reviewed by a qualified examiner.

Lab 04 — Signature Detection in Documents (Conditional DETR)

File: notebooks/04_signature_detection_detr.ipynb

Uses a Conditional DETR model fine-tuned for signature detection (tech4humans/conditional-detr-50-signature-detector on Hugging Face) to automatically locate signatures within scanned documents.

What you will learn:

How Conditional DETR object detection works in the context of document analysis
How to load a Hugging Face model with the transformers library
How to run inference on document images and parse bounding box results
How to visualise detected regions and crop them for downstream processing

Demo flow:

Load a scanned document image from data/samples/
Run Conditional DETR inference
Draw bounding boxes around detected signatures
Crop and save each detected signature for use in Lab 03

Forensic use cases:

Automated extraction of signatures from multi-page legal documents
First step in a pipeline: detect → extract → verify
Screening large document archives for signature presence

Prerequisites: transformers, timm, opencv-python, Pillow

Lab 05 — Writer Identification

File: notebooks/05_writer_identification.ipynb

Compares the stylistic characteristics of an anonymous handwritten sample against a set of known reference samples to attribute authorship.

What you will learn:

How to extract handwriting style features: HOG (Histogram of Oriented Gradients), LBP (Local Binary Patterns), and horizontal/vertical run-length statistics
How to build an SVM-based writer identification pipeline with scikit-learn (StandardScaler + SVC with RBF kernel)
How to evaluate identification accuracy using cross-validation
How to present results as a ranked list of candidate authors with probability scores

Demo flow:

Load the reference sample database from data/samples/writer_XX/ (five writers, 41 samples each)
Extract HOG + LBP + run-length features from each sample
Train an SVM classifier (C=10, gamma="scale", probability=True)
Upload an anonymous sample → ranked candidate list with probability scores

Forensic use cases:

Attribution of anonymous threatening letters
Verification of disputed document authorship
Historical document provenance research

Prerequisites: scikit-learn, scikit-image, Pillow, numpy

Dataset note: The demo uses a synthetic handwriting database in data/samples/writer_XX/ (five writers, 41 samples each) generated with system TTF fonts (Ink Free, Lucida Handwriting, Segoe Print, Segoe Script, Comic Sans) to ensure distinct, reproducible styles. For production use, replace these with real handwriting scans. The IAM Handwriting Database is the standard forensic benchmark.

Lab 06 — Graphological Feature Analysis

File: notebooks/06_graphological_feature_analysis.ipynb

Automatically extracts and visualises graphological features from a handwritten text sample using classical computer vision and signal processing.

What you will learn:

How to segment handwriting into words and characters using OpenCV
How to measure: letter slant angle, word/character spacing, letter height and width, stroke pressure (pixel intensity distribution)
How to build a visual dashboard of graphological metrics
How to compare two samples and highlight differences

Demo flow:

Load a handwritten text image from data/samples/
Preprocess (binarise, denoise, deskew)
Segment into lines, words, and characters
Compute and display metrics with annotated visualisations
(Optional) Compare two samples side-by-side

Forensic use cases:

Supporting expert witness testimony with objective, reproducible measurements
Detecting stress indicators in handwriting (pen pressure variation)
Comparative analysis between a reference sample and a disputed document

Prerequisites: opencv-python, numpy, matplotlib, scipy

Lab 07 — Named Entity Recognition (NER)

File: notebooks/07_named_entity_recognition.ipynb

Uses a multilingual BERT-NER model (Babelscape/wikineural-multilingual-ner) to automatically extract named entities — persons, organisations, and locations — from text. Ideal as a second step after HTR transcription.

What you will learn:

How BERT-based token classification works (BIO tagging scheme)
How to load and run a multilingual NER pipeline via transformers
How to visualise entity spans with colour-coded inline highlighting
How to build a complete HTR → NER pipeline for handwritten document analysis

Demo flow:

Run NER on an Italian legal text (e.g. a will or declaration)
Run NER on an English text (multilingual support)
Full pipeline: load a handwritten image → TrOCR transcription → NER entity extraction
Analyse entity distribution and confidence scores

Forensic use cases:

Automatically identify persons, places, and organisations mentioned in handwritten documents
Screen anonymous letters for proper nouns (names, addresses)
Build a relationship graph between entities across a document corpus

Prerequisites: transformers, torch, opencv-python, Pillow, matplotlib

Lab 08 — dots.ocr: OCR with Vision-Language Model

File: notebooks/08_dots_ocr_vlm.ipynb

Uses dots.ocr (rednote-hilab/dots.ocr) — a 1.7B Vision-Language Model — to transcribe handwritten text from document images. Unlike CNN-based OCR, the LLM component uses linguistic context to correct visual ambiguities, making it particularly effective on Italian cursive.

What you will learn:

How VLM-based OCR differs from TrOCR and EasyOCR (architecture comparison table)
How to perform a hardware check and select the right inference configuration (CPU / GPU)
How to load and run dots.ocr via transformers with adaptive dtype and attention
How to measure and compare transcription quality (CER) between EasyOCR and dots.ocr

Demo flow:

Hardware check — detects GPU/RAM and selects CPU fp32 or CUDA bf16 automatically
Load dots.ocr from Hugging Face (~3.5 GB bf16, ~7 GB fp32)
Transcribe a writer_00 sample (single 320×140 image)
Transcribe the full testamento_writer00.png document
Transcribe Lorella real-world handwriting samples
Side-by-side EasyOCR vs dots.ocr comparison + CER measurement

Forensic use cases:

High-quality OCR on Italian cursive handwriting
Documents with tables, formulas, or complex layouts
When EasyOCR results are insufficient for forensic accuracy requirements

Prerequisites: transformers>=4.49, qwen_vl_utils, torch, Pillow, psutil, accelerate

Hardware note: On CPU (~7 GB free RAM), inference takes 2–5 min per image. On GPU ≥8 GB VRAM, 5–10 seconds per image. Not suitable for interactive real-time demos — use EasyOCR in the Gradio app.

Installation (one-time — see notebook cell):

git clone https://github.com/rednote-hilab/dots.ocr.git DotsOCR
pip install -e DotsOCR
pip install qwen_vl_utils accelerate

Reference: arxiv 2512.02498 — RedNote / Xiaohongshu, Dec 2024.

Interactive Demo (Gradio)

File: app/grapholab_demo.py

A browser-based multi-tab Gradio application (fully in Italian) aggregating nine AI capabilities:

Tab	Functionality
OCR Manoscritto	Upload an image (single or multi-line) → transcribed text (EasyOCR)
Verifica Firma	Upload two signatures → genuine / forged verdict
Rilevamento Firma	Upload a document → annotated image with detected signatures
Riconoscimento Entità	Enter text → colour-coded named entities + summary table
Identificazione Scrittore	Upload a handwriting sample → ranked candidate authors with probability scores
Analisi Grafologica	Upload handwritten text → visual metrics dashboard
Perizia Forense Automatica	Upload a document (+optional reference signature) → full forensic report (7 steps: detection, HTR, NER, writer ID, graphology, signature verification, LLM synthesis via Ollama)
Datazione Documenti	Upload multiple document images → chronological ordering by extracted date (EasyOCR + dateparser)
Consulente Forense IA	RAG chatbot: upload PDF/DOCX to enrich the knowledge base, then ask questions answered by a local Ollama LLM

Running locally:

# Create and activate a virtual environment (recommended)
python -m venv .venv

# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate

pip install -r requirements.txt
python app/grapholab_demo.py
# Open http://localhost:7860

Running with Docker

# JupyterLab at http://localhost:8888  (token: grapholab)
docker compose up jupyter

# Gradio demo at http://localhost:7860
docker compose up gradio

# Both services together
docker compose up

The Hugging Face model cache is stored in a named Docker volume (grapholab-hf-cache) and shared between both services. Models are downloaded only once.

Sample Data

Demo images are provided in data/samples/:

File	Used in
`handwritten_text_01.png`	Labs 02, 06, 07 (single-line HTR demo)
`handwritten_multiline_01.png`	Labs 02, 07 (multi-line HTR + NER pipeline)
`genuine_N_M.png`	Lab 03 (CEDAR reference signatures, N=writer, M=sample)
`forged_N_M.png`	Lab 03 (CEDAR forgeries, pre-selected for model detectability)
`document_with_signature_*.png`	Lab 04

Signature samples (genuine_*, forged_*) are sourced from the CEDAR signature database and pre-selected so that SigNet correctly classifies the forgery. You can replace or supplement these with your own images to experiment with real-world cases.