grapholab / docs /NOTEBOOKS_GUIDE.md
Fabio Antonini
docs: update all YOLOv8 references to Conditional DETR
e3ef569

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

GraphoLab — Notebooks Guide

A practical guide to the GraphoLab demo labs on AI-assisted forensic graphology (8 labs).


Overview

Forensic graphology is the scientific examination of handwriting and signatures to support legal investigations. AI and machine learning can automate and scale several tasks that experts traditionally perform manually:

Task AI Approach Forensic Application
Handwritten text transcription Transformer OCR (TrOCR / EasyOCR) Anonymous letters, historical documents
Signature authenticity Siamese Neural Network (SigNet) Checks, contracts, wills
Signature location in documents Object Detection (Conditional DETR) Document processing pipelines
Writer identification Feature extraction + classifier Disputed authorship
Graphological feature analysis OpenCV + ML Profiling, comparative analysis
Named entity recognition Token classification (BERT-NER) People, places, orgs in documents
Deep OCR (Italian cursive) Vision-Language Model (dots.ocr 1.7B) Wills, cursive handwriting, complex layouts

Lab 01 — Introduction: AI and Forensic Graphology

File: notebooks/01_intro_forensic_graphology.ipynb

A presentation-style notebook with no executable code. Covers:

  • What is forensic graphology and why it matters
  • The limits of purely manual analysis
  • How AI/ML can augment expert analysis
  • Concept map of the full pipeline: acquisition → preprocessing → AI analysis → forensic report
  • Overview of all subsequent labs

Prerequisites: None. Audience: All levels, including non-technical stakeholders.


Lab 02 — Handwritten Text Recognition (HTR/OCR)

File: notebooks/02_handwritten_ocr_trocr.ipynb

Uses TrOCR (microsoft/trocr-base-handwritten on Hugging Face) to automatically transcribe handwritten text from images.

What you will learn:

  • How Transformer-based OCR works (BEiT vision encoder + RoBERTa text decoder)
  • How to load and run a pre-trained HTR model via the transformers library
  • How to preprocess handwritten images for the model
  • How to interpret and visualise the transcription output

Demo flow:

  1. Load a sample handwritten image from data/samples/
  2. Run TrOCR inference
  3. Display the original image alongside the transcribed text
  4. (Optional) Compare against a ground-truth transcript and compute CER (Character Error Rate)

Forensic use cases:

  • Automatic transcription of anonymous threatening letters
  • Digitisation of historical handwritten court documents
  • Pre-processing step for author identification pipelines

Prerequisites: transformers, torch, Pillow

EasyOCR vs TrOCR — Architecture Comparison

The Gradio app uses EasyOCR instead of TrOCR for interactive inference. EasyOCR is not a Transformer — it uses a classic CNN + RNN pipeline:

EasyOCR TrOCR
Architecture CNN + BiLSTM + CTC Transformer encoder-decoder
Text detector CRAFT (CNN heatmap) — (line-level input)
Visual encoder VGG-based CNN BEiT (Vision Transformer)
Decoder Bidirectional LSTM + CTC RoBERTa
Linguistic context Limited (local LSTM) Broad (global self-attention)
CPU speed ~1–3 s/image ~10–20 s/image
Italian cursive quality Adequate Better
RAM footprint ~200 MB ~400 MB

When to use which: EasyOCR is the right choice for interactive demos (fast, low memory). TrOCR — and especially dots.ocr (Lab 08) — are preferable when transcription accuracy on complex Italian cursive is the priority.


Lab 03 — Signature Authenticity Verification

File: notebooks/03_signature_verification_siamese.ipynb

Uses a Siamese Neural Network (SigNet architecture) to compare two signature images and determine whether the questioned signature is genuine or forged.

What you will learn:

  • The Siamese network paradigm for one-shot similarity learning
  • How SigNet encodes signature images into feature vectors
  • How to compute a similarity (or distance) score between two signatures
  • Setting a decision threshold for genuine / forged classification

Demo flow:

  1. Load a reference signature and a questioned signature from data/samples/
  2. Extract feature embeddings with the SigNet encoder
  3. Compute cosine similarity score
  4. Display: genuine / forged verdict + confidence score

Forensic use cases:

  • Verification of signatures on bank cheques
  • Authentication of signatures on contracts, wills, and legal deeds
  • Detection of traced or digitally reproduced signatures

Prerequisites: torch, scikit-image, Pillow

Note: Pre-trained SigNet weights (models/signet.pth) were trained on the GPDS signature dataset. Demo sample pairs are sourced from the CEDAR signature database (data/samples/genuine_N_M.png / forged_N_M.png) and pre-selected so the model correctly detects the forgery.

Reference: luizgh/sigver — SigNet implementation and pre-trained weights.

Generalisation to signers not in the training set

SigNet uses metric learning (Siamese network with contrastive loss), not a standard classifier. This is a key architectural distinction:

  • A classifier learns "who is person X" — it cannot generalise to unseen identities.
  • A Siamese/metric model learns "do these two signatures come from the same hand?" — the question is identity-agnostic and generalises to any signer, including those never seen during training.

In practice this means SigNet can be applied to any pair of signatures, regardless of whether the signer appears in GPDS or CEDAR.

Known limitations:

Limitation Detail
Cultural/stylistic bias GPDS contains primarily Brazilian/Portuguese signatures; Italian or non-Latin signatures may fall outside the training distribution
Threshold calibration The 0.35 cosine-distance threshold was optimised on CEDAR; for signers with unusually compact or elaborate styles it may need adjustment
Forgery type Trained on skilled forgeries (forger practised the target signature); performance on naive forgeries is generally better, on professional forgeries comparable to a human expert
Image quality Performance degrades with photos of documents on coloured, folded, or stamp-overlaid paper

Practical guidance: Use SigNet as a first-level screening tool. It is not certified for standalone forensic legal testimony; results should always be reviewed by a qualified examiner.


Lab 04 — Signature Detection in Documents (Conditional DETR)

File: notebooks/04_signature_detection_detr.ipynb

Uses a Conditional DETR model fine-tuned for signature detection (tech4humans/conditional-detr-50-signature-detector on Hugging Face) to automatically locate signatures within scanned documents.

What you will learn:

  • How Conditional DETR object detection works in the context of document analysis
  • How to load a Hugging Face model with the transformers library
  • How to run inference on document images and parse bounding box results
  • How to visualise detected regions and crop them for downstream processing

Demo flow:

  1. Load a scanned document image from data/samples/
  2. Run Conditional DETR inference
  3. Draw bounding boxes around detected signatures
  4. Crop and save each detected signature for use in Lab 03

Forensic use cases:

  • Automated extraction of signatures from multi-page legal documents
  • First step in a pipeline: detect → extract → verify
  • Screening large document archives for signature presence

Prerequisites: transformers, timm, opencv-python, Pillow


Lab 05 — Writer Identification

File: notebooks/05_writer_identification.ipynb

Compares the stylistic characteristics of an anonymous handwritten sample against a set of known reference samples to attribute authorship.

What you will learn:

  • How to extract handwriting style features: HOG (Histogram of Oriented Gradients), LBP (Local Binary Patterns), and horizontal/vertical run-length statistics
  • How to build an SVM-based writer identification pipeline with scikit-learn (StandardScaler + SVC with RBF kernel)
  • How to evaluate identification accuracy using cross-validation
  • How to present results as a ranked list of candidate authors with probability scores

Demo flow:

  1. Load the reference sample database from data/samples/writer_XX/ (five writers, 41 samples each)
  2. Extract HOG + LBP + run-length features from each sample
  3. Train an SVM classifier (C=10, gamma="scale", probability=True)
  4. Upload an anonymous sample → ranked candidate list with probability scores

Forensic use cases:

  • Attribution of anonymous threatening letters
  • Verification of disputed document authorship
  • Historical document provenance research

Prerequisites: scikit-learn, scikit-image, Pillow, numpy

Dataset note: The demo uses a synthetic handwriting database in data/samples/writer_XX/ (five writers, 41 samples each) generated with system TTF fonts (Ink Free, Lucida Handwriting, Segoe Print, Segoe Script, Comic Sans) to ensure distinct, reproducible styles. For production use, replace these with real handwriting scans. The IAM Handwriting Database is the standard forensic benchmark.


Lab 06 — Graphological Feature Analysis

File: notebooks/06_graphological_feature_analysis.ipynb

Automatically extracts and visualises graphological features from a handwritten text sample using classical computer vision and signal processing.

What you will learn:

  • How to segment handwriting into words and characters using OpenCV
  • How to measure: letter slant angle, word/character spacing, letter height and width, stroke pressure (pixel intensity distribution)
  • How to build a visual dashboard of graphological metrics
  • How to compare two samples and highlight differences

Demo flow:

  1. Load a handwritten text image from data/samples/
  2. Preprocess (binarise, denoise, deskew)
  3. Segment into lines, words, and characters
  4. Compute and display metrics with annotated visualisations
  5. (Optional) Compare two samples side-by-side

Forensic use cases:

  • Supporting expert witness testimony with objective, reproducible measurements
  • Detecting stress indicators in handwriting (pen pressure variation)
  • Comparative analysis between a reference sample and a disputed document

Prerequisites: opencv-python, numpy, matplotlib, scipy


Lab 07 — Named Entity Recognition (NER)

File: notebooks/07_named_entity_recognition.ipynb

Uses a multilingual BERT-NER model (Babelscape/wikineural-multilingual-ner) to automatically extract named entities — persons, organisations, and locations — from text. Ideal as a second step after HTR transcription.

What you will learn:

  • How BERT-based token classification works (BIO tagging scheme)
  • How to load and run a multilingual NER pipeline via transformers
  • How to visualise entity spans with colour-coded inline highlighting
  • How to build a complete HTR → NER pipeline for handwritten document analysis

Demo flow:

  1. Run NER on an Italian legal text (e.g. a will or declaration)
  2. Run NER on an English text (multilingual support)
  3. Full pipeline: load a handwritten image → TrOCR transcription → NER entity extraction
  4. Analyse entity distribution and confidence scores

Forensic use cases:

  • Automatically identify persons, places, and organisations mentioned in handwritten documents
  • Screen anonymous letters for proper nouns (names, addresses)
  • Build a relationship graph between entities across a document corpus

Prerequisites: transformers, torch, opencv-python, Pillow, matplotlib


Lab 08 — dots.ocr: OCR with Vision-Language Model

File: notebooks/08_dots_ocr_vlm.ipynb

Uses dots.ocr (rednote-hilab/dots.ocr) — a 1.7B Vision-Language Model — to transcribe handwritten text from document images. Unlike CNN-based OCR, the LLM component uses linguistic context to correct visual ambiguities, making it particularly effective on Italian cursive.

What you will learn:

  • How VLM-based OCR differs from TrOCR and EasyOCR (architecture comparison table)
  • How to perform a hardware check and select the right inference configuration (CPU / GPU)
  • How to load and run dots.ocr via transformers with adaptive dtype and attention
  • How to measure and compare transcription quality (CER) between EasyOCR and dots.ocr

Demo flow:

  1. Hardware check — detects GPU/RAM and selects CPU fp32 or CUDA bf16 automatically
  2. Load dots.ocr from Hugging Face (~3.5 GB bf16, ~7 GB fp32)
  3. Transcribe a writer_00 sample (single 320×140 image)
  4. Transcribe the full testamento_writer00.png document
  5. Transcribe Lorella real-world handwriting samples
  6. Side-by-side EasyOCR vs dots.ocr comparison + CER measurement

Forensic use cases:

  • High-quality OCR on Italian cursive handwriting
  • Documents with tables, formulas, or complex layouts
  • When EasyOCR results are insufficient for forensic accuracy requirements

Prerequisites: transformers>=4.49, qwen_vl_utils, torch, Pillow, psutil, accelerate

Hardware note: On CPU (~7 GB free RAM), inference takes 2–5 min per image. On GPU ≥8 GB VRAM, 5–10 seconds per image. Not suitable for interactive real-time demos — use EasyOCR in the Gradio app.

Installation (one-time — see notebook cell):

git clone https://github.com/rednote-hilab/dots.ocr.git DotsOCR
pip install -e DotsOCR
pip install qwen_vl_utils accelerate

Reference: arxiv 2512.02498 — RedNote / Xiaohongshu, Dec 2024.


Interactive Demo (Gradio)

File: app/grapholab_demo.py

A browser-based multi-tab Gradio application (fully in Italian) aggregating nine AI capabilities:

Tab Functionality
OCR Manoscritto Upload an image (single or multi-line) → transcribed text (EasyOCR)
Verifica Firma Upload two signatures → genuine / forged verdict
Rilevamento Firma Upload a document → annotated image with detected signatures
Riconoscimento Entità Enter text → colour-coded named entities + summary table
Identificazione Scrittore Upload a handwriting sample → ranked candidate authors with probability scores
Analisi Grafologica Upload handwritten text → visual metrics dashboard
Perizia Forense Automatica Upload a document (+optional reference signature) → full forensic report (7 steps: detection, HTR, NER, writer ID, graphology, signature verification, LLM synthesis via Ollama)
Datazione Documenti Upload multiple document images → chronological ordering by extracted date (EasyOCR + dateparser)
Consulente Forense IA RAG chatbot: upload PDF/DOCX to enrich the knowledge base, then ask questions answered by a local Ollama LLM

Running locally:

# Create and activate a virtual environment (recommended)
python -m venv .venv

# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate

pip install -r requirements.txt
python app/grapholab_demo.py
# Open http://localhost:7860

Running with Docker

# JupyterLab at http://localhost:8888  (token: grapholab)
docker compose up jupyter

# Gradio demo at http://localhost:7860
docker compose up gradio

# Both services together
docker compose up

The Hugging Face model cache is stored in a named Docker volume (grapholab-hf-cache) and shared between both services. Models are downloaded only once.


Sample Data

Demo images are provided in data/samples/:

File Used in
handwritten_text_01.png Labs 02, 06, 07 (single-line HTR demo)
handwritten_multiline_01.png Labs 02, 07 (multi-line HTR + NER pipeline)
genuine_N_M.png Lab 03 (CEDAR reference signatures, N=writer, M=sample)
forged_N_M.png Lab 03 (CEDAR forgeries, pre-selected for model detectability)
document_with_signature_*.png Lab 04

Signature samples (genuine_*, forged_*) are sourced from the CEDAR signature database and pre-selected so that SigNet correctly classifies the forgery. You can replace or supplement these with your own images to experiment with real-world cases.