Picarones / docs /tutorials /reading-a-report.en.md
Claude
docs: refonte Diataxis + 8 documents institutionnels (S60)
d0a3fab unverified
<!-- translation: machine + human review pending -->
<!-- canonical: docs/tutorials/reading-a-report.md (FR) -->
# Reading a Picarones report
> πŸ‡«πŸ‡· [Version franΓ§aise](reading-a-report.md)
This guide explains how to read a Picarones HTML report β€” the
self-contained file produced by `picarones report --output report.html`.
It is the primary deliverable of a benchmark and is intended to be
read both by engineers and by domain experts (archivists,
paleographers, project managers).
## Anatomy
A report is structured as **5 main views** (tabs in the navigation):
1. **Ranking** β€” sortable table of all engines with CER, WER, MER,
WIL, ligature/diacritic scores, anchor score, etc.
2. **Gallery** β€” grid view of all documents with color-coded CER
badges per engine.
3. **Document** β€” per-document detail with synchronized N-way diff
between ground truth and each engine output.
4. **Analyses** β€” statistical charts: CER histogram, radar chart,
correlation plots, calibration diagrams, Pareto front, etc.
5. **Characters** β€” Unicode confusion matrix and ligature analysis.
Above the tabs, you'll find:
- The **factual narrative synthesis** (Sprint 19): 3–5 sentences
summarizing the salient facts (global leader, statistical ties,
outliers, regression flags). Every number cited in the synthesis
is traceable to the underlying JSON data β€” no LLM, no
hallucination risk.
- The **Critical Difference Diagram** (Sprint 18, DemΕ‘ar 2006):
visual representation of which engines are statistically
indistinguishable.
- The **Pareto front** (Sprint 20): cost vs CER trade-off analysis.
## Suggested reading order
1. **Read the synthesis at the top** (3–5 sentences) β€” it points
to the salient facts.
2. **Look at the CDD**: if all engines are connected by a single
horizontal bar, your corpus does not discriminate them
sufficiently β€” increase the corpus or refine its homogeneity.
3. **Open the ranking** sorted by CER median (default since
Sprint 44). Identify the leader and the gap to second place.
4. **Switch to Gallery** and click on the "Worst cases" filter to
see what specifically goes wrong.
5. **For an OCR+LLM pipeline**: open Document view and toggle the
triple diff (GT / raw OCR / post-correction).
## Side panels
Two side panels enrich the report:
- **Glossary** (`?` icon next to each metric) β€” definition, what
it measures, usage, limits, primary reference. 25 bilingual
entries, opens via click on `?`.
- **Advanced mode** (`βš™` button in nav) β€” visible columns picker,
per-stratum filters (script type), opt-in personal composite
score with explicit "no universal weighting" warning.
All settings are URL-stateful (shareable).
## Export
A "⬇ CSV" button in the navigation exports the current view (with
all customization filters applied) to CSV for Excel/LibreOffice.
JSON, ALTO XML and PAGE XML exports are available via CLI flags
on `picarones run` and `picarones report`.
## `--lazy-images` mode for large corpora
Sprint A5 (item M-16). By default, the HTML report is a **single
file** transportable: all images are embedded as base64 within the
HTML. Convenient for sharing by email, but the file becomes heavy
beyond ~50 documents:
| Corpus size | Inline HTML | Lazy HTML |
|---|---|---|
| 10 docs | ~5 MB | ~3 MB + ~2 MB assets |
| 50 docs | ~50 MB | ~3 MB + ~10 MB assets |
| 500 docs | ~250 MB (slow to load) | ~3 MB + ~100 MB lazy-loaded |
For digital libraries benchmarking thousands of documents, enable
the lazy mode:
```bash
picarones report --results results.json --output report.html --lazy-images
```
The report stays **self-contained**: copy `report.html` AND the
`report-assets/` folder side by side. Images are referenced by
relative path and loaded by the browser on-demand
(`loading="lazy"` HTML5).
## Further reading
- [Glossary] (embedded in report, accessible via `?` icons)
- [docs/explanation/narrative-engine.en.md](../developer/narrative-engine.en.md) β€” adding a detector
- [docs/developer/extending-glossary.en.md](../developer/extending-glossary.en.md) β€” enriching the glossary
- [SPECS.md](../../SPECS.md) β€” full project specifications