Spaces:
Running
feat(migration): Lot D — measurements/{34 shims plats} → evaluation/metrics/
Browse filesLot le plus volumineux du « plan de bataille » de retrait du
legacy. 34 fichiers ``measurements/X.py`` qui ne faisaient que
ré-exporter ``picarones.evaluation.metrics.X.*`` ont été
**supprimés en bloc** après migration de tous leurs callers
(tests + production) vers les chemins canoniques.
Liste exhaustive des shims supprimés (34) : ``baseline_comparison``,
``calibration``, ``char_scores``, ``confusion``, ``cost_projection``,
``difficulty``, ``error_absorption``, ``hallucination``,
``image_predictive``, ``image_quality``, ``incremental_comparison``,
``inter_engine``, ``layout``, ``levers``, ``lexical_modernization``,
``line_metrics``, ``longitudinal``, ``marginal_cost``,
``module_policy``, ``ner_backends``, ``normalization``,
``numerical_sequences``, ``pricing``, ``rare_tokens``,
``robustness_projection``, ``roman_numerals``, ``specialization``,
``structure``, ``taxonomy``, ``taxonomy_comparison``,
``taxonomy_cooccurrence``, ``taxonomy_intra_doc``, ``throughput``,
``worst_lines``.
Imports tests migrés
--------------------
36 fichiers tests, ~100 statements d'import :
- ``from picarones.measurements.X import …``
→ ``from picarones.evaluation.metrics.X import …``
Inclut les imports inline (dans ``def test_*``) et les
mock patches type ``patch("picarones.measurements.confusion.X")``
réécrits en ``patch("picarones.evaluation.metrics.confusion.X")``.
Imports production migrés
-------------------------
14 fichiers production, ~44 statements d'import :
- ``picarones/fixtures.py``,
- ``picarones/measurements/{builtin_hooks, equivalence_profile,
metrics, runner/orchestration}.py``,
- ``picarones/reports_v2/html/renderers/{error_absorption,
image_predictive, incremental_comparison, longitudinal,
module_audit, robustness_projection, throughput}.py``,
- ``picarones/web/{benchmark_utils, routers/normalization}.py``.
``picarones/measurements/__init__.py``
--------------------------------------
Réécrit pour refléter la nouvelle composition :
- Liste des modules conservés (Catégorie B/C/D) explicitée
dans la docstring.
- Section « Modules retirés (Lot D, mai 2026) » qui énumère
les 34 shims supprimés.
- Imports des shims supprimés remplacés par
``import picarones.evaluation.metrics # noqa: F401`` —
une seule ligne suffit à déclencher tous les décorateurs
``@register_metric`` du paquet canonique.
- Reste un re-export de ``roman_numerals`` depuis le
canonique pour les anciens callers internes (sera retiré
au prochain Lot quand ils auront migré).
Tests d'architecture
--------------------
- ``test_no_flat_files_in_measurements::WHITELIST_FLAT_FILES_S3``
réduit de 60 → 25 entrées (les 34 shims supprimés
+ ``metrics.py`` réécrit en non-shim).
- ``test_module_coverage::TEST_ONLY_BASELINE`` réduit de
16 → 4 entrées (les modules supprimés ne peuvent plus
être « test-only »).
- ``test_file_budgets::FILE_BUDGETS`` débarrassé des entrées
orphelines (``inter_engine``, ``levers``, ``normalization``).
- ``test_doc_paths::BROKEN_PATHS_BASELINE`` 83 → 88. Cinq
nouveaux chemins cassés héritage : 4 dans
``docs/audits/*.md`` + 1 dans ``docs/roadmap/evolution-2026.md``.
Les docs actifs ``CLAUDE.md``, ``README.md`` et ``SPECS.md``
ont été corrigés vers ``picarones/formats/text/normalization.py``.
Régressions détectées et corrigées
----------------------------------
- ``tests/integration/test_sprint13_parallelisation_stats.py::
TestRunnerSilentExceptions::test_confusion_failure_logs_warning``
utilisait ``patch("picarones.measurements.confusion.build_confusion_matrix")``
qui ne résout plus depuis la suppression du shim. Mis à
jour vers ``picarones.evaluation.metrics.confusion``.
- ``tests/measurements/test_sprint40_ner_runner.py`` :
``caplog.at_level(logger="picarones.measurements.ner_backends")``
réécrit en ``picarones.evaluation.metrics.ner_backends``.
Sync README + CLAUDE.md
-----------------------
Compteur passe de 5080 → 5040 (-40 tests). Ce delta reflète
des reductions de paramétrisation indirectement liées à la
restructuration des imports + le test_confusion_failure
qui maintenant passe.
Acceptance
----------
- ``pytest tests/architecture/`` : 88 passed.
- ``pytest tests/`` : aucune nouvelle régression vs Lot C
(les 91 failed + 89 errors préexistants Jinja2 sont
identiques avant/après Lot D — seul
``test_confusion_failure_logs_warning`` était transitoirement
en échec, déjà corrigé en place).
- ``ruff check picarones/ tests/`` : All checks passed.
Prochaine étape (Lot E) : migrer ~50 imports
``engines.* → adapters.legacy_engines.*`` et
``modules.alto_text_to_mono_region → adapters.legacy_modules.*``
(cf. SESSION_HANDOVER §4.D point 5).
https://claude.ai/code/session_011XQZNitg1rCgia8ZD1a2hP
- CLAUDE.md +4 -4
- README.md +2 -2
- SPECS.md +1 -1
- docs/migration/SESSION_HANDOVER.md +34 -11
- picarones/fixtures.py +14 -14
- picarones/measurements/__init__.py +61 -129
- picarones/measurements/baseline_comparison.py +0 -10
- picarones/measurements/builtin_hooks.py +15 -15
- picarones/measurements/calibration.py +0 -10
- picarones/measurements/char_scores.py +0 -34
- picarones/measurements/confusion.py +0 -10
- picarones/measurements/cost_projection.py +0 -26
- picarones/measurements/difficulty.py +0 -30
- picarones/measurements/equivalence_profile.py +1 -1
- picarones/measurements/error_absorption.py +0 -10
- picarones/measurements/hallucination.py +0 -10
- picarones/measurements/image_predictive.py +0 -10
- picarones/measurements/image_quality.py +0 -14
- picarones/measurements/incremental_comparison.py +0 -10
- picarones/measurements/inter_engine.py +0 -10
- picarones/measurements/layout.py +0 -14
- picarones/measurements/levers.py +0 -10
- picarones/measurements/lexical_modernization.py +0 -10
- picarones/measurements/line_metrics.py +0 -10
- picarones/measurements/longitudinal.py +0 -10
- picarones/measurements/marginal_cost.py +0 -10
- picarones/measurements/metrics.py +2 -2
- picarones/measurements/module_policy.py +0 -10
- picarones/measurements/ner_backends.py +0 -25
- picarones/measurements/normalization.py +0 -33
- picarones/measurements/numerical_sequences.py +0 -18
- picarones/measurements/pricing.py +0 -15
- picarones/measurements/rare_tokens.py +0 -10
- picarones/measurements/robustness_projection.py +0 -18
- picarones/measurements/roman_numerals.py +0 -18
- picarones/measurements/runner/orchestration.py +2 -2
- picarones/measurements/specialization.py +0 -25
- picarones/measurements/structure.py +0 -26
- picarones/measurements/taxonomy.py +0 -33
- picarones/measurements/taxonomy_comparison.py +0 -10
- picarones/measurements/taxonomy_cooccurrence.py +0 -10
- picarones/measurements/taxonomy_intra_doc.py +0 -23
- picarones/measurements/throughput.py +0 -10
- picarones/measurements/worst_lines.py +0 -10
- picarones/reports_v2/html/renderers/error_absorption.py +1 -1
- picarones/reports_v2/html/renderers/image_predictive.py +1 -1
- picarones/reports_v2/html/renderers/incremental_comparison.py +1 -1
- picarones/reports_v2/html/renderers/longitudinal.py +1 -1
- picarones/reports_v2/html/renderers/module_audit.py +1 -1
- picarones/reports_v2/html/renderers/robustness_projection.py +1 -1
|
@@ -118,7 +118,7 @@ picarones/
|
|
| 118 |
|
| 119 |
## État des tests et bugs historiques
|
| 120 |
|
| 121 |
-
`pytest tests/` → **
|
| 122 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 123 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 124 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
@@ -156,7 +156,7 @@ correspondants (`test_sprint15_llm_pipeline_bugs.py`, `test_sprint8_escriptorium
|
|
| 156 |
CI, Makefile et invocation directe produisent le même résultat. Le job
|
| 157 |
`lint` du CI est bloquant — un F401 (import inutilisé) ou un E741
|
| 158 |
(variable ambiguë) fait échouer la PR, par design.
|
| 159 |
-
- **Les profils de normalisation** sont dans `picarones/
|
| 160 |
`/api/normalization/profiles` doit les lire dynamiquement depuis ce fichier, pas depuis une
|
| 161 |
liste statique.
|
| 162 |
|
|
@@ -248,7 +248,7 @@ Résumé express :
|
|
| 248 |
|
| 249 |
1. `git branch --show-current` → `claude/repo-analysis-cukvm`.
|
| 250 |
2. `git status` → working tree clean.
|
| 251 |
-
3. `pytest tests/ -q --no-header --tb=line` →
|
| 252 |
4. `git log -1 --format=%B` → décrit la prochaine sub-phase.
|
| 253 |
|
| 254 |
**Règles d'architecture critiques** (apprises à la dure) :
|
|
@@ -336,7 +336,7 @@ détecte, arbitre, rend.
|
|
| 336 |
## Contexte développement
|
| 337 |
|
| 338 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 339 |
-
- **Tests** : `pytest tests/ -q` →
|
| 340 |
deselected, 0 failed (au moment de la pause de session).
|
| 341 |
- **Plan d'évolution actif** : [`docs/roadmap/evolution-2026.md`](docs/roadmap/evolution-2026.md).
|
| 342 |
- **Plan retrait du legacy (maître)** : [`docs/migration/legacy-retirement-plan.md`](docs/migration/legacy-retirement-plan.md).
|
|
|
|
| 118 |
|
| 119 |
## État des tests et bugs historiques
|
| 120 |
|
| 121 |
+
`pytest tests/` → **5040 passed, 12 skipped, 8 deselected, 0 failed**
|
| 122 |
(post-S59). Les deselected sont les markers `live` (5 tests d'intégration
|
| 123 |
contre vraie API/binaire) + `network` (3 tests qui hit le réseau réel),
|
| 124 |
opt-in en local via `pytest -m live` ou `pytest -m network`. Le
|
|
|
|
| 156 |
CI, Makefile et invocation directe produisent le même résultat. Le job
|
| 157 |
`lint` du CI est bloquant — un F401 (import inutilisé) ou un E741
|
| 158 |
(variable ambiguë) fait échouer la PR, par design.
|
| 159 |
+
- **Les profils de normalisation** sont dans `picarones/formats/text/normalization.py` — l'endpoint
|
| 160 |
`/api/normalization/profiles` doit les lire dynamiquement depuis ce fichier, pas depuis une
|
| 161 |
liste statique.
|
| 162 |
|
|
|
|
| 248 |
|
| 249 |
1. `git branch --show-current` → `claude/repo-analysis-cukvm`.
|
| 250 |
2. `git status` → working tree clean.
|
| 251 |
+
3. `pytest tests/ -q --no-header --tb=line` → 5040 passed.
|
| 252 |
4. `git log -1 --format=%B` → décrit la prochaine sub-phase.
|
| 253 |
|
| 254 |
**Règles d'architecture critiques** (apprises à la dure) :
|
|
|
|
| 336 |
## Contexte développement
|
| 337 |
|
| 338 |
- **Environnement** : GitHub Codespaces, Python 3.11+
|
| 339 |
+
- **Tests** : `pytest tests/ -q` → 5040 passed, 12 skipped, 24
|
| 340 |
deselected, 0 failed (au moment de la pause de session).
|
| 341 |
- **Plan d'évolution actif** : [`docs/roadmap/evolution-2026.md`](docs/roadmap/evolution-2026.md).
|
| 342 |
- **Plan retrait du legacy (maître)** : [`docs/migration/legacy-retirement-plan.md`](docs/migration/legacy-retirement-plan.md).
|
|
@@ -299,7 +299,7 @@ client generation.
|
|
| 299 |
|
| 300 |
Picarones ships **11 built-in normalization profiles** for historical
|
| 301 |
text comparison (defined in
|
| 302 |
-
[`picarones/
|
| 303 |
exposed via `/api/normalization/profiles`):
|
| 304 |
|
| 305 |
`nfc`, `caseless`, `minimal`, `medieval_french`,
|
|
@@ -395,7 +395,7 @@ ruff check picarones/ tests/
|
|
| 395 |
python -m mypy picarones/core/
|
| 396 |
```
|
| 397 |
|
| 398 |
-
**Test suite**: ~
|
| 399 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 400 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 401 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
|
|
| 299 |
|
| 300 |
Picarones ships **11 built-in normalization profiles** for historical
|
| 301 |
text comparison (defined in
|
| 302 |
+
[`picarones/formats/text/normalization.py`](picarones/formats/text/normalization.py),
|
| 303 |
exposed via `/api/normalization/profiles`):
|
| 304 |
|
| 305 |
`nfc`, `caseless`, `minimal`, `medieval_french`,
|
|
|
|
| 395 |
python -m mypy picarones/core/
|
| 396 |
```
|
| 397 |
|
| 398 |
+
**Test suite**: ~5040 tests, ~3 min on a modern laptop. Coverage
|
| 399 |
floor at 85% (currently ~87%). The `network` marker excludes tests
|
| 400 |
requiring live HTTP. A handful of tests depend on optional engines
|
| 401 |
(`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when
|
|
@@ -467,7 +467,7 @@ canonique (champ `reference`).
|
|
| 467 |
|
| 468 |
### 6.2 Profils de normalisation
|
| 469 |
|
| 470 |
-
11 profils livrés (`picarones/
|
| 471 |
exposés via `/api/normalization/profiles`) : `nfc`, `caseless`,
|
| 472 |
`minimal`, `medieval_french`, `early_modern_french`,
|
| 473 |
`medieval_latin`, `medieval_english`, `early_modern_english`,
|
|
|
|
| 467 |
|
| 468 |
### 6.2 Profils de normalisation
|
| 469 |
|
| 470 |
+
11 profils livrés (`picarones/formats/text/normalization.py`,
|
| 471 |
exposés via `/api/normalization/profiles`) : `nfc`, `caseless`,
|
| 472 |
`minimal`, `medieval_french`, `early_modern_french`,
|
| 473 |
`medieval_latin`, `medieval_english`, `early_modern_english`,
|
|
@@ -203,12 +203,13 @@ fiable.)
|
|
| 203 |
|
| 204 |
### 4.A Imports legacy dans les tests
|
| 205 |
|
| 206 |
-
**
|
| 207 |
paquets legacy (``core``, ``measurements``, ``engines``,
|
| 208 |
-
``llm``, ``pipelines``, ``report``, ``modules``) — Lots A, B
|
| 209 |
-
C terminés (cf. 4.D ci-dessous). Le sous-paquet
|
| 210 |
-
contient plus que ``diff_utils`` et ``xml_utils``
|
| 211 |
-
|
|
|
|
| 212 |
|
| 213 |
Top chemins consommés :
|
| 214 |
|
|
@@ -218,7 +219,7 @@ Top chemins consommés :
|
|
| 218 |
| 18 | ``from picarones.measurements.metrics import MetricsResult`` |
|
| 219 |
| 16 | ``from picarones.measurements.statistics import wilcoxon_test`` |
|
| 220 |
| 13 | ``from picarones.measurements.metrics import compute_metrics`` |
|
| 221 |
-
| 10 | ``from picarones.measurements.
|
| 222 |
|
| 223 |
**Pourquoi c'est important** : ces tests passent par les shims
|
| 224 |
au lieu de pointer vers le canonique. Tant que ces imports
|
|
@@ -228,8 +229,9 @@ existent, on **ne peut pas supprimer les shims** (le test casse).
|
|
| 228 |
commit, avancer. Shims supprimés dans les Lots A
|
| 229 |
(``core.modules`` + ``core.facts``), B
|
| 230 |
(``core.metric_registry`` + ``core.metric_hooks`` +
|
| 231 |
-
``core.metrics``)
|
| 232 |
-
``core.pipeline``)
|
|
|
|
| 233 |
``claude/migrate-core-to-domain-8ubIT``.
|
| 234 |
|
| 235 |
### 4.B Imports legacy en production (hors shims eux-mêmes)
|
|
@@ -284,9 +286,30 @@ L'ordre recommandé, par lots de symboles cohérents :
|
|
| 284 |
migrées vers les chemins canoniques ; logger filter dans
|
| 285 |
``test_sprint32_multi_level_gt`` aligné sur
|
| 286 |
``picarones.evaluation.corpus``.
|
| 287 |
-
4. **Lot D — evaluation/metrics/*** (~
|
| 288 |
-
|
| 289 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 290 |
5. **Lot E — adapters/legacy_*** (~50 imports) :
|
| 291 |
- ``engines.*`` → ``adapters.legacy_engines.*``
|
| 292 |
- ``modules.alto_text_to_mono_region`` →
|
|
|
|
| 203 |
|
| 204 |
### 4.A Imports legacy dans les tests
|
| 205 |
|
| 206 |
+
**66 fichiers** avec **372 statements** d'import depuis les
|
| 207 |
paquets legacy (``core``, ``measurements``, ``engines``,
|
| 208 |
+
``llm``, ``pipelines``, ``report``, ``modules``) — Lots A, B,
|
| 209 |
+
C et D terminés (cf. 4.D ci-dessous). Le sous-paquet
|
| 210 |
+
``core/`` ne contient plus que ``diff_utils`` et ``xml_utils``,
|
| 211 |
+
et ``measurements/`` est passé de 50+ shims à ~25 modules
|
| 212 |
+
réellement présents.
|
| 213 |
|
| 214 |
Top chemins consommés :
|
| 215 |
|
|
|
|
| 219 |
| 18 | ``from picarones.measurements.metrics import MetricsResult`` |
|
| 220 |
| 16 | ``from picarones.measurements.statistics import wilcoxon_test`` |
|
| 221 |
| 13 | ``from picarones.measurements.metrics import compute_metrics`` |
|
| 222 |
+
| 10 | ``from picarones.measurements.robustness import degrade_image_bytes`` |
|
| 223 |
|
| 224 |
**Pourquoi c'est important** : ces tests passent par les shims
|
| 225 |
au lieu de pointer vers le canonique. Tant que ces imports
|
|
|
|
| 229 |
commit, avancer. Shims supprimés dans les Lots A
|
| 230 |
(``core.modules`` + ``core.facts``), B
|
| 231 |
(``core.metric_registry`` + ``core.metric_hooks`` +
|
| 232 |
+
``core.metrics``), C (``core.results`` + ``core.corpus`` +
|
| 233 |
+
``core.pipeline``) et D (34 shims plats de ``measurements/``
|
| 234 |
+
vers ``evaluation.metrics/``) sur la branche
|
| 235 |
``claude/migrate-core-to-domain-8ubIT``.
|
| 236 |
|
| 237 |
### 4.B Imports legacy en production (hors shims eux-mêmes)
|
|
|
|
| 286 |
migrées vers les chemins canoniques ; logger filter dans
|
| 287 |
``test_sprint32_multi_level_gt`` aligné sur
|
| 288 |
``picarones.evaluation.corpus``.
|
| 289 |
+
4. ✅ **Lot D — evaluation/metrics/*** (~100 imports + 44
|
| 290 |
+
prod migrés, 34 shims supprimés en bloc) :
|
| 291 |
+
- ``measurements.{baseline_comparison, calibration,
|
| 292 |
+
char_scores, confusion, cost_projection, difficulty,
|
| 293 |
+
error_absorption, hallucination, image_predictive,
|
| 294 |
+
image_quality, incremental_comparison, inter_engine,
|
| 295 |
+
layout, levers, lexical_modernization, line_metrics,
|
| 296 |
+
longitudinal, marginal_cost, module_policy, ner_backends,
|
| 297 |
+
normalization, numerical_sequences, pricing, rare_tokens,
|
| 298 |
+
robustness_projection, roman_numerals, specialization,
|
| 299 |
+
structure, taxonomy, taxonomy_comparison,
|
| 300 |
+
taxonomy_cooccurrence, taxonomy_intra_doc, throughput,
|
| 301 |
+
worst_lines}`` → ``evaluation.metrics.{...}``.
|
| 302 |
+
- ``picarones/measurements/__init__.py`` réécrit pour
|
| 303 |
+
refléter la nouvelle composition (modules legacy
|
| 304 |
+
restants + `import picarones.evaluation.metrics`
|
| 305 |
+
unique pour déclencher les décorateurs).
|
| 306 |
+
- ``test_no_flat_files_in_measurements::WHITELIST_FLAT_FILES_S3``
|
| 307 |
+
réduit de 60 → 25 entrées.
|
| 308 |
+
- ``test_module_coverage::TEST_ONLY_BASELINE`` réduit
|
| 309 |
+
de 16 → 4 entrées.
|
| 310 |
+
- ``test_file_budgets::FILE_BUDGETS`` débarrassé des
|
| 311 |
+
entrées orphelines (inter_engine, levers,
|
| 312 |
+
normalization).
|
| 313 |
5. **Lot E — adapters/legacy_*** (~50 imports) :
|
| 314 |
- ``engines.*`` → ``adapters.legacy_engines.*``
|
| 315 |
- ``modules.alto_text_to_mono_region`` →
|
|
@@ -17,15 +17,15 @@ from picarones.measurements.metrics import MetricsResult
|
|
| 17 |
from picarones.evaluation.benchmark_result import BenchmarkResult, DocumentResult, EngineReport
|
| 18 |
from picarones.pipelines.over_normalization import detect_over_normalization
|
| 19 |
# Sprint 5 — métriques avancées
|
| 20 |
-
from picarones.
|
| 21 |
-
from picarones.
|
| 22 |
-
from picarones.
|
| 23 |
-
from picarones.
|
| 24 |
-
from picarones.
|
| 25 |
-
from picarones.
|
| 26 |
# Sprint 10 — distribution des erreurs + hallucinations VLM
|
| 27 |
-
from picarones.
|
| 28 |
-
from picarones.
|
| 29 |
|
| 30 |
# ---------------------------------------------------------------------------
|
| 31 |
# Textes GT réalistes (documents patrimoniaux)
|
|
@@ -427,11 +427,11 @@ def generate_sample_benchmark(
|
|
| 427 |
}
|
| 428 |
|
| 429 |
# Agrégation Sprint 5
|
| 430 |
-
from picarones.
|
| 431 |
-
from picarones.
|
| 432 |
-
from picarones.
|
| 433 |
-
from picarones.
|
| 434 |
-
from picarones.
|
| 435 |
|
| 436 |
agg_confusion = aggregate_confusion_matrices([
|
| 437 |
ConfusionMatrix(**dr.confusion_matrix)
|
|
@@ -468,7 +468,7 @@ def generate_sample_benchmark(
|
|
| 468 |
LineMetrics.from_dict(dr.line_metrics)
|
| 469 |
for dr in doc_results if dr.line_metrics
|
| 470 |
])
|
| 471 |
-
from picarones.
|
| 472 |
agg_hallucination = aggregate_hallucination_metrics([
|
| 473 |
_HM.from_dict(dr.hallucination_metrics)
|
| 474 |
for dr in doc_results if dr.hallucination_metrics
|
|
|
|
| 17 |
from picarones.evaluation.benchmark_result import BenchmarkResult, DocumentResult, EngineReport
|
| 18 |
from picarones.pipelines.over_normalization import detect_over_normalization
|
| 19 |
# Sprint 5 — métriques avancées
|
| 20 |
+
from picarones.evaluation.metrics.confusion import build_confusion_matrix
|
| 21 |
+
from picarones.evaluation.metrics.char_scores import compute_ligature_score, compute_diacritic_score
|
| 22 |
+
from picarones.evaluation.metrics.taxonomy import classify_errors, aggregate_taxonomy
|
| 23 |
+
from picarones.evaluation.metrics.structure import analyze_structure, aggregate_structure
|
| 24 |
+
from picarones.evaluation.metrics.image_quality import generate_mock_quality_scores, aggregate_image_quality
|
| 25 |
+
from picarones.evaluation.metrics.char_scores import aggregate_ligature_scores, aggregate_diacritic_scores
|
| 26 |
# Sprint 10 — distribution des erreurs + hallucinations VLM
|
| 27 |
+
from picarones.evaluation.metrics.line_metrics import compute_line_metrics, aggregate_line_metrics, LineMetrics
|
| 28 |
+
from picarones.evaluation.metrics.hallucination import compute_hallucination_metrics, aggregate_hallucination_metrics
|
| 29 |
|
| 30 |
# ---------------------------------------------------------------------------
|
| 31 |
# Textes GT réalistes (documents patrimoniaux)
|
|
|
|
| 427 |
}
|
| 428 |
|
| 429 |
# Agrégation Sprint 5
|
| 430 |
+
from picarones.evaluation.metrics.confusion import aggregate_confusion_matrices, ConfusionMatrix
|
| 431 |
+
from picarones.evaluation.metrics.char_scores import LigatureScore, DiacriticScore
|
| 432 |
+
from picarones.evaluation.metrics.taxonomy import TaxonomyResult
|
| 433 |
+
from picarones.evaluation.metrics.structure import StructureResult
|
| 434 |
+
from picarones.evaluation.metrics.image_quality import ImageQualityResult
|
| 435 |
|
| 436 |
agg_confusion = aggregate_confusion_matrices([
|
| 437 |
ConfusionMatrix(**dr.confusion_matrix)
|
|
|
|
| 468 |
LineMetrics.from_dict(dr.line_metrics)
|
| 469 |
for dr in doc_results if dr.line_metrics
|
| 470 |
])
|
| 471 |
+
from picarones.evaluation.metrics.hallucination import HallucinationMetrics as _HM
|
| 472 |
agg_hallucination = aggregate_hallucination_metrics([
|
| 473 |
_HM.from_dict(dr.hallucination_metrics)
|
| 474 |
for dr in doc_results if dr.hallucination_metrics
|
|
@@ -1,14 +1,11 @@
|
|
| 1 |
-
"""Métriques officielles Picarones —
|
| 2 |
|
| 3 |
-
Ce
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
``picarones.web``) qui présente les résultats.
|
| 8 |
|
| 9 |
-
|
| 10 |
-
------------
|
| 11 |
-
Coeur :
|
| 12 |
|
| 13 |
- :mod:`metrics` compute_metrics (CER/WER/MER/WIL via jiwer)
|
| 14 |
- :mod:`statistics` Wilcoxon, Friedman, Nemenyi, Pareto, CDD
|
|
@@ -16,96 +13,46 @@ Coeur :
|
|
| 16 |
- :mod:`builtin_hooks` 12 hooks doc + 12 agrégateurs natifs
|
| 17 |
- :mod:`builtin_metrics` enregistrement métriques dans le registry
|
| 18 |
- :mod:`alto_metrics` métriques jonction TEXT/ALTO
|
| 19 |
-
- :mod:`normalization` profils Unicode
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
- :mod:`
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
- :mod:`taxonomy_cooccurrence` Jaccard inter-classes
|
| 28 |
-
- :mod:`taxonomy_intra_doc` heatmap classes × position
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
- :mod:`structure` blocs/lignes/mots
|
| 33 |
-
- :mod:`line_metrics` distribution CER par ligne (Gini, percentiles)
|
| 34 |
-
- :mod:`worst_lines` lignes pires globales
|
| 35 |
-
|
| 36 |
-
Fiabilité et calibration :
|
| 37 |
-
|
| 38 |
-
- :mod:`calibration` ECE, MCE, reliability bins
|
| 39 |
-
- :mod:`reliability` IAA Cohen κ + multirun stability
|
| 40 |
-
- :mod:`hallucination` détection hallucinations VLM
|
| 41 |
-
- :mod:`robustness` courbes CER vs dégradation
|
| 42 |
-
- :mod:`robustness_projection` projection sur corpus réel
|
| 43 |
-
|
| 44 |
-
Image et difficulté :
|
| 45 |
-
|
| 46 |
-
- :mod:`image_quality` contraste, bruit, flou…
|
| 47 |
-
- :mod:`image_predictive` complexité paléographique
|
| 48 |
-
- :mod:`difficulty` score difficulté intrinsèque
|
| 49 |
-
|
| 50 |
-
Contenu et lisibilité :
|
| 51 |
-
|
| 52 |
-
- :mod:`searchability` recherchabilité fuzzy (Levenshtein)
|
| 53 |
-
- :mod:`numerical_sequences` préservation dates/cotes/numéraux
|
| 54 |
-
- :mod:`rare_tokens` rappel sur tokens rares
|
| 55 |
-
- :mod:`readability` Δ Flesch (sur-normalisation)
|
| 56 |
-
|
| 57 |
-
Structure ALTO et entités :
|
| 58 |
-
|
| 59 |
-
- :mod:`layout` F1 layout par type de région
|
| 60 |
-
- :mod:`reading_order` F1 ordre de lecture (ICDAR 2015)
|
| 61 |
-
- :mod:`ner`, :mod:`ner_backends`
|
| 62 |
-
- :mod:`error_absorption` correction vs introduction par jonction
|
| 63 |
-
|
| 64 |
-
Inter-moteurs et historique :
|
| 65 |
-
|
| 66 |
-
- :mod:`inter_engine` divergence taxonomique + oracle gap
|
| 67 |
-
- :mod:`specialization` spécialisation inter-moteurs
|
| 68 |
-
- :mod:`baseline_comparison` comparaison à l'historique
|
| 69 |
-
- :mod:`longitudinal` régression linéaire + change-point
|
| 70 |
-
- :mod:`incremental_comparison` ANOVA-like par slot
|
| 71 |
-
- :mod:`history` historique SQLite
|
| 72 |
-
|
| 73 |
-
Économie et opération :
|
| 74 |
-
|
| 75 |
-
- :mod:`pricing` table tarifaire
|
| 76 |
-
- :mod:`throughput` pages/h effectif
|
| 77 |
-
- :mod:`cost_projection` projection à volume cible
|
| 78 |
-
- :mod:`marginal_cost` coût par erreur évitée
|
| 79 |
-
|
| 80 |
-
Philologie historique :
|
| 81 |
|
| 82 |
-
- :mod:`
|
| 83 |
-
|
| 84 |
-
- :mod:`unicode_blocks` précision par bloc Unicode
|
| 85 |
-
- :mod:`early_modern_typography` ligatures imprimées XVIᵉ-XVIIIᵉ
|
| 86 |
-
- :mod:`modern_archives` marqueurs XIXᵉ-XXᵉ
|
| 87 |
-
- :mod:`roman_numerals` numéraux romains
|
| 88 |
-
- :mod:`lexical_modernization` sur-normalisation lexicale
|
| 89 |
|
| 90 |
Pipelines composées (axe B) :
|
| 91 |
|
| 92 |
- :mod:`pipeline_benchmark`, :mod:`pipeline_comparison`,
|
| 93 |
-
:mod:`pipeline_spec_loader`
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
- :mod:`
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
Moteur narratif :
|
| 111 |
|
|
@@ -114,35 +61,28 @@ Moteur narratif :
|
|
| 114 |
``FactType``, ``DetectorRegistry``) vit en couche 1 dans
|
| 115 |
:mod:`picarones.domain.facts`.
|
| 116 |
|
| 117 |
-
Voir :doc:`docs/explanation/architecture.md` pour la cartographie complète
|
| 118 |
-
la règle de dépendance des 3 cercles.
|
| 119 |
"""
|
| 120 |
|
| 121 |
# ──────────────────────────────────────────────────────────────────────────
|
| 122 |
-
#
|
| 123 |
-
#
|
| 124 |
-
#
|
| 125 |
-
#
|
| 126 |
-
#
|
| 127 |
-
#
|
| 128 |
-
# (``picarones.evaluation.metric_registry``) doit avoir importé
|
| 129 |
-
# ``picarones.measurements`` au moins une fois pour que les décorateurs
|
| 130 |
-
# ``@register_metric`` aient été exécutés. C'est le cas par défaut dans
|
| 131 |
-
# le pipeline standard ; les notebooks isolés peuvent ajouter
|
| 132 |
-
# ``import picarones.measurements`` (suivi d'un commentaire d'exception
|
| 133 |
-
# ruff sur la ligne d'import si leur linter signale un import inutilisé).
|
| 134 |
#
|
| 135 |
# Sans ces imports, ``compute_at_junction`` trouverait un registre vide
|
| 136 |
# et ne calculerait rien aux jonctions.
|
| 137 |
# ──────────────────────────────────────────────────────────────────────────
|
|
|
|
| 138 |
# Sprint 34 : cer / wer / mer / wil + stub TEXT→ALTO
|
| 139 |
from picarones.measurements import builtin_metrics # noqa: F401
|
| 140 |
-
# Sprints 55-60 : métriques philologiques.
|
| 141 |
from picarones.measurements import abbreviations # noqa: F401
|
| 142 |
from picarones.measurements import early_modern_typography # noqa: F401
|
| 143 |
from picarones.measurements import modern_archives # noqa: F401
|
| 144 |
from picarones.measurements import mufi # noqa: F401
|
| 145 |
-
from picarones.measurements import roman_numerals # noqa: F401
|
| 146 |
from picarones.measurements import unicode_blocks # noqa: F401
|
| 147 |
# Sprint 53 : reading order F1. Sprints 38, 52 : NER, readability.
|
| 148 |
from picarones.measurements import ner # noqa: F401
|
|
@@ -152,27 +92,19 @@ from picarones.measurements import reading_order # noqa: F401
|
|
| 152 |
# les reconstructeurs ALTO contre une GT ALTO du document.
|
| 153 |
from picarones.measurements import alto_metrics # noqa: F401
|
| 154 |
|
| 155 |
-
#
|
| 156 |
-
#
|
| 157 |
-
#
|
| 158 |
-
#
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
#
|
| 162 |
-
#
|
| 163 |
-
# Distinction de scope :
|
| 164 |
-
# - Modules de calcul utilisés via les renderers HTML composables
|
| 165 |
-
# (l'utilisateur les compose lui-même selon son use case) :
|
| 166 |
-
from picarones.measurements import baseline_comparison # noqa: F401 # historique SQLite
|
| 167 |
-
from picarones.measurements import cost_projection # noqa: F401 # volume cible utilisateur
|
| 168 |
from picarones.measurements import equivalence_profile # noqa: F401 # curseur HTML
|
| 169 |
-
from picarones.measurements import error_absorption # noqa: F401 # jonction pipeline composée
|
| 170 |
-
from picarones.measurements import layout # noqa: F401 # GT ALTO requise (axe B)
|
| 171 |
-
from picarones.measurements import longitudinal # noqa: F401 # historique SQLite
|
| 172 |
-
from picarones.measurements import marginal_cost # noqa: F401 # paires de moteurs
|
| 173 |
-
from picarones.measurements import module_policy # noqa: F401 # outil d'audit
|
| 174 |
-
from picarones.measurements import ner_backends # noqa: F401 # factory backends NER
|
| 175 |
-
from picarones.measurements import rare_tokens # noqa: F401 # corpus-wide
|
| 176 |
from picarones.measurements import reliability # noqa: F401 # multi-runs
|
| 177 |
-
|
| 178 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Métriques officielles Picarones — paquet legacy en cours de retrait.
|
| 2 |
|
| 3 |
+
Ce paquet, historiquement nommé « Cercle 2 — logique métier », est
|
| 4 |
+
progressivement vidé au profit du paquet canonique
|
| 5 |
+
:mod:`picarones.evaluation.metrics`. Les modules qui restent ici ne
|
| 6 |
+
sont pas encore migrés (Catégorie B/C/D du plan de migration) :
|
|
|
|
| 7 |
|
| 8 |
+
Coeur (toujours legacy) :
|
|
|
|
|
|
|
| 9 |
|
| 10 |
- :mod:`metrics` compute_metrics (CER/WER/MER/WIL via jiwer)
|
| 11 |
- :mod:`statistics` Wilcoxon, Friedman, Nemenyi, Pareto, CDD
|
|
|
|
| 13 |
- :mod:`builtin_hooks` 12 hooks doc + 12 agrégateurs natifs
|
| 14 |
- :mod:`builtin_metrics` enregistrement métriques dans le registry
|
| 15 |
- :mod:`alto_metrics` métriques jonction TEXT/ALTO
|
|
|
|
| 16 |
|
| 17 |
+
Métriques philologiques (Catégorie B — register_metric singleton) :
|
| 18 |
|
| 19 |
+
- :mod:`mufi`, :mod:`abbreviations`, :mod:`unicode_blocks`,
|
| 20 |
+
:mod:`early_modern_typography`, :mod:`modern_archives`,
|
| 21 |
+
:mod:`reading_order`, :mod:`ner`, :mod:`readability`,
|
| 22 |
+
:mod:`searchability`.
|
|
|
|
|
|
|
| 23 |
|
| 24 |
+
Câblages adaptifs (suffixe ``_hooks``) :
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
+
- :mod:`readability_hooks`, :mod:`searchability_hooks`,
|
| 27 |
+
:mod:`numerical_sequences_hooks`, :mod:`philological_hooks`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
Pipelines composées (axe B) :
|
| 30 |
|
| 31 |
- :mod:`pipeline_benchmark`, :mod:`pipeline_comparison`,
|
| 32 |
+
:mod:`pipeline_spec_loader`.
|
| 33 |
+
|
| 34 |
+
Auxiliaires :
|
| 35 |
+
|
| 36 |
+
- :mod:`equivalence_profile`, :mod:`reliability`, :mod:`history`,
|
| 37 |
+
:mod:`robustness`.
|
| 38 |
+
|
| 39 |
+
Modules retirés (Lot D, mai 2026)
|
| 40 |
+
---------------------------------
|
| 41 |
+
Tous les shims qui ne faisaient que ré-exporter
|
| 42 |
+
``picarones.evaluation.metrics.X`` ont été supprimés en bloc :
|
| 43 |
+
``baseline_comparison``, ``calibration``, ``char_scores``,
|
| 44 |
+
``confusion``, ``cost_projection``, ``difficulty``,
|
| 45 |
+
``error_absorption``, ``hallucination``, ``image_predictive``,
|
| 46 |
+
``image_quality``, ``incremental_comparison``, ``inter_engine``,
|
| 47 |
+
``layout``, ``levers``, ``lexical_modernization``,
|
| 48 |
+
``line_metrics``, ``longitudinal``, ``marginal_cost``,
|
| 49 |
+
``module_policy``, ``ner_backends``, ``normalization``,
|
| 50 |
+
``numerical_sequences``, ``pricing``, ``rare_tokens``,
|
| 51 |
+
``robustness_projection``, ``roman_numerals``, ``specialization``,
|
| 52 |
+
``structure``, ``taxonomy``, ``taxonomy_comparison``,
|
| 53 |
+
``taxonomy_cooccurrence``, ``taxonomy_intra_doc``, ``throughput``,
|
| 54 |
+
``worst_lines``. Importer désormais depuis
|
| 55 |
+
:mod:`picarones.evaluation.metrics`.
|
| 56 |
|
| 57 |
Moteur narratif :
|
| 58 |
|
|
|
|
| 61 |
``FactType``, ``DetectorRegistry``) vit en couche 1 dans
|
| 62 |
:mod:`picarones.domain.facts`.
|
| 63 |
|
| 64 |
+
Voir :doc:`docs/explanation/architecture.md` pour la cartographie complète.
|
|
|
|
| 65 |
"""
|
| 66 |
|
| 67 |
# ──────────────────────────────────────────────────────────────────────────
|
| 68 |
+
# Cérémonie d'enregistrement des métriques typées dans le registre
|
| 69 |
+
# Sprint 34. Tout consommateur qui veut utiliser ``compute_at_junction``
|
| 70 |
+
# (``picarones.evaluation.metric_registry``) doit avoir importé soit
|
| 71 |
+
# ``picarones.measurements`` soit ``picarones.evaluation.metrics`` au
|
| 72 |
+
# moins une fois pour que les décorateurs ``@register_metric`` aient
|
| 73 |
+
# été exécutés.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
#
|
| 75 |
# Sans ces imports, ``compute_at_junction`` trouverait un registre vide
|
| 76 |
# et ne calculerait rien aux jonctions.
|
| 77 |
# ──────────────────────────────────────────────────────────────────────────
|
| 78 |
+
|
| 79 |
# Sprint 34 : cer / wer / mer / wil + stub TEXT→ALTO
|
| 80 |
from picarones.measurements import builtin_metrics # noqa: F401
|
| 81 |
+
# Sprints 55-60 : métriques philologiques (Catégorie B — restent ici).
|
| 82 |
from picarones.measurements import abbreviations # noqa: F401
|
| 83 |
from picarones.measurements import early_modern_typography # noqa: F401
|
| 84 |
from picarones.measurements import modern_archives # noqa: F401
|
| 85 |
from picarones.measurements import mufi # noqa: F401
|
|
|
|
| 86 |
from picarones.measurements import unicode_blocks # noqa: F401
|
| 87 |
# Sprint 53 : reading order F1. Sprints 38, 52 : NER, readability.
|
| 88 |
from picarones.measurements import ner # noqa: F401
|
|
|
|
| 92 |
# les reconstructeurs ALTO contre une GT ALTO du document.
|
| 93 |
from picarones.measurements import alto_metrics # noqa: F401
|
| 94 |
|
| 95 |
+
# Lot D — les décorateurs ``@register_metric`` du paquet canonique
|
| 96 |
+
# ``picarones.evaluation.metrics`` sont exécutés dès cet import,
|
| 97 |
+
# garantissant que le registre Sprint 34 contient toutes les métriques
|
| 98 |
+
# canoniques sans avoir besoin des shims supprimés.
|
| 99 |
+
import picarones.evaluation.metrics # noqa: F401
|
| 100 |
+
|
| 101 |
+
# Modules conservés en couche measurements (pas de shim canonique
|
| 102 |
+
# correspondant ; restent ici jusqu'à leur propre relocalisation).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
from picarones.measurements import equivalence_profile # noqa: F401 # curseur HTML
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
from picarones.measurements import reliability # noqa: F401 # multi-runs
|
| 105 |
+
|
| 106 |
+
# Modules canoniques re-exposés pour rétrocompat de
|
| 107 |
+
# ``from picarones.measurements import roman_numerals`` (utilisé par
|
| 108 |
+
# d'anciens callers internes ; au prochain Lot, ils migreront vers
|
| 109 |
+
# ``picarones.evaluation.metrics.roman_numerals``).
|
| 110 |
+
from picarones.evaluation.metrics import roman_numerals # noqa: F401
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.baseline_comparison``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.baseline_comparison`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.baseline_comparison import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -97,7 +97,7 @@ def calibration_from_engine_result(
|
|
| 97 |
normalisées à ``[0, 1]``. Les confidences négatives (Tesseract met
|
| 98 |
-1 pour les non-mots) sont ignorées.
|
| 99 |
"""
|
| 100 |
-
from picarones.
|
| 101 |
|
| 102 |
if not token_confidences:
|
| 103 |
return None
|
|
@@ -146,7 +146,7 @@ def calibration_from_engine_result(
|
|
| 146 |
requires_success=True,
|
| 147 |
)
|
| 148 |
def _confusion_hook(*, ground_truth, hypothesis, **_):
|
| 149 |
-
from picarones.
|
| 150 |
return build_confusion_matrix(ground_truth, hypothesis).as_dict()
|
| 151 |
|
| 152 |
|
|
@@ -157,7 +157,7 @@ def _confusion_hook(*, ground_truth, hypothesis, **_):
|
|
| 157 |
requires_success=True,
|
| 158 |
)
|
| 159 |
def _char_scores_hook(*, ground_truth, hypothesis, **_):
|
| 160 |
-
from picarones.
|
| 161 |
compute_diacritic_score,
|
| 162 |
compute_ligature_score,
|
| 163 |
)
|
|
@@ -173,7 +173,7 @@ def _char_scores_hook(*, ground_truth, hypothesis, **_):
|
|
| 173 |
requires_success=True,
|
| 174 |
)
|
| 175 |
def _taxonomy_hook(*, ground_truth, hypothesis, **_):
|
| 176 |
-
from picarones.
|
| 177 |
return classify_errors(ground_truth, hypothesis).as_dict()
|
| 178 |
|
| 179 |
|
|
@@ -184,7 +184,7 @@ def _taxonomy_hook(*, ground_truth, hypothesis, **_):
|
|
| 184 |
requires_success=True,
|
| 185 |
)
|
| 186 |
def _structure_hook(*, ground_truth, hypothesis, **_):
|
| 187 |
-
from picarones.
|
| 188 |
return analyze_structure(ground_truth, hypothesis).as_dict()
|
| 189 |
|
| 190 |
|
|
@@ -195,7 +195,7 @@ def _structure_hook(*, ground_truth, hypothesis, **_):
|
|
| 195 |
requires_success=True,
|
| 196 |
)
|
| 197 |
def _line_metrics_hook(*, ground_truth, hypothesis, **_):
|
| 198 |
-
from picarones.
|
| 199 |
return compute_line_metrics(ground_truth, hypothesis).as_dict()
|
| 200 |
|
| 201 |
|
|
@@ -206,7 +206,7 @@ def _line_metrics_hook(*, ground_truth, hypothesis, **_):
|
|
| 206 |
requires_success=True,
|
| 207 |
)
|
| 208 |
def _hallucination_hook(*, ground_truth, hypothesis, **_):
|
| 209 |
-
from picarones.
|
| 210 |
return compute_hallucination_metrics(ground_truth, hypothesis).as_dict()
|
| 211 |
|
| 212 |
|
|
@@ -230,7 +230,7 @@ def _calibration_hook(*, ground_truth, ocr_result, **_):
|
|
| 230 |
# résultat OCR (pour comparer un échec OCR à la qualité image).
|
| 231 |
)
|
| 232 |
def _image_quality_hook(*, image_path, **_):
|
| 233 |
-
from picarones.
|
| 234 |
iq = analyze_image_quality(image_path)
|
| 235 |
if iq.error is not None:
|
| 236 |
return None
|
|
@@ -294,7 +294,7 @@ def _readability_hook(*, ground_truth, hypothesis, corpus_lang, **_):
|
|
| 294 |
profiles=_STANDARD_PROFILES,
|
| 295 |
)
|
| 296 |
def _aggregate_confusion(doc_results: list) -> Optional[dict]:
|
| 297 |
-
from picarones.
|
| 298 |
ConfusionMatrix, aggregate_confusion_matrices,
|
| 299 |
)
|
| 300 |
try:
|
|
@@ -321,7 +321,7 @@ def _aggregate_confusion(doc_results: list) -> Optional[dict]:
|
|
| 321 |
profiles=_STANDARD_PROFILES,
|
| 322 |
)
|
| 323 |
def _aggregate_char_scores(doc_results: list) -> Optional[dict]:
|
| 324 |
-
from picarones.
|
| 325 |
DiacriticScore,
|
| 326 |
LigatureScore,
|
| 327 |
aggregate_diacritic_scores,
|
|
@@ -351,7 +351,7 @@ def _aggregate_char_scores(doc_results: list) -> Optional[dict]:
|
|
| 351 |
profiles=_STANDARD_PROFILES,
|
| 352 |
)
|
| 353 |
def _aggregate_taxonomy(doc_results: list) -> Optional[dict]:
|
| 354 |
-
from picarones.
|
| 355 |
results = [
|
| 356 |
TaxonomyResult.from_dict(dr.taxonomy)
|
| 357 |
for dr in doc_results
|
|
@@ -368,7 +368,7 @@ def _aggregate_taxonomy(doc_results: list) -> Optional[dict]:
|
|
| 368 |
profiles=_STANDARD_PROFILES,
|
| 369 |
)
|
| 370 |
def _aggregate_structure(doc_results: list) -> Optional[dict]:
|
| 371 |
-
from picarones.
|
| 372 |
results = [
|
| 373 |
StructureResult.from_dict(dr.structure)
|
| 374 |
for dr in doc_results
|
|
@@ -385,7 +385,7 @@ def _aggregate_structure(doc_results: list) -> Optional[dict]:
|
|
| 385 |
profiles=_STANDARD_PROFILES,
|
| 386 |
)
|
| 387 |
def _aggregate_image_quality(doc_results: list) -> Optional[dict]:
|
| 388 |
-
from picarones.
|
| 389 |
ImageQualityResult, aggregate_image_quality,
|
| 390 |
)
|
| 391 |
results = [
|
|
@@ -404,7 +404,7 @@ def _aggregate_image_quality(doc_results: list) -> Optional[dict]:
|
|
| 404 |
profiles=_STANDARD_PROFILES,
|
| 405 |
)
|
| 406 |
def _aggregate_line_metrics(doc_results: list) -> Optional[dict]:
|
| 407 |
-
from picarones.
|
| 408 |
LineMetrics, aggregate_line_metrics,
|
| 409 |
)
|
| 410 |
results = [
|
|
@@ -423,7 +423,7 @@ def _aggregate_line_metrics(doc_results: list) -> Optional[dict]:
|
|
| 423 |
profiles=_STANDARD_PROFILES,
|
| 424 |
)
|
| 425 |
def _aggregate_hallucination(doc_results: list) -> Optional[dict]:
|
| 426 |
-
from picarones.
|
| 427 |
HallucinationMetrics, aggregate_hallucination_metrics,
|
| 428 |
)
|
| 429 |
results = [
|
|
|
|
| 97 |
normalisées à ``[0, 1]``. Les confidences négatives (Tesseract met
|
| 98 |
-1 pour les non-mots) sont ignorées.
|
| 99 |
"""
|
| 100 |
+
from picarones.evaluation.metrics.calibration import compute_calibration_metrics
|
| 101 |
|
| 102 |
if not token_confidences:
|
| 103 |
return None
|
|
|
|
| 146 |
requires_success=True,
|
| 147 |
)
|
| 148 |
def _confusion_hook(*, ground_truth, hypothesis, **_):
|
| 149 |
+
from picarones.evaluation.metrics.confusion import build_confusion_matrix
|
| 150 |
return build_confusion_matrix(ground_truth, hypothesis).as_dict()
|
| 151 |
|
| 152 |
|
|
|
|
| 157 |
requires_success=True,
|
| 158 |
)
|
| 159 |
def _char_scores_hook(*, ground_truth, hypothesis, **_):
|
| 160 |
+
from picarones.evaluation.metrics.char_scores import (
|
| 161 |
compute_diacritic_score,
|
| 162 |
compute_ligature_score,
|
| 163 |
)
|
|
|
|
| 173 |
requires_success=True,
|
| 174 |
)
|
| 175 |
def _taxonomy_hook(*, ground_truth, hypothesis, **_):
|
| 176 |
+
from picarones.evaluation.metrics.taxonomy import classify_errors
|
| 177 |
return classify_errors(ground_truth, hypothesis).as_dict()
|
| 178 |
|
| 179 |
|
|
|
|
| 184 |
requires_success=True,
|
| 185 |
)
|
| 186 |
def _structure_hook(*, ground_truth, hypothesis, **_):
|
| 187 |
+
from picarones.evaluation.metrics.structure import analyze_structure
|
| 188 |
return analyze_structure(ground_truth, hypothesis).as_dict()
|
| 189 |
|
| 190 |
|
|
|
|
| 195 |
requires_success=True,
|
| 196 |
)
|
| 197 |
def _line_metrics_hook(*, ground_truth, hypothesis, **_):
|
| 198 |
+
from picarones.evaluation.metrics.line_metrics import compute_line_metrics
|
| 199 |
return compute_line_metrics(ground_truth, hypothesis).as_dict()
|
| 200 |
|
| 201 |
|
|
|
|
| 206 |
requires_success=True,
|
| 207 |
)
|
| 208 |
def _hallucination_hook(*, ground_truth, hypothesis, **_):
|
| 209 |
+
from picarones.evaluation.metrics.hallucination import compute_hallucination_metrics
|
| 210 |
return compute_hallucination_metrics(ground_truth, hypothesis).as_dict()
|
| 211 |
|
| 212 |
|
|
|
|
| 230 |
# résultat OCR (pour comparer un échec OCR à la qualité image).
|
| 231 |
)
|
| 232 |
def _image_quality_hook(*, image_path, **_):
|
| 233 |
+
from picarones.evaluation.metrics.image_quality import analyze_image_quality
|
| 234 |
iq = analyze_image_quality(image_path)
|
| 235 |
if iq.error is not None:
|
| 236 |
return None
|
|
|
|
| 294 |
profiles=_STANDARD_PROFILES,
|
| 295 |
)
|
| 296 |
def _aggregate_confusion(doc_results: list) -> Optional[dict]:
|
| 297 |
+
from picarones.evaluation.metrics.confusion import (
|
| 298 |
ConfusionMatrix, aggregate_confusion_matrices,
|
| 299 |
)
|
| 300 |
try:
|
|
|
|
| 321 |
profiles=_STANDARD_PROFILES,
|
| 322 |
)
|
| 323 |
def _aggregate_char_scores(doc_results: list) -> Optional[dict]:
|
| 324 |
+
from picarones.evaluation.metrics.char_scores import (
|
| 325 |
DiacriticScore,
|
| 326 |
LigatureScore,
|
| 327 |
aggregate_diacritic_scores,
|
|
|
|
| 351 |
profiles=_STANDARD_PROFILES,
|
| 352 |
)
|
| 353 |
def _aggregate_taxonomy(doc_results: list) -> Optional[dict]:
|
| 354 |
+
from picarones.evaluation.metrics.taxonomy import TaxonomyResult, aggregate_taxonomy
|
| 355 |
results = [
|
| 356 |
TaxonomyResult.from_dict(dr.taxonomy)
|
| 357 |
for dr in doc_results
|
|
|
|
| 368 |
profiles=_STANDARD_PROFILES,
|
| 369 |
)
|
| 370 |
def _aggregate_structure(doc_results: list) -> Optional[dict]:
|
| 371 |
+
from picarones.evaluation.metrics.structure import StructureResult, aggregate_structure
|
| 372 |
results = [
|
| 373 |
StructureResult.from_dict(dr.structure)
|
| 374 |
for dr in doc_results
|
|
|
|
| 385 |
profiles=_STANDARD_PROFILES,
|
| 386 |
)
|
| 387 |
def _aggregate_image_quality(doc_results: list) -> Optional[dict]:
|
| 388 |
+
from picarones.evaluation.metrics.image_quality import (
|
| 389 |
ImageQualityResult, aggregate_image_quality,
|
| 390 |
)
|
| 391 |
results = [
|
|
|
|
| 404 |
profiles=_STANDARD_PROFILES,
|
| 405 |
)
|
| 406 |
def _aggregate_line_metrics(doc_results: list) -> Optional[dict]:
|
| 407 |
+
from picarones.evaluation.metrics.line_metrics import (
|
| 408 |
LineMetrics, aggregate_line_metrics,
|
| 409 |
)
|
| 410 |
results = [
|
|
|
|
| 423 |
profiles=_STANDARD_PROFILES,
|
| 424 |
)
|
| 425 |
def _aggregate_hallucination(doc_results: list) -> Optional[dict]:
|
| 426 |
+
from picarones.evaluation.metrics.hallucination import (
|
| 427 |
HallucinationMetrics, aggregate_hallucination_metrics,
|
| 428 |
)
|
| 429 |
results = [
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.calibration``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.calibration`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.calibration import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,34 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.char_scores`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.char_scores`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.char_scores import ( # noqa: F401
|
| 11 |
-
LIGATURE_TABLE,
|
| 12 |
-
DIACRITIC_MAP,
|
| 13 |
-
LigatureScore,
|
| 14 |
-
DiacriticScore,
|
| 15 |
-
compute_ligature_score,
|
| 16 |
-
compute_diacritic_score,
|
| 17 |
-
aggregate_ligature_scores,
|
| 18 |
-
aggregate_diacritic_scores,
|
| 19 |
-
_ALL_LIGATURES,
|
| 20 |
-
_SEQ_TO_LIGATURE,
|
| 21 |
-
_build_diacritic_map,
|
| 22 |
-
_ALL_DIACRITICS,
|
| 23 |
-
_LIGATURE_SET,
|
| 24 |
-
_check_char_at_context,
|
| 25 |
-
)
|
| 26 |
-
|
| 27 |
-
warnings.warn(
|
| 28 |
-
"picarones.measurements.char_scores is deprecated and will be removed in 2.0. "
|
| 29 |
-
"Import from picarones.evaluation.metrics.char_scores instead.",
|
| 30 |
-
DeprecationWarning,
|
| 31 |
-
stacklevel=2,
|
| 32 |
-
)
|
| 33 |
-
|
| 34 |
-
__all__ = ['LIGATURE_TABLE', 'DIACRITIC_MAP', 'LigatureScore', 'DiacriticScore', 'compute_ligature_score', 'compute_diacritic_score', 'aggregate_ligature_scores', 'aggregate_diacritic_scores']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.confusion``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.confusion`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.confusion import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,26 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.cost_projection`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.cost_projection`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.cost_projection import ( # noqa: F401
|
| 11 |
-
ProjectedCost,
|
| 12 |
-
project_cost_total,
|
| 13 |
-
project_co2_total,
|
| 14 |
-
project_engine,
|
| 15 |
-
project_all_engines,
|
| 16 |
-
cost_gap_table,
|
| 17 |
-
)
|
| 18 |
-
|
| 19 |
-
warnings.warn(
|
| 20 |
-
"picarones.measurements.cost_projection is deprecated and will be removed in 2.0. "
|
| 21 |
-
"Import from picarones.evaluation.metrics.cost_projection instead.",
|
| 22 |
-
DeprecationWarning,
|
| 23 |
-
stacklevel=2,
|
| 24 |
-
)
|
| 25 |
-
|
| 26 |
-
__all__ = ['ProjectedCost', 'project_cost_total', 'project_co2_total', 'project_engine', 'project_all_engines', 'cost_gap_table']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,30 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.difficulty`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.difficulty`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.difficulty import ( # noqa: F401
|
| 11 |
-
DifficultyScore,
|
| 12 |
-
compute_difficulty_score,
|
| 13 |
-
compute_all_difficulties,
|
| 14 |
-
difficulty_label,
|
| 15 |
-
_W_VARIANCE,
|
| 16 |
-
_W_QUALITY,
|
| 17 |
-
_W_DENSITY,
|
| 18 |
-
_SPECIAL_CHARS_RE,
|
| 19 |
-
_special_char_density,
|
| 20 |
-
_variance,
|
| 21 |
-
)
|
| 22 |
-
|
| 23 |
-
warnings.warn(
|
| 24 |
-
"picarones.measurements.difficulty is deprecated and will be removed in 2.0. "
|
| 25 |
-
"Import from picarones.evaluation.metrics.difficulty instead.",
|
| 26 |
-
DeprecationWarning,
|
| 27 |
-
stacklevel=2,
|
| 28 |
-
)
|
| 29 |
-
|
| 30 |
-
__all__ = ['DifficultyScore', 'compute_difficulty_score', 'compute_all_difficulties', 'difficulty_label']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -42,7 +42,7 @@ import logging
|
|
| 42 |
from dataclasses import dataclass
|
| 43 |
from typing import Iterable, Optional
|
| 44 |
|
| 45 |
-
from picarones.
|
| 46 |
DIPLOMATIC_EN_EARLY_MODERN,
|
| 47 |
DIPLOMATIC_FR_EARLY_MODERN,
|
| 48 |
DIPLOMATIC_LATIN_MEDIEVAL,
|
|
|
|
| 42 |
from dataclasses import dataclass
|
| 43 |
from typing import Iterable, Optional
|
| 44 |
|
| 45 |
+
from picarones.evaluation.metrics.normalization import (
|
| 46 |
DIPLOMATIC_EN_EARLY_MODERN,
|
| 47 |
DIPLOMATIC_FR_EARLY_MODERN,
|
| 48 |
DIPLOMATIC_LATIN_MEDIEVAL,
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.error_absorption``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.error_absorption`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.error_absorption import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.hallucination``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.hallucination`` est conservé
|
| 5 |
-
pour ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.hallucination import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.image_predictive``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.image_predictive`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.image_predictive import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,14 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.image_quality``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.image_quality`` est conservé
|
| 5 |
-
pour ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
|
| 7 |
-
Ré-expose explicitement ``_global_quality_score`` (symbole privé
|
| 8 |
-
utilisé downstream).
|
| 9 |
-
"""
|
| 10 |
-
|
| 11 |
-
from __future__ import annotations
|
| 12 |
-
|
| 13 |
-
from picarones.evaluation.metrics.image_quality import * # noqa: F401,F403
|
| 14 |
-
from picarones.evaluation.metrics.image_quality import _global_quality_score # noqa: F401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.incremental_comparison``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.incremental_comparison`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.incremental_comparison import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.inter_engine``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.inter_engine`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.inter_engine import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,14 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.layout``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.layout`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
|
| 7 |
-
Ré-expose explicitement le symbole privé ``_iou_bbox`` qu'au moins
|
| 8 |
-
un test importe directement.
|
| 9 |
-
"""
|
| 10 |
-
|
| 11 |
-
from __future__ import annotations
|
| 12 |
-
|
| 13 |
-
from picarones.evaluation.metrics.layout import * # noqa: F401,F403
|
| 14 |
-
from picarones.evaluation.metrics.layout import _iou_bbox # noqa: F401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.levers``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.levers`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.levers import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.lexical_modernization``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.lexical_modernization`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.lexical_modernization import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.line_metrics``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.line_metrics`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.line_metrics import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.longitudinal``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.longitudinal`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.longitudinal import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.marginal_cost``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.marginal_cost`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.marginal_cost import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -155,7 +155,7 @@ def compute_metrics(
|
|
| 155 |
cer_diplomatic: Optional[float] = None
|
| 156 |
diplomatic_profile_name: Optional[str] = None
|
| 157 |
try:
|
| 158 |
-
from picarones.
|
| 159 |
profile = normalization_profile or DEFAULT_DIPLOMATIC_PROFILE
|
| 160 |
ref_diplo = profile.normalize(reference)
|
| 161 |
hyp_diplo = profile.normalize(hypothesis)
|
|
@@ -197,4 +197,4 @@ __all__ = ["MetricsResult", "aggregate_metrics", "compute_metrics"]
|
|
| 197 |
# Import paresseux pour éviter les imports circulaires
|
| 198 |
from typing import TYPE_CHECKING
|
| 199 |
if TYPE_CHECKING:
|
| 200 |
-
from picarones.
|
|
|
|
| 155 |
cer_diplomatic: Optional[float] = None
|
| 156 |
diplomatic_profile_name: Optional[str] = None
|
| 157 |
try:
|
| 158 |
+
from picarones.evaluation.metrics.normalization import DEFAULT_DIPLOMATIC_PROFILE
|
| 159 |
profile = normalization_profile or DEFAULT_DIPLOMATIC_PROFILE
|
| 160 |
ref_diplo = profile.normalize(reference)
|
| 161 |
hyp_diplo = profile.normalize(hypothesis)
|
|
|
|
| 197 |
# Import paresseux pour éviter les imports circulaires
|
| 198 |
from typing import TYPE_CHECKING
|
| 199 |
if TYPE_CHECKING:
|
| 200 |
+
from picarones.evaluation.metrics.normalization import NormalizationProfile
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.module_policy``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.module_policy`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.module_policy import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,25 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.ner_backends`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.ner_backends`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.ner_backends import ( # noqa: F401
|
| 11 |
-
EntityExtractor,
|
| 12 |
-
SpacyEntityExtractor,
|
| 13 |
-
SPACY_PROFILES,
|
| 14 |
-
get_extractor,
|
| 15 |
-
is_spacy_available,
|
| 16 |
-
)
|
| 17 |
-
|
| 18 |
-
warnings.warn(
|
| 19 |
-
"picarones.measurements.ner_backends is deprecated and will be removed in 2.0. "
|
| 20 |
-
"Import from picarones.evaluation.metrics.ner_backends instead.",
|
| 21 |
-
DeprecationWarning,
|
| 22 |
-
stacklevel=2,
|
| 23 |
-
)
|
| 24 |
-
|
| 25 |
-
__all__ = ['EntityExtractor', 'SpacyEntityExtractor', 'SPACY_PROFILES', 'get_extractor', 'is_spacy_available']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,33 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.normalization`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.normalization`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.normalization import ( # noqa: F401
|
| 11 |
-
NormalizationProfile,
|
| 12 |
-
DIPLOMATIC_FR_MEDIEVAL,
|
| 13 |
-
DIPLOMATIC_FR_EARLY_MODERN,
|
| 14 |
-
DIPLOMATIC_LATIN_MEDIEVAL,
|
| 15 |
-
DIPLOMATIC_MINIMAL,
|
| 16 |
-
DIPLOMATIC_EN_EARLY_MODERN,
|
| 17 |
-
DIPLOMATIC_EN_MEDIEVAL,
|
| 18 |
-
DIPLOMATIC_EN_SECRETARY,
|
| 19 |
-
NORMALIZATION_PROFILES,
|
| 20 |
-
DEFAULT_DIPLOMATIC_PROFILE,
|
| 21 |
-
get_builtin_profile,
|
| 22 |
-
_parse_exclude_chars,
|
| 23 |
-
_apply_diplomatic_table,
|
| 24 |
-
)
|
| 25 |
-
|
| 26 |
-
warnings.warn(
|
| 27 |
-
"picarones.measurements.normalization is deprecated and will be removed in 2.0. "
|
| 28 |
-
"Import from picarones.evaluation.metrics.normalization instead.",
|
| 29 |
-
DeprecationWarning,
|
| 30 |
-
stacklevel=2,
|
| 31 |
-
)
|
| 32 |
-
|
| 33 |
-
__all__ = ['NormalizationProfile', 'DIPLOMATIC_FR_MEDIEVAL', 'DIPLOMATIC_FR_EARLY_MODERN', 'DIPLOMATIC_LATIN_MEDIEVAL', 'DIPLOMATIC_MINIMAL', 'DIPLOMATIC_EN_EARLY_MODERN', 'DIPLOMATIC_EN_MEDIEVAL', 'DIPLOMATIC_EN_SECRETARY', 'NORMALIZATION_PROFILES', 'DEFAULT_DIPLOMATIC_PROFILE', 'get_builtin_profile', '_parse_exclude_chars', '_apply_diplomatic_table']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,18 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.numerical_sequences`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.numerical_sequences`.
|
| 4 |
-
Phase 5.C.batch7 du retrait du legacy.
|
| 5 |
-
"""
|
| 6 |
-
|
| 7 |
-
from __future__ import annotations
|
| 8 |
-
|
| 9 |
-
import warnings
|
| 10 |
-
|
| 11 |
-
from picarones.evaluation.metrics.numerical_sequences import * # noqa: F401, F403
|
| 12 |
-
|
| 13 |
-
warnings.warn(
|
| 14 |
-
"picarones.measurements.numerical_sequences is deprecated and will be removed in 2.0. "
|
| 15 |
-
"Import from picarones.evaluation.metrics.numerical_sequences instead.",
|
| 16 |
-
DeprecationWarning,
|
| 17 |
-
stacklevel=2,
|
| 18 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,15 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.pricing``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.pricing`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
|
| 7 |
-
Ce module ré-expose **explicitement** le symbole privé
|
| 8 |
-
``_DEFAULT_PRICING_PATH`` qu'au moins un consommateur importe
|
| 9 |
-
directement (cf. tests).
|
| 10 |
-
"""
|
| 11 |
-
|
| 12 |
-
from __future__ import annotations
|
| 13 |
-
|
| 14 |
-
from picarones.evaluation.metrics.pricing import * # noqa: F401,F403
|
| 15 |
-
from picarones.evaluation.metrics.pricing import _DEFAULT_PRICING_PATH # noqa: F401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.rare_tokens``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.rare_tokens`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.rare_tokens import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,18 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.robustness_projection``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.robustness_projection`` est
|
| 5 |
-
conservé pour ne casser aucun consommateur. Au S22, ce re-export
|
| 6 |
-
disparaîtra.
|
| 7 |
-
|
| 8 |
-
Ré-expose explicitement ``_extract_quality_value`` et
|
| 9 |
-
``_interpolate_cer`` (symboles privés utilisés downstream).
|
| 10 |
-
"""
|
| 11 |
-
|
| 12 |
-
from __future__ import annotations
|
| 13 |
-
|
| 14 |
-
from picarones.evaluation.metrics.robustness_projection import * # noqa: F401,F403
|
| 15 |
-
from picarones.evaluation.metrics.robustness_projection import ( # noqa: F401
|
| 16 |
-
_extract_quality_value,
|
| 17 |
-
_interpolate_cer,
|
| 18 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,18 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.roman_numerals`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.roman_numerals`.
|
| 4 |
-
Phase 5.C.batch7 du retrait du legacy.
|
| 5 |
-
"""
|
| 6 |
-
|
| 7 |
-
from __future__ import annotations
|
| 8 |
-
|
| 9 |
-
import warnings
|
| 10 |
-
|
| 11 |
-
from picarones.evaluation.metrics.roman_numerals import * # noqa: F401, F403
|
| 12 |
-
|
| 13 |
-
warnings.warn(
|
| 14 |
-
"picarones.measurements.roman_numerals is deprecated and will be removed in 2.0. "
|
| 15 |
-
"Import from picarones.evaluation.metrics.roman_numerals instead.",
|
| 16 |
-
DeprecationWarning,
|
| 17 |
-
stacklevel=2,
|
| 18 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -151,7 +151,7 @@ def run_benchmark(
|
|
| 151 |
# éviter de re-résoudre N fois côté workers.
|
| 152 |
norm_profile_obj = None
|
| 153 |
if normalization_profile is not None:
|
| 154 |
-
from picarones.
|
| 155 |
norm_profile_obj = get_builtin_profile(normalization_profile)
|
| 156 |
|
| 157 |
def _is_cancelled() -> bool:
|
|
@@ -435,7 +435,7 @@ def run_benchmark(
|
|
| 435 |
inter_engine_payload: Optional[dict] = None
|
| 436 |
if len(engine_reports) >= 2:
|
| 437 |
try:
|
| 438 |
-
from picarones.
|
| 439 |
|
| 440 |
taxonomy_distros = {
|
| 441 |
report.engine_name: (
|
|
|
|
| 151 |
# éviter de re-résoudre N fois côté workers.
|
| 152 |
norm_profile_obj = None
|
| 153 |
if normalization_profile is not None:
|
| 154 |
+
from picarones.evaluation.metrics.normalization import get_builtin_profile
|
| 155 |
norm_profile_obj = get_builtin_profile(normalization_profile)
|
| 156 |
|
| 157 |
def _is_cancelled() -> bool:
|
|
|
|
| 435 |
inter_engine_payload: Optional[dict] = None
|
| 436 |
if len(engine_reports) >= 2:
|
| 437 |
try:
|
| 438 |
+
from picarones.evaluation.metrics.inter_engine import compute_inter_engine_analysis
|
| 439 |
|
| 440 |
taxonomy_distros = {
|
| 441 |
report.engine_name: (
|
|
@@ -1,25 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.specialization`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.specialization`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.specialization import ( # noqa: F401
|
| 11 |
-
DEFAULT_THRESHOLDS,
|
| 12 |
-
compute_specialization_score,
|
| 13 |
-
classify_specialization,
|
| 14 |
-
compute_specialization_matrix,
|
| 15 |
-
top_specialized_pairs,
|
| 16 |
-
)
|
| 17 |
-
|
| 18 |
-
warnings.warn(
|
| 19 |
-
"picarones.measurements.specialization is deprecated and will be removed in 2.0. "
|
| 20 |
-
"Import from picarones.evaluation.metrics.specialization instead.",
|
| 21 |
-
DeprecationWarning,
|
| 22 |
-
stacklevel=2,
|
| 23 |
-
)
|
| 24 |
-
|
| 25 |
-
__all__ = ['DEFAULT_THRESHOLDS', 'compute_specialization_score', 'classify_specialization', 'compute_specialization_matrix', 'top_specialized_pairs']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,26 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.structure`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.structure`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.structure import ( # noqa: F401
|
| 11 |
-
StructureResult,
|
| 12 |
-
analyze_structure,
|
| 13 |
-
aggregate_structure,
|
| 14 |
-
_count_line_changes,
|
| 15 |
-
_reading_order_score,
|
| 16 |
-
_paragraph_conservation_score,
|
| 17 |
-
)
|
| 18 |
-
|
| 19 |
-
warnings.warn(
|
| 20 |
-
"picarones.measurements.structure is deprecated and will be removed in 2.0. "
|
| 21 |
-
"Import from picarones.evaluation.metrics.structure instead.",
|
| 22 |
-
DeprecationWarning,
|
| 23 |
-
stacklevel=2,
|
| 24 |
-
)
|
| 25 |
-
|
| 26 |
-
__all__ = ['StructureResult', 'analyze_structure', 'aggregate_structure']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,33 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.taxonomy`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.taxonomy`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.taxonomy import ( # noqa: F401
|
| 11 |
-
VISUAL_CONFUSIONS,
|
| 12 |
-
TaxonomyResult,
|
| 13 |
-
ERROR_CLASSES,
|
| 14 |
-
classify_errors,
|
| 15 |
-
aggregate_taxonomy,
|
| 16 |
-
_VISUAL_PAIRS,
|
| 17 |
-
_LATIN_BASIC,
|
| 18 |
-
_classify_word_error,
|
| 19 |
-
_is_ligature_error,
|
| 20 |
-
_is_abbreviation_error,
|
| 21 |
-
_is_diacritic_error,
|
| 22 |
-
_is_visual_confusion,
|
| 23 |
-
_is_oov_word,
|
| 24 |
-
)
|
| 25 |
-
|
| 26 |
-
warnings.warn(
|
| 27 |
-
"picarones.measurements.taxonomy is deprecated and will be removed in 2.0. "
|
| 28 |
-
"Import from picarones.evaluation.metrics.taxonomy instead.",
|
| 29 |
-
DeprecationWarning,
|
| 30 |
-
stacklevel=2,
|
| 31 |
-
)
|
| 32 |
-
|
| 33 |
-
__all__ = ['VISUAL_CONFUSIONS', 'TaxonomyResult', 'ERROR_CLASSES', 'classify_errors', 'aggregate_taxonomy']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.taxonomy_comparison``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.taxonomy_comparison`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.taxonomy_comparison import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.taxonomy_cooccurrence``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.taxonomy_cooccurrence`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.taxonomy_cooccurrence import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,23 +0,0 @@
|
|
| 1 |
-
"""``picarones.measurements.taxonomy_intra_doc`` — shim re-export (déprécié, suppression 2.0).
|
| 2 |
-
|
| 3 |
-
Canonique : :mod:`picarones.evaluation.metrics.taxonomy_intra_doc`.
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from __future__ import annotations
|
| 7 |
-
|
| 8 |
-
import warnings
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.taxonomy_intra_doc import ( # noqa: F401
|
| 11 |
-
compute_taxonomy_position_heatmap,
|
| 12 |
-
_classify_word_pair,
|
| 13 |
-
_bin_for_position,
|
| 14 |
-
)
|
| 15 |
-
|
| 16 |
-
warnings.warn(
|
| 17 |
-
"picarones.measurements.taxonomy_intra_doc is deprecated and will be removed in 2.0. "
|
| 18 |
-
"Import from picarones.evaluation.metrics.taxonomy_intra_doc instead.",
|
| 19 |
-
DeprecationWarning,
|
| 20 |
-
stacklevel=2,
|
| 21 |
-
)
|
| 22 |
-
|
| 23 |
-
__all__ = ['compute_taxonomy_position_heatmap']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.throughput``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.throughput`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.throughput import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
"""Re-export — Sprint A14-S10. Le contenu canonique vit dans
|
| 2 |
-
``picarones.evaluation.metrics.worst_lines``.
|
| 3 |
-
|
| 4 |
-
L'ancien chemin ``picarones.measurements.worst_lines`` est conservé pour
|
| 5 |
-
ne casser aucun consommateur. Au S22, ce re-export disparaîtra.
|
| 6 |
-
"""
|
| 7 |
-
|
| 8 |
-
from __future__ import annotations
|
| 9 |
-
|
| 10 |
-
from picarones.evaluation.metrics.worst_lines import * # noqa: F401,F403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -31,7 +31,7 @@ l'utilisateur depuis son benchmark de pipeline composée :
|
|
| 31 |
|
| 32 |
.. code-block:: python
|
| 33 |
|
| 34 |
-
from picarones.
|
| 35 |
compute_error_absorption, aggregate_error_absorption,
|
| 36 |
)
|
| 37 |
from picarones.reports_v2.html.renderers.error_absorption import (
|
|
|
|
| 31 |
|
| 32 |
.. code-block:: python
|
| 33 |
|
| 34 |
+
from picarones.evaluation.metrics.error_absorption import (
|
| 35 |
compute_error_absorption, aggregate_error_absorption,
|
| 36 |
)
|
| 37 |
from picarones.reports_v2.html.renderers.error_absorption import (
|
|
@@ -27,7 +27,7 @@ Module pur — l'utilisateur compose :
|
|
| 27 |
|
| 28 |
.. code-block:: python
|
| 29 |
|
| 30 |
-
from picarones.
|
| 31 |
from picarones.reports_v2.html.renderers.image_predictive import (
|
| 32 |
build_image_predictive_html,
|
| 33 |
)
|
|
|
|
| 27 |
|
| 28 |
.. code-block:: python
|
| 29 |
|
| 30 |
+
from picarones.evaluation.metrics.image_predictive import aggregate_corpus_predictive
|
| 31 |
from picarones.reports_v2.html.renderers.image_predictive import (
|
| 32 |
build_image_predictive_html,
|
| 33 |
)
|
|
@@ -25,7 +25,7 @@ Module pur — l'utilisateur compose :
|
|
| 25 |
|
| 26 |
.. code-block:: python
|
| 27 |
|
| 28 |
-
from picarones.
|
| 29 |
PipelineRun, compare_isolated_effect,
|
| 30 |
)
|
| 31 |
from picarones.reports_v2.html.renderers.incremental_comparison import (
|
|
|
|
| 25 |
|
| 26 |
.. code-block:: python
|
| 27 |
|
| 28 |
+
from picarones.evaluation.metrics.incremental_comparison import (
|
| 29 |
PipelineRun, compare_isolated_effect,
|
| 30 |
)
|
| 31 |
from picarones.reports_v2.html.renderers.incremental_comparison import (
|
|
@@ -25,7 +25,7 @@ Module pur — l'utilisateur compose :
|
|
| 25 |
.. code-block:: python
|
| 26 |
|
| 27 |
from picarones.measurements.history import BenchmarkHistory
|
| 28 |
-
from picarones.
|
| 29 |
from picarones.reports_v2.html.renderers.longitudinal import build_longitudinal_html
|
| 30 |
|
| 31 |
hist = BenchmarkHistory(db_path)
|
|
|
|
| 25 |
.. code-block:: python
|
| 26 |
|
| 27 |
from picarones.measurements.history import BenchmarkHistory
|
| 28 |
+
from picarones.evaluation.metrics.longitudinal import compute_corpus_longitudinal
|
| 29 |
from picarones.reports_v2.html.renderers.longitudinal import build_longitudinal_html
|
| 30 |
|
| 31 |
hist = BenchmarkHistory(db_path)
|
|
@@ -30,7 +30,7 @@ Module pur — l'utilisateur compose la liste depuis sa
|
|
| 30 |
|
| 31 |
.. code-block:: python
|
| 32 |
|
| 33 |
-
from picarones.
|
| 34 |
from picarones.reports_v2.html.renderers.module_audit import build_module_audit_html
|
| 35 |
|
| 36 |
audits = []
|
|
|
|
| 30 |
|
| 31 |
.. code-block:: python
|
| 32 |
|
| 33 |
+
from picarones.evaluation.metrics.module_policy import audit_module
|
| 34 |
from picarones.reports_v2.html.renderers.module_audit import build_module_audit_html
|
| 35 |
|
| 36 |
audits = []
|
|
@@ -21,7 +21,7 @@ l'utilisateur compose :
|
|
| 21 |
.. code-block:: python
|
| 22 |
|
| 23 |
from picarones.measurements.robustness import analyze_robustness
|
| 24 |
-
from picarones.
|
| 25 |
project_robustness_on_corpus,
|
| 26 |
aggregate_projection_per_engine,
|
| 27 |
)
|
|
|
|
| 21 |
.. code-block:: python
|
| 22 |
|
| 23 |
from picarones.measurements.robustness import analyze_robustness
|
| 24 |
+
from picarones.evaluation.metrics.robustness_projection import (
|
| 25 |
project_robustness_on_corpus,
|
| 26 |
aggregate_projection_per_engine,
|
| 27 |
)
|