Spaces:

Ma-Ri-Ba-Ku
/

Picarones

Sleeping

Claude commited on about 1 month ago

Commit

b5c2eaf

unverified ·

1 Parent(s): cfac168

feat(adapters/ocr): Sprint A14-S33 — GoogleVisionAdapter natif (no shim)

Migration native du legacy picarones.engines.google_vision vers
BaseOCRAdapter (S26). Pas un shim.

picarones/adapters/ocr/google_vision.py
---------------------------------------
- GoogleVisionAdapter(BaseOCRAdapter), execution_mode = "io".
- Constructeur kwargs-only : name, language_hints (défaut ["fr"]),
feature_type (DOCUMENT_TEXT_DETECTION ou TEXT_DETECTION),
api_key/credentials_path (overrides des env GOOGLE_API_KEY/
GOOGLE_APPLICATION_CREDENTIALS), timeout_seconds.
- Validation au constructeur : name alphanum + _-, feature_type dans
l'ensemble valide, timeout > 0.
- Routing : creds → SDK google-cloud-vision ; sinon api_key → REST
via urllib ; sinon OCRAdapterError.
- DOCUMENT_TEXT_DETECTION → fullTextAnnotation.text concaténé.
TEXT_DETECTION → textAnnotations[0].description.
- SDK ImportError → OCRAdapterError clair avec pip install.
- Erreurs HTTP (HTTPError + autres) wrappées dans OCRAdapterError.
- Erreur dans response.error → OCRAdapterError avec message Google.
- Écrit dans <stem>.<name>.txt à côté de l'image.
- Artifact id "<doc>:<name>:raw_text".

Tests S33 dédiés (29 nouveaux)
------------------------------
- Constructor : defaults, custom name/feature_type/language_hints,
rejet feature_type invalide, rejet name vide/invalide, rejet timeout
non-positif.
- Contract : isinstance BaseOCRAdapter, input/output_types,
execution_mode = "io".
- Auth : pas d'auth → OCRAdapterError, explicite credentials_path
prend priorité sur env, env fallback, explicit api_key prend
priorité.
- InputValidation : IMAGE absent, sans URI, image inexistante → tous
OCRAdapterError.
- REST : DOCUMENT_TEXT_DETECTION extrait fullTextAnnotation.text,
TEXT_DETECTION extrait textAnnotations[0].description, responses
vides → text vide, error dans response → OCRAdapterError, écriture
<stem>.<name>.txt.
- SDK : credentials_path route vers SDK, SDK manquant →
OCRAdapterError, SDK exception → OCRAdapterError wrappé.
- ArtifactID : utilise adapter name.

Pas de confidences pour S33
---------------------------
Confidences (legacy S50 Word.confidence dans pages.blocks.paragraphs)
reportées au sprint dédié ConfidenceArtifact.

Tests : 4700 passed, 11 skipped (vs 4671 avant : +29 S33).
Lint : ruff check picarones/ tests/ → All checks passed.

https://claude.ai/code/session_011XQZNitg1rCgia8ZD1a2hP

Files changed (4) hide show

README.md +1 -1
picarones/adapters/ocr/__init__.py +2 -0
picarones/adapters/ocr/google_vision.py +298 -0
tests/adapters/ocr/test_sprint_a14_s33_google_vision_adapter.py +418 -0

README.md CHANGED Viewed

@@ -396,7 +396,7 @@ ruff check picarones/ tests/
 python -m mypy picarones/core/
 ```
-**Test suite**: ~4690 tests, ~3 min on a modern laptop. Coverage
 floor at 85% (currently ~87%). The `network` marker excludes tests
 requiring live HTTP. A handful of tests depend on optional engines
 (`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when

 python -m mypy picarones/core/
 ```
+**Test suite**: ~4720 tests, ~3 min on a modern laptop. Coverage
 floor at 85% (currently ~87%). The `network` marker excludes tests
 requiring live HTTP. A handful of tests depend on optional engines
 (`pero-ocr`, `pytesseract`) and are skipped/fail gracefully when

picarones/adapters/ocr/__init__.py CHANGED Viewed

@@ -20,6 +20,7 @@ dédiés, **natifs** au nouveau contrat (pas de shim sur le legacy
 from __future__ import annotations
 from picarones.adapters.ocr.base import BaseOCRAdapter, OCRAdapterError
 from picarones.adapters.ocr.mistral_ocr import MistralOCRAdapter
 from picarones.adapters.ocr.pero_ocr import PeroOCRAdapter
 from picarones.adapters.ocr.precomputed import PrecomputedTextAdapter
@@ -28,6 +29,7 @@ from picarones.adapters.ocr.tesseract import TesseractAdapter
 __all__ = [
     "BaseOCRAdapter",
     "OCRAdapterError",
     "MistralOCRAdapter",
     "PeroOCRAdapter",
     "PrecomputedTextAdapter",

 from __future__ import annotations
 from picarones.adapters.ocr.base import BaseOCRAdapter, OCRAdapterError
+from picarones.adapters.ocr.google_vision import GoogleVisionAdapter
 from picarones.adapters.ocr.mistral_ocr import MistralOCRAdapter
 from picarones.adapters.ocr.pero_ocr import PeroOCRAdapter
 from picarones.adapters.ocr.precomputed import PrecomputedTextAdapter
 __all__ = [
     "BaseOCRAdapter",
     "OCRAdapterError",
+    "GoogleVisionAdapter",
     "MistralOCRAdapter",
     "PeroOCRAdapter",
     "PrecomputedTextAdapter",

picarones/adapters/ocr/google_vision.py ADDED Viewed

	@@ -0,0 +1,298 @@

+"""``GoogleVisionAdapter`` natif — Sprint A14-S33.
+Migration native du legacy ``picarones.engines.google_vision.GoogleVisionEngine``
+vers le contrat ``BaseOCRAdapter`` (S26).  **Pas un shim**.
+Le legacy reste en place jusqu'au S46.
+Cas d'usage BnF
+---------------
+Google Cloud Vision propose deux modes d'OCR :
+- ``DOCUMENT_TEXT_DETECTION`` (défaut) : optimisé pour les textes
+  denses et multilinguistiques — retourne une ``fullTextAnnotation``
+  hiérarchique (pages → blocks → paragraphs → words → symbols) avec
+  un texte plat ``text``.
+- ``TEXT_DETECTION`` : mode court, retourne uniquement les
+  ``textAnnotations[0].description``.
+L'adapter route automatiquement vers SDK (auth service account) ou
+REST direct (auth clé API) selon la configuration disponible.
+Configuration
+-------------
+Constructeur :
+- ``name`` (défaut ``"google_vision"``).
+- ``language_hints`` (défaut ``["fr"]``) : suggestions Vision API.
+- ``feature_type`` (défaut ``"DOCUMENT_TEXT_DETECTION"``).
+- ``api_key`` : clé API Google.  Si ``None``, lit ``GOOGLE_API_KEY``.
+- ``credentials_path`` : chemin vers un service account JSON.  Si
+  ``None``, lit ``GOOGLE_APPLICATION_CREDENTIALS``.
+- ``timeout_seconds`` (défaut 60).
+Au moins une des deux authentifications (SDK ou REST) doit être
+disponible.
+Anti-sur-ingénierie
+-------------------
+- Pas d'extraction de confidences (legacy S50 — reportée).
+- Pas de pré-validation du JSON service account — le SDK le fait.
+- Pas de support batch — un appel par image.
+"""
+from __future__ import annotations
+import base64
+import json
+import os
+import urllib.error
+import urllib.request
+from pathlib import Path
+from typing import Any
+from picarones.adapters.ocr.base import BaseOCRAdapter, OCRAdapterError
+from picarones.domain.artifacts import Artifact, ArtifactType
+_VALID_FEATURE_TYPES = frozenset({"DOCUMENT_TEXT_DETECTION", "TEXT_DETECTION"})
+class GoogleVisionAdapter(BaseOCRAdapter):
+    """Adapter Google Cloud Vision natif au contrat S26.
+    Parameters
+    ----------
+    name:
+        Identifiant lisible.  Défaut ``"google_vision"``.
+    language_hints:
+        Suggestions Vision API.  Défaut ``["fr"]``.
+    feature_type:
+        ``"DOCUMENT_TEXT_DETECTION"`` (défaut) ou ``"TEXT_DETECTION"``.
+    api_key:
+        Clé API explicite.  Si ``None``, lit ``GOOGLE_API_KEY``.
+    credentials_path:
+        Chemin service account JSON explicite.  Si ``None``, lit
+        ``GOOGLE_APPLICATION_CREDENTIALS``.
+    timeout_seconds:
+        Timeout HTTP (REST).  Défaut 60.
+    Raises
+    ------
+    OCRAdapterError
+        Au constructeur si name ou feature_type invalides.
+    """
+    input_types = frozenset({ArtifactType.IMAGE})
+    output_types = frozenset({ArtifactType.RAW_TEXT})
+    execution_mode = "io"
+    def __init__(
+        self,
+        *,
+        name: str = "google_vision",
+        language_hints: list[str] | None = None,
+        feature_type: str = "DOCUMENT_TEXT_DETECTION",
+        api_key: str | None = None,
+        credentials_path: str | None = None,
+        timeout_seconds: float = 60.0,
+    ) -> None:
+        if not name or not name.strip():
+            raise OCRAdapterError(
+                "GoogleVisionAdapter : name vide non autorisé.",
+            )
+        if not all(c.isalnum() or c in "_-" for c in name):
+            raise OCRAdapterError(
+                f"GoogleVisionAdapter : name invalide {name!r} — "
+                "alphanumérique + _ - uniquement.",
+            )
+        if feature_type not in _VALID_FEATURE_TYPES:
+            raise OCRAdapterError(
+                f"GoogleVisionAdapter : feature_type invalide "
+                f"{feature_type!r}.  Valeurs valides : "
+                f"{sorted(_VALID_FEATURE_TYPES)}.",
+            )
+        if timeout_seconds <= 0:
+            raise OCRAdapterError(
+                f"GoogleVisionAdapter : timeout_seconds doit être > 0, "
+                f"reçu {timeout_seconds}.",
+            )
+        self._name = name
+        self._language_hints = list(language_hints or ["fr"])
+        self._feature_type = feature_type
+        self._explicit_api_key = api_key
+        self._explicit_credentials = credentials_path
+        self._timeout = timeout_seconds
+    @property
+    def name(self) -> str:
+        return self._name
+    @property
+    def feature_type(self) -> str:
+        return self._feature_type
+    def _resolve_credentials_path(self) -> str | None:
+        return self._explicit_credentials or os.environ.get(
+            "GOOGLE_APPLICATION_CREDENTIALS",
+        )
+    def _resolve_api_key(self) -> str | None:
+        return self._explicit_api_key or os.environ.get("GOOGLE_API_KEY")
+    def execute(
+        self,
+        inputs: dict[ArtifactType, Artifact],
+        params: dict[str, Any],
+        context: Any,
+    ) -> dict[ArtifactType, Artifact]:
+        """Exécute Google Vision OCR sur l'image fournie.
+        Routing :
+        - Si un service account JSON est disponible
+          (``credentials_path`` ou ``GOOGLE_APPLICATION_CREDENTIALS``)
+          → passe par le SDK ``google-cloud-vision``.
+        - Sinon, si une clé API simple est disponible
+          (``api_key`` ou ``GOOGLE_API_KEY``) → passe par REST direct
+          via ``urllib``.
+        - Sinon → ``OCRAdapterError``.
+        """
+        if ArtifactType.IMAGE not in inputs:
+            raise OCRAdapterError(
+                f"{self.name} : input IMAGE manquant.",
+            )
+        image_artifact = inputs[ArtifactType.IMAGE]
+        if image_artifact.uri is None:
+            raise OCRAdapterError(
+                f"{self.name} : artefact image "
+                f"{image_artifact.id!r} sans URI.",
+            )
+        image_path = Path(image_artifact.uri)
+        if not image_path.exists():
+            raise OCRAdapterError(
+                f"{self.name} : image introuvable {image_path!r}.",
+            )
+        creds = self._resolve_credentials_path()
+        api_key = self._resolve_api_key()
+        if creds:
+            text = self._call_via_sdk(image_path)
+        elif api_key:
+            text = self._call_via_rest(image_path, api_key)
+        else:
+            raise OCRAdapterError(
+                f"{self.name} : authentification manquante. Définir "
+                "GOOGLE_APPLICATION_CREDENTIALS (service account JSON) "
+                "ou GOOGLE_API_KEY.",
+            )
+        text_path = (
+            image_path.parent / f"{image_path.stem}.{self.name}.txt"
+        )
+        text_path.write_text(text, encoding="utf-8")
+        return {
+            ArtifactType.RAW_TEXT: Artifact(
+                id=f"{context.document_id}:{self.name}:raw_text",
+                document_id=context.document_id,
+                type=ArtifactType.RAW_TEXT,
+                produced_by_step="ocr",
+                uri=str(text_path),
+            ),
+        }
+    # ──────────────────────────────────────────────────────────────
+    # SDK / REST
+    # ──────────────────────────────────────────────────────────────
+    def _call_via_sdk(self, image_path: Path) -> str:
+        try:
+            from google.cloud import vision
+        except ImportError as exc:
+            raise OCRAdapterError(
+                f"{self.name} : SDK google-cloud-vision non installé. "
+                "Installer avec : pip install google-cloud-vision",
+            ) from exc
+        try:
+            client = vision.ImageAnnotatorClient()
+            image = vision.Image(content=image_path.read_bytes())
+            ctx = vision.ImageContext(language_hints=self._language_hints)
+            if self._feature_type == "DOCUMENT_TEXT_DETECTION":
+                response = client.document_text_detection(
+                    image=image, image_context=ctx,
+                )
+                text = response.full_text_annotation.text
+            else:
+                response = client.text_detection(
+                    image=image, image_context=ctx,
+                )
+                texts = response.text_annotations
+                text = texts[0].description if texts else ""
+        except Exception as exc:
+            raise OCRAdapterError(
+                f"{self.name} : SDK Google Vision a levé : "
+                f"{type(exc).__name__}: {exc}",
+            ) from exc
+        return text
+    def _call_via_rest(self, image_path: Path, api_key: str) -> str:
+        image_b64 = base64.b64encode(
+            image_path.read_bytes(),
+        ).decode("ascii")
+        payload = json.dumps({
+            "requests": [{
+                "image": {"content": image_b64},
+                "features": [
+                    {"type": self._feature_type, "maxResults": 1},
+                ],
+                "imageContext": {"languageHints": self._language_hints},
+            }],
+        }).encode("utf-8")
+        req = urllib.request.Request(
+            "https://vision.googleapis.com/v1/images:annotate",
+            data=payload,
+            headers={
+                "Content-Type": "application/json",
+                "X-Goog-Api-Key": api_key,
+            },
+        )
+        try:
+            with urllib.request.urlopen(req, timeout=self._timeout) as resp:
+                result = json.loads(resp.read().decode("utf-8"))
+        except urllib.error.HTTPError as exc:
+            body = ""
+            try:
+                body = exc.read().decode("utf-8")
+            except Exception:  # noqa: BLE001
+                pass
+            raise OCRAdapterError(
+                f"{self.name} : Google Vision API erreur {exc.code} : {body}",
+            ) from exc
+        except Exception as exc:
+            raise OCRAdapterError(
+                f"{self.name} : erreur API Google Vision : "
+                f"{type(exc).__name__}: {exc}",
+            ) from exc
+        responses = result.get("responses", [{}])
+        if not responses:
+            return ""
+        r = responses[0]
+        if "error" in r:
+            raise OCRAdapterError(
+                f"{self.name} : Google Vision API erreur : {r['error']}",
+            )
+        if self._feature_type == "DOCUMENT_TEXT_DETECTION":
+            full = r.get("fullTextAnnotation") or {}
+            return full.get("text", "") if isinstance(full, dict) else ""
+        # TEXT_DETECTION
+        texts = r.get("textAnnotations", [])
+        return texts[0]["description"] if texts else ""
+__all__ = ["GoogleVisionAdapter"]

tests/adapters/ocr/test_sprint_a14_s33_google_vision_adapter.py ADDED Viewed

	@@ -0,0 +1,418 @@

+"""Sprint A14-S33 — ``GoogleVisionAdapter`` natif au contrat S26."""
+from __future__ import annotations
+import json
+import sys
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+import pytest
+from picarones.adapters.ocr import (
+    BaseOCRAdapter,
+    GoogleVisionAdapter,
+    OCRAdapterError,
+)
+from picarones.domain.artifacts import Artifact, ArtifactType
+from picarones.pipeline.types import RunContext
+def _make_image_artifact(uri: str) -> Artifact:
+    return Artifact(
+        id="d1:img",
+        document_id="d1",
+        type=ArtifactType.IMAGE,
+        uri=uri,
+    )
+def _make_context() -> RunContext:
+    return RunContext(
+        document_id="d1",
+        code_version="1.0.0",
+        pipeline_name="test",
+    )
+def _make_dummy_image(tmp_path: Path) -> Path:
+    path = tmp_path / "page.png"
+    path.write_bytes(b"PNG_FAKE_BYTES")
+    return path
+# ──────────────────────────────────────────────────────────────────────
+# Constructeur
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionConstructor:
+    def test_defaults(self) -> None:
+        adapter = GoogleVisionAdapter()
+        assert adapter.name == "google_vision"
+        assert adapter.feature_type == "DOCUMENT_TEXT_DETECTION"
+    def test_custom_name(self) -> None:
+        adapter = GoogleVisionAdapter(name="my_gv")
+        assert adapter.name == "my_gv"
+    def test_text_detection_feature(self) -> None:
+        adapter = GoogleVisionAdapter(feature_type="TEXT_DETECTION")
+        assert adapter.feature_type == "TEXT_DETECTION"
+    def test_rejects_invalid_feature_type(self) -> None:
+        with pytest.raises(OCRAdapterError, match="feature_type"):
+            GoogleVisionAdapter(feature_type="UNKNOWN_FEATURE")
+    def test_rejects_empty_name(self) -> None:
+        with pytest.raises(OCRAdapterError, match="vide"):
+            GoogleVisionAdapter(name="")
+    def test_rejects_invalid_chars_in_name(self) -> None:
+        with pytest.raises(OCRAdapterError, match="invalide"):
+            GoogleVisionAdapter(name="bad name")
+    def test_rejects_non_positive_timeout(self) -> None:
+        with pytest.raises(OCRAdapterError, match="timeout"):
+            GoogleVisionAdapter(timeout_seconds=0)
+    def test_default_language_hints(self) -> None:
+        adapter = GoogleVisionAdapter()
+        # Vérifier que les hints sont stockés (privé mais accessible).
+        assert adapter._language_hints == ["fr"]
+    def test_custom_language_hints(self) -> None:
+        adapter = GoogleVisionAdapter(language_hints=["en", "lat"])
+        assert adapter._language_hints == ["en", "lat"]
+# ──────────────────────────────────────────────────────────────────────
+# Contrat BaseOCRAdapter
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionContract:
+    def test_inherits_base_adapter(self) -> None:
+        adapter = GoogleVisionAdapter()
+        assert isinstance(adapter, BaseOCRAdapter)
+    def test_input_types(self) -> None:
+        assert GoogleVisionAdapter.input_types == frozenset({ArtifactType.IMAGE})
+    def test_output_types(self) -> None:
+        assert GoogleVisionAdapter.output_types == frozenset({ArtifactType.RAW_TEXT})
+    def test_execution_mode_is_io(self) -> None:
+        assert GoogleVisionAdapter.execution_mode == "io"
+# ──────────────────────────────────────────────────────────────────────
+# Auth resolution
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionAuth:
+    def test_no_auth_raises(self, tmp_path: Path) -> None:
+        adapter = GoogleVisionAdapter()
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        with patch.dict("os.environ", {}, clear=True):
+            with pytest.raises(OCRAdapterError, match="authentification manquante"):
+                adapter.execute(
+                    inputs={ArtifactType.IMAGE: artifact},
+                    params={},
+                    context=_make_context(),
+                )
+    def test_explicit_credentials_path_takes_priority(self) -> None:
+        adapter = GoogleVisionAdapter(credentials_path="/explicit/creds.json")
+        with patch.dict(
+            "os.environ",
+            {"GOOGLE_APPLICATION_CREDENTIALS": "/env/creds.json"},
+        ):
+            assert adapter._resolve_credentials_path() == "/explicit/creds.json"
+    def test_env_credentials_fallback(self) -> None:
+        adapter = GoogleVisionAdapter()
+        with patch.dict(
+            "os.environ",
+            {"GOOGLE_APPLICATION_CREDENTIALS": "/env/creds.json"},
+        ):
+            assert adapter._resolve_credentials_path() == "/env/creds.json"
+    def test_explicit_api_key_takes_priority(self) -> None:
+        adapter = GoogleVisionAdapter(api_key="explicit_key")
+        with patch.dict("os.environ", {"GOOGLE_API_KEY": "env_key"}):
+            assert adapter._resolve_api_key() == "explicit_key"
+# ──────────────────────────────────────────────────────────────────────
+# Input validation
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionInputValidation:
+    def test_missing_image_input_raises(self) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        with pytest.raises(OCRAdapterError, match="IMAGE manquant"):
+            adapter.execute(inputs={}, params={}, context=_make_context())
+    def test_image_artifact_without_uri_raises(self) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        artifact = Artifact(
+            id="d1:img",
+            document_id="d1",
+            type=ArtifactType.IMAGE,
+            uri=None,
+        )
+        with pytest.raises(OCRAdapterError, match="sans URI"):
+            adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+    def test_image_path_does_not_exist_raises(self) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        artifact = _make_image_artifact("/nonexistent/img.png")
+        with pytest.raises(OCRAdapterError, match="introuvable"):
+            adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+# ──────────────────────────────────────────────────────────────────────
+# REST API path (api_key)
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionREST:
+    def _mock_urlopen(self, response_dict: dict):
+        mock_resp = MagicMock()
+        mock_resp.read.return_value = json.dumps(response_dict).encode("utf-8")
+        mock_resp.__enter__.return_value = mock_resp
+        return patch("urllib.request.urlopen", return_value=mock_resp)
+    def test_document_text_detection_returns_full_text(
+        self, tmp_path: Path,
+    ) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        response = {
+            "responses": [{
+                "fullTextAnnotation": {"text": "Bonjour\nle monde"},
+            }],
+        }
+        with self._mock_urlopen(response):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        out_text = Path(result[ArtifactType.RAW_TEXT].uri).read_text(
+            encoding="utf-8",
+        )
+        assert out_text == "Bonjour\nle monde"
+    def test_text_detection_returns_first_annotation(
+        self, tmp_path: Path,
+    ) -> None:
+        adapter = GoogleVisionAdapter(
+            api_key="x", feature_type="TEXT_DETECTION",
+        )
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        response = {
+            "responses": [{
+                "textAnnotations": [
+                    {"description": "Texte court"},
+                ],
+            }],
+        }
+        with self._mock_urlopen(response):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        out_text = Path(result[ArtifactType.RAW_TEXT].uri).read_text(
+            encoding="utf-8",
+        )
+        assert out_text == "Texte court"
+    def test_empty_responses_returns_empty_text(self, tmp_path: Path) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        with self._mock_urlopen({"responses": [{}]}):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        out_text = Path(result[ArtifactType.RAW_TEXT].uri).read_text(
+            encoding="utf-8",
+        )
+        assert out_text == ""
+    def test_api_error_in_response_raises(self, tmp_path: Path) -> None:
+        adapter = GoogleVisionAdapter(api_key="x")
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        response = {
+            "responses": [{
+                "error": {"code": 7, "message": "Permission denied"},
+            }],
+        }
+        with self._mock_urlopen(response):
+            with pytest.raises(OCRAdapterError, match="Permission denied"):
+                adapter.execute(
+                    inputs={ArtifactType.IMAGE: artifact},
+                    params={},
+                    context=_make_context(),
+                )
+    def test_writes_to_stem_name_pattern(self, tmp_path: Path) -> None:
+        adapter = GoogleVisionAdapter(api_key="x", name="my_gv")
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        response = {"responses": [{"fullTextAnnotation": {"text": "x"}}]}
+        with self._mock_urlopen(response):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        out_path = Path(result[ArtifactType.RAW_TEXT].uri)
+        assert out_path.name == "page.my_gv.txt"
+# ──────────────────────────────────────────────────────────────────────
+# SDK path (credentials_path)
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionSDK:
+    def test_credentials_path_routes_to_sdk(self, tmp_path: Path) -> None:
+        creds_path = tmp_path / "creds.json"
+        creds_path.write_text("{}")
+        adapter = GoogleVisionAdapter(credentials_path=str(creds_path))
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        # Mock du SDK google.cloud.vision
+        mock_response = MagicMock()
+        mock_response.full_text_annotation.text = "SDK output text"
+        mock_client = MagicMock()
+        mock_client.document_text_detection.return_value = mock_response
+        fake_vision = MagicMock()
+        fake_vision.ImageAnnotatorClient = MagicMock(return_value=mock_client)
+        fake_vision.Image = MagicMock(return_value="image_obj")
+        fake_vision.ImageContext = MagicMock(return_value="ctx_obj")
+        fake_module = MagicMock()
+        fake_module.vision = fake_vision
+        with patch.dict(sys.modules, {
+            "google": fake_module,
+            "google.cloud": fake_module,
+            "google.cloud.vision": fake_vision,
+        }):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        out_text = Path(result[ArtifactType.RAW_TEXT].uri).read_text(
+            encoding="utf-8",
+        )
+        assert out_text == "SDK output text"
+    def test_sdk_missing_raises_clean_error(self, tmp_path: Path) -> None:
+        creds_path = tmp_path / "creds.json"
+        creds_path.write_text("{}")
+        adapter = GoogleVisionAdapter(credentials_path=str(creds_path))
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        with patch.dict(sys.modules, {
+            "google.cloud.vision": None,
+            "google.cloud": None,
+        }):
+            with pytest.raises(OCRAdapterError, match="google-cloud-vision"):
+                adapter.execute(
+                    inputs={ArtifactType.IMAGE: artifact},
+                    params={},
+                    context=_make_context(),
+                )
+    def test_sdk_internal_error_wrapped(self, tmp_path: Path) -> None:
+        creds_path = tmp_path / "creds.json"
+        creds_path.write_text("{}")
+        adapter = GoogleVisionAdapter(credentials_path=str(creds_path))
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        mock_client = MagicMock()
+        mock_client.document_text_detection.side_effect = RuntimeError(
+            "SDK boom",
+        )
+        fake_vision = MagicMock()
+        fake_vision.ImageAnnotatorClient = MagicMock(return_value=mock_client)
+        fake_vision.Image = MagicMock(return_value="image_obj")
+        fake_vision.ImageContext = MagicMock(return_value="ctx_obj")
+        fake_module = MagicMock()
+        fake_module.vision = fake_vision
+        with patch.dict(sys.modules, {
+            "google": fake_module,
+            "google.cloud": fake_module,
+            "google.cloud.vision": fake_vision,
+        }):
+            with pytest.raises(OCRAdapterError, match="RuntimeError.*SDK boom"):
+                adapter.execute(
+                    inputs={ArtifactType.IMAGE: artifact},
+                    params={},
+                    context=_make_context(),
+                )
+# ──────────────────────────────────────────────────────────────────────
+# Artifact ID
+# ──────────────────────────────────────────────────────────────────────
+class TestGoogleVisionArtifactID:
+    def test_artifact_id_uses_adapter_name(self, tmp_path: Path) -> None:
+        adapter = GoogleVisionAdapter(api_key="x", name="custom_gv")
+        image_path = _make_dummy_image(tmp_path)
+        artifact = _make_image_artifact(str(image_path))
+        response = {"responses": [{"fullTextAnnotation": {"text": "x"}}]}
+        mock_resp = MagicMock()
+        mock_resp.read.return_value = json.dumps(response).encode("utf-8")
+        mock_resp.__enter__.return_value = mock_resp
+        with patch("urllib.request.urlopen", return_value=mock_resp):
+            result = adapter.execute(
+                inputs={ArtifactType.IMAGE: artifact},
+                params={},
+                context=_make_context(),
+            )
+        produced = result[ArtifactType.RAW_TEXT]
+        assert produced.id == "d1:custom_gv:raw_text"
+        assert produced.document_id == "d1"
+        assert produced.produced_by_step == "ocr"