Spaces:

W01fAI
/

gaia_unit4_space

Sleeping

App Files Files Community

hawkdev commited on 28 days ago

Commit

f11f984

1 Parent(s): a46999a

fixing errors

Browse files

Files changed (10) hide show

README.md +13 -7
__pycache__/agent.cpython-312.pyc +0 -0
__pycache__/llm_backends.cpython-312.pyc +0 -0
agent.py +70 -32
llm_backends.py +61 -0
requirements.txt +1 -0
run_local_eval.py +4 -2
tools/__pycache__/gaia_deterministic.cpython-312.pyc +0 -0
tools/__pycache__/media_tools.cpython-312.pyc +0 -0
tools/media_tools.py +108 -8

README.md CHANGED Viewed

@@ -21,14 +21,20 @@ This folder is a **drop-in replacement** for the course Space
 1. On Hugging Face, **Duplicate** the template Space above (or create a new Gradio Space and copy these files into the repo root).
 2. The repo **`README.md` frontmatter must include `hf_oauth: true`** so Hugging Face injects `OAUTH_CLIENT_ID` / OAuth for `gr.LoginButton()` (required by Gradio 5.10+).
-3. In the Space **Settings → Repository secrets**, add:
-   - **`HF_TOKEN`**: a Hugging Face access token with **read** permission (for Inference API / serverless models).
-   - If you see **402 Payment Required** during evaluation, your **Inference Provider credits** are exhausted — add [billing / PRO](https://huggingface.co/settings/billing) or the run will fail on LLM-heavy tasks. Several tasks are solved **without** the LLM (deterministic) to save quota.
 4. Optional **Variables** (or secrets) to tune models:
    - `HF_INFERENCE_PROVIDER` — **omit by default** so the client uses **`auto`**: the first [inference provider](https://hf.co/settings/inference-providers) that supports your **chosen model** on the Hub. Do **not** set `hf-inference` unless that model lists it — many chat models (e.g. Qwen2.5-7B-Instruct) only support **together** / **featherless-ai**, and forcing `hf-inference` yields **404**. If the auto order hits a provider that returns **401** (e.g. Novita), reorder providers in HF settings or pin e.g. `HF_INFERENCE_PROVIDER=together`.
-   - `GAIA_TEXT_MODEL` — default `Qwen/Qwen2.5-7B-Instruct` (broad provider mapping via Together).
-   - `GAIA_ASR_MODEL` — default `openai/whisper-large-v3`
-   - `GAIA_VISION_MODEL` — default `meta-llama/Llama-3.2-11B-Vision-Instruct`
    - `GAIA_API_URL` — default `https://agents-course-unit4-scoring.hf.space`
    - `GAIA_USE_CACHE` — `1` (default) or `0` to disable `gaia_answers_cache.json`
@@ -40,7 +46,7 @@ Keep the Space **public** so `agent_code` (`…/tree/main`) verifies for the lea
 cd gaia_unit4_space
 python -m venv .venv && source .venv/bin/activate
 pip install -r requirements.txt
-export HF_TOKEN=hf_...
 python run_local_eval.py
 ```

 1. On Hugging Face, **Duplicate** the template Space above (or create a new Gradio Space and copy these files into the repo root).
 2. The repo **`README.md` frontmatter must include `hf_oauth: true`** so Hugging Face injects `OAUTH_CLIENT_ID` / OAuth for `gr.LoginButton()` (required by Gradio 5.10+).
+3. In the Space **Settings → Repository secrets**, add **at least one** LLM credential:
+   - **`GROQ_API_KEY`** (recommended, [free tier](https://console.groq.com)): the agent uses **Groq’s OpenAI-compatible API** for chat, Whisper-class ASR, and vision when this is set. You do **not** need Hugging Face Inference credits or a paid Hub plan for the LLM.
+   - **`OPENAI_API_KEY`** (optional): same wiring via the OpenAI SDK; used only if `GROQ_API_KEY` is **not** set.
+   - **`HF_TOKEN`**: Hugging Face token with **read** permission — used only when neither Groq nor OpenAI keys are set (Hub **Inference** / serverless). If you see **402 Payment Required**, your **Inference Provider credits** are exhausted; prefer **`GROQ_API_KEY`** instead of adding billing, or the run will fail on LLM-heavy tasks. Several tasks are solved **without** the LLM (deterministic) to save quota.
 4. Optional **Variables** (or secrets) to tune models:
    - `HF_INFERENCE_PROVIDER` — **omit by default** so the client uses **`auto`**: the first [inference provider](https://hf.co/settings/inference-providers) that supports your **chosen model** on the Hub. Do **not** set `hf-inference` unless that model lists it — many chat models (e.g. Qwen2.5-7B-Instruct) only support **together** / **featherless-ai**, and forcing `hf-inference` yields **404**. If the auto order hits a provider that returns **401** (e.g. Novita), reorder providers in HF settings or pin e.g. `HF_INFERENCE_PROVIDER=together`.
+   - `GAIA_GROQ_CHAT_MODEL` — default `llama-3.1-8b-instant` (Groq chat + tool calls).
+   - `GAIA_OPENAI_CHAT_MODEL` — default `gpt-4o-mini` (when using `OPENAI_API_KEY`).
+   - `GAIA_TEXT_MODEL` — default `Qwen/Qwen2.5-7B-Instruct` (HF Inference only).
+   - `GAIA_GROQ_ASR_MODEL` / `GAIA_OPENAI_ASR_MODEL` — Groq default `whisper-large-v3`; OpenAI default `whisper-1`.
+   - `GAIA_GROQ_VISION_MODEL` — default `llama-3.2-11b-vision-preview`.
+   - `GAIA_OPENAI_VISION_MODEL` — default `gpt-4o-mini`.
+   - `GAIA_ASR_MODEL` — HF-only default `openai/whisper-large-v3`
+   - `GAIA_VISION_MODEL` — HF-only default `meta-llama/Llama-3.2-11B-Vision-Instruct`
    - `GAIA_API_URL` — default `https://agents-course-unit4-scoring.hf.space`
    - `GAIA_USE_CACHE` — `1` (default) or `0` to disable `gaia_answers_cache.json`
 cd gaia_unit4_space
 python -m venv .venv && source .venv/bin/activate
 pip install -r requirements.txt
+export GROQ_API_KEY=gsk_...   # or: export HF_TOKEN=hf_...
 python run_local_eval.py
 ```

__pycache__/agent.cpython-312.pyc CHANGED Viewed

Binary files a/__pycache__/agent.cpython-312.pyc and b/__pycache__/agent.cpython-312.pyc differ

__pycache__/llm_backends.cpython-312.pyc ADDED Viewed

Binary file (3.11 kB). View file

agent.py CHANGED Viewed

@@ -1,16 +1,27 @@
-"""GAIA Unit 4 agent: tool-calling loop via Hugging Face Inference API."""
 from __future__ import annotations
 import os
 from typing import Any, Optional
-from huggingface_hub import InferenceClient
 from answer_normalize import normalize_answer
 from inference_client_factory import inference_client_kwargs
 from tools.registry import TOOL_DEFINITIONS, deterministic_attempt, dispatch_tool
 SYSTEM_PROMPT = """You solve GAIA benchmark questions for the Hugging Face Agents Course.
 Hard rules:
@@ -36,21 +47,34 @@ class GaiaAgent:
             or os.environ.get("HF_TOKEN")
             or os.environ.get("HUGGINGFACEHUB_API_TOKEN")
         )
-        self.text_model = text_model or os.environ.get(
-            "GAIA_TEXT_MODEL", "Qwen/Qwen2.5-7B-Instruct"
-        )
         self.max_iterations = max_iterations
-        self._client: Optional[InferenceClient] = None
-    def _get_client(self) -> InferenceClient:
-        if self._client is None:
             if not self.hf_token:
                 raise RuntimeError(
-                    "HF_TOKEN or HUGGINGFACEHUB_API_TOKEN is required for GaiaAgent."
                 )
             kw = inference_client_kwargs(self.hf_token)
-            self._client = InferenceClient(**kw)
-        return self._client
     def __call__(
         self,
@@ -62,9 +86,10 @@ class GaiaAgent:
         if det is not None:
             return normalize_answer(det)
-        if not self.hf_token:
             return normalize_answer(
-                "Error: missing HF_TOKEN; cannot run LLM tools for this question."
             )
         user_text = _build_user_payload(question, attachment_path, task_id)
@@ -73,36 +98,48 @@ class GaiaAgent:
             {"role": "user", "content": user_text},
         ]
-        client = self._get_client()
         last_text = ""
         for _ in range(self.max_iterations):
             try:
-                completion = client.chat_completion(
-                    messages=messages,
-                    model=self.text_model,
-                    tools=TOOL_DEFINITIONS,
-                    tool_choice="auto",
-                    max_tokens=1024,
-                    temperature=0.15,
-                )
             except Exception as e:
                 es = str(e)
                 if "402" in es or "Payment Required" in es or "depleted" in es.lower():
                     last_text = (
                         "Error: Hugging Face Inference credits exhausted (402). "
-                        "Add prepaid credits, enable PRO, or set HF_INFERENCE_PROVIDER to a provider "
-                        "with quota. See https://huggingface.co/settings/billing"
                     )
                 else:
                     last_text = f"Inference error: {e}"
                 break
-            choice = completion.choices[0]
-            msg = choice.message
             last_text = (msg.content or "").strip()
-            if msg.tool_calls:
                 messages.append(
                     {
                         "role": "assistant",
@@ -113,14 +150,14 @@ class GaiaAgent:
                                 "type": "function",
                                 "function": {
                                     "name": tc.function.name,
-                                    "arguments": tc.function.arguments,
                                 },
                             }
-                            for tc in msg.tool_calls
                         ],
                     }
                 )
-                for tc in msg.tool_calls:
                     name = tc.function.name
                     args = tc.function.arguments or "{}"
                     result = dispatch_tool(name, args, hf_token=self.hf_token)
@@ -136,7 +173,8 @@ class GaiaAgent:
             if last_text:
                 break
-            if choice.finish_reason == "length":
                 last_text = "Error: model hit max length without an answer."
                 break

+"""GAIA Unit 4 agent: tool-calling loop via Groq, OpenAI, or Hugging Face Inference."""
 from __future__ import annotations
 import os
 from typing import Any, Optional
 from answer_normalize import normalize_answer
 from inference_client_factory import inference_client_kwargs
+from llm_backends import (
+    chat_complete_openai,
+    detect_llm_backend,
+    groq_chat_model,
+    hf_chat_model,
+    make_openai_sdk_client,
+    openai_chat_model,
+)
 from tools.registry import TOOL_DEFINITIONS, deterministic_attempt, dispatch_tool
+try:
+    from huggingface_hub import InferenceClient
+except ImportError:
+    InferenceClient = None  # type: ignore
 SYSTEM_PROMPT = """You solve GAIA benchmark questions for the Hugging Face Agents Course.
 Hard rules:
             or os.environ.get("HF_TOKEN")
             or os.environ.get("HUGGINGFACEHUB_API_TOKEN")
         )
+        self.backend = detect_llm_backend()
+        if self.backend == "groq":
+            self.text_model = text_model or groq_chat_model()
+            self._oa_client, _ = make_openai_sdk_client("groq")
+            self._hf_client = None
+        elif self.backend == "openai":
+            self.text_model = text_model or openai_chat_model()
+            self._oa_client, _ = make_openai_sdk_client("openai")
+            self._hf_client = None
+        else:
+            self.text_model = text_model or hf_chat_model()
+            self._oa_client = None
+            self._hf_client: Optional[InferenceClient] = None
         self.max_iterations = max_iterations
+    def _get_hf_client(self) -> InferenceClient:
+        if InferenceClient is None:
+            raise RuntimeError("huggingface_hub is not installed.")
+        if self._hf_client is None:
             if not self.hf_token:
                 raise RuntimeError(
+                    "HF_TOKEN or HUGGINGFACEHUB_API_TOKEN is required when using "
+                    "Hugging Face Inference (no GROQ_API_KEY / OPENAI_API_KEY set)."
                 )
             kw = inference_client_kwargs(self.hf_token)
+            self._hf_client = InferenceClient(**kw)
+        return self._hf_client
     def __call__(
         self,
         if det is not None:
             return normalize_answer(det)
+        if self.backend == "hf" and not self.hf_token:
             return normalize_answer(
+                "Error: set GROQ_API_KEY (free Groq tier), OPENAI_API_KEY, or HF_TOKEN "
+                "for LLM inference."
             )
         user_text = _build_user_payload(question, attachment_path, task_id)
             {"role": "user", "content": user_text},
         ]
         last_text = ""
         for _ in range(self.max_iterations):
             try:
+                if self.backend in ("groq", "openai"):
+                    assert self._oa_client is not None
+                    completion = chat_complete_openai(
+                        self._oa_client,
+                        model=self.text_model,
+                        messages=messages,
+                        tools=TOOL_DEFINITIONS,
+                        max_tokens=1024,
+                        temperature=0.15,
+                    )
+                    msg = completion.choices[0].message
+                else:
+                    client = self._get_hf_client()
+                    completion = client.chat_completion(
+                        messages=messages,
+                        model=self.text_model,
+                        tools=TOOL_DEFINITIONS,
+                        tool_choice="auto",
+                        max_tokens=1024,
+                        temperature=0.15,
+                    )
+                    msg = completion.choices[0].message
             except Exception as e:
                 es = str(e)
                 if "402" in es or "Payment Required" in es or "depleted" in es.lower():
                     last_text = (
                         "Error: Hugging Face Inference credits exhausted (402). "
+                        "Set Space secret GROQ_API_KEY (free at https://console.groq.com) "
+                        "to use Groq instead, or add HF billing."
                     )
                 else:
                     last_text = f"Inference error: {e}"
                 break
             last_text = (msg.content or "").strip()
+            tool_calls = getattr(msg, "tool_calls", None)
+            if tool_calls:
                 messages.append(
                     {
                         "role": "assistant",
                                 "type": "function",
                                 "function": {
                                     "name": tc.function.name,
+                                    "arguments": tc.function.arguments or "{}",
                                 },
                             }
+                            for tc in tool_calls
                         ],
                     }
                 )
+                for tc in tool_calls:
                     name = tc.function.name
                     args = tc.function.arguments or "{}"
                     result = dispatch_tool(name, args, hf_token=self.hf_token)
             if last_text:
                 break
+            fr = getattr(completion.choices[0], "finish_reason", None)
+            if fr == "length":
                 last_text = "Error: model hit max length without an answer."
                 break

llm_backends.py ADDED Viewed

	@@ -0,0 +1,61 @@

+"""Optional OpenAI-compatible backends (Groq free tier, OpenAI) to avoid HF Inference credits."""
+from __future__ import annotations
+import os
+from typing import Any, Literal, Optional
+Backend = Literal["groq", "openai", "hf"]
+def detect_llm_backend() -> Backend:
+    if os.environ.get("GROQ_API_KEY", "").strip():
+        return "groq"
+    if os.environ.get("OPENAI_API_KEY", "").strip():
+        return "openai"
+    return "hf"
+def groq_chat_model() -> str:
+    return os.environ.get("GAIA_GROQ_CHAT_MODEL", "llama-3.1-8b-instant")
+def openai_chat_model() -> str:
+    return os.environ.get("GAIA_OPENAI_CHAT_MODEL", "gpt-4o-mini")
+def hf_chat_model() -> str:
+    return os.environ.get("GAIA_TEXT_MODEL", "Qwen/Qwen2.5-7B-Instruct")
+def make_openai_sdk_client(backend: Backend):
+    """Return (client, base_url_label) for Groq or direct OpenAI."""
+    from openai import OpenAI
+    if backend == "groq":
+        key = os.environ["GROQ_API_KEY"].strip()
+        return OpenAI(api_key=key, base_url="https://api.groq.com/openai/v1"), "groq"
+    key = os.environ["OPENAI_API_KEY"].strip()
+    base = os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1").strip()
+    return OpenAI(api_key=key, base_url=base), "openai"
+def chat_complete_openai(
+    client: Any,
+    *,
+    model: str,
+    messages: list[dict[str, Any]],
+    tools: Optional[list[dict[str, Any]]],
+    max_tokens: int,
+    temperature: float,
+) -> Any:
+    kwargs: dict[str, Any] = {
+        "model": model,
+        "messages": messages,
+        "max_tokens": max_tokens,
+        "temperature": temperature,
+    }
+    if tools:
+        kwargs["tools"] = tools
+        kwargs["tool_choice"] = "auto"
+    return client.chat.completions.create(**kwargs)

requirements.txt CHANGED Viewed

@@ -8,5 +8,6 @@ lxml>=5.0.0
 duckduckgo-search>=6.0.0
 wikipedia>=1.4.0
 huggingface_hub>=0.26.0
 youtube-transcript-api>=0.6.0
 Pillow>=10.0.0

 duckduckgo-search>=6.0.0
 wikipedia>=1.4.0
 huggingface_hub>=0.26.0
+openai>=1.40.0
 youtube-transcript-api>=0.6.0
 Pillow>=10.0.0

run_local_eval.py CHANGED Viewed

@@ -66,7 +66,9 @@ def main() -> None:
     print(f"{len(items)} questions")
     token = os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACEHUB_API_TOKEN")
-    agent = GaiaAgent(hf_token=token) if token else None
     out: list[dict] = []
     for item in items:
@@ -85,7 +87,7 @@ def main() -> None:
                 from tools.registry import deterministic_attempt
                 d = deterministic_attempt(str(q), local)
-                ans = d if d is not None else "NO_HF_TOKEN"
         finally:
             if local and Path(local).is_file():
                 Path(local).unlink(missing_ok=True)

     print(f"{len(items)} questions")
     token = os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACEHUB_API_TOKEN")
+    has_groq = bool(os.environ.get("GROQ_API_KEY", "").strip())
+    has_openai = bool(os.environ.get("OPENAI_API_KEY", "").strip())
+    agent = GaiaAgent(hf_token=token) if (token or has_groq or has_openai) else None
     out: list[dict] = []
     for item in items:
                 from tools.registry import deterministic_attempt
                 d = deterministic_attempt(str(q), local)
+                ans = d if d is not None else "NO_LLM_KEYS"
         finally:
             if local and Path(local).is_file():
                 Path(local).unlink(missing_ok=True)

tools/__pycache__/gaia_deterministic.cpython-312.pyc CHANGED Viewed

Binary files a/tools/__pycache__/gaia_deterministic.cpython-312.pyc and b/tools/__pycache__/gaia_deterministic.cpython-312.pyc differ

tools/__pycache__/media_tools.cpython-312.pyc CHANGED Viewed

Binary files a/tools/__pycache__/media_tools.cpython-312.pyc and b/tools/__pycache__/media_tools.cpython-312.pyc differ

tools/media_tools.py CHANGED Viewed

@@ -1,22 +1,66 @@
 import base64
 import os
 from pathlib import Path
-from typing import Optional
 from inference_client_factory import make_inference_client
 def transcribe_audio(
     file_path: str,
     *,
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )
     if not token:
-        return "Error: HF_TOKEN not set for speech recognition."
     mid = model or os.environ.get("GAIA_ASR_MODEL", "openai/whisper-large-v3")
     client = make_inference_client(token)
     try:
@@ -26,6 +70,37 @@ def transcribe_audio(
         return f"ASR error: {e}"
 def analyze_image_with_vlm(
     file_path: str,
     question: str,
@@ -33,18 +108,39 @@ def analyze_image_with_vlm(
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
-    """Use a vision-language chat model via HF Inference (image as data URL)."""
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )
     if not token:
-        return "Error: HF_TOKEN not set for vision."
     mid = model or os.environ.get(
         "GAIA_VISION_MODEL", "meta-llama/Llama-3.2-11B-Vision-Instruct"
     )
-    path = Path(file_path)
-    if not path.is_file():
-        return f"Error: image not found: {file_path}"
     raw = path.read_bytes()
     b64 = base64.b64encode(raw).decode("ascii")
     mime = "image/png" if path.suffix.lower() == ".png" else "image/jpeg"
@@ -79,7 +175,11 @@ def visual_question_short(
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
-    """Fallback VQA task (shorter answers)."""
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )

 import base64
 import os
 from pathlib import Path
+from typing import Any, Optional
 from inference_client_factory import make_inference_client
+def _groq_openai_client():
+    k = os.environ.get("GROQ_API_KEY", "").strip()
+    if not k:
+        return None
+    from openai import OpenAI
+    return OpenAI(api_key=k, base_url="https://api.groq.com/openai/v1")
+def _openai_platform_client():
+    k = os.environ.get("OPENAI_API_KEY", "").strip()
+    if not k:
+        return None
+    from openai import OpenAI
+    base = os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1").strip()
+    return OpenAI(api_key=k, base_url=base)
 def transcribe_audio(
     file_path: str,
     *,
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
+    gc = _groq_openai_client()
+    if gc:
+        mid = model or os.environ.get("GAIA_GROQ_ASR_MODEL", "whisper-large-v3")
+        try:
+            with open(file_path, "rb") as audio_f:
+                tr = gc.audio.transcriptions.create(
+                    model=mid,
+                    file=audio_f,
+                )
+            return (tr.text or "").strip()
+        except Exception as e:
+            return f"ASR error (Groq): {e}"
+    oc = _openai_platform_client()
+    if oc:
+        mid = model or os.environ.get("GAIA_OPENAI_ASR_MODEL", "whisper-1")
+        try:
+            with open(file_path, "rb") as audio_f:
+                tr = oc.audio.transcriptions.create(model=mid, file=audio_f)
+            return (tr.text or "").strip()
+        except Exception as e:
+            return f"ASR error (OpenAI): {e}"
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )
     if not token:
+        return (
+            "Error: set GROQ_API_KEY (free), OPENAI_API_KEY, or HF_TOKEN for speech."
+        )
     mid = model or os.environ.get("GAIA_ASR_MODEL", "openai/whisper-large-v3")
     client = make_inference_client(token)
     try:
         return f"ASR error: {e}"
+def _vision_chat_openai(
+    client: Any,
+    *,
+    model: str,
+    file_path: Path,
+    question: str,
+) -> str:
+    raw = file_path.read_bytes()
+    b64 = base64.b64encode(raw).decode("ascii")
+    mime = "image/png" if file_path.suffix.lower() == ".png" else "image/jpeg"
+    data_url = f"data:{mime};base64,{b64}"
+    comp = client.chat.completions.create(
+        model=model,
+        messages=[
+            {
+                "role": "user",
+                "content": [
+                    {"type": "text", "text": question},
+                    {
+                        "type": "image_url",
+                        "image_url": {"url": data_url},
+                    },
+                ],
+            }
+        ],
+        max_tokens=512,
+        temperature=0.2,
+    )
+    return (comp.choices[0].message.content or "").strip()
 def analyze_image_with_vlm(
     file_path: str,
     question: str,
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
+    path = Path(file_path)
+    if not path.is_file():
+        return f"Error: image not found: {file_path}"
+    gc = _groq_openai_client()
+    if gc:
+        mid = model or os.environ.get(
+            "GAIA_GROQ_VISION_MODEL",
+            "llama-3.2-11b-vision-preview",
+        )
+        try:
+            return _vision_chat_openai(gc, model=mid, file_path=path, question=question)
+        except Exception as e:
+            return f"Vision error (Groq): {e}"
+    oc = _openai_platform_client()
+    if oc:
+        mid = model or os.environ.get("GAIA_OPENAI_VISION_MODEL", "gpt-4o-mini")
+        try:
+            return _vision_chat_openai(oc, model=mid, file_path=path, question=question)
+        except Exception as e:
+            return f"Vision error (OpenAI): {e}"
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )
     if not token:
+        return (
+            "Error: set GROQ_API_KEY, OPENAI_API_KEY, or HF_TOKEN for vision."
+        )
     mid = model or os.environ.get(
         "GAIA_VISION_MODEL", "meta-llama/Llama-3.2-11B-Vision-Instruct"
     )
     raw = path.read_bytes()
     b64 = base64.b64encode(raw).decode("ascii")
     mime = "image/png" if path.suffix.lower() == ".png" else "image/jpeg"
     hf_token: Optional[str] = None,
     model: Optional[str] = None,
 ) -> str:
+    if _groq_openai_client() or _openai_platform_client():
+        return analyze_image_with_vlm(
+            file_path, question, hf_token=hf_token, model=model
+        )
     token = hf_token or os.environ.get("HF_TOKEN") or os.environ.get(
         "HUGGINGFACEHUB_API_TOKEN"
     )