Spaces:

xcz0
/

lean4-eval-pipeline

Sleeping

App Files Files Community

xcz0 commited on Dec 22, 2025

Commit

965cabb

1 Parent(s): c82ffa7

增强 LLM 客户端功能，支持多种推理提供方和错误处理，优化生成 Lean 代码的调用逻辑

Browse files

Files changed (2) hide show

README.md +47 -0
app.py +119 -9

README.md CHANGED Viewed

@@ -9,3 +9,50 @@ short_description: 测评大模型完成Lean4证明能力的管道
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+## 功能概览
+本项目提供一个 Streamlit 页面：输入自然语言数学题 → 调用 Hugging Face Inference 生成 `PLAN:` + 一个 `lean4` 代码块 → 通过 `lake env <repl>` 在内置的 Mathlib 项目中编译验证（无 error 且无 sorry 才算通过）。
+## Hugging Face Spaces 配置
+### 必需 Secrets
+- `HF_TOKEN`：调用 Hugging Face Inference 必需。未设置会直接在 UI 中报错并停止。
+### 可选 Variables
+- `HF_MODEL_ID`：要调用的模型 ID。
+	- 默认：`deepseek-ai/DeepSeek-Prover-V2-7B`
+	- 说明：并非所有模型都支持 HF Serverless Inference 的 `text-generation`/`chat-completion`。如果出现 provider 相关报错，优先尝试换一个确认可用的模型或使用 Endpoint。
+- `HF_BASE_URL`：自定义 Inference Endpoint Base URL（当你使用专用 Endpoint 时设置）。
+	- 用途：绕开 Serverless Inference 对“模型/任务”的限制，稳定性通常更好。
+- `HF_PROVIDER`：显式指定推理提供方（provider）。
+	- 示例：`hf-inference`
+	- 说明：不同 `huggingface_hub` 版本/运行环境可用的 provider 可能不同；不确定时可以先不设置。
+- `LEAN_REPL_BIN`：repl 可执行文件路径。
+	- 容器默认：`/app/repl/.lake/build/bin/repl`
+	- 一般不需要修改。
+## 常见问题（Troubleshooting）
+### 1) `StopIteration` / 找不到 provider
+这通常表示：当前模型 + 任务（比如 `text-generation`）在你的推理环境里没有可用 provider。你可以：
+1. 将 `HF_MODEL_ID` 换成一个已在 Hugging Face Inference 上可用的模型；
+2. 使用专用 Inference Endpoint：设置 `HF_BASE_URL` 并保证 `HF_TOKEN` 有权限；
+3. 需要时设置 `HF_PROVIDER`（例如 `hf-inference`）来显式选择 provider。
+### 2) 模型输出无法解析（没生成标准代码块）
+本项目解析逻辑要求：
+- 先输出以 `PLAN:` 开头的计划段落；
+- 再输出且仅输出一个 fenced code block，且语言标记为 `lean`/`lean4`。
+如果 UI 提示“未能生成标准的 Lean 代码块”，可在页面底部查看模型原始输出并据此调整提示词或更换模型。

app.py CHANGED Viewed

@@ -6,6 +6,108 @@ import os
 from huggingface_hub import InferenceClient
 def _extract_lean_code_blocks(text: str) -> list[str]:
     matches = re.findall(
         r"```(?:lean4?|lean)\s*\r?\n(.*?)\r?\n```",
@@ -37,11 +139,11 @@ st.title("Lean 4 证明能力在线测评")
 st.markdown("输入自然语言数学题，由 LLM 生成证明并由 Lean 4 编译器实时验证。")
 # 配置 LLM 客户端 (建议在 HF Space 设置中添加 HF_TOKEN 密钥)
-client = InferenceClient(api_key=os.environ.get("HF_TOKEN"))
 # 默认模型（可通过环境变量覆盖）
 MODEL_ID = os.environ.get("HF_MODEL_ID", "deepseek-ai/DeepSeek-Prover-V2-7B")
 # 用户输入
 nl_problem = st.text_area(
     "输入数学问题 (自然语言):", "证明：对于任何实数 x，若 x > 0，则 x + 1/x >= 2。"
@@ -91,13 +193,21 @@ Requirements:
 3) The final Lean code must compile with Mathlib. Do NOT use `sorry`.
 """.strip()
-        gen_text = client.text_generation(
-            prompt,
-            model=MODEL_ID,
-            max_new_tokens=4096,
-            temperature=0.2,
-            top_p=0.95,
-        )
         plan = _extract_plan(gen_text)
         code_blocks = _extract_lean_code_blocks(gen_text)

 from huggingface_hub import InferenceClient
+def _make_inference_client(model_id: str) -> InferenceClient:
+    """Create an InferenceClient with broad compatibility across huggingface_hub versions.
+    Supports:
+    - HF Serverless Inference via token
+    - Custom Inference Endpoint via base_url
+    - Optional provider selection via env
+    Env vars:
+    - HF_TOKEN (required in this app)
+    - HF_BASE_URL (optional): custom endpoint base URL
+    - HF_PROVIDER (optional): e.g. "hf-inference"
+    """
+    token = os.environ.get("HF_TOKEN")
+    base_url = os.environ.get("HF_BASE_URL")
+    provider = os.environ.get("HF_PROVIDER")
+    base_kwargs: dict[str, object] = {}
+    if base_url:
+        base_kwargs["base_url"] = base_url
+    if provider:
+        base_kwargs["provider"] = provider
+    # Try a few constructor signatures to tolerate different hub versions.
+    candidates = [
+        {"model": model_id, "token": token, **base_kwargs},
+        {"model": model_id, "api_key": token, **base_kwargs},
+        {"token": token, **base_kwargs},
+        {"api_key": token, **base_kwargs},
+        {**base_kwargs},
+    ]
+    last_err: Exception | None = None
+    for kwargs in candidates:
+        filtered = {k: v for k, v in kwargs.items() if v is not None}
+        try:
+            return InferenceClient(**filtered)
+        except TypeError as e:
+            last_err = e
+            continue
+    # Should be unreachable, but keep a safe fallback.
+    if last_err is not None:
+        raise last_err
+    return InferenceClient()
+def _call_llm(client: InferenceClient, prompt: str, model_id: str) -> str:
+    """Call the model with robust fallbacks.
+    Primary: text_generation
+    Fallback: chat completion (OpenAI-style)
+    """
+    # 1) Try text-generation first.
+    try:
+        return client.text_generation(
+            prompt,
+            model=model_id,
+            max_new_tokens=4096,
+            temperature=0.2,
+            top_p=0.95,
+        )
+    except TypeError:
+        # Older hub versions may not accept `model=` here if model is set in client.
+        return client.text_generation(
+            prompt,
+            max_new_tokens=4096,
+            temperature=0.2,
+            top_p=0.95,
+        )
+    except StopIteration as e:
+        # huggingface_hub currently may raise StopIteration when it cannot find
+        # a provider for the requested (task, model).
+        first_err: Exception = e
+    except Exception as e:
+        first_err = e
+    # 2) Fallback to chat completion if available.
+    try:
+        chat = client.chat.completions.create(
+            model=model_id,
+            messages=[{"role": "user", "content": prompt}],
+            max_tokens=4096,
+            temperature=0.2,
+            top_p=0.95,
+        )
+        content = chat.choices[0].message.content
+        if isinstance(content, str) and content.strip():
+            return content
+        raise RuntimeError("chat-completion 返回空内容")
+    except Exception:
+        if isinstance(first_err, StopIteration):
+            raise RuntimeError(
+                "Hugging Face Inference 未能为该模型找到可用的推理提供方（provider）。"
+            ) from first_err
+        raise
+    # Note: other exceptions are handled by the caller for better UI reporting.
 def _extract_lean_code_blocks(text: str) -> list[str]:
     matches = re.findall(
         r"```(?:lean4?|lean)\s*\r?\n(.*?)\r?\n```",
 st.markdown("输入自然语言数学题，由 LLM 生成证明并由 Lean 4 编译器实时验证。")
 # 配置 LLM 客户端 (建议在 HF Space 设置中添加 HF_TOKEN 密钥)
 # 默认模型（可通过环境变量覆盖）
 MODEL_ID = os.environ.get("HF_MODEL_ID", "deepseek-ai/DeepSeek-Prover-V2-7B")
+client = _make_inference_client(MODEL_ID)
 # 用户输入
 nl_problem = st.text_area(
     "输入数学问题 (自然语言):", "证明：对于任何实数 x，若 x > 0，则 x + 1/x >= 2。"
 3) The final Lean code must compile with Mathlib. Do NOT use `sorry`.
 """.strip()
+        try:
+            gen_text = _call_llm(client, prompt, MODEL_ID)
+        except Exception as e:
+            # Provide actionable guidance instead of crashing.
+            st.error("调用 Hugging Face Inference 失败。")
+            st.write("错误信息:", str(e))
+            st.info(
+                "可能原因：该模型不支持 Hugging Face Serverless Inference 的 `text-generation`/`chat-completion`。\n"
+                "可选解决方案：\n"
+                "1) 在 Space Variables 里设置 `HF_MODEL_ID` 为一个已在 Inference 上可用的模��；\n"
+                "2) 使用 Hugging Face Inference Endpoint：设置 `HF_BASE_URL` 指向你的 Endpoint，并确保 `HF_TOKEN` 有权限；\n"
+                "3) 如需显式指定 provider，可设置 `HF_PROVIDER`（例如 `hf-inference`）。"
+            )
+            st.stop()
         plan = _extract_plan(gen_text)
         code_blocks = _extract_lean_code_blocks(gen_text)