NorthernTribe-Research commited on
Commit
b628dab
·
verified ·
1 Parent(s): 2b52bbd

Deploy Conjecture Lean+AI Lab Space

Browse files
Files changed (3) hide show
  1. README.md +42 -6
  2. app.py +531 -0
  3. requirements.txt +3 -0
README.md CHANGED
@@ -1,12 +1,48 @@
1
  ---
2
- title: Math Conjecture Lean Ai Lab
3
- emoji: ⚡
4
- colorFrom: purple
5
- colorTo: blue
6
  sdk: gradio
7
- sdk_version: 6.9.0
 
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Conjecture Lean+AI Lab
 
 
 
3
  sdk: gradio
4
+ sdk_version: "6.6.0"
5
+ python_version: "3.10"
6
  app_file: app.py
7
  pinned: false
8
+ emoji: "🔬"
9
  ---
10
 
11
+ # Conjecture Lean+AI Lab
12
+
13
+ This Space is tailored for conjecture-focused math research workflows that
14
+ combine:
15
+
16
+ - AI-assisted reasoning and strategy generation
17
+ - dataset-grounded retrieval from `NorthernTribe-Research/math-conjecture-training-corpus`
18
+ - model execution using `NorthernTribe-Research/math-conjecture-model` (with fallback)
19
+ - explicit work trace emission on every run
20
+ - Lean-oriented verification stubs for formal follow-up
21
+
22
+ ## What this Space does
23
+
24
+ 1. Accepts a conjecture statement and analysis mode.
25
+ 2. Retrieves relevant examples from the dataset (validation/test splits).
26
+ 3. Builds a transparent prompt with evidence context.
27
+ 4. Runs model inference (preferred repo, then fallback).
28
+ 5. Displays:
29
+ - full work log,
30
+ - retrieved evidence table,
31
+ - exact prompt sent to the model,
32
+ - model answer with explicit work sections,
33
+ - generated Lean theorem skeleton for formal verification.
34
+
35
+ ## Runtime secrets
36
+
37
+ Set at least one token secret in Space settings:
38
+
39
+ - `HF_TOKEN` or `HUGGINGFACE_HUB_TOKEN`
40
+
41
+ Optional:
42
+
43
+ - `HF_USERNAME`
44
+
45
+ ## Notes
46
+
47
+ - The app is intentionally designed for auditability and reproducibility.
48
+ - Every run should expose intermediate work artifacts in the UI.
app.py ADDED
@@ -0,0 +1,531 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Lean+AI conjecture analysis Space app with explicit work traces."""
3
+
4
+ from __future__ import annotations
5
+
6
+ import datetime as dt
7
+ import heapq
8
+ import os
9
+ import re
10
+ import textwrap
11
+ from pathlib import Path
12
+ from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
13
+
14
+ import gradio as gr
15
+ from datasets import load_dataset
16
+ from huggingface_hub import InferenceClient, hf_hub_download
17
+
18
+
19
+ DEFAULT_DATASET_REPO = "NorthernTribe-Research/math-conjecture-training-corpus"
20
+ DEFAULT_MODEL_REPO = "NorthernTribe-Research/math-conjecture-model"
21
+ DEFAULT_FALLBACK_MODEL = "deepseek-ai/deepseek-math-7b-instruct"
22
+ WORKSPACE = Path(__file__).resolve().parent / "workspace"
23
+ DATA_CACHE = WORKSPACE / "data"
24
+
25
+ DEFAULT_CONJECTURE = (
26
+ "Erdos-style Sidon set conjecture: every finite Sidon set can be extended "
27
+ "to a finite perfect difference set."
28
+ )
29
+
30
+
31
+ UI_CSS = r"""
32
+ :root {
33
+ --bg-0: #04101a;
34
+ --bg-1: #081826;
35
+ --panel: #0d2433;
36
+ --panel-2: #113044;
37
+ --border: rgba(170, 225, 250, 0.22);
38
+ --text: #eaf6ff;
39
+ --muted: #9ec0d4;
40
+ --cyan: #54d7ff;
41
+ --teal: #5df2c1;
42
+ --amber: #ffc56f;
43
+ }
44
+
45
+ body, .gradio-container {
46
+ color: var(--text) !important;
47
+ background:
48
+ radial-gradient(circle at 8% 0%, rgba(84, 215, 255, 0.18), rgba(84, 215, 255, 0) 34%),
49
+ radial-gradient(circle at 88% 0%, rgba(93, 242, 193, 0.14), rgba(93, 242, 193, 0) 30%),
50
+ linear-gradient(140deg, var(--bg-0) 0%, var(--bg-1) 100%) !important;
51
+ }
52
+
53
+ .gradio-container .main {
54
+ max-width: 1320px !important;
55
+ }
56
+
57
+ .block {
58
+ background: linear-gradient(180deg, rgba(15, 36, 51, 0.94), rgba(9, 27, 39, 0.95)) !important;
59
+ border: 1px solid var(--border) !important;
60
+ border-radius: 16px !important;
61
+ }
62
+
63
+ .hero {
64
+ border: 1px solid rgba(84, 215, 255, 0.34);
65
+ border-radius: 18px;
66
+ padding: 20px;
67
+ background:
68
+ radial-gradient(circle at 90% 10%, rgba(84, 215, 255, 0.15), transparent 36%),
69
+ radial-gradient(circle at 0% 0%, rgba(93, 242, 193, 0.14), transparent 30%),
70
+ linear-gradient(165deg, rgba(10, 30, 43, 0.98), rgba(7, 20, 31, 0.98));
71
+ }
72
+
73
+ .hero h1 {
74
+ margin: 0 0 8px 0;
75
+ letter-spacing: 0.02em;
76
+ }
77
+
78
+ .hero p {
79
+ margin: 0;
80
+ color: var(--muted);
81
+ }
82
+
83
+ .chip-row {
84
+ margin-top: 14px;
85
+ display: flex;
86
+ flex-wrap: wrap;
87
+ gap: 8px;
88
+ }
89
+
90
+ .chip-row span {
91
+ border: 1px solid rgba(84, 215, 255, 0.35);
92
+ border-radius: 999px;
93
+ padding: 4px 10px;
94
+ font-size: 0.7rem;
95
+ text-transform: uppercase;
96
+ letter-spacing: 0.12em;
97
+ color: var(--muted);
98
+ background: rgba(8, 24, 35, 0.95);
99
+ }
100
+
101
+ .status-ok {
102
+ color: var(--teal);
103
+ }
104
+
105
+ .status-warn {
106
+ color: var(--amber);
107
+ }
108
+ """
109
+
110
+
111
+ def now_iso() -> str:
112
+ return dt.datetime.now(dt.timezone.utc).replace(microsecond=0).isoformat()
113
+
114
+
115
+ def ensure_workspace() -> None:
116
+ DATA_CACHE.mkdir(parents=True, exist_ok=True)
117
+
118
+
119
+ def auth_token() -> Optional[str]:
120
+ token = (os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN") or "").strip()
121
+ return token or None
122
+
123
+
124
+ def tokenize(text: str) -> List[str]:
125
+ return re.findall(r"[a-zA-Z0-9_]{3,}", (text or "").lower())
126
+
127
+
128
+ def clip(text: str, max_len: int) -> str:
129
+ s = (text or "").strip()
130
+ if len(s) <= max_len:
131
+ return s
132
+ return s[: max_len - 3] + "..."
133
+
134
+
135
+ def score_example(query_terms: Sequence[str], row: Dict[str, Any]) -> float:
136
+ text = " ".join(
137
+ str(row.get(k, "")) for k in ("prompt", "target", "task_type", "family", "source_dataset")
138
+ ).lower()
139
+ terms = set(tokenize(text))
140
+ if not terms:
141
+ return 0.0
142
+ overlap = len(set(query_terms) & terms)
143
+ if overlap == 0:
144
+ return 0.0
145
+ density = overlap / max(1, len(query_terms))
146
+ return overlap + density
147
+
148
+
149
+ def iter_rows_from_split(parquet_path: Path, limit: int) -> Iterable[Dict[str, Any]]:
150
+ dataset = load_dataset("parquet", data_files={"rows": str(parquet_path)}, split="rows")
151
+ capped = min(limit, len(dataset))
152
+ for idx in range(capped):
153
+ yield dataset[idx]
154
+
155
+
156
+ def retrieve_evidence(
157
+ *,
158
+ dataset_repo_id: str,
159
+ query: str,
160
+ token: Optional[str],
161
+ rows_to_scan: int,
162
+ top_k: int,
163
+ ) -> Tuple[List[Dict[str, Any]], List[str]]:
164
+ ensure_workspace()
165
+ logs: List[str] = []
166
+ logs.append(f"[{now_iso()}] Evidence retrieval started for repo: {dataset_repo_id}")
167
+ query_terms = tokenize(query)
168
+ if not query_terms:
169
+ query_terms = tokenize(DEFAULT_CONJECTURE)
170
+
171
+ heap: List[Tuple[float, int, Dict[str, Any]]] = []
172
+ total_scanned = 0
173
+
174
+ for split in ("validation", "test"):
175
+ cached = hf_hub_download(
176
+ repo_id=dataset_repo_id,
177
+ repo_type="dataset",
178
+ filename=f"{split}.parquet",
179
+ token=token,
180
+ )
181
+ logs.append(f"[{now_iso()}] Downloaded {split}.parquet to cache")
182
+ split_quota = max(1, rows_to_scan // 2)
183
+ for idx, row in enumerate(iter_rows_from_split(Path(cached), split_quota)):
184
+ total_scanned += 1
185
+ score = score_example(query_terms, row)
186
+ if score <= 0:
187
+ continue
188
+ packed = {
189
+ "split": split,
190
+ "row_index": idx,
191
+ "score": round(score, 4),
192
+ "family": str(row.get("family") or ""),
193
+ "task_type": str(row.get("task_type") or ""),
194
+ "source_dataset": str(row.get("source_dataset") or ""),
195
+ "prompt": clip(str(row.get("prompt") or ""), 320),
196
+ "target": clip(str(row.get("target") or ""), 420),
197
+ }
198
+ if len(heap) < top_k:
199
+ heapq.heappush(heap, (score, idx, packed))
200
+ elif score > heap[0][0]:
201
+ heapq.heapreplace(heap, (score, idx, packed))
202
+
203
+ best = [item[2] for item in sorted(heap, key=lambda x: (-x[0], x[1]))]
204
+ logs.append(f"[{now_iso()}] Scanned rows: {total_scanned}; retained evidence: {len(best)}")
205
+ return best, logs
206
+
207
+
208
+ def build_prompt(
209
+ *,
210
+ conjecture: str,
211
+ goal_mode: str,
212
+ evidence_rows: Sequence[Dict[str, Any]],
213
+ always_show_work: bool,
214
+ ) -> str:
215
+ evidence_lines: List[str] = []
216
+ for idx, row in enumerate(evidence_rows, start=1):
217
+ evidence_lines.append(
218
+ textwrap.dedent(
219
+ f"""\
220
+ Evidence {idx}
221
+ - split: {row.get("split")}
222
+ - family: {row.get("family")}
223
+ - task_type: {row.get("task_type")}
224
+ - source_dataset: {row.get("source_dataset")}
225
+ - prompt: {row.get("prompt")}
226
+ - target_snippet: {row.get("target")}
227
+ """
228
+ ).strip()
229
+ )
230
+
231
+ work_policy = (
232
+ "Always show your full step-by-step work, assumptions, candidate constructions, "
233
+ "and verification plan. Do not hide intermediate reasoning."
234
+ if always_show_work
235
+ else "Provide concise reasoning."
236
+ )
237
+
238
+ return textwrap.dedent(
239
+ f"""\
240
+ You are NorthernTribe Research's Lean+AI conjecture assistant.
241
+ Goal mode: {goal_mode}
242
+ {work_policy}
243
+
244
+ Conjecture:
245
+ {conjecture}
246
+
247
+ Use the evidence below from our conjecture training corpus.
248
+ If the evidence is insufficient, say so explicitly and propose the next formal checks.
249
+
250
+ {chr(10).join(evidence_lines) if evidence_lines else "No evidence retrieved."}
251
+
252
+ Output format:
253
+ 1) Claim status (supported / refuted / unknown)
254
+ 2) Full work log
255
+ 3) Candidate proof or counterexample strategy
256
+ 4) Lean verification plan
257
+ 5) Risks and uncertainty
258
+ """
259
+ ).strip()
260
+
261
+
262
+ def infer_with_repo(
263
+ *,
264
+ repo_id: str,
265
+ prompt: str,
266
+ token: Optional[str],
267
+ max_new_tokens: int,
268
+ temperature: float,
269
+ ) -> str:
270
+ client = InferenceClient(model=repo_id, token=token)
271
+ try:
272
+ return str(
273
+ client.text_generation(
274
+ prompt,
275
+ max_new_tokens=max_new_tokens,
276
+ temperature=temperature,
277
+ return_full_text=False,
278
+ do_sample=temperature > 0,
279
+ )
280
+ )
281
+ except Exception:
282
+ # Try chat-style fallback if provider only exposes chat completion.
283
+ response = client.chat.completions.create(
284
+ model=repo_id,
285
+ messages=[
286
+ {"role": "system", "content": "You are a mathematical conjecture assistant."},
287
+ {"role": "user", "content": prompt},
288
+ ],
289
+ max_tokens=max_new_tokens,
290
+ temperature=temperature,
291
+ )
292
+ return str(response.choices[0].message.content or "")
293
+
294
+
295
+ def known_reference_hint(conjecture: str) -> str:
296
+ lower = conjecture.lower()
297
+ if "sidon" in lower and "perfect difference" in lower:
298
+ return (
299
+ "Reference note: arXiv:2510.19804 (v2, Jan 16, 2026) reports a counterexample "
300
+ "to the Sidon extension conjecture, including Lean artifacts."
301
+ )
302
+ return "No direct literature hint matched from built-in references."
303
+
304
+
305
+ def build_lean_stub(conjecture: str, status: str) -> str:
306
+ sanitized = clip(conjecture.replace("\n", " "), 220)
307
+ theorem_name = "conjecture_attempt"
308
+ if "sidon" in conjecture.lower():
309
+ theorem_name = "sidon_extension_attempt"
310
+ return textwrap.dedent(
311
+ f"""\
312
+ /- Auto-generated Lean sketch from Conjecture Lean+AI Lab -/
313
+ import Mathlib
314
+
315
+ namespace ConjectureLab
316
+
317
+ /-- Conjecture statement (natural language):
318
+ {sanitized}
319
+ -/
320
+ theorem {theorem_name} : Prop := by
321
+ -- Status from model attempt: {status}
322
+ -- TODO:
323
+ -- 1) Encode finite set / Sidon / perfect difference definitions.
324
+ -- 2) Formalize candidate construction or counterexample.
325
+ -- 3) Prove theorem or derive contradiction.
326
+ trivial
327
+
328
+ end ConjectureLab
329
+ """
330
+ )
331
+
332
+
333
+ def analyze_conjecture(
334
+ conjecture: str,
335
+ goal_mode: str,
336
+ dataset_repo_id: str,
337
+ model_repo_id: str,
338
+ fallback_model_id: str,
339
+ rows_to_scan: int,
340
+ top_k: int,
341
+ temperature: float,
342
+ max_new_tokens: int,
343
+ always_show_work: bool,
344
+ ) -> Tuple[str, str, List[Dict[str, Any]], str, str, str]:
345
+ work_log: List[str] = []
346
+ work_log.append(f"[{now_iso()}] Analysis requested")
347
+ work_log.append(f"[{now_iso()}] Goal mode: {goal_mode}")
348
+ work_log.append(f"[{now_iso()}] Dataset repo: {dataset_repo_id}")
349
+ work_log.append(f"[{now_iso()}] Preferred model repo: {model_repo_id}")
350
+
351
+ token = auth_token()
352
+ if token:
353
+ work_log.append(f"[{now_iso()}] HF token source detected from runtime secrets")
354
+ else:
355
+ work_log.append(f"[{now_iso()}] No HF token found; proceeding with public-access paths only")
356
+
357
+ try:
358
+ evidence, retrieval_log = retrieve_evidence(
359
+ dataset_repo_id=dataset_repo_id,
360
+ query=conjecture,
361
+ token=token,
362
+ rows_to_scan=max(100, int(rows_to_scan)),
363
+ top_k=max(1, int(top_k)),
364
+ )
365
+ work_log.extend(retrieval_log)
366
+ except Exception as exc:
367
+ evidence = []
368
+ work_log.append(f"[{now_iso()}] Evidence retrieval error: {type(exc).__name__}: {exc}")
369
+
370
+ prompt = build_prompt(
371
+ conjecture=conjecture or DEFAULT_CONJECTURE,
372
+ goal_mode=goal_mode,
373
+ evidence_rows=evidence,
374
+ always_show_work=always_show_work,
375
+ )
376
+
377
+ model_used = model_repo_id
378
+ answer = ""
379
+ try:
380
+ answer = infer_with_repo(
381
+ repo_id=model_repo_id,
382
+ prompt=prompt,
383
+ token=token,
384
+ max_new_tokens=int(max_new_tokens),
385
+ temperature=float(temperature),
386
+ )
387
+ work_log.append(f"[{now_iso()}] Inference succeeded using preferred model repo")
388
+ except Exception as exc:
389
+ work_log.append(
390
+ f"[{now_iso()}] Preferred model inference failed ({type(exc).__name__}). "
391
+ f"Falling back to {fallback_model_id}."
392
+ )
393
+ model_used = fallback_model_id
394
+ try:
395
+ answer = infer_with_repo(
396
+ repo_id=fallback_model_id,
397
+ prompt=prompt,
398
+ token=token,
399
+ max_new_tokens=int(max_new_tokens),
400
+ temperature=float(temperature),
401
+ )
402
+ work_log.append(f"[{now_iso()}] Fallback model inference succeeded")
403
+ except Exception as inner_exc:
404
+ work_log.append(f"[{now_iso()}] Fallback inference failed: {type(inner_exc).__name__}: {inner_exc}")
405
+ answer = (
406
+ "Model inference unavailable. Based on known literature context, "
407
+ "this conjecture may require a counterexample-driven disproof workflow. "
408
+ "Use retrieved evidence plus Lean formalization to validate next steps."
409
+ )
410
+
411
+ reference_hint = known_reference_hint(conjecture)
412
+ work_log.append(f"[{now_iso()}] {reference_hint}")
413
+
414
+ status = "unknown"
415
+ lowered = answer.lower()
416
+ if "refuted" in lowered or "counterexample" in lowered:
417
+ status = "refuted_candidate"
418
+ elif "proved" in lowered or "supported" in lowered:
419
+ status = "supported_candidate"
420
+
421
+ lean_stub = build_lean_stub(conjecture, status=status)
422
+ summary = (
423
+ f"### Conjecture Analysis Status\n\n"
424
+ f"- `status`: **{status}**\n"
425
+ f"- `model_used`: `{model_used}`\n"
426
+ f"- `evidence_rows`: `{len(evidence)}`\n"
427
+ f"- `timestamp`: `{now_iso()}`\n"
428
+ )
429
+
430
+ if always_show_work:
431
+ answer = (
432
+ f"### Model Attempt\n\n{answer.strip()}\n\n"
433
+ f"### Literature Hint\n\n{reference_hint}\n\n"
434
+ f"### Explicit Work Policy\n\n"
435
+ f"This run was configured to show full intermediate work in the UI."
436
+ )
437
+
438
+ return (
439
+ summary,
440
+ answer.strip(),
441
+ evidence,
442
+ "\n".join(work_log),
443
+ prompt,
444
+ lean_stub,
445
+ )
446
+
447
+
448
+ with gr.Blocks(title="Conjecture Lean+AI Lab", css=UI_CSS) as demo:
449
+ gr.HTML(
450
+ """
451
+ <section class="hero">
452
+ <h1>Conjecture Lean+AI Lab</h1>
453
+ <p>
454
+ A dedicated Space for conjecture analysis using NorthernTribe datasets and models.
455
+ Every run emits an explicit work log, retrieved evidence, prompt trace, and Lean stub.
456
+ </p>
457
+ <div class="chip-row">
458
+ <span>Lean-Oriented</span>
459
+ <span>Dataset-Grounded</span>
460
+ <span>Model-Assisted</span>
461
+ <span>Full Work Trace</span>
462
+ </div>
463
+ </section>
464
+ """
465
+ )
466
+
467
+ with gr.Row():
468
+ dataset_repo_id = gr.Textbox(label="Dataset Repo", value=DEFAULT_DATASET_REPO)
469
+ model_repo_id = gr.Textbox(label="Preferred Model Repo", value=DEFAULT_MODEL_REPO)
470
+ fallback_model_id = gr.Textbox(label="Fallback Base Model", value=DEFAULT_FALLBACK_MODEL)
471
+
472
+ conjecture_text = gr.Textbox(
473
+ label="Conjecture",
474
+ lines=5,
475
+ value=DEFAULT_CONJECTURE,
476
+ )
477
+
478
+ with gr.Row():
479
+ goal_mode = gr.Radio(
480
+ label="Goal Mode",
481
+ choices=[
482
+ "Try to disprove (counterexample search)",
483
+ "Try to support (proof sketch search)",
484
+ "Balanced analysis (both directions)",
485
+ ],
486
+ value="Balanced analysis (both directions)",
487
+ )
488
+ always_show_work = gr.Checkbox(label="Always Show Full Work", value=True)
489
+
490
+ with gr.Row():
491
+ rows_to_scan = gr.Slider(label="Rows to Scan", minimum=200, maximum=6000, step=100, value=1800)
492
+ top_k = gr.Slider(label="Top Evidence Rows", minimum=3, maximum=25, step=1, value=8)
493
+ max_new_tokens = gr.Slider(label="Max New Tokens", minimum=200, maximum=2000, step=50, value=900)
494
+ temperature = gr.Slider(label="Temperature", minimum=0.0, maximum=1.2, step=0.05, value=0.2)
495
+
496
+ run_btn = gr.Button("Run Conjecture Analysis", variant="primary")
497
+
498
+ status_md = gr.Markdown()
499
+ answer_md = gr.Markdown()
500
+ evidence_df = gr.Dataframe(
501
+ label="Retrieved Evidence (from dataset)",
502
+ wrap=True,
503
+ interactive=False,
504
+ )
505
+ work_log_box = gr.Textbox(label="Work Log (always visible)", lines=18, max_lines=30, interactive=False)
506
+ prompt_box = gr.Textbox(label="Prompt Sent to Model", lines=14, max_lines=22, interactive=False)
507
+ lean_box = gr.Code(label="Lean Verification Stub", language="lean")
508
+
509
+ run_btn.click(
510
+ fn=analyze_conjecture,
511
+ inputs=[
512
+ conjecture_text,
513
+ goal_mode,
514
+ dataset_repo_id,
515
+ model_repo_id,
516
+ fallback_model_id,
517
+ rows_to_scan,
518
+ top_k,
519
+ temperature,
520
+ max_new_tokens,
521
+ always_show_work,
522
+ ],
523
+ outputs=[status_md, answer_md, evidence_df, work_log_box, prompt_box, lean_box],
524
+ )
525
+
526
+
527
+ demo.queue(default_concurrency_limit=2)
528
+
529
+
530
+ if __name__ == "__main__":
531
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ gradio>=6.6.0,<7
2
+ huggingface_hub>=1.5.0
3
+ datasets>=2.21.0,<3