\" f\"🔄 Reliability Testing — k={k} trials\" f\"
\" f\"pass@{k} = P(≥1 of {k} trials passes) — optimistic bound | \" f\"pass^{k} = P(ALL {k} trials pass) — reliability estimate
\" ) table = ( \"
\" \"\" f\"| Evaluator | \" f\"Avg | \" f\"pass@{k} | \" f\"pass^{k} | \" f\"Verdict | \" \"
\" ) for r in rows: color, icon = verdict_style.get(r[\"Verdict\"], (\"#888\", \"?\")) table += ( f\"\" f\"| {r['Evaluator']} | \" f\"{r['Avg Score']} | \" f\"{r[f'pass@{k}']} | \" f\"{r[f'pass^{k}']} | \" f\"{icon} {r['Verdict']} | \" \"
\" ) table += \"
\" summary = ( f\"
\" f\"Overall — pass@{k}: {rel_report.overall_pass_at_k:.0%}\" f\" | pass^{k}: {rel_report.overall_pass_hat_k:.0%}\" f\" | avg score: {rel_report.avg_score:.0%}
\" ) return header + table + summary def run_evaluation( trace_json: str, use_session: bool, use_trace: bool, use_span: bool, sel_session: list, sel_trace: list, sel_span: list, threshold: float, k_trials: int, eval_mode_radio: str, hf_token: str, exp_response: str, exp_trajectory: str, assertions_text: str, progress=gr.Progress(track_tqdm=True), ): # ── 1. Parse input ──────────────────────────────────────────────────── progress(0.05, desc=\"Parsing trace…\") try: session = parse_trace(trace_json) except Exception as e: err = ( f\"
❌ Parse error: {e}
\" ) return err, None, None, None, err # ── 2. Build ground truth ───────────────────────────────────────────── gt = None if exp_response.strip() or exp_trajectory.strip() or assertions_text.strip(): traj = ( [t.strip() for t in exp_trajectory.split(\",\") if t.strip()] if exp_trajectory.strip() else None ) asrt = ( [a.strip() for a in assertions_text.splitlines() if a.strip()] if assertions_text.strip() else None ) gt = GroundTruth( expected_response=exp_response.strip() or None, expected_trajectory=traj, assertions=asrt, ) # ── 3. Resolve selected evaluators ─────────────────────────────────── sess_evals = sel_session if use_session else [] trace_evals = sel_trace if use_trace else [] span_evals = sel_span if use_span else [] if not sess_evals and not trace_evals and not span_evals: warn = \"
⚠️ No evaluators selected — please enable at least one level.
\" return warn, None, None, None, warn # ── 4. Build LLM judge (if requested) ──────────────────────────────── use_llm = eval_mode_radio == \"LLM Judge (QwQ-32B)\" mode = EvalMode.LLM if use_llm else EvalMode.HEURISTIC judge = None if use_llm: token = hf_token.strip() or None judge = LLMJudge(api_key=token) if not judge.available: warn = \"
⚠️ LLM mode selected but no HF Token provided — falling back to heuritic.
\" mode = EvalMode.HEURISTIC # ── 5. Run evaluation (single or k trials) ───────────────────────────── progress(0.15, desc=\"Running evalua",
+ "readme_body": "# 🧪 AI Agent Evaluation Pipeline\n\n> Evaluate AI agents at **Session**, **Trace**, and **Span** levels — inspired by [Amazon Bedrock AgentCore Evaluations](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/evaluations.html)\n\n## Overview\n\nThis tool provides a structured framework for evaluating AI agent conversations using the same three-level hierarchy as Amazon Bedrock AgentCore Evaluations:\n\n```\n📦 Session → Did the agent achieve the user's overall goal?\n └── 🔄 Trace → Per-turn quality (helpfulness, coherence, relevance...)\n └── 🔧 Span → Per tool-call accuracy\n```\n\n## Features\n\n- **14 built-in evaluators** (1 session + 11 trace + 2 span)\n- **Heuristic mode** — works offline, no API key required\n- **3 demo traces** (Simple Q&A, Tool Calling, Multi-turn)\n- **Ground truth support** — `expected_response`, `expected_trajectory`, `assertions`\n- **Visual results** — radar chart, bar chart, heatmap, score cards\n\n## Evaluators\n\n### 📦 Session Level (1)\n\n| Evaluator | Description |\n| ----------------- | --------------------------------------------------- |\n| Goal Success Rate | Did the agent fully achieve the user's stated goal? |\n\n### 🔄 Trace Level (11)\n\n| Evaluator | Description |\n| ----------------------- | ----------------------------------------------------------- |\n| Helpfulness | Does the response help the user progress toward their goal? |\n| Correctness | Is the response factually correct? |\n| Coherence | Is the reasoning logically consistent and well-structured? |\n| Conciseness | Is the response appropriately concise? |\n| Faithfulness | Is the response consistent with conversation history? |\n| Harmfulness | Does the response contain harmful content? |\n| Instruction Following | Does the agent follow its system prompt? |\n| Response Relevance | Does the response address what was asked? |\n| Context Relevance | Was the retrieved context relevant? (RAG) |\n| Refusal Appropriateness | Did the agent correctly handle refusals? |\n| Stereotyping / Bias | Is there demographic bias in the response? |\n\n### 🔧 Span Level (2)\n\n| Evaluator | Description |\n| ----------------------- | -------------------------------------- |\n| Tool Selection Accuracy | Did the agent choose the right tool? |\n| Tool Parameter Accuracy | Did the agent pass correct parameters? |\n\n## JSON Trace Format\n\n```json\n{\n \"session_id\": \"my_session\",\n \"user_goal\": \"The user's overall goal for this conversation\",\n \"system_prompt\": \"(optional) System instructions given to the agent\",\n \"traces\": [\n {\n \"trace_id\": \"t1\",\n \"user_input\": \"User's message\",\n \"agent_response\": \"Agent's reply\",\n \"retrieved_context\": \"(optional) RAG context\",\n \"spans\": [\n {\n \"span_id\": \"s1\",\n \"span_type\": \"TOOL_CALL\",\n \"tool_name\": \"my_tool\",\n \"tool_input\": { \"param\": \"value\" },\n \"tool_output\": \"Tool result\",\n \"duration_ms\": 250\n }\n ]\n }\n ]\n}\n```\n\n## Ground Truth Support\n\nOptional reference inputs for more precise evaluation:\n\n- **`expected_response`** — What the final response should look like (enables Correctness scoring)\n- **`expected_trajectory`** — Expected tool call sequence (enables TrajectoryMatch scoring)\n- **`assertions`** — Natural language assertions about the session (enables GoalSuccessRate scoring)\n\n## Running Locally\n\n```bash\ngit clone https://github.com/your-org/ai-agent-eval-pipeline\ncd ai-agent-eval-pipeline\npip install -r requirements.txt\n\n# Gradio UI\npython app.py # http://localhost:7860\n\n# REST API\npython api.py # http://localhost:8000\n# or\nuvicorn api:app --reload --port 8000\n```\n\n## Integration — Zero Changes to Your Agent\n\n### Option 1 — Python Wrapper\n\n```python\nfrom src.wrapper import SessionTracer\n\nwith SessionTracer(\n goal=\"Interview a Python candidate\",\n system_prompt=\"You are a technical interviewer...\",\n) as tracer:\n for user_msg in conversation:\n # Your agent code — completely unchanged\n response = my_agent.invoke(user_msg)\n\n # Optional: capture tool calls made during this turn\n span = tracer.new_span()\n span.log_span(\"search_kb\", {\"query\": user_msg}, kb_result)\n\n tracer.log_trace(user_msg, response, span)\n\n report = tracer.evaluate()\n print(f\"Overall: {report.overall_score:.0%}\")\n tracer.save(\"traces/session_001.json\")\n```\n\n### Option 2 — REST API\n\n```bash\n# Start the server\npython api.py # → http://localhost:8000\n\n# Evaluate a session\ncurl -X POST http://localhost:8000/evaluate/quick \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"trace\": {\n \"session_id\": \"interview_001\",\n \"user_goal\": \"Assess Python skills\",\n \"traces\": [\n {\n \"trace_id\": \"t1\",\n \"user_input\": \"What is a decorator?\",\n \"agent_response\": \"A decorator is a function that wraps another function...\",\n \"spans\": []\n }\n ]\n }\n }'\n```\n\nAPI docs auto-generated at `http://localhost:8000/docs`.\n\n## Architecture\n\n```\napp.py # Gradio UI entry point\napi.py # FastAPI REST server\nsrc/\n├── models.py # Session / Trace / Span / EvalScore data classes\n├── parser.py # JSON trace parser\n├── evaluators.py # All 14 evaluators (heuristic + LLM-ready)\n├── runner.py # Evaluation orchestrator\n├── visualizer.py # Plotly charts\n└── wrapper.py # SessionTracer — captures agent conversations\ndemos/\n├── simple_qa.json # Demo: Simple Q&A\n├── tool_calling.json # Demo: Tool calling\n└── multi_turn.json # Demo: Multi-turn with tools\n```\n\n## Roadmap\n\n### ✅ MVP Complete\n\n- [x] **Gradio UI** — 14 evaluators, Session / Trace / Span levels, 3 demo traces\n- [x] **Agent Wrapper** (`src/wrapper.py`) — `SessionTracer` + `trace_agent` decorator\n- [x] **REST API** (`api.py`) — `POST /evaluate`, `POST /evaluate/quick`, `GET /evaluators`\n- [x] **LLM-as-Judge** (`src/llm_judge.py`) — `Qwen/Qwen3.6-27B` via HF Inference API\n- [x] **pass@k / pass^k** (`src/reliability.py`) — multi-trial reliability metrics\n- [x] **Golden Dataset Generator** — Nemotron-3-Nano-30B, 8 tech interview domains\n- [x] **Deployed** — `build-small-hackathon/AI-agent-Evaluation-pipeline`\n\n### 📋 Future (post-MVP)\n\n- [ ] Export results as JSON / CSV\n- [ ] Custom evaluator builder (user-defined prompt templates)\n- [ ] Dataset management for regression testing\n- [ ] Online monitoring mode\n\n## Inspiration\n\nThis project is inspired by the architecture and evaluator design of [Amazon Bedrock AgentCore Evaluations](https://aws.amazon.com/blogs/machine-learning/build-reliable-ai-agents-with-amazon-bedrock-agentcore-evaluations/), re-implemented as an open-source Gradio application.\n\n## License\n\nMIT",
+ "app_file_source": "#!/usr/bin/env python3\n\"\"\"\nAI Agent Evaluation Pipeline — Gradio MVP\n==========================================\nEvaluate AI agents at 3 hierarchical levels, inspired by\nAmazon Bedrock AgentCore Evaluations.\n\n 📦 Session — Did the agent achieve the user's goal?\n 🔄 Trace — Per-turn quality (11 evaluators)\n 🔧 Span — Per tool-call accuracy (2 evaluators)\n\nRun locally : python app.py\nHuggingFace : app_file = app.py (Gradio SDK)\n\"\"\"\n\nimport json\nimport os\nimport sys\nfrom pathlib import Path\n\n# Ensure src/ is importable whether run from repo root or HF Spaces\n_ROOT = Path(__file__).parent\nsys.path.insert(0, str(_ROOT))\n\nimport gradio as gr\n\n# HF ZeroGPU Spaces require at least one @spaces.GPU-decorated function\n# to be detected at module load. The actual evaluation and dataset\n# generation work in this app uses the cloud InferenceClient and runs\n# without local GPU compute; the placeholder below exists only to\n# satisfy the runtime's static check. `spaces` is pre-installed on\n# ZeroGPU hardware; we guard the import so the app still loads if it\n# is missing (e.g. local CPU dev).\ntry:\n import spaces as _spaces\nexcept ImportError:\n class _spaces_stub:\n @staticmethod\n def GPU(fn, duration: int = 60):\n return fn\n _spaces = _spaces_stub()\n\n\n@_spaces.GPU\ndef _zero_gpu_healthcheck() -> dict:\n \"\"\"Placeholder GPU function detected by the ZeroGPU runtime.\"\"\"\n try:\n import torch\n return {\"cuda_available\": bool(torch.cuda.is_available())}\n except ImportError:\n return {\"cuda_available\": False, \"note\": \"torch not installed\"}\n\n\nfrom src.evaluators import (\n ALL_EVALUATORS,\n DEFAULT_TRACE_EVALS,\n SESSION_EVALUATORS,\n SPAN_EVALUATORS,\n TRACE_EVALUATORS,\n)\nfrom src.llm_judge import LLMJudge\nfrom src.models import EvalLevel, EvalMode, GroundTruth\nfrom src.parser import format_trace_tree, parse_trace\nfrom src.reliability import compute_reliability\nfrom src.runner import EvalRunner\nfrom src.visualizer import create_bar_chart, create_radar_chart, create_trace_timeline\n\n# ─── Load demo traces ───────────────────────────────────────────────────────\n\n_DEMOS = _ROOT / \"demos\"\n\n\ndef _load_demo(name: str) -> str:\n p = _DEMOS / f\"{name}.json\"\n return p.read_text(encoding=\"utf-8\") if p.exists() else \"{}\"\n\n\nDEMO_SIMPLE_QA = _load_demo(\"simple_qa\")\nDEMO_TOOL_CALLING = _load_demo(\"tool_calling\")\nDEMO_MULTI_TURN = _load_demo(\"multi_turn\")\n\n# ─── UI helpers ─────────────────────────────────────────────────────────────\n\n_LEVEL_COLOR = {\n EvalLevel.SESSION: \"#9B59B6\",\n EvalLevel.TRACE: \"#3498DB\",\n EvalLevel.SPAN: \"#27AE60\",\n}\n\n_LEVEL_ICON = {\n EvalLevel.SESSION: \"📦\",\n EvalLevel.TRACE: \"🔄\",\n EvalLevel.SPAN: \"🔧\",\n}\n\n\ndef _bar_color(score: float) -> str:\n if score >= 0.8:\n return \"#4CAF50\"\n elif score >= 0.6:\n return \"#FF9800\"\n return \"#F44336\"\n\n\ndef _bg_color(score: float) -> str:\n if score >= 0.8:\n return \"rgba(76,175,80,0.12)\"\n elif score >= 0.6:\n return \"rgba(255,152,0,0.12)\"\n return \"rgba(244,67,54,0.12)\"\n\n\ndef render_score_card(score) -> str:\n color = _bar_color(score.score)\n bg = _bg_color(score.score)\n badge_color = _LEVEL_COLOR.get(score.level, \"#888\")\n level_icon = _LEVEL_ICON.get(score.level, \"\")\n\n return f\"\"\"\n
\n
\n
\n {level_icon} {score.level.value}\n {score.evaluator_display}\n
\n
{score.score_pct}%\n
\n
\n
\n \n {score.target_label} · {score.mode.value} mode\n
\n {score.explanation}\n
\n
\"\"\"\n\n\ndef render_overall_banner(report) -> str:\n s = report.overall_score\n color = _bar_color(s)\n passed = sum(1 for x in report.scores if x.passed)\n total = len(report.scores)\n status = \"PASS ✅\" if s >= 0.6 else \"NEEDS REVIEW ⚠️\"\n\n # Level breakdown\n sess_avg = (\n sum(x.score for x in report.session_scores) / len(report.session_scores)\n if report.session_scores\n else None\n )\n trace_avg = (\n sum(x.score for x in report.trace_scores) / len(report.trace_scores)\n if report.trace_scores\n else None\n )\n span_avg = (\n sum(x.score for x in report.span_scores) / len(report.span_scores)\n if report.span_scores\n else None\n )\n\n def level_chip(label, avg, icon, level):\n if avg is None:\n return \"\"\n c = _bar_color(avg)\n bc = _LEVEL_COLOR.get(level, \"#888\")\n return (\n f'
'\n f'
{icon} {label}
'\n f'
{avg:.0%}
'\n f\"
\"\n )\n\n chips = \" \".join(\n [\n level_chip(\"SESSION\", sess_avg, \"📦\", EvalLevel.SESSION),\n level_chip(\"TRACE\", trace_avg, \"🔄\", EvalLevel.TRACE),\n level_chip(\"SPAN\", span_avg, \"🔧\", EvalLevel.SPAN),\n ]\n )\n\n return f\"\"\"\n
\n
\n
\n
OVERALL SCORE
\n
{s:.0%}
\n
\n {passed}/{total} evaluators passed · \n {len(report.session.traces)} turn(s) · \n {report.elapsed_seconds:.2f}s · \n {report.eval_mode.value} mode\n
\n
\n
\n
\n
\n
\"\"\"\n\n\ndef parse_and_preview(trace_json: str) -> str:\n if not trace_json or not trace_json.strip():\n return \"*Paste or load a JSON trace above to see a preview.*\"\n try:\n session = parse_trace(trace_json)\n return format_trace_tree(session)\n except Exception as e:\n return f\"❌ **Parse error:** `{e}`\\n\\nCheck that your JSON is valid and contains `user_goal` + `traces`.\"\n\n\n# ─── Benchmark functions ──────────────────────────────────────────────────────\n\n\ndef load_records_from_url(url: str) -> list:\n \"\"\"Load JSONL records from a HF dataset repo URL (data/golden_dataset.jsonl).\"\"\"\n from urllib.parse import urlparse\n\n from huggingface_hub import hf_hub_download\n\n parsed = urlparse(url)\n if \"huggingface.co\" not in parsed.netloc or \"/datasets/\" not in parsed.path:\n raise ValueError(f\"Not a HF dataset URL: {url}\")\n repo_id = parsed.path.split(\"/datasets/\")[1].strip(\"/\").split(\"/\")[0]\n path = hf_hub_download(\n repo_id=repo_id,\n filename=\"data/golden_dataset.jsonl\",\n repo_type=\"dataset\",\n )\n with open(path, encoding=\"utf-8\") as f:\n return [json.loads(line) for line in f if line.strip()]\n\n\ndef parse_pasted_jsonl(text: str) -> list:\n \"\"\"Parse pasted JSONL content into list of records.\"\"\"\n return [json.loads(line) for line in text.splitlines() if line.strip()]\n\n\ndef call_openai_compat(\n url: str, scenario: dict, api_key: str, model: str, timeout: int = 60\n) -> str:\n \"\"\"POST to an OpenAI-compatible /v1/chat/completions endpoint.\"\"\"\n import requests\n\n headers = {\"Content-Type\": \"application/json\"}\n if api_key.strip():\n headers[\"Authorization\"] = f\"Bearer {api_key.strip()}\"\n body = {\n \"messages\": [\n {\"role\": \"system\", \"content\": scenario.get(\"system_prompt\", \"\")},\n {\"role\": \"user\", \"content\": scenario[\"initial_message\"]},\n ],\n }\n if model.strip():\n body[\"model\"] = model.strip()\n r = requests.post(url, json=body, headers=headers, timeout=timeout)\n r.raise_for_status()\n data = r.json()\n return data[\"choices\"][0][\"message\"][\"content\"]\n\n\ndef build_trace_json(rec: dict, agent_response: str) -> str:\n \"\"\"Build a parseable trace JSON from a dataset record + agent response.\"\"\"\n scenario = rec.get(\"scenario\", {})\n return json.dumps(\n {\n \"session_id\": rec.get(\"id\", \"unknown\"),\n \"user_goal\": scenario.get(\"user_goal\", \"\"),\n \"system_prompt\": scenario.get(\"system_prompt\"),\n \"traces\": [\n {\n \"trace_id\": \"t1\",\n \"user_input\": scenario.get(\"initial_message\", \"\"),\n \"agent_response\": agent_response,\n }\n ],\n },\n ensure_ascii=False,\n )\n\n\ndef run_benchmark(\n dataset_url: str,\n pasted_jsonl: str,\n agent_url: str,\n api_key: str,\n model_name: str,\n use_session: bool,\n use_trace: bool,\n use_span: bool,\n sel_session: list,\n sel_trace: list,\n sel_span: list,\n threshold: float,\n progress=gr.Progress(track_tqdm=True),\n):\n \"\"\"Run benchmark: load dataset, call agent for each record, eval, aggregate.\"\"\"\n\n def render_status(phase: str, done: int, total: int, current_id: str = \"\") -> str:\n pct = int(done / total * 100) if total else 0\n current = f\" · ⏳ {current_id}\" if current_id else \"\"\n return (\n f\"
\"\n f\"
\"\n f\"{phase} · {done}/{total} ({pct}%){current}
\"\n f\"
\"\n )\n\n def render_table(rows: list) -> str:\n if not rows:\n return \"\"\n body = \"\"\n for r in rows:\n color = \"#4CAF50\" if r[\"passed\"] else \"#F44336\"\n icon = \"✅\" if r[\"passed\"] else \"⚠️\"\n score = r[\"score\"]\n score_str = f\"{score:.0%}\" if isinstance(score, float) else \"—\"\n err_cell = (\n f\"
{r['error']}
\"\n if r.get(\"error\")\n else \"\"\n )\n body += (\n \"
\"\n f\"| {r['id']} | \"\n f\"{r['domain']} | \"\n f\"{r['difficulty']} | \"\n f\"{score_str} {icon} | \"\n f\"{err_cell} | \"\n \"
\"\n )\n return (\n \"
\"\n \"\"\n \"| ID | \"\n \"Domain | \"\n \"Difficulty | \"\n \"Score | \"\n \"Error | \"\n \"
\" + body + \"
\"\n )\n\n def render_aggregate(rows: list, total: int) -> str:\n scored = [r for r in rows if isinstance(r[\"score\"], float)]\n if not scored:\n return \"\"\n ok = sum(1 for r in scored if r[\"passed\"])\n avg = sum(r[\"score\"] for r in scored) / len(scored)\n by_domain: dict = {}\n for r in scored:\n d = r[\"domain\"] or \"—\"\n by_domain.setdefault(d, []).append(r[\"score\"])\n domain_chips = \" \".join(\n f\"
\"\n f\"{d}: {sum(s)/len(s):.0%}\"\n for d, s in sorted(by_domain.items())\n )\n return (\n f\"
\"\n f\"
📊 Aggregate
\"\n f\"
\"\n f\"Passed: {ok}/{len(scored)} \"\n f\" · Avg: {avg:.0%}\"\n f\" · Threshold: {threshold:.0%}
\"\n f\"
{domain_chips}
\"\n )\n\n def panel(*htmls: str) -> str:\n return \"\".join(h for h in htmls if h)\n\n progress(0.02, desc=\"Loading dataset…\")\n yield panel(render_status(\"Loading dataset\", 0, 1)), \"📂 Loading dataset…\"\n try:\n if pasted_jsonl.strip():\n records = parse_pasted_jsonl(pasted_jsonl)\n source = \"pasted JSONL\"\n else:\n records = load_records_from_url(dataset_url.strip())\n source = dataset_url.strip()\n except Exception as e:\n err = f\"❌ Failed to load dataset: {e}\"\n yield (\n panel(f\"
{err}
\"),\n f\"ERROR: {e}\\nPaste JSONL directly if the URL is empty or unreachable.\",\n )\n return\n\n if not records:\n yield (\n panel(\"
⚠️ Dataset loaded but empty.
\"),\n \"No records found in source.\",\n )\n return\n\n total = len(records)\n log_lines = [f\"✅ Loaded {total} records from {source}\"]\n yield (\n panel(\n render_status(\"Loaded\", total, total),\n f\"
📂 {total} records loaded from {source}
\",\n ),\n \"\\n\".join(log_lines),\n )\n\n if not agent_url.strip():\n yield (\n panel(\"
❌ Agent URL is empty.
\"),\n \"ERROR: Provide an OpenAI-compatible chat completions URL.\",\n )\n return\n\n sess_evals = sel_session if use_session else []\n trace_evals = sel_trace if use_trace else []\n span_evals = sel_span if use_span else []\n runner = EvalRunner(\n selected_session_evals=sess_evals,\n selected_trace_evals=trace_evals,\n selected_span_evals=span_evals,\n threshold=threshold,\n mode=EvalMode.HEURISTIC,\n )\n\n results = []\n for i, rec in enumerate(records):\n rid = rec.get(\"id\", f\"rec_{i}\")\n domain = rec.get(\"domain\", \"\")\n difficulty = rec.get(\"difficulty\", \"\")\n progress(0.1 + 0.85 * i / total, desc=f\"Running {rid}…\")\n log_lines.append(f\"⏳ {rid} ({domain}/{difficulty})…\")\n yield (\n panel(render_status(\"Running\", i, total, rid), render_table(results)),\n \"\\n\".join(log_lines),\n )\n\n try:\n scenario = rec.get(\"scenario\") or {}\n agent_out = call_openai_compat(\n agent_url.strip(),\n scenario,\n api_key or \"\",\n model_name or \"\",\n timeout=60,\n )\n trace_json = build_trace_json(rec, agent_out)\n session = parse_trace(trace_json)\n gt_data = rec.get(\"ground_truth\") or {}\n gt = GroundTruth(\n expected_response=gt_data.get(\"expected_response\"),\n expected_trajectory=gt_data.get(\"expected_trajectory\"),\n assertions=gt_data.get(\"assertions\"),\n )\n report = runner.run(session, gt)\n score = report.overall_score\n results.append(\n {\n \"id\": rid,\n \"domain\": domain,\n \"difficulty\": difficulty,\n \"score\": score,\n \"passed\": score >= threshold,\n \"error\": None,\n }\n )\n log_lines[-1] = f\"✅ {rid} — {score:.0%}\"\n except Exception as e:\n results.append(\n {\n \"id\": rid,\n \"domain\": domain,\n \"difficulty\": difficulty,\n \"score\": None,\n \"passed\": False,\n \"error\": f\"{type(e).__name__}: {str(e)[:80]}\",\n }\n )\n log_lines[-1] = f\"✗ {rid} — {type(e).__name__}: {str(e)[:60]}\"\n\n yield (\n panel(render_status(\"Running\", i + 1, total), render_table(results)),\n \"\\n\".join(log_lines),\n )\n\n progress(1.0, desc=\"Done!\")\n yield (\n panel(\n render_status(\"Done\", total, total),\n render_table(results),\n render_aggregate(results, total),\n ),\n \"\\n\".join(log_lines),\n )\n\n\n# ─── Main evaluation function ────────────────────────────────────────────────\n\n\ndef render_reliability(rel_report, k: int) -> str:\n \"\"\"Render pass@k / pass^k as an HTML table.\"\"\"\n if not rel_report or not rel_report.evaluator_results:\n return \"\"\n rows = rel_report.summary_table()\n verdict_style = {\n \"reliable\": (\"#4CAF50\", \"✅\"),\n \"unstable\": (\"#FF9800\", \"⚠️\"),\n \"unreliable\": (\"#F44336\", \"❌\"),\n }\n header = (\n f\"
\"\n f\"🔄 Reliability Testing — k={k} trials
\"\n f\"
\"\n f\"pass@{k} = P(≥1 of {k} trials passes) — optimistic bound | \"\n f\"pass^{k} = P(ALL {k} trials pass) — reliability estimate
\"\n )\n table = (\n \"
\"\n \"\"\n f\"| Evaluator | \"\n f\"Avg | \"\n f\"pass@{k} | \"\n f\"pass^{k} | \"\n f\"Verdict | \"\n \"
\"\n )\n for r in rows:\n color, icon = verdict_style.get(r[\"Verdict\"], (\"#888\", \"?\"))\n table += (\n f\"\"\n f\"| {r['Evaluator']} | \"\n f\"{r['Avg Score']} | \"\n f\"{r[f'pass@{k}']} | \"\n f\"{r[f'pass^{k}']} | \"\n f\"{icon} {r['Verdict']} | \"\n \"
\"\n )\n table += \"
\"\n\n summary = (\n f\"
\"\n f\"Overall — pass@{k}: {rel_report.overall_pass_at_k:.0%}\"\n f\" | pass^{k}: {rel_report.overall_pass_hat_k:.0%}\"\n f\" | avg score: {rel_report.avg_score:.0%}
\"\n )\n return header + table + summary\n\n\ndef run_evaluation(\n trace_json: str,\n use_session: bool,\n use_trace: bool,\n use_span: bool,\n sel_session: list,\n sel_trace: list,\n sel_span: list,\n threshold: float,\n k_trials: int,\n eval_mode_radio: str,\n hf_token: str,\n exp_response: str,\n exp_trajectory: str,\n assertions_text: str,\n progress=gr.Progress(track_tqdm=True),\n):\n # ── 1. Parse input ────────────────────────────────────────────────────\n progress(0.05, desc=\"Parsing trace…\")\n try:\n session = parse_trace(trace_json)\n except Exception as e:\n err = (\n f\"
❌ Parse error: {e}
\"\n )\n return err, None, None, None, err\n\n # ── 2. Build ground truth ─────────────────────────────────────────────\n gt = None\n if exp_response.strip() or exp_trajectory.strip() or assertions_text.strip():\n traj = (\n [t.strip() for t in exp_trajectory.split(\",\") if t.strip()]\n if exp_trajectory.strip()\n else None\n )\n asrt = (\n [a.strip() for a in assertions_text.splitlines() if a.strip()]\n if assertions_text.strip()\n else None\n )\n gt = GroundTruth(\n expected_response=exp_response.strip() or None,\n expected_trajectory=traj,\n assertions=asrt,\n )\n\n # ── 3. Resolve selected evaluators ───────────────────────────────────\n sess_evals = sel_session if use_session else []\n trace_evals = sel_trace if use_trace else []\n span_evals = sel_span if use_span else []\n\n if not sess_evals and not trace_evals and not span_evals:\n warn = \"
⚠️ No evaluators selected — please enable at least one level.
\"\n return warn, None, None, None, warn\n\n # ── 4. Build LLM judge (if requested) ────────────────────────────────\n use_llm = eval_mode_radio == \"LLM Judge (QwQ-32B)\"\n mode = EvalMode.LLM if use_llm else EvalMode.HEURISTIC\n judge = None\n if use_llm:\n token = hf_token.strip() or None\n judge = LLMJudge(api_key=token)\n if not judge.available:\n warn = \"
⚠️ LLM mode selected but no HF Token provided — falling back to heuritic.
\"\n mode = EvalMode.HEURISTIC\n\n # ── 5. Run evaluation (single or k trials) ─────────────────────────────\n progress(0.15, desc=\"Running evalua"
},
{
"id": "build-small-hackathon/AI-Puppet-Theater",
@@ -85,15 +91,17 @@
],
"models": [],
"datasets": [],
- "likes": 1,
+ "likes": 2,
"sdk": "gradio",
"license": "",
"created_at": "2026-06-05T17:19:57+00:00",
- "last_modified": "2026-06-07T10:57:39+00:00",
+ "last_modified": "2026-06-07T14:35:03+00:00",
"host": "https://build-small-hackathon-ai-puppet-theater.hf.space",
"url": "https://huggingface.co/spaces/build-small-hackathon/AI-Puppet-Theater",
"app_file": "app.py",
- "app_file_embedding_text": "render_stage session render_notes create_show premise reset_show advance_one_beat advance_full_act throw_audience_prop prop_name summon_audience_actor actor_name request_audience_finale AI Puppet Theater Enter a premise and create a show. No show yet. The transcript will appear here. director_lines.extend join premise.strip create_show_from_premise run_one_beat throw_prop summon_actor request_finale gr.Blocks title gr.State gr.Markdown gr.HTML value label gr.Textbox lines interactive create_button.click inputs outputs run_one_button.click run_full_button.click throw_prop_button.click summon_actor_button.click request_finale_button.click reset_button.click __main__ app.launch css actor_cards.append Setting: Premise: Beat of Transcript: No puppet lines yet. The first beat will be added in the next milestone. enumerate start Director Log: # AI Puppet Theater Create a tiny improv stage from a premise. This public shell is ready for puppet casting, short scenes, audience interruptions, and behind-the-scenes traces in later milestones. gr.Row placeholder gr.Button variant gr.Dropdown choices allow_custom_value none active Now speaking Latest: Audience: Props on stage: escape transcript_lines.append Trace Events: No premise yet. Add a premise to raise the curtain. Create a show before running a beat. sleep Create a show before throwing a prop. Create a show before summoning an actor. Create a show before requesting a finale. AI Puppet Theater Create Show Run One Beat Run Full Act Reset Throw Prop Summon Actor Request Finale Stage Transcript
Goal: Style: Tools: Holding: - Create a show before running the full act. Premise A moon detective interrogates a suspicious toaster... primary rubber duck Prop Professor Button , . : egg flowers tomato tiny crown scroll nothing"
+ "app_file_embedding_text": "from html import escape import os from time import sleep import gradio as gr from puppet_theater import ( DEFAULT_OPENBMB_MODEL_ID, TheaterSession, create_show_from_premise, get_backend_status, request_finale, run_one_beat, summon_actor, throw_prop, warm_up_openbmb, ) EMPTY_STAGE = \"\"\"
AI Puppet Theater
Enter a premise and raise the curtain.
\"\"\" EMPTY_TRANSCRIPT = \"No show yet. The transcript will appear here.\" EMPTY_DIRECTOR_LOG = \"No director notes yet.\" EMPTY_TRACE = \"No trace events yet.\" EMPTY_BACKEND = ( \"Active backend: deterministic\\n\" \"OpenBMB model id: openbmb/MiniCPM5-1B\\n\" \"Model status: unloaded\\n\" \"Fallback: deterministic safety path enabled\" ) BACKEND_CHOICES = [\"deterministic\", \"openbmb\"] OPENBMB_MODEL_ID = os.getenv(\"OPENBMB_MODEL_ID\", DEFAULT_OPENBMB_MODEL_ID) DEFAULT_MAX_NEW_TOKENS = 80 DEFAULT_TEMPERATURE = 0.8 PLAYBACK_DELAY_SECONDS = 0.75 PROP_EMOJI = { \"rubber duck\": \"🐤\", \"duck\": \"🐤\", \"egg\": \"🥚\", \"flowers\": \"💐\", \"flower\": \"💐\", \"tomato\": \"🍅\", \"crown\": \"👑\", \"tiny crown\": \"👑\", \"scroll\": \"📜\", \"banana\": \"🍌\", \"mirror\": \"🪞\", } CUSTOM_CSS = \"\"\" body, .gradio-container { background: radial-gradient(circle at 50% 0%, rgba(127, 29, 29, 0.18), transparent 28rem), linear-gradient(180deg, #0b1020 0%, #070914 100%) !important; color: #f8efe4 !important; } .gradio-container { max-width: 1180px !important; padding-top: 1rem !important; font-family: Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, \"Segoe UI\", sans-serif; } .gradio-container .prose, .gradio-container label, .gradio-container span, .gradio-container p { color: #f8efe4; } .gradio-container textarea, .gradio-container input { background: rgba(10, 12, 23, 0.82) !important; border-color: rgba(246, 196, 83, 0.24) !important; color: #f8efe4 !important; } .gradio-container textarea::placeholder, .gradio-container input::placeholder { color: #9f8c7a !important; } .gradio-container footer { color: rgba(203, 183, 161, 0.62) !important; } .gradio-container .block, .gradio-container .form, .gradio-container .panel, .gradio-container .tabs, .gradio-container .tabitem { background: rgba(34, 17, 31, 0.56) !important; border-color: rgba(246, 196, 83, 0.18) !important; } .gradio-container label, .gradio-container .block-title, .gradio-container .label-wrap { color: #f8efe4 !important; } .gradio-container .block-info, .gradio-container .label-wrap span, .gradio-container label > span { background: rgba(34, 17, 31, 0.88) !important; border: 1px solid rgba(246, 196, 83, 0.28) !important; border-radius: 6px !important; color: #ffd166 !important; font-weight: 700 !important; } .gradio-container .wrap, .gradio-container .styler, .gradio-container .form, .gradio-container .form > *, .gradio-container .block > div { background-color: transparent !important; } .gradio-container select, .gradio-container [role=\"listbox\"], .gradio-container [role=\"combobox\"] { background: rgba(10, 12, 23, 0.82) !important; border-color: rgba(246, 196, 83, 0.24) !important; color: #f8efe4 !important; } .app-title h1 { color: #f8efe4; font-family: Georgia, \"Times New Roman\", serif; font-size: 2.15rem; letter-spacing: 0; margin-bottom: 0; text-align: center; } .app-title p { color: #cbb7a1; font-size: 0.95rem; margin: 0.15rem 0 0.8rem; text-align: center; } .gradio-container h3, .gradio-container h3 span, .gradio-container .prose h3, .gradio-container .prose h3 span { color: #f8efe4 !important; } .premise-panel { background: rgba(42, 20, 38, 0.72); border-color: rgba(246, 196, 83, 0.3); box-shadow: 0 16px 32px rgba(0, 0, 0, 0.2); padding: 0.55rem 0.65rem 0.65rem; } .premise-panel .block, .premise-panel .wrap, .premise-panel .styler, .premise-panel .form, .premise-panel .block > div { background: rgba(42, 20, 38, 0.78) !important; } .control-panel { background: rgb ... dience-action, .prop-pile { font-size: 0.78rem; max-width: 39rem; padding: 0.22rem 0.5rem; } .prop-token { margin: 0.08rem; padding: 0.12rem 0.4rem; } .beat-counter { font-size: 0.84rem; margin-top: 0.34rem; } .stage-floorboards { height: 40px; } .control-panel { margin-top: 0 !important; padding: 0.42rem; } .control-panel h3 { margin-bottom: 0.2rem; } .gradio-container .row { gap: 0.55rem !important; } .stage-output + .row, .stage-output + div, .control-panel + .control-panel { margin-top: 0.45rem !important; } .transcript-section, .gradio-container .accordion { margin-top: 0.55rem !important; } @media (max-width: 760px) { .puppet-stage { min-height: 430px; } .stage-backdrop { padding: 0.52rem 2.15rem; } .actor-row { grid-template-columns: repeat(2, minmax(0, 1fr)); } .speech-line { font-size: 0.8rem; } } \"\"\" def render_stage(session: TheaterSession | None) -> str: if session is None: return EMPTY_STAGE actor_cards = [] latest_beat = session.transcript[-1] if session.transcript else None latest_speaker = latest_beat.speaker if latest_beat else None for actor in session.actors: active_class = \" active\" if actor.name == latest_speaker else \"\" active_label = '
Now speaking
' if actor.name == latest_speaker else \"\" role_line = actor.goal.split(\".\", maxsplit=1)[0] held_prop = actor.held_prop or \"nothing\" held_emoji = PROP_EMOJI.get(held_prop.lower(), \"🎁\") if actor.held_prop else \"\" actor_cards.append( f\"\"\"
{escape(actor.avatar)}
{escape(actor.name)}
{active_label}
{escape(role_line)}
Holding: {escape((held_emoji + \" \") if held_emoji else \"\")}{escape(held_prop)}
\"\"\" ) latest_line = \"\" if latest_beat is not None: latest_line = f\"\"\"
{escape(latest_beat.speaker)}
{escape(latest_beat.line)}
\"\"\" audience_action = \"\" if session.latest_audience_action is not None: audience_action = f\"\"\"
Audience: {escape(session.latest_audience_action)}
\"\"\" prop_pile = \"\" if session.props: prop_tokens = \"\".join( f'
{escape(PROP_EMOJI.get(prop.lower(), \"🎁\"))} {escape(prop)}' for prop in session.props ) prop_pile = f\"\"\"
Props on stage: {prop_tokens}
\"\"\" return f\"\"\"
{escape(session.show_title)}
Setting: {escape(session.setting)}
Premise: {escape(session.premise)}
{latest_line}
{''.join(actor_cards)}
{audience_action} {prop_pile}
Beat {session.beat_index} of {session.max_beats}
\"\"\" def render_transcript(session: TheaterSession | None) -> str: if session is None: return EMPTY_TRANSCRIPT transcript_lines = [ \"Transcript:\", \"No puppet lines yet. The first beat will be added in the next milestone.\", ] if session.transcript: transcript_lines = [\"Transcript:\"] for index, beat in enumerate(session.transcript, start=1): transcript_lines.append(f\"{index}. {beat.speaker}: {beat.line}\") return \"\\n\".join(transcript_lines) def render_director_log(session: TheaterSession | None) -> str: if session is None: return EMPTY_DIRECTOR_LOG return \"\\n\".join(f\"- {entry}\" for entry in session.director_log) def render_trace(session: TheaterSession | None) -> str: if session is None: return EMPTY_TRACE return \"\\n\".join(f\"- {entry}\" for entry in session.trace_events) def normalize_backend_name(backend_name: str | None) -> str: return backend_name if backend_name in BACKEND_CHOICES else \"determinist",
+ "readme_body": "AI Puppet Theater is a public Gradio Space for building short interactive puppet shows from a user premise.",
+ "app_file_source": "from html import escape\nimport os\nfrom time import sleep\n\nimport gradio as gr\n\nfrom puppet_theater import (\n DEFAULT_OPENBMB_MODEL_ID,\n TheaterSession,\n create_show_from_premise,\n get_backend_status,\n request_finale,\n run_one_beat,\n summon_actor,\n throw_prop,\n warm_up_openbmb,\n)\n\n\nEMPTY_STAGE = \"\"\"\n
\n
\n
\n
AI Puppet Theater
\n
Enter a premise and raise the curtain.
\n
\n
\n
\n\"\"\"\n\nEMPTY_TRANSCRIPT = \"No show yet. The transcript will appear here.\"\nEMPTY_DIRECTOR_LOG = \"No director notes yet.\"\nEMPTY_TRACE = \"No trace events yet.\"\nEMPTY_BACKEND = (\n \"Active backend: deterministic\\n\"\n \"OpenBMB model id: openbmb/MiniCPM5-1B\\n\"\n \"Model status: unloaded\\n\"\n \"Fallback: deterministic safety path enabled\"\n)\nBACKEND_CHOICES = [\"deterministic\", \"openbmb\"]\nOPENBMB_MODEL_ID = os.getenv(\"OPENBMB_MODEL_ID\", DEFAULT_OPENBMB_MODEL_ID)\nDEFAULT_MAX_NEW_TOKENS = 80\nDEFAULT_TEMPERATURE = 0.8\nPLAYBACK_DELAY_SECONDS = 0.75\nPROP_EMOJI = {\n \"rubber duck\": \"🐤\",\n \"duck\": \"🐤\",\n \"egg\": \"🥚\",\n \"flowers\": \"💐\",\n \"flower\": \"💐\",\n \"tomato\": \"🍅\",\n \"crown\": \"👑\",\n \"tiny crown\": \"👑\",\n \"scroll\": \"📜\",\n \"banana\": \"🍌\",\n \"mirror\": \"🪞\",\n}\n\nCUSTOM_CSS = \"\"\"\nbody,\n.gradio-container {\n background:\n radial-gradient(circle at 50% 0%, rgba(127, 29, 29, 0.18), transparent 28rem),\n linear-gradient(180deg, #0b1020 0%, #070914 100%) !important;\n color: #f8efe4 !important;\n}\n.gradio-container {\n max-width: 1180px !important;\n padding-top: 1rem !important;\n font-family: Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, \"Segoe UI\", sans-serif;\n}\n.gradio-container .prose,\n.gradio-container label,\n.gradio-container span,\n.gradio-container p {\n color: #f8efe4;\n}\n.gradio-container textarea,\n.gradio-container input {\n background: rgba(10, 12, 23, 0.82) !important;\n border-color: rgba(246, 196, 83, 0.24) !important;\n color: #f8efe4 !important;\n}\n.gradio-container textarea::placeholder,\n.gradio-container input::placeholder {\n color: #9f8c7a !important;\n}\n.gradio-container footer {\n color: rgba(203, 183, 161, 0.62) !important;\n}\n.gradio-container .block,\n.gradio-container .form,\n.gradio-container .panel,\n.gradio-container .tabs,\n.gradio-container .tabitem {\n background: rgba(34, 17, 31, 0.56) !important;\n border-color: rgba(246, 196, 83, 0.18) !important;\n}\n.gradio-container label,\n.gradio-container .block-title,\n.gradio-container .label-wrap {\n color: #f8efe4 !important;\n}\n.gradio-container .block-info,\n.gradio-container .label-wrap span,\n.gradio-container label > span {\n background: rgba(34, 17, 31, 0.88) !important;\n border: 1px solid rgba(246, 196, 83, 0.28) !important;\n border-radius: 6px !important;\n color: #ffd166 !important;\n font-weight: 700 !important;\n}\n.gradio-container .wrap,\n.gradio-container .styler,\n.gradio-container .form,\n.gradio-container .form > *,\n.gradio-container .block > div {\n background-color: transparent !important;\n}\n.gradio-container select,\n.gradio-container [role=\"listbox\"],\n.gradio-container [role=\"combobox\"] {\n background: rgba(10, 12, 23, 0.82) !important;\n border-color: rgba(246, 196, 83, 0.24) !important;\n color: #f8efe4 !important;\n}\n.app-title h1 {\n color: #f8efe4;\n font-family: Georgia, \"Times New Roman\", serif;\n font-size: 2.15rem;\n letter-spacing: 0;\n margin-bottom: 0;\n text-align: center;\n}\n.app-title p {\n color: #cbb7a1;\n font-size: 0.95rem;\n margin: 0.15rem 0 0.8rem;\n text-align: center;\n}\n.gradio-container h3,\n.gradio-container h3 span,\n.gradio-container .prose h3,\n.gradio-container .prose h3 span {\n color: #f8efe4 !important;\n}\n.premise-panel {\n background: rgba(42, 20, 38, 0.72);\n border-color: rgba(246, 196, 83, 0.3);\n box-shadow: 0 16px 32px rgba(0, 0, 0, 0.2);\n padding: 0.55rem 0.65rem 0.65rem;\n}\n.premise-panel .block,\n.premise-panel .wrap,\n.premise-panel .styler,\n.premise-panel .form,\n.premise-panel .block > div {\n background: rgba(42, 20, 38, 0.78) !important;\n}\n.control-panel {\n background: rgba(34, 17, 31, 0.76);\n border: 1px solid rgba(246, 196, 83, 0.22);\n border-radius: 8px;\n box-shadow: 0 14px 34px rgba(0, 0, 0, 0.22);\n padding: 0.55rem;\n}\n.control-panel .block,\n.control-panel .wrap,\n.control-panel .styler,\n.control-panel .form,\n.control-panel .block > div {\n background: rgba(34, 17, 31, 0.78) !important;\n}\n.control-panel .row,\n.premise-panel .row {\n background: transparent !important;\n}\n.control-panel h3 {\n color: #f8efe4;\n margin: 0 0 0.35rem;\n font-size: 1rem;\n}\n.control-panel .prose,\n.control-panel .prose h3,\n.control-panel h3 * {\n color: #f8efe4 !important;\n}\n.puppet-stage {\n min-height: 430px;\n border: 5px solid #3b0a16;\n border-radius: 14px;\n background:\n linear-gradient(90deg, rgba(59, 10, 22, 0.98) 0 10%, transparent 10% 90%, rgba(59, 10, 22, 0.98) 90% 100%),\n linear-gradient(180deg, rgba(42, 20, 38, 0.96), rgba(13, 6, 14, 0.98));\n color: #f8efe4;\n display: flex;\n flex-direction: column;\n align-items: stretch;\n justify-content: stretch;\n position: relative;\n overflow: hidden;\n box-shadow:\n 0 24px 48px rgba(0, 0, 0, 0.38),\n inset 0 0 42px rgba(0, 0, 0, 0.58);\n}\n.puppet-stage::before,\n.puppet-stage::after {\n content: \"\";\n position: absolute;\n top: 0;\n bottom: 0;\n width: 13%;\n background:\n repeating-linear-gradient(90deg, rgba(255, 255, 255, 0.04) 0 14px, transparent 14px 28px),\n linear-gradient(180deg, #8b1e3f 0%, #7f1d1d 54%, #3b0a16 100%);\n box-shadow: inset -16px 0 28px rgba(0, 0, 0, 0.22);\n z-index: 2;\n}\n.puppet-stage::before {\n left: 0;\n}\n.puppet-stage::after {\n right: 0;\n transform: scaleX(-1);\n}\n.stage-valance {\n height: 48px;\n background:\n repeating-linear-gradient(90deg, rgba(255, 255, 255, 0.06) 0 22px, transparent 22px 44px),\n linear-gradient(180deg, #8b1e3f 0%, #7f1d1d 100%);\n border-bottom: 4px solid #f6c453;\n box-shadow: 0 10px 20px rgba(0, 0, 0, 0.34);\n position: relative;\n z-index: 3;\n}\n.stage-backdrop {\n background:\n radial-gradient(circle at 50% 8%, rgba(255, 224, 150, 0.28), transparent 19rem),\n radial-gradient(circle at 24% 58%, rgba(255, 224, 150, 0.12), transparent 14rem),\n linear-gradient(180deg, #2a1426 0%, #22111f 62%, #130911 100%);\n flex: 1;\n padding: 0.72rem 7.2rem 0.8rem;\n position: relative;\n z-index: 1;\n}\n.stage-backdrop::after {\n background: linear-gradient(180deg, transparent 0%, rgba(124, 63, 23, 0.46) 100%);\n bottom: 0;\n content: \"\";\n height: 32%;\n left: 0;\n position: absolute;\n right: 0;\n}\n.stage-marquee {\n color: #fff7ed;\n font-family: Georgia, \"Times New Roman\", serif;\n font-size: 1.6rem;\n font-weight: 700;\n letter-spacing: 0;\n text-align: center;\n text-shadow: 0 4px 18px rgba(0, 0, 0, 0.72);\n position: relative;\n z-index: 2;\n overflow-wrap: anywhere;\n}\n.stage-copy {\n max-width: 54rem;\n color: #cbb7a1;\n font-size: 0.84rem;\n line-height: 1.35;\n margin: 0.25rem auto 0;\n text-align: center;\n position: relative;\n z-index: 2;\n}\n.stage-copy strong {\n color: #f8efe4;\n}\n.empty-stage-copy {\n color: #cbb7a1;\n font-size: 1rem;\n margin-top: 5.8rem;\n text-align: center;\n position: relative;\n z-index: 2;\n}\n.stage-floorboards {\n height: 58px;\n background:\n repeating-linear-gradient(90deg, rgba(255, 255, 255, 0.08) 0 2px, transparent 2px 72px),\n linear-gradient(180deg, #8a4b22 0%, #7c3f17 100%);\n border-top: 2px solid rgba(246, 196, 83, 0.28);\n position: relative;\n z-index: 3;\n}\n.speech-bubble {\n animation: bubble-in 0.24s ease-out;\n background: rgba(18, 10, 18, 0.82);\n border: 1px solid rgba(246, 196, 83, 0.5);\n border-radius: 16px;\n box-shadow: 0 18px 30px rgba(0, 0, 0, 0.34);\n color: #f8efe4;\n margin: 0.55rem auto 0;\n max-width: 46rem;\n padding: 0.72rem 0.95rem;\n position: relative;\n text-align: center;\n z-index: 4;\n}\n.speech-bubble::after {\n border-left: 10px solid transparent;\n border-right: 10px solid transparent;\n border-top: 12px solid rgba(246, 196, 83, 0.5);\n bottom: -12px;\n content: \"\";\n left: 50%;\n position: absolute;\n transform: translateX(-50%);\n}\n.speech-speaker {\n color: #ffd166;\n font-size: 0.78rem;\n font-weight: 800;\n letter-spacing: 0.08em;\n margin-bottom: 0.18rem;\n text-transform: uppercase;\n}\n.speech-line {\n color: #f8efe4;\n font-size: 0.96rem;\n line-height: 1.35;\n}\n.actor-row {\n display: grid;\n grid-template-columns: repeat(auto-fit, minmax(132px, 1fr));\n gap: 0.55rem;\n margin-top: 0.72rem;\n position: relative;\n z-index: 3;\n}\n.actor-card {\n background: rgba(70, 38, 36, 0.72);\n border: 1px solid rgba(246, 196, 83, 0.45);\n border-radius: 16px 16px 10px 10px;\n box-shadow: 0 14px 28px rgba(0, 0, 0, 0.28);\n min-height: 132px;\n padding: 0.58rem 0.62rem 0.72rem;\n position: relative;\n transform-origin: bottom center;\n text-align: center;\n}\n.actor-card::after {\n background: #7c3f17;\n border-radius: 0 0 8px 8px;\n bottom: -22px;\n box-shadow: inset 0 -5px 8px rgba(0, 0, 0, 0.2);\n content: \"\";\n height: 22px;\n left: calc(50% - 8px);\n position: absolute;\n width: 16px;\n}\n.actor-card.active {\n animation: puppet-bounce 0.78s ease-in-out infinite alternate;\n border-color: #ffd166;\n box-shadow:\n 0 0 0 2px rgba(255, 209, 102, 0.22),\n 0 0 34px rgba(255, 209, 102, 0.46),\n 0 16px 34px rgba(0, 0, 0, 0.34);\n}\n.actor-avatar {\n background: radial-gradient(circle, rgba(255, 209, 102, 0.2), rgba(59, 10, 22, 0.3));\n border: 1px solid rgba(246, 196, 83, 0.34);\n border-radius: 999px;\n display: inline-grid;\n font-size: 1.7rem;\n height: 3rem;\n place-items: center;\n text-align: center;\n width: 3rem;\n}\n.actor-name {\n color: #f8efe4;\n font-weight: 700;\n line-height: 1.15;\n margin-top: 0.35rem;\n text-align: center;\n}\n.speaking-pill {\n background: #ffd166;\n border-radius: 999px;\n color: #3b0a16;\n display: inline-block;\n font-size: 0.64rem;\n font-weight: 800;\n margin-top: 0.26rem;\n padding: 0.12rem 0.44rem;\n text-transform: uppercase;\n}\n.actor-detail {\n color: #cbb7a1;\n font-size: 0.72rem;\n line-height: 1.28;\n margin-top: 0.35rem;\n}\n.actor-detail strong {\n color: #f8efe4;\n}\n.held-prop {\n margin-top: 0.42rem;\n}\n.held-prop span {\n background: rgba(246, 196, 83, 0.14);\n border: 1px solid rgba(246, 196, 83, 0.32);\n border-radius: 999px;\n color: #ffd166;\n display: inline-block;\n font-size: 0.68rem;\n font-weight: 700;\n padding: 0.12rem 0.42rem;\n}\n.beat-counter {\n color: #ffd166;\n font-weight: 800;\n margin-top: 0.55rem;\n position: relative;\n text-align: center;\n z-index: 3;\n}\n.stage-events {\n display: grid;\n gap: 0.4rem;\n margin-top: 0.55rem;\n position: relative;\n z-index: 3;\n}\n.audience-action,\n.prop-pile {\n background: rgba(42, 20, 38, 0.7);\n border: 1px solid rgba(246, 196, 83, 0.25);\n border-radius: 999px;\n color: #f8efe4;\n margin: 0 auto;\n max-width: 48rem;\n padding: 0.38rem;\n text-align: center;\n width: 100%;\n}\n.audience-action strong,\n.prop-pile strong {\n color: #ffd166;\n}\n.prop-token {\n animation: prop-pop 0.22s ease-out;\n background: rgba(246, 196, 83, 0.17);\n border: 1px solid rgba(246, 196, 83, 0.5);\n border-radius: 999px;\n color: #fff7ed;\n display: inline-block;\n margin: 0.2rem;\n padding: 0.22rem 0.55rem;\n}\n.gradio-container button.primary,\n.gradio-container button.primary-action,\n.gradio-container button.run-one-action {\n background: #f97316 !important;\n border-color: #f97316 !important;\n box-shadow: 0 10px 24px rgba(249, 115, 22, 0.25) !important;\n color: #fff7ed !important;\n}\n.gradio-container button.secondary,\n.gradio-container button.secondary-action,\n.gradio-container button.audience-action-button {\n background: #3f3148 !important;\n border-color: rgba(246, 196, 83, 0.22) !important;\n color: #f8efe4 !important;\n}\n.gradio-container button.reset-action {\n background: #3b0a16 !important;\n border-color: rgba(246, 196, 83, 0.24) !important;\n color: #f8efe4 !important;\n}\n.transcript-box,\n.gradio-container .accordion {\n background: rgba(13, 6, 14, 0.58) !important;\n border-color: rgba(246, 196, 83, 0.18) !important;\n color: #f8efe4 !important;\n}\n@keyframes puppet-bounce {\n from { transform: translateY(0) rotate(-0.4deg); }\n to { transform: translateY(-7px) rotate(0.7deg); }\n}\n@keyframes bubble-in {\n from { opacity: 0; transform: translateY(8px); }\n to { opacity: 1; transform: translateY(0); }\n}\n@keyframes prop-pop {\n from { opacity: 0; transform: scale(0.86); }\n to { opacity: 1; transform: scale(1); }\n}\n@media (max-width: 760px) {\n .puppet-stage {\n min-height: 560px;\n }\n .puppet-stage::before,\n .puppet-stage::after {\n width: 7%;\n }\n .stage-backdrop {\n padding: 0.8rem 1.4rem;\n }\n .stage-marquee {\n font-size: 1.2rem;\n }\n .actor-row {\n grid-template-columns: repeat(2, minmax(0, 1fr));\n }\n .actor-card {\n min-height: 126px;\n }\n}\n\n/* Final Gradio chrome overrides: keep the whole app in the theater palette. */\n.gradio-container {\n width: min(1200px, calc(100vw - 2rem)) !important;\n}\n.gradio-container .gr-group {\n background: rgba(34, 17, 31, 0.84) !important;\n border: 1px solid rgba(246, 196, 83, 0.2) !important;\n border-radius: 8px !important;\n color: #f8efe4 !important;\n}\n.gradio-container .gr-group .form,\n.gradio-container .gr-group .block,\n.gradio-container .gr-group .wrap,\n.gradio-container .gr-group .wrap-inner,\n.gradio-container .gr-group .secondary-wrap,\n.gradio-container .gr-group .input-container,\n.gradio-container .gr-group label {\n background: transparent !important;\n color: #f8efe4 !important;\n}\n.gradio-container input,\n.gradio-container textarea,\n.gradio-container select,\n.gradio-container .dropdown-container,\n.gradio-container .wrap-inner {\n background: rgba(10, 12, 23, 0.9) !important;\n color: #f8efe4 !important;\n}\n.gradio-container .control-panel input,\n.gradio-container .control-panel textarea,\n.gradio-container .control-panel .wrap-inner,\n.gradio-container .premise-panel textarea {\n border: 1px solid rgba(246, 196, 83, 0.24) !important;\n}\n.gradio-container button {\n background: #3f3148 !important;\n border: 1px solid rgba(246, 196, 83, 0.24) !important;\n color: #f8efe4 !important;\n}\n.gradio-container button.primary,\n.gradio-container button.primary-action,\n.gradio-container button.run-one-action {\n background: #f97316 !important;\n border-color: #f97316 !important;\n color: #fff7ed !important;\n}\n.gradio-container button.reset-action {\n background: #3b0a16 !important;\n border-color: rgba(246, 196, 83, 0.32) !important;\n}\n.gradio-container .html-container,\n.gradio-container .gradio-style {\n width: 100% !important;\n}\n.puppet-stage {\n min-height: 500px;\n width: 100%;\n}\n.puppet-stage::before,\n.puppet-stage::after {\n width: clamp(56px, 9%, 110px);\n}\n.stage-backdrop {\n padding: 0.78rem clamp(4.1rem, 11vw, 8.8rem) 0.72rem;\n}\n.stage-marquee {\n font-size: clamp(1.25rem, 2.1vw, 1.72rem);\n white-space: normal;\n}\n.speech-bubble {\n margin-top: 0.48rem;\n max-width: 44rem;\n padding: 0.58rem 0.82rem;\n}\n.actor-row {\n align-items: end;\n grid-template-columns: repeat(auto-fit, minmax(116px, 1fr));\n gap: 0.62rem;\n margin-top: 0.82rem;\n}\n.actor-card {\n align-content: start;\n background: radial-gradient(circle at 50% 18%, rgba(246, 196, 83, 0.13), rgba(70, 38, 36, 0.72) 58%);\n border-radius: 18px;\n display: grid;\n justify-items: center;\n min-height: 108px;\n padding: 0.5rem 0.45rem 0.56rem;\n}\n.actor-card::after {\n bottom: -20px;\n height: 20px;\n width: 14px;\n}\n.actor-avatar {\n font-size: 2rem;\n height: 3.3rem;\n width: 3.3rem;\n}\n.actor-name {\n font-size: 0.82rem;\n margin-top: 0.28rem;\n}\n.actor-detail {\n display: -webkit-box;\n font-size: 0.66rem;\n line-height: 1.18;\n margin-top: 0.2rem;\n max-width: 11rem;\n min-height: 1.55rem;\n overflow: hidden;\n -webkit-box-orient: vertical;\n -webkit-line-clamp: 2;\n}\n.held-prop {\n margin-top: 0.26rem;\n}\n.held-prop span {\n font-size: 0.62rem;\n padding: 0.08rem 0.34rem;\n}\n.speaking-pill {\n font-size: 0.58rem;\n margin-top: 0.18rem;\n padding: 0.08rem 0.36rem;\n}\n.stage-events {\n gap: 0.32rem;\n margin-top: 0.64rem;\n}\n.audience-action,\n.prop-pile {\n max-width: 45rem;\n padding: 0.3rem 0.55rem;\n}\n@media (max-width: 760px) {\n .gradio-container {\n width: min(100vw, calc(100vw - 0.75rem)) !important;\n }\n .puppet-stage::before,\n .puppet-stage::after {\n width: 30px;\n }\n .stage-backdrop {\n padding: 0.75rem 2.45rem;\n }\n .actor-row {\n grid-template-columns: repeat(2, minmax(0, 1fr));\n gap: 0.45rem;\n }\n .actor-card {\n min-height: 102px;\n padding-left: 0.28rem;\n padding-right: 0.28rem;\n }\n}\n\n/* Compact stage pass: keep the theater look, reduce scrolling, and keep controls close. */\n.gradio-container {\n padding-top: 0.65rem !important;\n}\n.app-title h1 {\n font-size: 1.95rem;\n}\n.app-title p {\n margin-bottom: 0.55rem;\n}\n.premise-panel {\n padding: 0.42rem 0.55rem 0.52rem;\n}\n.stage-output,\n.stage-output .html-container,\n.stage-output .gradio-style {\n margin-bottom: 0 !important;\n}\n.puppet-stage {\n min-height: 390px;\n}\n.stage-valance {\n height: 34px;\n border-bottom-width: 3px;\n}\n.stage-backdrop {\n padding: 0.48rem clamp(3.9rem, 9vw, 7.3rem) 0.46rem;\n}\n.stage-marquee {\n font-size: clamp(1.15rem, 1.9vw, 1.52rem);\n}\n.stage-copy {\n font-size: 0.76rem;\n line-height: 1.25;\n margin-top: 0.14rem;\n}\n.speech-bubble {\n border-radius: 12px;\n margin-top: 0.34rem;\n max-width: 40rem;\n padding: 0.42rem 0.7rem;\n}\n.speech-speaker {\n font-size: 0.68rem;\n}\n.speech-line {\n font-size: 0.86rem;\n}\n.actor-row {\n grid-template-columns: repeat(auto-fit, minmax(104px, 1fr));\n gap: 0.5rem;\n margin-top: 0.55rem;\n}\n.actor-card {\n border-radius: 14px;\n min-height: 88px;\n padding: 0.38rem 0.36rem 0.44rem;\n}\n.actor-card::after {\n bottom: -16px;\n height: 16px;\n}\n.actor-avatar {\n font-size: 1.65rem;\n height: 2.55rem;\n width: 2.55rem;\n}\n.actor-name {\n font-size: 0.74rem;\n margin-top: 0.2rem;\n}\n.actor-detail {\n font-size: 0.6rem;\n line-height: 1.12;\n margin-top: 0.14rem;\n min-height: 1.35rem;\n}\n.speaking-pill {\n font-size: 0.52rem;\n margin-top: 0.14rem;\n}\n.held-prop {\n margin-top: 0.18rem;\n}\n.held-prop span {\n font-size: 0.55rem;\n}\n.stage-events {\n gap: 0.24rem;\n margin-top: 0.46rem;\n}\n.audience-action,\n.prop-pile {\n font-size: 0.78rem;\n max-width: 39rem;\n padding: 0.22rem 0.5rem;\n}\n.prop-token {\n margin: 0.08rem;\n padding: 0.12rem 0.4rem;\n}\n.beat-counter {\n font-size: 0.84rem;\n margin-top: 0.34rem;\n}\n.stage-floorboards {\n height: 40px;\n}\n.control-panel {\n margin-top: 0 !important;\n padding: 0.42rem;\n}\n.control-panel h3 {\n margin-bottom: 0.2rem;\n}\n.gradio-container .row {\n gap: 0.55rem !important;\n}\n.stage-output + .row,\n.stage-output + div,\n.control-panel + .control-panel {\n margin-top: 0.45rem !important;\n}\n.transcript-section,\n.gradio-container .accordion {\n margin-top: 0.55rem !important;\n}\n@media (max-width: 760px) {\n .puppet-stage {\n min-height: 430px;\n }\n .stage-backdrop {\n padding: 0.52rem 2.15rem;\n }\n .actor-row {\n grid-template-columns: repeat(2, minmax(0, 1fr));\n }\n .speech-line {\n font-size: 0.8rem;\n }\n}\n\"\"\"\n\n\ndef render_stage(session: TheaterSession | None) -> str:\n if session is None:\n return EMPTY_STAGE\n\n actor_cards = []\n latest_beat = session.transcript[-1] if session.transcript else None\n latest_speaker = latest_beat.speaker if latest_beat else None\n for actor in session.actors:\n active_class = \" active\" if actor.name == latest_speaker else \"\"\n active_label = '
Now speaking
' if actor.name == latest_speaker else \"\"\n role_line = actor.goal.split(\".\", maxsplit=1)[0]\n held_prop = actor.held_prop or \"nothing\"\n held_emoji = PROP_EMOJI.get(held_prop.lower(), \"🎁\") if actor.held_prop else \"\"\n actor_cards.append(\n f\"\"\"\n
\n
{escape(actor.avatar)}
\n
{escape(actor.name)}
\n {active_label}\n
{escape(role_line)}
\n
Holding: {escape((held_emoji + \" \") if held_emoji else \"\")}{escape(held_prop)}
\n
\n \"\"\"\n )\n latest_line = \"\"\n if latest_beat is not None:\n latest_line = f\"\"\"\n
\n
{escape(latest_beat.speaker)}
\n
{escape(latest_beat.line)}
\n
\n \"\"\"\n audience_action = \"\"\n if session.latest_audience_action is not None:\n audience_action = f\"\"\"\n
\n Audience: {escape(session.latest_audience_action)}\n
\n \"\"\"\n prop_pile = \"\"\n if session.props:\n prop_tokens = \"\".join(\n f'
{escape(PROP_EMOJI.get(prop.lower(), \"🎁\"))} {escape(prop)}'\n for prop in session.props\n )\n prop_pile = f\"\"\"\n
\n Props on stage: {prop_tokens}\n
\n \"\"\"\n\n return f\"\"\"\n
\n
\n
\n
{escape(session.show_title)}
\n
\n Setting: {escape(session.setting)}
\n Premise: {escape(session.premise)}\n
\n {latest_line}\n
\n {''.join(actor_cards)}\n
\n
\n {audience_action}\n {prop_pile}\n
\n
Beat {session.beat_index} of {session.max_beats}
\n
\n
\n
\n \"\"\"\n\n\ndef render_transcript(session: TheaterSession | None) -> str:\n if session is None:\n return EMPTY_TRANSCRIPT\n\n transcript_lines = [\n \"Transcript:\",\n \"No puppet lines yet. The first beat will be added in the next milestone.\",\n ]\n if session.transcript:\n transcript_lines = [\"Transcript:\"]\n for index, beat in enumerate(session.transcript, start=1):\n transcript_lines.append(f\"{index}. {beat.speaker}: {beat.line}\")\n\n return \"\\n\".join(transcript_lines)\n\n\ndef render_director_log(session: TheaterSession | None) -> str:\n if session is None:\n return EMPTY_DIRECTOR_LOG\n return \"\\n\".join(f\"- {entry}\" for entry in session.director_log)\n\n\ndef render_trace(session: TheaterSession | None) -> str:\n if session is None:\n return EMPTY_TRACE\n return \"\\n\".join(f\"- {entry}\" for entry in session.trace_events)\n\n\ndef normalize_backend_name(backend_name: str | None) -> str:\n return backend_name if backend_name in BACKEND_CHOICES else \"determinist"
},
{
"id": "build-small-hackathon/ai-study-buddy",
@@ -109,11 +117,35 @@
"sdk": "gradio",
"license": "apache-2.0",
"created_at": "2026-06-01T13:45:43+00:00",
- "last_modified": "2026-06-07T09:45:47+00:00",
+ "last_modified": "2026-06-07T14:46:54+00:00",
"host": "https://build-small-hackathon-ai-study-buddy.hf.space",
"url": "https://huggingface.co/spaces/build-small-hackathon/ai-study-buddy",
"app_file": "app.py",
- "app_file_embedding_text": "build_prompt message mode get_response history summarize text quiz simple study_plan InferenceClient model token You are AI Study Buddy, created by Areeba Iqbal. Rules: - Always explain step-by-step - Give examples - Be clear and student-friendly - If asked who created you: \"I am AI Study Buddy, created by Areeba Iqbal.\" demo.launch server_name server_port messages.append gr.Blocks theme css title gr.HTML gr.Radio value label gr.ChatInterface fn additional_inputs examples gr.Markdown gr.Textbox click meta-llama/Llama-3.1-8B-Instruct os.getenv 📚 Study Mode 💻 Coding Mode 🧮 Math Solver 📝 Exam Prep Explain simply for students with examples. Act as a senior programmer. Debug and improve code. Solve step-by-step with explanation. Give short exam-focused answers. Mode: User Question: client.chat_completion messages max_tokens temperature 📚 AI Study Buddy Learn smarter with AI-powered guidance ## ⚡ Quick Actions gr.Row ## 🗓️ Study Plan Generator Created by Areeba Iqbal 0.0.0.0 API_KEY mode_prompts.get role content system user gr.themes.Soft AI Study Buddy Select Mode Quick Input Enter Topic / Exam Detail Plan Output gr.Button ❌ Error: Generate Plan Explain recursion Solve quadratic equation What is AI? Debug Python code 📖 Summarize 📝 Quiz 💡 Simple Summarize: Generate 5 MCQs: Explain simply: Make 7-day study plan for:"
+ "app_file_embedding_text": "build_prompt message mode get_response history summarize text quiz simple study_plan InferenceClient model token You are AI Study Buddy, created by Areeba Iqbal. Rules: - Always explain step-by-step - Give examples - Be clear and student-friendly - If asked who created you: \"I am AI Study Buddy, created by Areeba Iqbal.\" demo.launch server_name server_port messages.append gr.Blocks theme css title gr.HTML gr.Radio value label gr.ChatInterface fn additional_inputs examples gr.Markdown gr.Textbox click meta-llama/Llama-3.1-8B-Instruct os.getenv 📚 Study Mode 💻 Coding Mode 🧮 Math Solver 📝 Exam Prep Explain simply for students with examples. Act as a senior programmer. Debug and improve code. Solve step-by-step with explanation. Give short exam-focused answers. Mode: User Question: client.chat_completion messages max_tokens temperature 📚 AI Study Buddy Learn smarter with AI-powered guidance ## ⚡ Quick Actions gr.Row ## 🗓️ Study Plan Generator Created by Areeba Iqbal 0.0.0.0 API_KEY mode_prompts.get role content system user gr.themes.Soft AI Study Buddy Select Mode Quick Input Enter Topic / Exam Detail Plan Output gr.Button ❌ Error: Generate Plan Explain recursion Solve quadratic equation What is AI? Debug Python code 📖 Summarize 📝 Quiz 💡 Simple Summarize: Generate 5 MCQs: Explain simply: Make 7-day study plan for:",
+ "readme_body": "Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference",
+ "app_file_source": "import gradio as gr\nimport os\nimport spaces\nfrom huggingface_hub import InferenceClient\n\n# -----------------------------\n# 🔑 API KEY FIXED\n# -----------------------------\nclient = InferenceClient(\n model=\"meta-llama/Llama-3.1-8B-Instruct\",\n token=os.getenv(\"API_KEY\") # 👈 FIXED NAME (recommended)\n)\n\n# -----------------------------\n# SYSTEM PROMPT\n# -----------------------------\nSYSTEM_PROMPT = \"\"\"\nYou are AI Study Buddy, created by Areeba Iqbal.\n\nRules:\n- Always explain step-by-step\n- Give examples\n- Be clear and student-friendly\n- If asked who created you: \"I am AI Study Buddy, created by Areeba Iqbal.\"\n\"\"\"\n\n# -----------------------------\n# MODE CONTROL\n# -----------------------------\ndef build_prompt(message, mode):\n mode_prompts = {\n \"📚 Study Mode\": \"Explain simply for students with examples.\",\n \"💻 Coding Mode\": \"Act as a senior programmer. Debug and improve code.\",\n \"🧮 Math Solver\": \"Solve step-by-step with explanation.\",\n \"📝 Exam Prep\": \"Give short exam-focused answers.\"\n }\n\n return f\"\"\"\n{SYSTEM_PROMPT}\n\nMode: {mode_prompts.get(mode, \"\")}\n\nUser Question:\n{message}\n\"\"\"\n\n# -----------------------------\n# MAIN CHAT FUNCTION\n# -----------------------------\n@spaces.GPU\ndef get_response(message, history, mode):\n\n messages = [{\"role\": \"system\", \"content\": SYSTEM_PROMPT}]\n\n for msg in history:\n messages.append(msg)\n\n messages.append({\"role\": \"user\", \"content\": build_prompt(message, mode)})\n\n try:\n response = client.chat_completion(\n messages=messages,\n max_tokens=1024,\n temperature=0.7\n )\n\n return response.choices[0].message.content\n\n except Exception as e:\n return f\"❌ Error: {e}\"\n\n\n# -----------------------------\n# QUICK ACTIONS\n# -----------------------------\ndef summarize(text):\n return client.chat_completion(\n messages=[{\"role\": \"user\", \"content\": \"Summarize: \" + text}],\n max_tokens=500\n ).choices[0].message.content\n\n\ndef quiz(text):\n return client.chat_completion(\n messages=[{\"role\": \"user\", \"content\": \"Generate 5 MCQs: \" + text}],\n max_tokens=500\n ).choices[0].message.content\n\n\ndef simple(text):\n return client.chat_completion(\n messages=[{\"role\": \"user\", \"content\": \"Explain simply: \" + text}],\n max_tokens=500\n ).choices[0].message.content\n\n\ndef study_plan(text):\n return client.chat_completion(\n messages=[{\"role\": \"user\", \"content\": f\"Make 7-day study plan for: {text}\"}],\n max_tokens=700\n ).choices[0].message.content\n\n\n# -----------------------------\n# UI\n# -----------------------------\ncss = \"\"\"\n.main-container {\n max-width: 900px;\n margin: auto;\n}\n#title { text-align:center; }\n#subtitle { text-align:center; color:gray; }\n#footer { text-align:center; color:gray; font-size:14px; }\n\"\"\"\n\nwith gr.Blocks(\n theme=gr.themes.Soft(),\n css=css,\n title=\"AI Study Buddy\"\n) as demo:\n\n gr.HTML(\"\"\"\n
\n
📚 AI Study Buddy
\n
Learn smarter with AI-powered guidance
\n
\n \"\"\")\n\n # ---------------- MODE SELECT ----------------\n mode = gr.Radio(\n [\"📚 Study Mode\", \"💻 Coding Mode\", \"🧮 Math Solver\", \"📝 Exam Prep\"],\n value=\"📚 Study Mode\",\n label=\"Select Mode\"\n )\n\n # ---------------- CHAT ----------------\n chatbot = gr.ChatInterface(\n fn=get_response,\n additional_inputs=[mode],\n examples=[\n [\"Explain recursion\"],\n [\"Solve quadratic equation\"],\n [\"What is AI?\"],\n [\"Debug Python code\"]\n ]\n )\n\n # ---------------- QUICK ACTIONS ----------------\n gr.Markdown(\"## ⚡ Quick Actions\")\n\n quick_input = gr.Textbox(label=\"Quick Input\")\n\n with gr.Row():\n gr.Button(\"📖 Summarize\").click(summarize, quick_input, gr.Textbox())\n gr.Button(\"📝 Quiz\").click(quiz, quick_input, gr.Textbox())\n gr.Button(\"💡 Simple\").click(simple, quick_input, gr.Textbox())\n\n # ---------------- STUDY PLAN ----------------\n gr.Markdown(\"## 🗓️ Study Plan Generator\")\n\n plan_input = gr.Textbox(label=\"Enter Topic / Exam Detail\")\n plan_output = gr.Textbox(label=\"Plan Output\")\n\n gr.Button(\"Generate Plan\").click(study_plan, plan_input, plan_output)\n\n # ---------------- FOOTER ----------------\n gr.HTML(\"\"\"\n \n \"\"\")\n\ndemo.launch(server_name=\"0.0.0.0\", server_port=7860)"
+ },
+ {
+ "id": "build-small-hackathon/AmazingDigitalPetDentures",
+ "title": "AmazingDigitalPetDentures",
+ "summary": "The Amazing Digital Pet Dentures feeds on your Adventures",
+ "tags": [
+ "gradio",
+ "region:us"
+ ],
+ "models": [],
+ "datasets": [],
+ "likes": 1,
+ "sdk": "gradio",
+ "license": "",
+ "created_at": "2026-06-05T15:14:32+00:00",
+ "last_modified": "2026-06-07T18:55:12+00:00",
+ "host": "https://build-small-hackathon-amazingdigitalpetdentures.hf.space",
+ "url": "https://huggingface.co/spaces/build-small-hackathon/AmazingDigitalPetDentures",
+ "app_file": "app.py",
+ "app_file_embedding_text": "_strip_fences text best_html parse_reply content reasoning answer_markdown prose doc iframe_for raw_html empty_preview_doc empty_preview local_reply message run_model messages user_message convo_to_history convo latest_html chat_turn history hydrate new_session build_app os.environ.setdefault Amazing Digital Pet Dentures — HTML Toy Maker re.compile GRADIO_SSR_MODE false .*? Remove ``` code fences but keep their contents. re.sub text.replace Slice out the real HTML document: from the LAST (or ) to the LAST . The real doc is generated AFTER any reasoning, so taking the last opener avoids reasoning that merely *mentions* tags (which produced broken fragments before). text.lower low.rfind strip Split a raw model reply into (thinking, prose, html_doc_or_None). The assistant chat bubble: the friendly line + the full HTML as a code block. html.escape quote Your toy will appear here. 🎪 Fallback when the model layer can't be imported/run (e.g. no GPU locally). I couldn't reach the model. This runs in-process on **ZeroGPU** via llama-cpp-python — check that the Space has ZeroGPU enabled and see the logs. Always returns {\"content\", \"reasoning\"}. Rebuild the chatbot from the persisted convo on reload (thinking is live-only). reversed list history.append sent.append convo.append On page load, restore the chat + last toy from the persisted BrowserState. Clear chat + history + preview (panel stays on, showing the empty placeholder). __main__ app.launch css ssr_mode print file traceback.print_exc role assistant Hi! I'm the dentures 🦷 — describe anything (a game, a widget, a visualizer, a clock…) and I'll build it as a live HTML toy. Hit 🧹 New session to start over. ```[a-zA-Z0-9]*\\n? ``` Tell me what to build — e.g. 'a bouncing ball that follows my mouse'. model_generate isinstance gr.update gr.Blocks title fill_width [app] model layer not available — using fallback replies. Reason: ℹ️ **Hugging Face note:** the `---` block at the very top of this file is the Space\n> config. **Do not delete it** — it tells the Space how to run. Everything below it is just\n> this page.\n\n---\n\n## How it works (architecture)\n\n| File | Role |\n|---|---|\n| `app.py` | Gradio UI: chat + the adventure window (renders games in an `