Spaces:

Decision-Fish
/

cat

Sleeping

App Files Files Community

Decision-Fish commited on Aug 16, 2025

Commit

aa39c2d

verified ·

1 Parent(s): 6e19a5c

Upload 2 files

Browse files

Files changed (2) hide show

CAT_universal_prompt.txt +53 -0
app.py +400 -0

CAT_universal_prompt.txt ADDED Viewed

	@@ -0,0 +1,53 @@

+CAT Universal Prompt (CAT_universal_prompt.txt)
+Opening Context
+You are an AI Mentor guiding a student through a simulation. The student will help a fictional character, in a realistic NYC entry-level job, navigate a decision or dilemma directly related to the Module Learning Objectives below. You will generate one realistic scenario aligned with these objectives, starting with a vivid, in-character request for help that encourages the student to ask questions and perform the relevant analyses taught in the module.
+[LEARNING OBJECTIVES]
+{LEARNING_OBJECTIVES}
+Rubric Criteria for Evaluation
+The evaluator will use the rubric below. Throughout the conversation, gently guide the student to address these items. Do not do analyses for the student; instead, prompt them for inputs and let them run the tools.
+{RUBRIC}
+Rules for the Simulation
+- Stay in character during roleplay until the scene ends.
+- Keep responses concise (2–4 sentences) and focused on helping the student think, not on solving the problem for them.
+- Let the student lead—do not introduce decision-making tools; if they choose one, ask for their inputs and let them run it.
+- Encourage the student to perform the relevant analyses taught in the module as part of their reasoning.
+- Aim for about 15–20 meaningful exchanges total.
+- Before moving to the wrap-up, ensure the student has addressed the key elements in this module’s rubric and learning objectives. If any appear missing, ask a brief, relevant question to prompt for them.
+Scene Wrap & Transition to Evaluation
+- After roughly 7–9 student turns, begin closing the scene.
+- In character, warmly acknowledge the student’s efforts and invite a final contribution:
+  "Thanks for walking through this with me. We’ve covered a lot. Before we wrap up, is there anything else you’d like me to consider before I give you a preliminary assessment?"
+- If the student says no:
+  "Okay, thank you. Let’s step back and review how you approached this situation."
+- If the student says yes:
+  Provide one short, neutral acknowledgment only (no new roleplay branches):
+  "I appreciate you sharing that. I’ll take it into account."
+  Then transition to mentor mode.
+- Hard cap: If 10 student turns are reached without the wrap-up, trigger it automatically.
+Evaluation Phase (Mentor Mode)
+- Drop character completely.
+- Ask the student to name at least two decision-making tools they used and confirm whether they applied them accurately.
+- Using the module rubric:
+  - Assign a score for each category.
+  - Provide up to 100 words of feedback per category in a warm, professional tone.
+  - Include at least one quote or paraphrase from the conversation to support each score.
+  - Be specific and constructive. Award full marks only if all criteria are met.
+- Invite revision:
+  "Would you like to revise any part of your reasoning or recommendation before receiving your final score?"
+- If they revise, reassess and give updated scores and feedback.
+Fictional Consequence
+- After scoring, describe a fictional but plausible consequence of the character’s decision-making process tied to the student’s performance:
+  - Excellent: Significant success or positive impact.
+  - Satisfactory: Middling result with some improvement needed.
+  - Unsatisfactory: Realistic setback or risk from weaker reasoning.
+- Keep it brief (2–3 sentences), professional, and relevant.
+Starting the Simulation
+Generate one realistic scenario aligned with the learning objectives. Begin with a short, vivid description of the situation and an in-character request for guidance. Then wait for the student to respond.

app.py ADDED Viewed

	@@ -0,0 +1,400 @@

+import os
+import gradio as gr
+import re
+from pathlib import Path
+UNIVERSAL_PROMPT_PATH = "CAT_universal_prompt.txt"
+MODULE_DIR = "modules"  # <-- now using /modules subfolder
+from openai import OpenAI
+from dotenv import load_dotenv
+load_dotenv()
+client = OpenAI()
+# Type aliases
+from typing import List, cast
+from openai.types.chat import ChatCompletionMessageParam
+def call_model(system_prompt: str, history: list[dict[str, str]]) -> str:
+    # Build as simple dicts first
+    msgs: list[dict[str, str]] = [{"role": "system", "content": system_prompt}]
+    for m in history:
+        role = m.get("role")
+        content = m.get("content")
+        if role in ("user", "assistant") and isinstance(content, str):
+            msgs.append({"role": role, "content": content})
+    # Cast once at the call site to satisfy the SDK types
+    typed_msgs = cast(List[ChatCompletionMessageParam], msgs)
+    resp = client.chat.completions.create(
+        model="gpt-4o-mini",
+        messages=typed_msgs,
+        temperature=0.4,
+    )
+    return resp.choices[0].message.content or ""
+def load_text(path: str) -> str:
+    with open(path, "r", encoding="utf-8") as f:
+        return f.read()
+def assemble_prompt(universal_prompt_text: str, module_text: str) -> str:
+    def extract(label: str) -> str:
+        marker = label + ":"
+        start = module_text.find(marker)
+        if start == -1:
+            return ""
+        start += len(marker)
+        next_markers = ["\nLEARNING OBJECTIVES:", "\nRUBRIC:", "\nMODULE NAME:"]
+        end_positions = [module_text.find(m, start) for m in next_markers if module_text.find(m, start) != -1]
+        end = min(end_positions) if end_positions else len(module_text)
+        return module_text[start:end].strip()
+    learning_objectives = extract("LEARNING OBJECTIVES")
+    rubric = extract("RUBRIC")
+    prompt = universal_prompt_text.replace("{LEARNING_OBJECTIVES}", learning_objectives)
+    prompt = prompt.replace("{RUBRIC}", rubric)
+    return prompt
+def init_state():
+    return {
+        "assembled_prompt": "",
+        "history": [],
+        "mode": "roleplay",
+        "mentor_step": 0,
+        "student_name": ""
+    }
+def start_session(module_file, student_name=""):
+    state = init_state()
+    state["student_name"] = student_name
+    universal = load_text(UNIVERSAL_PROMPT_PATH)
+    module_text = load_text(Path(MODULE_DIR) / module_file)
+    # Parse the full RUBRIC section once and keep a structured copy
+    state["rubric_items"] = parse_rubric_from_module(module_text)
+    print(f"[CAT] Parsed {len(state['rubric_items'])} rubric items for this module.")
+    # Personalize lightly with the student's first name
+    name_hint = (
+        f"\n\n[Student first name: {student_name}. Use it naturally once in the opening; don’t overuse.]"
+        if student_name else ""
+    )
+    state["assembled_prompt"] = assemble_prompt(universal, module_text) + name_hint
+    state["history"].append({"role": "system", "content": state["assembled_prompt"]})
+    opening = call_model(state["assembled_prompt"], state["history"])
+    state["history"].append({"role": "assistant", "content": opening})
+    return state, state["history"]
+def chat(user_msg, state):
+    if not user_msg.strip():
+        return "", state["history"], state
+    # Shortcut: typing "grade" acts like pressing the Assess button
+    if user_msg.strip().lower() == "grade":
+        hist, st = assess_fn(state)
+        return "", hist, st
+    # If the scene is finished, ignore further input and return cleanly
+    if state.get("mode") == "done":
+        return "", state["history"], state
+    # If we've left roleplay (e.g., pressed Assess), stop running roleplay code
+    if state.get("mode") != "roleplay":
+        return "", state["history"], state
+    state["history"].append({"role": "user", "content": user_msg})
+    if state["mode"] == "roleplay":
+        reply = call_model(state["assembled_prompt"], state["history"])
+        state["history"].append({"role": "assistant", "content": reply})
+        return "", state["history"], state
+    if state["mode"] == "mentor":
+        # Step 1: general intro (no assumption of tools)
+        if state.get("mentor_step", 0) == 0:
+            eval_intro = (
+                "Before we wrap up: name two specific concepts, tools, or frameworks you used in this scenario, "
+                "and in one short sentence each say how you applied them. If you didn’t use any, name two insights "
+                "you learned and how you would apply them next time."
+            )
+            state["history"].append({"role": "assistant", "content": eval_intro})
+            state["mentor_step"] = 1
+            return "", state["history"], state
+        # Step 2: concise rubric-based evaluation
+        else:
+            # Concise rubric-based evaluation (hidden): pass instruction via system_prompt
+            eval_request = (
+                "Evaluate the student's performance using the module rubric. Provide these sections: "
+                "Overall rating (Unsatisfactory, Satisfactory, or Excellent) with a one-sentence justification; "
+                "Career competencies; Uniquely human capacities; Argument analysis; Ethical frameworks; ESG awareness; "
+                "Application; Interaction quality; Strength; Area to improve; Advice for next time; Fictional consequence. "
+                "Quote at least one student phrase. Keep the whole evaluation under 180 words."
+            )
+            try:
+                reply = call_model(
+                    state["assembled_prompt"] + "\n\n" + eval_request,
+                    state["history"]  # call_model ignores system entries in history by design
+                )
+            except Exception as e:
+                # Never let chat return None; show a friendly error and allow retry
+                state["history"].append({
+                    "role": "assistant",
+                    "content": f"[Assessment error: {e}] Please press Assess again in a few seconds."
+                })
+                return "", state["history"], state
+            state["history"].append({"role": "assistant", "content": reply})
+            state["mode"] = "done"
+            return "", state["history"], state
+    # Safety net: ensure consistent return shape if a future branch falls through
+    return "", state["history"], state
+RUBRIC_FALLBACK = [
+    "States the decision and information needs clearly",
+    "Applies the appropriate tool or framework correctly",
+    "Shows steps or calculations and a decision rule, tool, or framework",
+    "Justifies the conclusion and notes at least one limitation or tradeoff",
+]
+# --- Helpers for rubric evaluation JSON ---
+import json
+def _safe_json_loads(s: str):
+    try:
+        return json.loads(s)
+    except Exception:
+        # crude but robust: try to extract {...} block if the model wrapped it
+        start = s.find("{")
+        end = s.rfind("}")
+        if start != -1 and end != -1 and end > start:
+            try:
+                return json.loads(s[start:end+1])
+            except Exception:
+                return None
+        return None
+def _format_assessment_readable(assess_obj):
+    """
+    assess_obj schema:
+    {
+      "criteria": [{"id": "...","level":"no|partial|full","points":0|0.5|1,"evidence":"..."}],
+      "total_points": float,
+      "max_points": float,
+      "summary": "≤180 words narrative"
+    }
+    """
+    if not isinstance(assess_obj, dict) or "criteria" not in assess_obj:
+        return "[Assessment parsing error: invalid JSON]"
+    lines = []
+    total = assess_obj.get("total_points", 0)
+    maxp = assess_obj.get("max_points", 0)
+    lines.append(f"Score: {total:g}/{maxp:g}")
+    lines.append("")
+    for c in assess_obj["criteria"]:
+        lid = c.get("id","?")
+        level = c.get("level","?")
+        pts = c.get("points","?")
+        ev = c.get("evidence","")
+        lines.append(f"- {lid}: {level} ({pts}) — {ev}")
+    if assess_obj.get("summary"):
+        lines.append("")
+        lines.append(assess_obj["summary"])
+    return "\n".join(lines)
+# --- end helpers ---
+# --- Assess: rubric-based, no/partial/full per criterion (JSON output) ---
+def assess_fn(state):
+    """
+    One press:
+      1) Adds END OF SCENE visibly once.
+      2) Runs a rubric-based evaluation using a dedicated evaluator system prompt.
+         Output schema is strict JSON with per-criterion no/partial/full.
+    If already done: no-op.
+    Returns: (chat_history, state)
+    """
+    # If already finalized, do nothing
+    if state.get("mode") == "done":
+        return state["history"], state
+    # 1) Show the scene break once
+    if not (
+        state["history"]
+        and state["history"][-1].get("role") == "assistant"
+        and state["history"][-1].get("content", "").strip() == "END OF SCENE"
+    ):
+        state["history"].append({"role": "assistant", "content": "END OF SCENE"})
+    # Enter mentor mode and skip any intro
+    state["mode"] = "mentor"
+    state["mentor_step"] = 2
+    # 2) Build rubric payload from the current module; fall back only if needed
+    raw_items = state.get("rubric_items") or []
+    if not isinstance(raw_items, list) or len(raw_items) == 0:
+        raw_items = RUBRIC_FALLBACK[:]  # last resort
+    rubric = []
+    for i, item in enumerate(raw_items, start=1):
+        rid = item.get("id") if isinstance(item, dict) else str(item)
+        rubric.append({"id": rid or f"Criterion {i}"})
+    # 3) Dedicated evaluator prompt: yes/no evidence per item (simple, deterministic)
+    assessor_system = (
+        "You are the Evaluator for the Conversational Assessment Tool (CAT).\n"
+        "For EACH rubric item, decide if the student provided reasonable, college-level evidence.\n"
+        "Rules:\n"
+        "- 'meets' = true only if the student shows specific, relevant reasoning/evidence for that item.\n"
+        "- Otherwise 'meets' = false.\n"
+        "Return STRICT JSON ONLY (no prose outside JSON):\n"
+        "{\n"
+        '  "results": [\n'
+        '    {"id": "<criterion id>", "meets": true|false, "evidence": "<short quote or brief reason>"}\n'
+        "  ]\n"
+        "}"
+    )
+    # 4) Provide the actual rubric text to the model as context
+    module_rubric_text = "\n".join(f"- {c['id']}" for c in rubric)
+    history_for_eval = list(state["history"]) + [
+        {"role": "assistant", "content": "Evaluate against these rubric items:\n" + module_rubric_text}
+    ]
+    try:
+        model_raw = call_model(assessor_system, history_for_eval)
+        # Parse and normalize
+        data = _safe_json_loads(model_raw)
+        if not data or "results" not in data or not isinstance(data["results"], list):
+            raise ValueError("Invalid evaluator JSON")
+        # Align results to rubric order
+        results = []
+        by_id = {str(r.get("id", "")): r for r in data["results"]}
+        for c in rubric:
+            cid = c["id"]
+            r = by_id.get(cid, {})
+            meets = bool(r.get("meets") is True)
+            evidence = str(r.get("evidence") or "")
+            results.append({"id": cid, "meets": meets, "evidence": evidence})
+        met = sum(1 for r in results if r["meets"])
+        total = len(results)
+        pct = (met / total) if total else 0.0
+        if total > 0 and met == total:
+            overall = "Full Credit"
+        elif pct >= 0.50:
+            overall = "Partial Credit"
+        else:
+            overall = "No Credit"
+        # Render readable output
+        lines = [f"Overall: {overall}", f"Met: {met}/{total} ({round(pct*100)}%)", ""]
+        for r in results:
+            mark = "✅" if r["meets"] else "❌"
+            ev = f" — {r['evidence']}" if r["evidence"] else ""
+            lines.append(f"- {mark} {r['id']}{ev}")
+        readable = "\n".join(lines)
+    except Exception as e:
+        readable = f"[Assessment error: {e}]"
+    state["history"].append({"role": "assistant", "content": readable})
+    state["mode"] = "done"
+    return state["history"], state
+# --- end Assess ---
+def parse_rubric_from_module(module_text: str):
+    """
+    Extracts the full RUBRIC section from a module text file and returns a list of items.
+    - Captures everything after a line that says 'RUBRIC' (with optional colon)
+      until the next ALL-CAPS header or file end.
+    - Accepts bullets like -, *, •, or 1), 1., etc.
+    - Falls back to RUBRIC_FALLBACK if nothing is found.
+    """
+    if not module_text:
+        return RUBRIC_FALLBACK[:]
+    # 1) Slice out the RUBRIC block
+    block_re = re.compile(
+        r'^\s*RUBRIC\s*:?\s*$([\s\S]*?)(?=^\s*[A-Z][A-Z\s/&\-]{3,}\s*:?\s*$|^\Z)',
+        re.MULTILINE
+    )
+    m = block_re.search(module_text)
+    if not m:
+        return RUBRIC_FALLBACK[:]
+    block = m.group(1).strip()
+    # 2) Collect bullet-like lines
+    items = []
+    for line in block.splitlines():
+        # Keep original text but trim whitespace
+        raw = line.strip()
+        if not raw:
+            continue
+        # Match common bullet or numbered list starters
+        if re.match(r'^(\-|\*|•|\d+[\.\)]|\([a-z]\))\s+', raw, re.IGNORECASE):
+            # strip the bullet prefix
+            cleaned = re.sub(r'^(\-|\*|•|\d+[\.\)]|\([a-z]\))\s+', '', raw, flags=re.IGNORECASE).strip()
+            if cleaned:
+                items.append(cleaned)
+        else:
+            # Some rubrics are paragraph-style; treat non-empty lines as items
+            # but avoid obvious section labels like "Notes:" inside the block
+            if not re.match(r'^\s*(notes?|example|weight|scale)\s*:?\s*$', raw, re.IGNORECASE):
+                items.append(raw)
+    # 3) Deduplicate while preserving order
+    seen = set()
+    deduped = []
+    for it in items:
+        if it not in seen:
+            seen.add(it)
+            deduped.append(it)
+    return deduped or RUBRIC_FALLBACK[:]
+with gr.Blocks(title="CAT (MVP)") as demo:
+    gr.Markdown("## 😼Conversational Assessment Tool (CAT) — MVP")
+    with gr.Row():
+        module_file = gr.Dropdown(
+            label="Select Module File",
+            choices=[p.name for p in sorted(Path(MODULE_DIR).glob("module*.txt"))],
+            value="module01.txt",
+            interactive=True
+        )
+        name_tb = gr.Textbox(label="Your first name", placeholder="e.g., Maya", value="", interactive=True)
+        start_btn = gr.Button("Start")   # fine to keep inside the row (optional)
+    chatbot = gr.Chatbot(label="CAT Conversation", type="messages")
+    user_in = gr.Textbox(label="Your message", placeholder="Type here and press Enter")
+    state = gr.State(init_state())
+    assess_btn = gr.Button("Assess", variant="primary")
+    def _start(module_name, student_name):
+        student_name = student_name.strip()
+        if not student_name:
+            # Return a valid state object plus a warning message in the chat
+            return init_state(), [{"role": "assistant", "content": "⚠ Please enter your first name before starting."}]
+        st, hist = start_session(module_name, student_name)
+        return st, hist
+    start_btn.click(_start, [module_file, name_tb], [state, chatbot])
+    user_in.submit(chat, [user_in, state], [user_in, chatbot, state])
+    # Clicking Assess triggers the mentor/evaluator flow
+    assess_btn.click(
+        fn=assess_fn,
+        inputs=[state],
+        outputs=[chatbot, state]
+    )
+if __name__ == "__main__":
+    demo.launch()