Spaces:

Agents-MCP-Hackathon
/

LLMGameHub

Running

App Files Files Community

gsavin commited on Jun 4, 2025

Commit

2999669

2 Parent(s): 4310b90 ccccaf7

Merge branch 'main' of https://github.com/DeltaZN/gradio-mcp-hackaton into feat/improve-image-generation

Browse files

Files changed (15) hide show

src/agent/game_generator.py +0 -0
src/agent/image_agent.py +82 -0
src/agent/llm.py +34 -49
src/agent/llm_agent.py +2 -2
src/agent/llm_graph.py +136 -6
src/agent/models.py +104 -0
src/agent/music_agent.py +47 -0
src/agent/prompts.py +38 -0
src/agent/runner.py +67 -0
src/agent/state.py +24 -0
src/agent/tools.py +161 -29
src/config.py +6 -3
src/css.py +9 -1
src/game_constructor.py +16 -52
src/main.py +35 -62

src/agent/game_generator.py DELETED Viewed

File without changes

src/agent/image_agent.py ADDED Viewed

	@@ -0,0 +1,82 @@

+from pydantic import BaseModel, Field
+from typing import Literal, Optional
+from agent.llm import create_light_llm
+from langchain_core.messages import SystemMessage, HumanMessage
+import logging
+logger = logging.getLogger(__name__)
+IMAGE_GENERATION_SYSTEM_PROMPT = """You are an AI agent for a visual novel game. Your role is to process an incoming scene description and determine if the visual scene needs to change. If it does, you will generate a new `scene_description`. This `scene_description` MUST BE a highly detailed image prompt, specifically engineered for an AI image generation model, and it MUST adhere to the strict first-person perspective detailed below.
+**Your Core Tasks & Output Structure:**
+Your output MUST be a `ChangeScene` object. You need to:
+1.  **Determine Change Type:** Decide if the scene requires a "change_completely", "modify", or "no_change" and set this in the `change_scene` field of the output object.
+2.  **Generate FPS Image Prompt:** If your decision is "change_completely" or "modify", you MUST then generate the image prompt and place it in the `scene_description` field of the output object. If "no_change", this field can be null or empty.
+**Mandatory: First-Person Perspective (FPS) for Image Prompts**
+The image prompt you generate for the `scene_description` field MUST strictly describe the scene from a first-person perspective (FPS), as if the player is looking directly through the character's eyes.
+    *   **Viewpoint:** All descriptions must be from the character's eye level, looking forward or as indicated by the scene.
+    *   **Character Visibility:** The scene must be depicted strictly as if looking through the character's eyes. NO part of the character's own body (e.g., hands, arms, feet, clothing on them) should be visible or described in the prompt. The view is purely what is external to the character.
+    *   **Immersion:** Focus on what the character directly sees and perceives in their immediate environment. Use phrasing that reflects this, for example: "I see...", "Before me lies...", "Looking through the grimy window...", "The corridor stretches out in front of me."
+**Guidelines for Crafting the FPS Image Prompt (for `scene_description` field):**
+When generating the image prompt, ensure it's detailed and considers the following aspects, all from the character's first-person viewpoint:
+1.  **Subject & Focus (as seen by the character):**
+    *   What is the primary subject or point of interest directly in the character's view?
+    *   Describe any other characters visible to the POV character: their appearance (from the character's perspective), clothing, expressions, posture, and actions.
+    *   Detail key objects, items, or environmental elements the character is interacting with or observing.
+2.  **Setting & Environment (from the character's perspective):**
+    *   Describe the immediate surroundings as the character would see them.
+    *   Time of day and weather conditions as perceived by the character.
+    *   Specific architectural or natural features visible in the character's field of view.
+3.  **Art Style & Medium:**
+    *   Specify the desired visual style (e.g., photorealistic, anime, manga, watercolor, oil painting, pixel art, 3D render, concept art, comic book).
+    *   Mention any specific artist influences if relevant (e.g., "in the style of Studio Ghibli").
+4.  **Composition & Framing (from the character's viewpoint):**
+    *   How is the scene framed from the character's eyes? (e.g., "looking straight ahead at a door," "view through a sniper scope," "gazing up at a tall tower").
+    *   Describe the arrangement of elements as perceived by the character. Avoid terms like "medium shot" or "wide shot" unless they can be rephrased from an FPS view (e.g., "a wide vista opens up before me").
+5.  **Lighting & Atmosphere (as perceived by the character):**
+    *   Describe lighting conditions (e.g., "bright sunlight streams through the window in front of me," "only the dim glow of my flashlight illuminates the passage ahead," "neon signs reflect off the wet street I'm looking at").
+    *   What is the overall mood or atmosphere from the character's perspective? (e.g., "a tense silence hangs in the air as I look down the dark hallway," "a sense of peace as I gaze at the sunset over the mountains").
+6.  **Color Palette:**
+    *   Specify dominant colors or a color scheme relevant to what the character sees.
+7.  **Details & Keywords:**
+    *   Include crucial details from the input scene description that the character would notice.
+    *   Use descriptive adjectives and strong keywords.
+**Example for the `scene_description` field (the FPS image prompt):**
+"FPS view. Through the cockpit window of a futuristic hovercar, a sprawling neon-lit cyberpunk city stretches out under a stormy, rain-lashed sky. Rain streaks across the glass. The hum of the engine is palpable. Photorealistic, Blade Runner style. Cool blue and vibrant pink neon palette."
+"""
+class ChangeScene(BaseModel):
+    change_scene: Literal["change_completely", "modify", "no_change"] = Field(
+        description="Whether the scene should be completely changed, just modified or not changed at all"
+    )
+    scene_description: Optional[str] = None
+image_prompt_generator_llm = create_light_llm(0.1).with_structured_output(ChangeScene)
+async def generate_image_prompt(scene_description: str, request_id: str) -> ChangeScene:
+    """
+    Generates a detailed image prompt string based on a scene description.
+    This prompt is intended for use with an AI image generation model.
+    """
+    logger.info(f"Generating image prompt for the current scene: {request_id}")
+    response = await image_prompt_generator_llm.ainvoke(
+        [
+            SystemMessage(content=IMAGE_GENERATION_SYSTEM_PROMPT),
+            HumanMessage(content=scene_description),
+        ]
+    )
+    logger.info(f"Image prompt generated: {request_id}")
+    return response

src/agent/llm.py CHANGED Viewed

@@ -1,74 +1,59 @@
-from langchain_google_genai import ChatGoogleGenerativeAI
 import logging
 from config import settings
 logger = logging.getLogger(__name__)
-_google_api_keys_list = []
-_current_google_key_idx = 0
-def create_llm(temperature: float = settings.temperature, top_p: float = settings.top_p):
-    global _google_api_keys_list, _current_google_key_idx
-    if not _google_api_keys_list:
-        api_keys_str = settings.gemini_api_keys.get_secret_value()
-        if api_keys_str:
-            _google_api_keys_list = [key.strip() for key in api_keys_str.split(',') if key.strip()]
-        if not _google_api_keys_list:
-            logger.error("Google API keys are not configured or are empty in settings.")
-            raise ValueError("Google API keys are not configured or are invalid for round-robin.")
-    if not _google_api_keys_list: # Safeguard, though previous block should handle it.
-        logger.error("No Google API keys available for round-robin.")
-        raise ValueError("No Google API keys available for round-robin.")
-    key_index_to_use = _current_google_key_idx
-    selected_api_key = _google_api_keys_list[key_index_to_use]
-    _current_google_key_idx = (key_index_to_use + 1) % len(_google_api_keys_list)
-    logger.debug(f"Using Google API key at index {key_index_to_use} (ending with ...{selected_api_key[-4:] if len(selected_api_key) > 4 else selected_api_key}) for round-robin.")
     return ChatGoogleGenerativeAI(
-        model="gemini-2.5-flash-preview-05-20",
-        google_api_key=selected_api_key,
         temperature=temperature,
         top_p=top_p,
-        thinking_budget=1024
     )
 def create_light_llm(temperature: float = settings.temperature, top_p: float = settings.top_p):
-    global _google_api_keys_list, _current_google_key_idx
-    if not _google_api_keys_list:
-        api_keys_str = settings.gemini_api_keys.get_secret_value()
-        if api_keys_str:
-            _google_api_keys_list = [key.strip() for key in api_keys_str.split(',') if key.strip()]
-        if not _google_api_keys_list:
-            logger.error("Google API keys are not configured or are empty in settings.")
-            raise ValueError("Google API keys are not configured or are invalid for round-robin.")
-    if not _google_api_keys_list: # Safeguard, though previous block should handle it.
-        logger.error("No Google API keys available for round-robin.")
-        raise ValueError("No Google API keys available for round-robin.")
-    key_index_to_use = _current_google_key_idx
-    selected_api_key = _google_api_keys_list[key_index_to_use]
-    _current_google_key_idx = (key_index_to_use + 1) % len(_google_api_keys_list)
-    logger.debug(f"Using Google API key at index {key_index_to_use} (ending with ...{selected_api_key[-4:] if len(selected_api_key) > 4 else selected_api_key}) for round-robin.")
     return ChatGoogleGenerativeAI(
         model="gemini-2.0-flash",
-        google_api_key=selected_api_key,
         temperature=temperature,
         top_p=top_p
     )
-def create_precise_llm():
     return create_llm(temperature=0, top_p=1)

+"""Utility functions for working with the language model."""
 import logging
+from langchain_google_genai import ChatGoogleGenerativeAI
 from config import settings
 logger = logging.getLogger(__name__)
+_API_KEYS: list[str] = []
+_current_key_idx = 0
+MODEL_NAME = "gemini-2.5-flash-preview-05-20"
+def _get_api_key() -> str:
+    """Return an API key using round-robin selection."""
+    global _API_KEYS, _current_key_idx
+    if not _API_KEYS:
+        keys_str = settings.gemini_api_key.get_secret_value()
+        if keys_str:
+            _API_KEYS = [k.strip() for k in keys_str.split(",") if k.strip()]
+        if not _API_KEYS:
+            msg = "Google API keys are not configured or invalid"
+            logger.error(msg)
+            raise ValueError(msg)
+    key = _API_KEYS[_current_key_idx]
+    _current_key_idx = (_current_key_idx + 1) % len(_API_KEYS)
+    logger.debug("Using Google API key index %s", _current_key_idx)
+    return key
+def create_llm(
+    temperature: float = settings.temperature,
+    top_p: float = settings.top_p,
+) -> ChatGoogleGenerativeAI:
+    """Create a standard LLM instance."""
     return ChatGoogleGenerativeAI(
+        model=MODEL_NAME,
+        google_api_key=_get_api_key(),
         temperature=temperature,
         top_p=top_p,
+        thinking_budget=1024,
     )
 def create_light_llm(temperature: float = settings.temperature, top_p: float = settings.top_p):
     return ChatGoogleGenerativeAI(
         model="gemini-2.0-flash",
+        google_api_key=_get_api_key(),
         temperature=temperature,
         top_p=top_p
     )
+def create_precise_llm() -> ChatGoogleGenerativeAI:
+    """Return an LLM tuned for deterministic output."""
     return create_llm(temperature=0, top_p=1)

src/agent/llm_agent.py CHANGED Viewed

@@ -5,7 +5,7 @@ import logging
 from agent.image_agent import ChangeScene
 import asyncio
 from agent.music_agent import generate_music_prompt
-from agent.image_agent import generate_scene_image
 import uuid
 logger = logging.getLogger(__name__)
@@ -57,7 +57,7 @@ async def process_user_input(input: str) -> MultiAgentResponse:
     music_prompt_task = generate_music_prompt(current_state, request_id)
-    change_scene_task = generate_scene_image(current_state, request_id)
     music_prompt, change_scene = await asyncio.gather(music_prompt_task, change_scene_task)

 from agent.image_agent import ChangeScene
 import asyncio
 from agent.music_agent import generate_music_prompt
+from agent.image_agent import generate_image_prompt
 import uuid
 logger = logging.getLogger(__name__)
     music_prompt_task = generate_music_prompt(current_state, request_id)
+    change_scene_task = generate_image_prompt(current_state, request_id)
     music_prompt, change_scene = await asyncio.gather(music_prompt_task, change_scene_task)

src/agent/llm_graph.py CHANGED Viewed

@@ -1,14 +1,144 @@
-from agent.tools import available_tools
-from agent.llm import create_llm
-from langgraph.graph import MessagesState
-class CustomState(MessagesState):
-    """Расширенное состояние графа."""
-llm = create_llm().bind_tools(available_tools)

+"""LangGraph setup for the interactive fiction agent."""
+import logging
+from dataclasses import dataclass
+from typing import Any, Dict, Optional
+import asyncio
+from langgraph.graph import END, StateGraph
+from agent.image_agent import generate_image_prompt
+from agent.tools import (
+    check_ending,
+    generate_scene,
+    generate_scene_image,
+    generate_story_frame,
+    update_state_with_choice,
+)
+from agent.state import get_user_state
+from audio.audio_generator import change_music_tone
+logger = logging.getLogger(__name__)
+@dataclass
+class GraphState:
+    """Mutable state passed between graph nodes."""
+    user_hash: Optional[str] = None
+    step: Optional[str] = None
+    setting: Optional[str] = None
+    character: Optional[Dict[str, Any]] = None
+    genre: Optional[str] = None
+    choice_text: Optional[str] = None
+    scene: Optional[Dict[str, Any]] = None
+    ending: Optional[Dict[str, Any]] = None
+async def node_entry(state: GraphState) -> GraphState:
+    logger.debug("[Graph] entry state: %s", state)
+    return state
+def route_step(state: GraphState) -> str:
+    if state.step == "start":
+        return "init_game"
+    if state.step == "choose":
+        return "player_step"
+    logger.warning("route_step received unknown step '%s'", state.step)
+    return "init_game"
+async def node_init_game(state: GraphState) -> GraphState:
+    logger.debug("[Graph] node_init_game state: %s", state)
+    await generate_story_frame.ainvoke(
+        {
+            "user_hash": state.user_hash,
+            "setting": state.setting,
+            "character": state.character,
+            "genre": state.genre,
+        }
+    )
+    first_scene = await generate_scene.ainvoke(
+        {"user_hash": state.user_hash, "last_choice": "start"}
+    )
+    change_scene = await generate_image_prompt(first_scene["description"], state.user_hash)
+    logger.info(f"Change scene: {change_scene}")
+    await generate_scene_image.ainvoke(
+        {
+            "user_hash": state.user_hash,
+            "scene_id": first_scene["scene_id"],
+            "change_scene": change_scene,
+        }
+    )
+    state.scene = first_scene
+    return state
+async def node_player_step(state: GraphState) -> GraphState:
+    logger.debug("[Graph] node_player_step state: %s", state)
+    user_state = get_user_state(state.user_hash)
+    scene_id = user_state.current_scene_id
+    if state.choice_text:
+        await update_state_with_choice.ainvoke(
+            {
+                "user_hash": state.user_hash,
+                "scene_id": scene_id,
+                "choice_text": state.choice_text,
+            }
+        )
+    ending = await check_ending.ainvoke({"user_hash": state.user_hash})
+    state.ending = ending
+    if not ending.get("ending_reached", False):
+        next_scene = await generate_scene.ainvoke(
+            {
+                "user_hash": state.user_hash,
+                "last_choice": state.choice_text,
+            }
+        )
+        change_scene = await generate_image_prompt(next_scene["description"], state.user_hash)
+        image_task = generate_scene_image.ainvoke(
+            {
+                "user_hash": state.user_hash,
+                "scene_id": next_scene["scene_id"],
+                "current_image": user_state.assets[scene_id],
+                "change_scene": change_scene,
+            }
+        )
+        music_task = change_music_tone(state.user_hash, next_scene["music"])
+        await asyncio.gather(image_task, music_task)
+        state.scene = next_scene
+    return state
+def route_ending(state: GraphState) -> str:
+    return "game_over" if state.ending.get("ending_reached") else "continue"
+async def node_game_over(state: GraphState) -> GraphState:
+    logger.info("[Graph] Game over for user %s", state.user_hash)
+    return state
+def build_llm_game_graph() -> StateGraph:
+    graph = StateGraph(GraphState)
+    graph.add_node("entry", node_entry)
+    graph.add_node("init_game", node_init_game)
+    graph.add_node("player_step", node_player_step)
+    graph.add_node("game_over", node_game_over)
+    graph.set_entry_point("entry")
+    graph.add_conditional_edges(
+        "entry",
+        route_step,
+        {"init_game": "init_game", "player_step": "player_step"},
+    )
+    graph.add_edge("init_game", END)
+    graph.add_conditional_edges(
+        "player_step",
+        route_ending,
+        {"game_over": "game_over", "continue": END},
+    )
+    graph.add_edge("game_over", END)
+    return graph.compile()
+llm_game_graph = build_llm_game_graph()

src/agent/models.py ADDED Viewed

	@@ -0,0 +1,104 @@

+"""Pydantic models representing game state and LLM outputs."""
+from typing import Dict, List, Optional, Set
+from pydantic import BaseModel, Field
+class Milestone(BaseModel):
+    """Milestone that can be achieved during the story."""
+    id: str
+    description: str
+class Ending(BaseModel):
+    """Possible game ending."""
+    id: str
+    type: str  # "good" or "bad"
+    condition: str
+    description: Optional[str] = None
+class StoryFrame(BaseModel):
+    """Overall plot information generated by the LLM."""
+    lore: str
+    goal: str
+    milestones: List[Milestone]
+    endings: List[Ending]
+    setting: str
+    character: Dict[str, str]
+    genre: str
+class StoryFrameLLM(BaseModel):
+    """Structure returned by the LLM for story frame generation."""
+    lore: str
+    goal: str
+    milestones: List[Milestone]
+    endings: List[Ending]
+class SceneChoice(BaseModel):
+    """User choice leading to another scene."""
+    text: str
+    next_scene_short_desc: str
+class PlayerOption(BaseModel):
+    """Option presented to the player in a scene."""
+    option_description: str = Field(
+        description=(
+            "Description of the option, e.g. '[Say] Hello!' or "
+            "'Go to the forest'"
+        )
+    )
+class Scene(BaseModel):
+    """Game scene with choices and optional assets."""
+    scene_id: str
+    description: str
+    choices: List[SceneChoice]
+    image: Optional[str] = None
+    music: Optional[str] = None
+class SceneLLM(BaseModel):
+    """Structure expected from the LLM when generating a scene."""
+    description: str
+    choices: List[SceneChoice]
+class EndingCheckResult(BaseModel):
+    """Result returned from the LLM when checking for an ending."""
+    ending_reached: bool = Field(default=False)
+    ending: Optional[Ending] = None
+class UserChoice(BaseModel):
+    """Single player choice recorded in the history."""
+    scene_id: str
+    choice_text: str
+    timestamp: Optional[str] = None
+class UserState(BaseModel):
+    """State stored for each user."""
+    story_frame: Optional[StoryFrame] = None
+    current_scene_id: Optional[str] = None
+    scenes: Dict[str, Scene] = Field(default_factory=dict)
+    milestones_achieved: Set[str] = Field(default_factory=set)
+    user_choices: List[UserChoice] = Field(default_factory=list)
+    ending: Optional[Ending] = None
+    assets: Dict[str, str] = Field(default_factory=dict)

src/agent/music_agent.py ADDED Viewed

	@@ -0,0 +1,47 @@

+from pydantic import BaseModel
+from agent.llm import create_light_llm
+from langchain_core.messages import SystemMessage, HumanMessage
+import logging
+logger = logging.getLogger(__name__)
+music_options = """Instruments: 303 Acid Bass, 808 Hip Hop Beat, Accordion, Alto Saxophone, Bagpipes, Balalaika Ensemble, Banjo, Bass Clarinet, Bongos, Boomy Bass, Bouzouki, Buchla Synths, Cello, Charango, Clavichord, Conga Drums, Didgeridoo, Dirty Synths, Djembe, Drumline, Dulcimer, Fiddle, Flamenco Guitar, Funk Drums, Glockenspiel, Guitar, Hang Drum, Harmonica, Harp, Harpsichord, Hurdy-gurdy, Kalimba, Koto, Lyre, Mandolin, Maracas, Marimba, Mbira, Mellotron, Metallic Twang, Moog Oscillations, Ocarina, Persian Tar, Pipa, Precision Bass, Ragtime Piano, Rhodes Piano, Shamisen, Shredding Guitar, Sitar, Slide Guitar, Smooth Pianos, Spacey Synths, Steel Drum, Synth Pads, Tabla, TR-909 Drum Machine, Trumpet, Tuba, Vibraphone, Viola Ensemble, Warm Acoustic Guitar, Woodwinds, ...
+Music Genre: Acid Jazz, Afrobeat, Alternative Country, Baroque, Bengal Baul, Bhangra, Bluegrass, Blues Rock, Bossa Nova, Breakbeat, Celtic Folk, Chillout, Chiptune, Classic Rock, Contemporary R&B, Cumbia, Deep House, Disco Funk, Drum & Bass, Dubstep, EDM, Electro Swing, Funk Metal, G-funk, Garage Rock, Glitch Hop, Grime, Hyperpop, Indian Classical, Indie Electronic, Indie Folk, Indie Pop, Irish Folk, Jam Band, Jamaican Dub, Jazz Fusion, Latin Jazz, Lo-Fi Hip Hop, Marching Band, Merengue, New Jack Swing, Minimal Techno, Moombahton, Neo-Soul, Orchestral Score, Piano Ballad, Polka, Post-Punk, 60s Psychedelic Rock, Psytrance, R&B, Reggae, Reggaeton, Renaissance Music, Salsa, Shoegaze, Ska, Surf Rock, Synthpop, Techno, Trance, Trap Beat, Trip Hop, Vaporwave, Witch house, ...
+Mood/Description: Acoustic Instruments, Ambient, Bright Tones, Chill, Crunchy Distortion, Danceable, Dreamy, Echo, Emotional, Ethereal Ambience, Experimental, Fat Beats, Funky, Glitchy Effects, Huge Drop, Live Performance, Lo-fi, Ominous Drone, Psychedelic, Rich Orchestration, Saturated Tones, Subdued Melody, Sustained Chords, Swirling Phasers, Tight Groove, Unsettling, Upbeat, Virtuoso, Weird Noises, ...
+"""
+system_prompt = f"""
+You are a music agent responsible for generating appropriate music tones for scenes in a visual novel game.
+Your task is to analyze the current scene description and generate a detailed music prompt that captures:
+1. The emotional atmosphere
+2. The intensity level
+3. The genre/style that best fits the scene
+4. Specific instruments that would enhance the mood
+You have access to a wide range of musical elements including:
+{music_options}
+When generating a music prompt:
+- Consider the scene's context, mood, and any suspense elements
+- Choose instruments that complement the scene's atmosphere
+- Select a genre that matches the story's setting and tone
+- Include specific mood descriptors to guide the music generation
+Your output should be a concise but detailed prompt that the music generation model can use to create an appropriate soundtrack for the scene.
+"""
+class MusicPrompt(BaseModel):
+    prompt: str
+llm = create_light_llm(0.1).with_structured_output(MusicPrompt)
+async def generate_music_prompt(scene_description: str, request_id: str) -> str:
+    logger.info(f"Generating music prompt for the current scene: {request_id}")
+    response = await llm.ainvoke(
+        [SystemMessage(content=system_prompt), HumanMessage(content=scene_description)]
+    )
+    logger.info(f"Music prompt generated: {request_id}")
+    return response.prompt

src/agent/prompts.py ADDED Viewed

	@@ -0,0 +1,38 @@

+STORY_FRAME_PROMPT = """
+You are a narrative game designer. Use the player data below to
+create a story frame for an interactive adventure.
+Setting: {setting}
+Character: {character}
+Genre: {genre}
+Return ONLY a JSON object with:
+- lore: brief world description
+- goal: main player objective
+- milestones: 2-4 key events (id, description)
+- endings: good/bad endings (id, type, condition, description)
+Translate the lore, goal, milestones and endings into
+a langueage of setting language.
+"""
+SCENE_PROMPT = """
+Using the provided lore and history, generate the next scene.
+Lore: {lore}
+Goal: {goal}
+Milestones: {milestones}
+Endings: {endings}
+History: {history}
+Last choice: {last_choice}
+Respond ONLY with JSON containing:
+- description: short summary of the scene
+- choices: exactly two dicts {{"text": ..., "next_scene_short_desc": ...}}
+Translate the scene description and choices into a language of lore language.
+"""
+ENDING_CHECK_PROMPT = """
+History: {history}
+Endings: {endings}
+Check if any ending conditions are met.
+If none are met return ending_reached: false.
+If an ending is reached return ending_reached: true and provide the
+ending object (id, type, description).
+Respond ONLY with JSON.
+"""

src/agent/runner.py ADDED Viewed

	@@ -0,0 +1,67 @@

+"""Entry point for executing a graph step."""
+import logging
+from dataclasses import asdict
+from typing import Dict, Optional
+from agent.llm_graph import GraphState, llm_game_graph
+from agent.models import UserState
+from agent.state import get_user_state
+logger = logging.getLogger(__name__)
+async def process_step(
+    user_hash: str,
+    step: str,
+    setting: Optional[str] = None,
+    character: Optional[dict] = None,
+    genre: Optional[str] = None,
+    choice_text: Optional[str] = None,
+) -> Dict:
+    """Run one interaction step through the graph."""
+    logger.info("[Runner] Step %s for user %s", step, user_hash)
+    graph_state = GraphState(user_hash=user_hash, step=step)
+    if step == "start":
+        assert setting and character and genre, "Missing start parameters"
+        graph_state.setting = setting
+        graph_state.character = character
+        graph_state.genre = genre
+    elif step == "choose":
+        assert choice_text, "choice_text is required"
+        graph_state.choice_text = choice_text
+    final_state = await llm_game_graph.ainvoke(asdict(graph_state))
+    user_state: UserState = get_user_state(user_hash)
+    response: Dict = {}
+    ending = final_state.get("ending")
+    if ending and ending.get("ending_reached"):
+        ending_info = ending["ending"]
+        if (
+            ("description" not in ending_info
+                or not ending_info["description"])
+            and user_state.story_frame
+        ):
+            for e in user_state.story_frame.endings:
+                if e.id == ending_info.get("id"):
+                    ending_info["description"] = e.description
+                    break
+        response["ending"] = ending_info
+        response["game_over"] = True
+    else:
+        if (
+            user_state.current_scene_id
+            and user_state.current_scene_id in user_state.scenes
+        ):
+            current_scene = user_state.scenes[
+                user_state.current_scene_id
+            ].dict()
+        else:
+            current_scene = final_state.get("scene")
+        response["scene"] = current_scene
+        response["game_over"] = False
+    return response

src/agent/state.py ADDED Viewed

	@@ -0,0 +1,24 @@

+"""Simple in-memory user state storage."""
+from typing import Dict
+from agent.models import UserState
+_USER_STATE: Dict[str, UserState] = {}
+def get_user_state(user_hash: str) -> UserState:
+    """Return user state for the given id, creating it if necessary."""
+    if user_hash not in _USER_STATE:
+        _USER_STATE[user_hash] = UserState()
+    return _USER_STATE[user_hash]
+def set_user_state(user_hash: str, state: UserState) -> None:
+    """Persist updated user state."""
+    _USER_STATE[user_hash] = state
+def reset_user_state(user_hash: str) -> None:
+    """Reset stored state for a user."""
+    _USER_STATE[user_hash] = UserState()

src/agent/tools.py CHANGED Viewed

@@ -1,40 +1,172 @@
-from langchain_core.tools import tool
-from typing import Annotated, Any, Dict, List
-from images.image_generator import generate_image
-from langgraph.prebuilt import InjectedState
 import logging
 logger = logging.getLogger(__name__)
 def _err(msg: str) -> str:
     logger.error(msg)
-    return f"{{ 'error': '{msg}' }}"
-def _success(msg: str) -> str:
-    logger.info(msg)
-    return f"{{ 'success': '{msg}' }}"
 @tool
 async def generate_scene_image(
-    prompt: Annotated[
-        str,
-        "The prompt to generate an image from"
-    ],
-    state: InjectedState,
-) -> Annotated[
-    str,
-    "The path to the generated image"
-]:
-    """
-    Generate an image from a prompt and set current scene image.
-    """
     try:
-        image_path, img_description = generate_image(prompt)
-        state["current_scene"]["image"] = image_path
-        state["current_scene"]["image_description"] = img_description
-        return _success(f"Image generated and set as current scene image: {img_description}")
-    except Exception as e:
-        return _err(str(e))
-available_tools = [generate_scene_image]

+"""LLM tools used by the game graph."""
 import logging
+import uuid
+from typing import Annotated, Dict
+from langchain_core.tools import tool
+from agent.llm import create_llm
+from agent.models import (
+    EndingCheckResult,
+    Scene,
+    SceneChoice,
+    SceneLLM,
+    StoryFrame,
+    StoryFrameLLM,
+    UserChoice,
+)
+from agent.prompts import ENDING_CHECK_PROMPT, SCENE_PROMPT, STORY_FRAME_PROMPT
+from agent.state import get_user_state, set_user_state
+from images.image_generator import modify_image, generate_image
+from agent.image_agent import ChangeScene
 logger = logging.getLogger(__name__)
 def _err(msg: str) -> str:
     logger.error(msg)
+    return f"{{'error': '{msg}'}}"
+@tool
+async def generate_story_frame(
+    user_hash: Annotated[str, "User session ID"],
+    setting: Annotated[str, "Game world setting"],
+    character: Annotated[Dict[str, str], "Character info"],
+    genre: Annotated[str, "Genre"],
+) -> Annotated[Dict, "Generated story frame"]:
+    """Create the initial story frame and store it in user state."""
+    llm = create_llm().with_structured_output(StoryFrameLLM)
+    prompt = STORY_FRAME_PROMPT.format(
+        setting=setting,
+        character=character,
+        genre=genre,
+    )
+    resp: StoryFrameLLM = await llm.ainvoke(prompt)
+    story_frame = StoryFrame(
+        lore=resp.lore,
+        goal=resp.goal,
+        milestones=resp.milestones,
+        endings=resp.endings,
+        setting=setting,
+        character=character,
+        genre=genre,
+    )
+    state = get_user_state(user_hash)
+    state.story_frame = story_frame
+    set_user_state(user_hash, state)
+    return story_frame.dict()
+@tool
+async def generate_scene(
+    user_hash: Annotated[str, "User session ID"],
+    last_choice: Annotated[str, "Last user choice"],
+) -> Annotated[Dict, "Generated scene"]:
+    """Generate a new scene based on the current user state."""
+    state = get_user_state(user_hash)
+    if not state.story_frame:
+        return _err("Story frame not initialized")
+    llm = create_llm().with_structured_output(SceneLLM)
+    prompt = SCENE_PROMPT.format(
+        lore=state.story_frame.lore,
+        goal=state.story_frame.goal,
+        milestones=",".join(m.id for m in state.story_frame.milestones),
+        endings=",".join(e.id for e in state.story_frame.endings),
+        history="; ".join(f"{c.scene_id}:{c.choice_text}" for c in state.user_choices),
+        last_choice=last_choice,
+    )
+    resp: SceneLLM = await llm.ainvoke(prompt)
+    if len(resp.choices) < 2:
+        resp = await llm.ainvoke(
+            prompt + "\nThe scene must contain exactly two choices."
+        )
+    scene_id = str(uuid.uuid4())
+    choices = [
+        SceneChoice(**ch.model_dump())
+        if hasattr(ch, "model_dump")
+        else SceneChoice(**ch)
+        for ch in resp.choices[:2]
+    ]
+    scene = Scene(
+        scene_id=scene_id,
+        description=resp.description,
+        choices=choices,
+        image=None,
+        music=None,
+    )
+    state.current_scene_id = scene_id
+    state.scenes[scene_id] = scene
+    set_user_state(user_hash, state)
+    return scene.dict()
 @tool
 async def generate_scene_image(
+    user_hash: Annotated[str, "User session ID"],
+    scene_id: Annotated[str, "Scene ID"],
+    change_scene: Annotated[ChangeScene, "Prompt for image generation"],
+    current_image: Annotated[str, "Current image"] | None = None,
+) -> Annotated[str, "Path to generated image"]:
+    """Generate an image for a scene and save the path in the state."""
     try:
+        image_path = current_image
+        if change_scene.change_scene == "change_completely" or change_scene.change_scene == "modify":
+            image_path, _ = await (
+                generate_image(change_scene.scene_description)
+                if current_image is None
+                # for now always modify the image to avoid the generating an update in a completely wrong style
+                else modify_image(current_image, change_scene.scene_description)
+            )
+        state = get_user_state(user_hash)
+        if scene_id in state.scenes:
+            state.scenes[scene_id].image = image_path
+        state.assets[scene_id] = image_path
+        set_user_state(user_hash, state)
+        return image_path
+    except Exception as exc:  # noqa: BLE001
+        return _err(str(exc))
+@tool
+async def update_state_with_choice(
+    user_hash: Annotated[str, "User session ID"],
+    scene_id: Annotated[str, "Scene ID"],
+    choice_text: Annotated[str, "Chosen option"],
+) -> Annotated[Dict, "Updated state"]:
+    """Record the player's choice in the state."""
+    import datetime
+    state = get_user_state(user_hash)
+    state.user_choices.append(
+        UserChoice(
+            scene_id=scene_id,
+            choice_text=choice_text,
+            timestamp=datetime.datetime.utcnow().isoformat(),
+        )
+    )
+    set_user_state(user_hash, state)
+    return state.dict()
+@tool
+async def check_ending(
+    user_hash: Annotated[str, "User session ID"],
+) -> Annotated[Dict, "Ending check result"]:
+    """Check whether an ending has been reached."""
+    state = get_user_state(user_hash)
+    if not state.story_frame:
+        return _err("No story frame")
+    llm = create_llm().with_structured_output(EndingCheckResult)
+    history = "; ".join(f"{c.scene_id}:{c.choice_text}" for c in state.user_choices)
+    prompt = ENDING_CHECK_PROMPT.format(
+        history=history,
+        endings=",".join(f"{e.id}:{e.condition}" for e in state.story_frame.endings),
+    )
+    resp: EndingCheckResult = await llm.ainvoke(prompt)
+    if resp.ending_reached and resp.ending:
+        state.ending = resp.ending
+        set_user_state(user_hash, state)
+        return {"ending_reached": True, "ending": resp.ending.dict()}
+    return {"ending_reached": False}

src/config.py CHANGED Viewed

@@ -1,16 +1,18 @@
 from dotenv import load_dotenv
-load_dotenv()
 from pydantic_settings import BaseSettings
 import logging
 from pydantic import SecretStr
 logging.basicConfig(
     level=logging.INFO,
     format="%(levelname)s:\t%(asctime)s [%(name)s] %(message)s",
     datefmt="%Y-%m-%d %H:%M:%S %z",
 )
 class BaseAppSettings(BaseSettings):
     """Base settings class with common configuration."""
@@ -18,7 +20,8 @@ class BaseAppSettings(BaseSettings):
         env_file = ".env"
         env_file_encoding = "utf-8"
         extra = "ignore"
 class AppSettings(BaseAppSettings):
     gemini_api_key: SecretStr
     gemini_api_keys: SecretStr

 from dotenv import load_dotenv
 from pydantic_settings import BaseSettings
 import logging
 from pydantic import SecretStr
+load_dotenv()
 logging.basicConfig(
     level=logging.INFO,
     format="%(levelname)s:\t%(asctime)s [%(name)s] %(message)s",
     datefmt="%Y-%m-%d %H:%M:%S %z",
 )
 class BaseAppSettings(BaseSettings):
     """Base settings class with common configuration."""
         env_file = ".env"
         env_file_encoding = "utf-8"
         extra = "ignore"
 class AppSettings(BaseAppSettings):
     gemini_api_key: SecretStr
     gemini_api_keys: SecretStr

src/css.py CHANGED Viewed

@@ -118,6 +118,14 @@ img {
     display: none !important;
 }
 /* Make form element transparent */
 .overlay-content .form {
     background: transparent !important;
@@ -144,4 +152,4 @@ loading_css_styles = """
     font-size: 2em;
     text-align: center;
 }
-"""

     display: none !important;
 }
+/* Position the back button in the top-right corner */
+#back-btn {
+    position: fixed !important;
+    top: 10px !important;
+    right: 10px !important;
+    z-index: 20 !important;
+}
 /* Make form element transparent */
 .overlay-content .form {
     background: transparent !important;
     font-size: 2em;
     text-align: center;
 }
+"""

src/game_constructor.py CHANGED Viewed

@@ -5,6 +5,8 @@ from game_setting import Character, GameSetting, get_user_story
 from game_state import story, state, get_current_scene
 from agent.llm_agent import process_user_input
 from images.image_generator import generate_image
 from audio.audio_generator import start_music_generation
 import asyncio
 from config import settings
@@ -144,60 +146,22 @@ async def start_game_with_settings(
     )
     game_setting = GameSetting(character=character, setting=setting_desc, genre=genre)
-    # Initialize the game story with the custom settings
-    initial_story = f"""Welcome to your story, {game_setting.character.name}!
-Setting: {game_setting.setting}
-You are {game_setting.character.name}, a {game_setting.character.age}-year-old character. {game_setting.character.background}
-Your personality: {game_setting.character.personality}
-Genre: {game_setting.genre}
-You find yourself at the beginning of your adventure. The world around you feels alive with possibilities. What do you choose to do first?
-NOTE FOR THE ASSISTANT: YOU HAVE TO GENERATE A NEW IMAGE FOR THE START SCENE.
-"""
-    response = await process_user_input(initial_story)
-    music_tone = response.music_prompt
-    asyncio.create_task(start_music_generation(user_hash, music_tone))
-    img = "forest.jpg"
-    img_description = ""
-    img_path, img_description = await generate_image(
-        response.change_scene.scene_description
     )
-    if img_path:
-        img = img_path
-    story["start"] = {
-        "text": response.game_message,
-        "image": img,
-        "choices": {
-            option.option_description: asyncio.create_task(
-                process_user_input(
-                    get_user_story(
-                        response.game_message,
-                        response.change_scene.scene_description,
-                        option.option_description,
-                    )
-                )
-            ) if settings.pregenerate_next_scene else None
-            for option in response.player_options
-        },
-        "music_tone": response.music_prompt,
-        "img_description": img_description,
-    }
-    state["scene"] = "start"
-    # Get the current scene data
-    scene_text, scene_image, scene_choices = get_current_scene()
     return (
         gr.update(visible=False),  # loading indicator

 from game_state import story, state, get_current_scene
 from agent.llm_agent import process_user_input
 from images.image_generator import generate_image
+from game_setting import Character, GameSetting
+from agent.runner import process_step
 from audio.audio_generator import start_music_generation
 import asyncio
 from config import settings
     )
     game_setting = GameSetting(character=character, setting=setting_desc, genre=genre)
+    asyncio.create_task(start_music_generation(user_hash, "neutral"))
+    # Запускаем LLM-граф для инициализации истории
+    result = await process_step(
+        user_hash=user_hash,
+        step="start",
+        setting=game_setting.setting,
+        character=game_setting.character.model_dump(),
+        genre=game_setting.genre,
     )
+    scene = result["scene"]
+    scene_text = scene["description"]
+    scene_image = scene.get("image", "")
+    scene_choices = [ch["text"] for ch in scene.get("choices", [])]
     return (
         gr.update(visible=False),  # loading indicator

src/main.py CHANGED Viewed

@@ -2,14 +2,13 @@ import gradio as gr
 from css import custom_css, loading_css_styles
 from audio.audio_generator import (
     update_audio,
-    change_music_tone,
     cleanup_music_session,
 )
 import logging
 from agent.llm_agent import process_user_input
 from images.image_generator import modify_image
 import uuid
-from game_state import story, state
 from game_constructor import (
     SETTING_SUGGESTIONS,
     CHARACTER_SUGGESTIONS,
@@ -25,80 +24,49 @@ from config import settings
 logger = logging.getLogger(__name__)
-def return_to_constructor():
-    """Return to the game constructor interface, ensure loading is hidden."""
     return (
         gr.update(visible=False),  # loading_indicator
         gr.update(visible=True),  # constructor_interface
         gr.update(visible=False),  # game_interface
         gr.update(visible=False),  # error_message
     )
 async def update_scene(user_hash: str, choice):
     logger.info(f"Updating scene with choice: {choice}")
-    if isinstance(choice, str):
-        old_scene = state["scene"]
-        new_scene = str(uuid.uuid4())
-        story[new_scene] = {
-            **story[old_scene],
-        }
-        state["scene"] = new_scene
-        user_story = get_user_story(
-            story[old_scene]["text"], story[old_scene]["img_description"], choice
-        )
-        response = await (
-            story[old_scene]["choices"][choice] or process_user_input(user_story)
-        )
-        story[new_scene]["text"] = response.game_message
-        story[new_scene]["choices"] = {
-            option.option_description: asyncio.create_task(
-                process_user_input(
-                    get_user_story(
-                        response.game_message,
-                        response.change_scene.scene_description,
-                        option.option_description,
-                    )
-                )
-            )
-            if settings.pregenerate_next_scene
-            else None
-            for option in response.player_options
-        }
-        img_task = None
-        # always modify the image to avoid hallucinations in which image is being generated in entirely different style
-        if (
-            response.change_scene.change_scene == "change_completely"
-            or response.change_scene.change_scene == "modify"
-        ):
-            img_task = modify_image(
-                story[old_scene]["image"], response.change_scene.scene_description
-            )
-        else:
-            img_task = asyncio.sleep(0)
-        # run both tasks in parallel
-        img_res, _ = await asyncio.gather(
-            img_task, change_music_tone(user_hash, response.music_prompt)
         )
-        if img_res and response.change_scene.change_scene:
-            img_path, img_description = img_res
-            if img_path:
-                story[new_scene]["image"] = img_path
-                story[new_scene]["img_description"] = img_description
-    scene = story[state["scene"]]
     return (
-        scene["text"],
-        scene["image"],
         gr.Radio(
-            choices=scene["choices"],
             label="What do you choose?",
             value=None,
             elem_classes=["choice-buttons"],
@@ -261,8 +229,12 @@ with gr.Blocks(
         gr.Markdown("# 🎮 Your Interactive Story")
         with gr.Row():
-            back_btn = gr.Button("⬅️ Back to Constructor", variant="secondary")
             gr.Markdown("### Playing your custom game!")
         # Audio component for background music
         audio_out = gr.Audio(
@@ -349,12 +321,13 @@ with gr.Blocks(
     back_btn.click(
         fn=return_to_constructor,
-        inputs=[],
         outputs=[
             loading_indicator,
             constructor_interface,
             game_interface,
             error_message,
         ],
     )

 from css import custom_css, loading_css_styles
 from audio.audio_generator import (
     update_audio,
     cleanup_music_session,
 )
 import logging
 from agent.llm_agent import process_user_input
 from images.image_generator import modify_image
+from agent.runner import process_step
 import uuid
 from game_constructor import (
     SETTING_SUGGESTIONS,
     CHARACTER_SUGGESTIONS,
 logger = logging.getLogger(__name__)
+async def return_to_constructor(user_hash: str):
+    """Return to the constructor and reset user state and audio."""
+    from agent.state import reset_user_state
+    reset_user_state(user_hash)
+    await cleanup_music_session(user_hash)
+    # Generate a new hash to avoid stale state
+    new_hash = str(uuid.uuid4())
     return (
         gr.update(visible=False),  # loading_indicator
         gr.update(visible=True),  # constructor_interface
         gr.update(visible=False),  # game_interface
         gr.update(visible=False),  # error_message
+        gr.update(value=new_hash),  # local_storage
     )
 async def update_scene(user_hash: str, choice):
     logger.info(f"Updating scene with choice: {choice}")
+    if not isinstance(choice, str):
+        return gr.update(), gr.update(), gr.update()
+    result = await process_step(
+        user_hash=user_hash,
+        step="choose",
+        choice_text=choice,
+    )
+    if result.get("game_over"):
+        ending = result["ending"]
+        ending_text = ending.get("description") or ending.get("condition", "")
+        return (
+            gr.update(value=ending_text),
+            gr.update(value=None),
+            gr.Radio(choices=[], label="", value=None),
         )
+    scene = result["scene"]
     return (
+        scene["description"],
+        scene.get("image", ""),
         gr.Radio(
+            choices=[ch["text"] for ch in scene.get("choices", [])],
             label="What do you choose?",
             value=None,
             elem_classes=["choice-buttons"],
         gr.Markdown("# 🎮 Your Interactive Story")
         with gr.Row():
             gr.Markdown("### Playing your custom game!")
+        back_btn = gr.Button(
+            "⬅️ Back to Constructor",
+            variant="secondary",
+            elem_id="back-btn",
+        )
         # Audio component for background music
         audio_out = gr.Audio(
     back_btn.click(
         fn=return_to_constructor,
+        inputs=[local_storage],
         outputs=[
             loading_indicator,
             constructor_interface,
             game_interface,
             error_message,
+            local_storage,
         ],
     )