Spaces:
Configuration error
Configuration error
File size: 15,958 Bytes
aedcb93 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 | import os
import random
import re
from typing import Optional, Tuple
from langchain_ollama import ChatOllama
import requests
from .base import AgentLike
from ..utils.parsing import (
extract_legal_moves, slice_board_and_moves, strip_think, MOVE_RE, extract_forbidden
)
# I seperated Prompts from the code
from ..prompts import PromptPack, get_prompt_pack
# 🧩 Import strategies
from ..strategies.base import Strategy
from ..strategies.aggressive_strategy import AggressiveStrategy
from ..strategies.defensive_strategy import DefensiveStrategy
from ..strategies.random_move import RandomStrategy
class OllamaAgent(AgentLike):
def __init__(
self,
model_name: str,
system_prompt: Optional[str] = None,
host: Optional[str] = None,
prompt_pack: Optional[PromptPack | str] = None,
strategy: Optional[Strategy] = None,
**kwargs,
):
self.model_name = model_name
self.STRATEGIC_GUIDANCE = """
You are a skilled Stratego player.
You must choose the SINGLE best legal move from the given board, legal moves, forbidden moves, and move history.
GENERAL RULES:
1. Output EXACTLY ONE MOVE in the form [A0 B0].
2. NEVER output explanations, commentary, or reasoning.
3. Try to choose a move that would be legal in Stratego rules.
4. NEVER repeat a previous move unless it creates a tactical advantage (capture, reveal, escape).
5. AVOID back-and-forth oscillations (e.g., A5->A6 then A6->A5).
6. It would be considered a SERIOUS MISTAKE, which leads you to lose the game, to attempt illegal moves such as moving a Flag or Bomb, moving in an impossible way, moving upon its own pieces, or trying to move opponent's pieces.
STRATEGIC PRINCIPLES:
1. Avoid random or pointless shuffling of pieces.
2. Prefer moves that improve board position, uncover information, or apply pressure.
3. Avoid moving high-value officers (Marshal, General, Colonel) blindly into unknown pieces.
4. Prefer advancing Scouts for reconnaissance.
5. Avoid moving bombs unless revealed and forced.
6. Do NOT walk pieces next to the same unknown piece repeatedly without purpose.
7. Do NOT afraid to sacrifice low-rank pieces for information gain.
CAPTURE & SAFETY RULES:
1. If you can capture a known weaker enemy piece safely, prefer that move.
2. NEVER attack a higher-ranked or unknown piece with a valuable piece unless strategically justified.
3. If the enemy piece is revealed as weaker, press the advantage.
4. If your piece is threatened, retreat or reposition instead of repeating the last move.
USE OF HISTORY:
1. Avoid repeating cycles recognized in the history (e.g., A->B->A->B).
2. Track revealed enemy pieces from history and use rank knowledge:
- If they moved, they are not Bombs or Flags.
- If they captured, infer their rank and avoid attacking with weaker pieces.
3. If an enemy repeatedly retreats from your piece, continue safe pressure.
POSITIONING RULES:
1. Advance pieces that have strategic value while keeping your formation stable.
2. Keep bombs guarding high-value territory; avoid unnecessary bomb movement.
3. Push on flanks where the opponent retreats often.
4. Maintain escape squares for your high-ranking leaders.
ENDGAME LOGIC:
1. Prioritize discovering and attacking the opponent's flag location.
2. Secure safe paths for Miners to remove bombs.
3. In endgame, prioritize mobility and avoid blockades caused by your own pieces.
CHOOSE THE BEST MOVE:
Evaluate all legal moves and pick the one that:
- improves position, OR
- pressures an opponent safely, OR
- increases information, OR
- avoids known traps or loops, OR
- ensures safety of valuable pieces.
Output ONLY one legal move in the exact format [A0 B0]. Nothing else.
"""
# self.VALIDATION_GUIDANCE = """
# You are validating a Stratego move. Decide if the move obeys Stratego rules given the board and history.
# Rules to enforce:
# - Pieces cannot move into lakes or off-board.
# - Immovable pieces (Bomb, Flag) cannot move.
# - A piece cannot capture its own piece.
# - Only Scouts can move more than one square in straight lines; others move exactly one square orthogonally.
# - No diagonal movement.
# - Respect revealed information from history (if it moved before, it is not a Bomb/Flag).
# - If an 'Available Moves:' list is present, moves not in that list are almost always invalid.
# - If a 'FORBIDDEN' list is present, those moves are invalid.
# - On small custom boards (size <= 5), there are NO lakes unless the board explicitly shows '~'. If you do not see '~', assume no lakes exist.
# Respond with either:
# - VALID
# - INVALID: <short reason>
# """
if isinstance(prompt_pack, str) or prompt_pack is None:
self.prompt_pack: PromptPack = get_prompt_pack(prompt_pack)
else:
self.prompt_pack = prompt_pack
if system_prompt is not None:
self.system_prompt = system_prompt
else:
# if there is already an existing updated prompt, we use that one
prompt_path = os.path.join(os.path.dirname(__file__), "..", "prompts", "current_prompt.txt")
if os.path.exists(prompt_path):
with open(prompt_path, "r", encoding="utf-8") as f:
self.system_prompt = f.read()
else:
self.system_prompt = self.prompt_pack.system
self.initial_prompt = self.system_prompt
# Setup Ollama client
base_url = host or os.getenv("OLLAMA_HOST", "http://localhost:11434")
model_kwargs = {
"temperature": kwargs.pop("temperature", 0.1),
"top_p": kwargs.pop("top_p", 0.9),
"repeat_penalty": kwargs.pop("repeat_penalty", 1.05),
"num_predict": kwargs.pop("num_predict", 24),
**kwargs,
}
# Only print connection message if explicitly enabled (for CLI use, not web UI)
# print("🚀 Connecting to Ollama at:", base_url)
self.client = ChatOllama(model=model_name, base_url=base_url, model_kwargs=model_kwargs)
# Simple move history tracking
self.move_history = []
self.player_id = None
def set_move_history(self, history):
"""Set the recent move history for this agent."""
self.move_history = history
# def _validate_move(self, context: str, move: str) -> Tuple[bool, str]:
# """Ask the LLM to self-check legality based on board + history."""
# prompt = (
# self.VALIDATION_GUIDANCE
# + "\n\nBOARD + HISTORY CONTEXT:\n"
# + context
# + f"\n\nCANDIDATE MOVE: {move}\nRespond strictly with VALID or INVALID and a reason."
# )
# verdict = self._llm_once(prompt)
# if not verdict:
# return False, "empty validation response"
# verdict_upper = verdict.strip().upper()
# if verdict_upper.startswith("VALID"):
# return True, ""
# if verdict_upper.startswith("INVALID"):
# reason = verdict.split(":", 1)[1].strip() if ":" in verdict else "marked invalid"
# return False, reason
# return False, f"unrecognized verdict: {verdict[:60]}"
# Run one LLM call
def _llm_once(self, prompt: str) -> str:
"""Send request directly to Ollama REST API (fixes Windows LangChain bug)."""
try:
response = requests.post(
"http://localhost:11434/api/generate",
json={
"model": self.model_name,
"prompt": prompt,
"stream": False
},
timeout=300
)
if response.status_code == 200:
data = response.json()
return (data.get("response") or "").strip()
else:
print(f"Ollama returned HTTP {response.status_code}: {response.text}")
return ""
except Exception as e:
print(f"Ollama request failed: {e}")
return ""
def __call__(self, observation: str) -> str:
# Build context
slim = slice_board_and_moves(observation)
available_moves = set(extract_legal_moves(observation))
forbidden_moves = set(extract_forbidden(observation))
prompt_history_lines = []
for line in observation.splitlines():
if line.startswith("Turn ") or "played[" in line:
prompt_history_lines.append(line)
history = "\n".join(prompt_history_lines)
full_context = slim + ("\n\nMOVE HISTORY:\n" + history if history else "")
def _detect_board_size(obs: str) -> Optional[int]:
"""Infer board size from numeric header (e.g., '0 1 2 3')."""
header_re = re.compile(r"^\s*0(\s+\d+)+\s*$")
lines = obs.splitlines()
for i in range(len(lines) - 1, -1, -1):
if header_re.match(lines[i].strip()):
nums = [int(n) for n in lines[i].split() if n.isdigit()]
if nums:
return max(nums) + 1
return None
def _build_board_map(obs: str) -> dict[str, str]:
size_local = _detect_board_size(obs)
if not size_local:
return {}
lines = obs.splitlines()
header_idx = None
header_re = re.compile(r"^\s*0(\s+\d+)+\s*$")
for i in range(len(lines)):
if header_re.match(lines[i].strip()):
header_idx = i
break
if header_idx is None:
return {}
board_map: dict[str, str] = {}
# Expect size_local lines after header
for r in range(size_local):
line_idx = header_idx + 1 + r
if line_idx >= len(lines):
break
parts = lines[line_idx].split()
if not parts:
continue
row_label = parts[0]
cells = parts[1:]
if len(cells) < size_local:
continue
for c in range(size_local):
pos = f"{row_label.upper()}{c}"
board_map[pos] = cells[c]
return board_map
board_map = _build_board_map(observation)
# >>> THE CRITICAL FIX <<<
guidance = (
self.STRATEGIC_GUIDANCE
+ "\n\n"
+ self.prompt_pack.guidance(full_context)
)
recent_moves = set()
if len(self.move_history) >= 2:
recent_moves = {m["move"] for m in self.move_history[-2:]}
last_error = None
last_raw: str = ""
invalid_memory = []
BARE_MOVE_RE = re.compile(r"\b([A-Z]\d+)\s+([A-Z]\d+)\b")
def _extract_move(raw: str):
m = MOVE_RE.search(raw or "")
if m:
return m.group(0)
m2 = BARE_MOVE_RE.search(raw or "")
if m2:
return f"[{m2.group(1)} {m2.group(2)}]"
return None
# generation + self-validation loop (4 attempts max)
for attempt in range(4):
decorated_guidance = guidance
if invalid_memory:
decorated_guidance += "\n\nPreviously invalid moves (avoid these):\n" + "\n".join(invalid_memory)
raw = self._llm_once(decorated_guidance)
last_raw = raw or last_raw
if not raw:
last_error = "empty response (timeout or HTTP error)"
continue
mv = _extract_move(raw)
if not mv:
last_error = f"no move found in response: {raw[:80]!r}"
continue
# Geometric sanity check: block diagonals and multi-step moves from non-Scout pieces
try:
src, dst = mv.strip("[]").split()
sr, sc = ord(src[0]) - 65, int(src[1:])
dr, dc = ord(dst[0]) - 65, int(dst[1:])
drow = abs(dr - sr)
dcol = abs(dc - sc)
src_token = board_map.get(src, "")
# Block moving empty/unknown/lake squares
if src_token in {"", ".", "?", "~"}:
invalid_memory.append(f"{mv} (source not movable)")
last_error = "source not movable"
continue
# Diagonal
if drow > 0 and dcol > 0:
invalid_memory.append(f"{mv} (diagonal not allowed)")
last_error = "diagonal"
continue
# Multi-step non-Scout
if drow + dcol > 1:
is_scout = src_token.upper() in {"SC", "SCOUT"}
if not is_scout:
invalid_memory.append(f"{mv} (non-Scout multi-step)")
last_error = "non-Scout multi-step"
continue
except Exception:
pass
# quick deterministic veto using env-provided lists
if available_moves and mv not in available_moves:
invalid_memory.append(f"{mv} (not in Available Moves)")
last_error = f"{mv} not in Available Moves"
print(f" LLM proposed move not in Available Moves: {mv}")
continue
if mv in forbidden_moves:
invalid_memory.append(f"{mv} (in FORBIDDEN)")
last_error = f"{mv} in FORBIDDEN"
print(f" LLM proposed forbidden move {mv}")
continue
if mv in recent_moves and len(recent_moves) > 0:
last_error = f"repeated move {mv}"
print(f" LLM proposed recent move {mv}, trying alternatives...")
continue
if available_moves:
return mv
def _first_valid_from_list(candidates):
for mv in candidates:
if available_moves and mv not in available_moves:
# print(f" Fallback move not in Available Moves: {mv}")
continue
if mv in forbidden_moves:
# print(f" Fallback forbidden move: {mv}")
continue
if mv in recent_moves and len(recent_moves) > 0:
continue
if available_moves:
return mv
# is_valid, reason = self._validate_move(full_context, mv)
# if is_valid:
# return mv
# print(f" Fallback invalid move {mv}: {reason}")
return None
if last_raw:
candidates = MOVE_RE.findall(last_raw or "")
if candidates:
mv = _first_valid_from_list(candidates)
if mv:
return mv
# Try to pick a random valid move from available moves
obs_moves = MOVE_RE.findall(observation)
if obs_moves:
mv = _first_valid_from_list(obs_moves)
if mv:
return mv
non_recent = [mv for mv in obs_moves if mv not in recent_moves]
if non_recent:
return random.choice(non_recent)
return random.choice(obs_moves)
print(f"[AGENT] {self.model_name} failed to produce valid move after retries.")
if last_error:
print(f" Last error: {last_error}")
return ""
|