Spaces:

Wen1201
/

BayesianPyMc

Sleeping

App Files Files Community

BayesianPyMc / bayesian_llm_assistant.py

Wen1201

Upload 9 files

0ba59a3 verified about 2 months ago

raw

history blame contribute delete

22.7 kB

	import google.generativeai as genai
	import json
	import re
	import graphviz
	import io
	from PIL import Image

	class BayesianLLMAssistant:
	"""
	貝氏階層模型 LLM 問答助手（支援動態 DAG 生成）
	協助用戶理解貝氏分析結果，並可根據描述生成客製化 DAG 圖
	"""

	def __init__(self, api_key, session_id, api_provider="Google Gemini"):
	"""
	初始化 LLM 助手

	Args:
	api_key: API key (Gemini 或 Claude)
	session_id: 唯一的 session 識別碼
	api_provider: API 提供商 ("Google Gemini" 或 "Anthropic Claude")
	"""
	self.api_provider = api_provider
	self.session_id = session_id
	self.conversation_history = []

	if api_provider == "Google Gemini":
	import google.generativeai as genai
	genai.configure(api_key=api_key)
	self.model = genai.GenerativeModel('gemini-2.0-flash-exp')
	self.client = None
	else: # Anthropic Claude
	import anthropic
	self.client = anthropic.Anthropic(api_key=api_key)
	self.model_name = "claude-sonnet-4-5-20250929"
	self.model = None


	# 系統提示詞（加入 DAG 生成能力）
	# 完整修改後的 system_prompt
	# 替換 bayesian_llm_assistant.py 第 40-181 行

	self.system_prompt = """You are an expert Bayesian statistician specializing in hierarchical models and meta-analysis, particularly in the context of Pokémon battle statistics.

	IMPORTANT - Language Instruction:
	- Always respond in the SAME language as the user's question
	- If user asks in Traditional Chinese (繁體中文), respond in Traditional Chinese
	- If user asks in English, respond in English
	- Maintain language consistency throughout the conversation

	你是一位精通貝氏階層模型和統合分析的統計專家，特別專注於寶可夢對戰統計分析。

	Your role is to help users understand Bayesian hierarchical model results analyzing
	win rate comparisons between Fire-type and Water-type Pokémon across different matchup pairs.
	你的角色是幫助使用者理解貝氏階層模型分析結果，
	了解火系與水系寶可夢在不同配對組合下的勝率比較。

	NEW CAPABILITY: DAG Diagram Generation \| 新能力：DAG 圖生成
	When users ask you to draw, create, or visualize a DAG (Directed Acyclic Graph) or model structure, you can generate Graphviz DOT code.
	當用戶要求你繪製、創建或視覺化 DAG（有向無環圖）或模型結構時，你可以生成 Graphviz DOT 代碼。

	How to generate DAG code:
	1. Detect requests like: "draw a DAG", "show me the model structure", "visualize the relationships", "畫一個 DAG 圖", "顯示模型結構"
	2. Generate Graphviz DOT code wrapped in special tags:
	```graphviz
	digraph G {
	// Your DOT code here
	}
	```
	3. The system will automatically render it as an image

	IMPORTANT - Font and Label Instructions for DAG:
	- NEVER use Chinese characters in node labels
	- Use ONLY English labels, or use English + romanized Chinese
	- DO NOT set fontname in the graph
	- Example of good labels: "d (overall effect)" or "delta[i] (pair-specific)"
	- Example of bad labels: "整體效應" or any Chinese text

	重要 - DAG 圖的字型和標籤指示：
	- 絕對不要在節點標籤中使用中文字
	- 只使用英文標籤，或使用「英文 + 拼音」
	- 不要設定 fontname
	- 好的標籤範例："d (overall effect)" 或 "delta[i] (pair-specific)"
	- 不好的標籤範例："整體效應" 或任何中文

	Example DAG code for Bayesian hierarchical model:
	```graphviz
	digraph BayesianModel {
	rankdir=TB;
	node [shape=ellipse, style=filled, fillcolor=lightblue];

	// Priors
	d [label="d\n(Fire vs Water overall)", fillcolor=lightyellow];
	tau [label="tau\n(precision)", fillcolor=lightyellow];
	sigma [label="sigma = 1/√tau", shape=diamond, fillcolor=lightgray];

	// Hierarchy
	d -> delta [label="mean"];
	tau -> delta [label="precision"];
	sigma -> delta [style=dashed];

	delta [label="delta[i]\n(pair-specific)", fillcolor=lightgreen];
	mu [label="mu[i]\n(baseline)", fillcolor=lightyellow];

	// Likelihood
	delta -> pt [label="effect"];
	mu -> pc;
	mu -> pt;

	pc [label="pc[i]\n(Water win rate)", shape=diamond, fillcolor=lightgray];
	pt [label="pt[i]\n(Fire win rate)", shape=diamond, fillcolor=lightgray];

	pc -> rc_obs [label="probability"];
	pt -> rt_obs [label="probability"];

	rc_obs [label="rc_obs[i]\n(Water wins)", shape=box, fillcolor=lightcoral];
	rt_obs [label="rt_obs[i]\n(Fire wins)", shape=box, fillcolor=lightcoral];
	}
	```

	You should:
	1. Explain Bayesian concepts in simple, accessible terms
	2. Interpret posterior distributions, HDI (Highest Density Interval), and credible intervals
	3. Explain hierarchical structure and why it's useful
	4. Help users understand heterogeneity (sigma) between different matchup pairs
	5. Discuss the practical significance of Fire vs Water type advantages
	6. Provide insights about which matchup pairs favor Fire-types the most
	7. Suggest team building strategies based on the statistical findings
	8. Clarify differences between Bayesian and frequentist approaches
	9. Explain MCMC diagnostics (R-hat, ESS) when relevant
	10. Generate custom DAG diagrams based on user descriptions

	你應該：
	1. 用簡單易懂的方式解釋貝氏概念
	2. 詮釋後驗分佈、HDI（最高密度區間）和可信區間
	3. 解釋階層結構及其優勢
	4. 幫助使用者理解不同配對間的異質性（sigma）
	5. 討論火系與水系屬性優劣勢的實際意義
	6. 提供哪些配對組合中火系最具優勢的見解
	7. 根據統計發現提出組隊策略建議
	8. 說明貝氏方法與頻率論方法的差異
	9. 適時解釋 MCMC 診斷指標（R-hat、ESS）
	10. 根據用戶描述生成客製化 DAG 圖

	Key concepts to explain when relevant:
	- Bayesian Hierarchical Model: Borrows strength across matchup pairs, shrinkage effect
	- Prior & Posterior: How data updates beliefs
	- HDI (Highest Density Interval): 95% most credible values
	- d (overall effect): Average log odds ratio of Fire vs Water across all pairs
	- sigma (between-pair variation): How much different matchup pairs vary in Fire advantage
	- delta (pair-specific effects): Each matchup pair's individual Fire advantage/disadvantage
	- Odds Ratio: exp(d) - how much more likely Fire-types are to win compared to Water-types
	- MCMC: Markov Chain Monte Carlo sampling method
	- Convergence: R-hat < 1.1, good ESS (effective sample size)
	- DAG (Directed Acyclic Graph): Visual representation of model structure

	重要概念解釋（當相關時）：
	- 貝氏階層模型：跨配對借用資訊，收縮效應
	- 先驗與後驗：資料如何更新信念
	- HDI（最高密度區間）：95% 最可信的數值範圍
	- d（整體效應）：火系相對於水系的平均對數勝算比（跨所有配對）
	- sigma（配對間變異）：不同配對組合的火系優勢差異程度
	- delta（配對特定效應）：每組配對的個別火系優勢/劣勢
	- 勝算比：exp(d) - 火系相對於水系獲勝的可能性倍數
	- MCMC：馬可夫鏈蒙地卡羅抽樣方法
	- 收斂性：R-hat < 1.1，良好的 ESS（有效樣本數）
	- DAG（有向無環圖）：模型結構的視覺化表示

	When discussing Pokémon type matchups:
	- Connect statistical findings to type advantage mechanics (Water typically beats Fire in core games)
	- Explain why Fire vs Water matchups show certain patterns
	- Discuss individual matchup variations and their causes (e.g., specific Pokémon abilities, stats)
	- Identify which Fire/Water Pokémon pairs show unusual results (Fire winning despite type disadvantage)
	- Consider team building and type coverage implications

	討論寶可夢屬性對抗時：
	- 將統計發現連結到屬性相剋機制（水系通常剋火系）
	- 解釋火系對水系的對戰模式為何呈現特定趨勢
	- 討論個別配對的變異及其可能原因（例如特殊能力、數值差異）
	- 識別哪些火/水系配對顯示異常結果（火系儘管屬性不利仍獲勝）
	- 考慮組隊和屬性覆蓋的影響

	Always be clear, educational, and engaging. Use examples when helpful.
	Format responses with proper markdown for better readability.

	請務必清晰、具教育性、引人入勝。適時使用範例說明。使用適當的 Markdown 格式以提升可讀性。"""

	def get_response(self, user_message, analysis_results=None):
	"""
	獲取 AI 回應（支援 DAG 生成）

	Args:
	user_message: 用戶訊息
	analysis_results: 分析結果字典（可選）

	Returns:
	tuple: (回應文字, DAG 圖片或 None)
	"""
	# 準備上下文資訊
	context = ""
	if analysis_results:
	context = self._prepare_context(analysis_results)

	# 添加用戶訊息到歷史
	self.conversation_history.append({
	"role": "user",
	"content": user_message
	})

	try:
	# 構建完整的提示詞
	full_prompt = self.system_prompt

	if context:
	full_prompt += f"\n\n## Current Analysis Context:\n{context}"

	# 構建對話歷史文字
	conversation_text = "\n\n## Conversation History:\n"
	for msg in self.conversation_history[:-1]:
	role = "User" if msg["role"] == "user" else "Assistant"
	conversation_text += f"\n{role}: {msg['content']}\n"

	# 組合最終提示詞
	final_prompt = full_prompt + conversation_text + f"\nUser: {user_message}\n\nAssistant:"


	# 調用對應的 API
	if self.api_provider == "Google Gemini":
	response = self.model.generate_content(
	final_prompt,
	generation_config=genai.types.GenerationConfig(
	temperature=0.7,
	max_output_tokens=4000,
	)
	)
	assistant_message = response.text

	else: # Anthropic Claude
	response = self.client.messages.create(
	model=self.model_name,
	max_tokens=4000,
	temperature=0.7,
	system=self.system_prompt,
	messages=[
	{"role": "user", "content": final_prompt}
	]
	)
	assistant_message = response.content[0].text

	# 檢查是否包含 Graphviz 代碼
	dag_image = self._extract_and_render_dag(assistant_message)

	# 添加助手回應到歷史
	self.conversation_history.append({
	"role": "assistant",
	"content": assistant_message
	})

	return assistant_message, dag_image

	except Exception as e:
	error_msg = f"❌ Error: {str(e)}\n\nPlease check your API key and try again."
	return error_msg, None



	def _extract_and_render_dag(self, text):
	"""
	從文字中提取 Graphviz 代碼並渲染成圖片

	Args:
	text: 包含可能的 Graphviz 代碼的文字

	Returns:
	PIL Image 或 None
	"""
	# 方法 1: 嘗試提取 ```graphviz ... ``` 格式
	pattern1 = r'```graphviz\s\n(.?)\n```'
	matches = re.findall(pattern1, text, re.DOTALL)

	if matches:
	dot_code = matches[0]
	else:
	# 方法 2: 嘗試提取 digraph ... } 格式（沒有 markdown 包裹）
	#pattern2 = r'(digraph\s+\w+\s\{.?\n\})'
	pattern2 = r'(digraph\s+\w+\s\{.\})'
	matches = re.findall(pattern2, text, re.DOTALL)

	if not matches:
	return None

	dot_code = matches[0]

	try:
	# 使用 Graphviz 渲染
	graph = graphviz.Source(dot_code)
	png_bytes = graph.pipe(format='png')

	# 轉換為 PIL Image
	img = Image.open(io.BytesIO(png_bytes))

	return img

	except Exception as e:
	print(f"Failed to render DAG: {e}")
	return None


	def _prepare_context(self, results):
	"""準備分析結果的上下文資訊"""

	if not results:
	return "目前尚無分析結果。No analysis results available yet."

	overall = results['overall']
	interp = results['interpretation']
	diag = results['diagnostics']

	# 找出顯著的屬性
	sig_types = [
	results['trial_labels'][i]
	for i, sig in enumerate(results['by_trial']['delta_significant'])
	if sig
	]

	context = f"""
	## Current Bayesian Hierarchical Model Analysis \| 目前的貝氏階層模型分析

	### Overall Effect \| 整體效應
	- d (Log Odds Ratio) \| d（對數勝算比）:
	- Mean \| 平均: {overall['d_mean']:.4f}
	- SD \| 標準差: {overall['d_sd']:.4f}
	- 95% HDI: [{overall['d_hdi_low']:.4f}, {overall['d_hdi_high']:.4f}]

	- sigma (Between-type Variation) \| sigma（屬性間變異）:
	- Mean \| 平均: {overall['sigma_mean']:.4f}
	- SD \| 標準差: {overall['sigma_sd']:.4f}
	- 95% HDI: [{overall['sigma_hdi_low']:.4f}, {overall['sigma_hdi_high']:.4f}]

	- Odds Ratio \| 勝算比:
	- Mean \| 平均: {overall['or_mean']:.4f}
	- SD \| 標準差: {overall['or_sd']:.4f}
	- 95% HDI: [{overall['or_hdi_low']:.4f}, {overall['or_hdi_high']:.4f}]

	### Model Diagnostics \| 模型診斷
	- R-hat (d): {f"{diag['rhat_d']:.4f}" if diag['rhat_d'] is not None else 'N/A'} {'✓' if diag['rhat_d'] and diag['rhat_d'] < 1.1 else '✗'}
	- R-hat (sigma): {f"{diag['rhat_sigma']:.4f}" if diag['rhat_sigma'] is not None else 'N/A'} {'✓' if diag['rhat_sigma'] and diag['rhat_sigma'] < 1.1 else '✗'}
	- ESS (d): {int(diag['ess_d']) if diag['ess_d'] is not None else 'N/A'}
	- ESS (sigma): {int(diag['ess_sigma']) if diag['ess_sigma'] is not None else 'N/A'}
	- Convergence \| 收斂狀態: {'✓ Converged 已收斂' if diag['converged'] else '✗ Not Converged 未收斂'}

	### Interpretation \| 結果解釋
	- Overall Effect \| 整體效應: {interp['overall_effect']}
	- Significance \| 顯著性: {interp['overall_significance']}
	- Effect Size \| 效果大小: {interp['effect_size']}
	- Heterogeneity \| 異質性: {interp['heterogeneity']}

	### Significant Types \| 顯著的屬性
	{len(sig_types)} out of {results['n_trials']} types show significant speed effects:
	{len(sig_types)} 個屬性（共 {results['n_trials']} 個）顯示顯著的速度效應：
	{', '.join(sig_types) if sig_types else 'None 無'}

	### Number of Types Analyzed \| 分析的屬性數量
	{results['n_trials']} types in total 共 {results['n_trials']} 個屬性

	### Key Finding \| 關鍵發現
	{
	f"On average, Fire-type Pokémon are {overall['or_mean']:.2f} times more likely to win compared to Water-type (95% HDI: [{overall['or_hdi_low']:.2f}, {overall['or_hdi_high']:.2f}]). 平均而言，火系寶可夢獲勝的可能性是水系的 {overall['or_mean']:.2f} 倍（95% HDI: [{overall['or_hdi_low']:.2f}, {overall['or_hdi_high']:.2f}]）。"

	if overall['or_mean'] > 1
	else f"Interestingly, the data suggests no clear speed advantage or even a slight disadvantage. 有趣的是，資料顯示速度並無明顯優勢，甚至可能略有劣勢。"
	}

	The variation between types (sigma = {overall['sigma_mean']:.3f}) indicates {interp['heterogeneity'].lower()}.
	屬性間的變異（sigma = {overall['sigma_mean']:.3f}）表示{interp['heterogeneity'].lower()}。
	"""
	return context

	def draw_custom_dag(self, description):
	"""
	根據用戶描述生成客製化 DAG 圖

	Args:
	description: 用戶對 DAG 的描述

	Returns:
	tuple: (解釋文字, DAG 圖片或 None)
	"""
	prompt = f"""Based on the following description, generate a Graphviz DOT code for a DAG diagram:

	User description: {description}

	Please:
	1. Create a clear and informative DAG
	2. Use appropriate node shapes (ellipse for random variables, box for observed data, diamond for deterministic nodes)
	3. Use different colors to distinguish node types
	4. CRITICAL: Use ONLY English labels - NO Chinese characters in node labels
	5. Add labels to explain what each node represents (in English)
	6. Wrap your DOT code in ```graphviz ``` tags
	7. Provide a brief explanation in Traditional Chinese about what the diagram shows

	根據以下描述，生成 Graphviz DOT 代碼的 DAG 圖：

	用戶描述：{description}

	請：
	1. 創建清晰且有資訊性的 DAG
	2. 使用適當的節點形狀（橢圓代表隨機變數，矩形代表觀測資料，菱形代表確定性節點）
	3. 使用不同顏色區分節點類型
	4. 重要：節點標籤必須使用英文，不能使用中文
	5. 添加標籤說明每個節點代表什麼（用英文）
	6. 將 DOT 代碼包在 ```graphviz ``` 標籤中
	7. 用繁體中文簡要說明圖表顯示什麼"""

	return self.get_response(prompt, None)

	# 保留原有的所有方法...
	def generate_summary(self, analysis_results):
	"""自動生成分析結果總結"""

	summary_prompt = """請根據提供的貝氏階層模型分析結果生成一份完整的總結報告，包含：

	1. 模型目的：簡述這個階層模型在分析什麼
	2. 整體發現：
	- 速度對勝率有什麼整體影響？
	- d 和勝算比告訴我們什麼？
	- HDI 的意義是什麼？
	3. 屬性間差異：
	- sigma 告訴我們什麼？
	- 哪些屬性特別受速度影響？
	4. 模型品質：
	- 模型收斂得好嗎？（R-hat、ESS）
	- 結果可信嗎？
	5. 實戰啟示：
	- 訓練師如何運用這些資訊？
	- 哪些屬性應該優先考慮速度？

	請用清楚的繁體中文 Markdown 格式撰寫，包含適當的章節標題。"""

	text, _ = self.get_response(summary_prompt, analysis_results)
	return text

	def explain_metric(self, metric_name, analysis_results):
	"""解釋特定指標"""

	metric_explanations = {
	'd': 'd (整體對數勝算比)',
	'sigma': 'sigma (屬性間變異)',
	'or_speed': 'Odds Ratio (勝算比)',
	'hdi': '95% HDI (最高密度區間)',
	'delta': 'delta (屬性特定效應)',
	'rhat': 'R-hat (收斂診斷)',
	'ess': 'ESS (有效樣本數)'
	}

	metric_display = metric_explanations.get(metric_name, metric_name)

	explain_prompt = f"""請在這次貝氏階層模型分析的脈絡下，解釋以下指標：

	指標：{metric_display}

	請包含：
	1. 這個指標在貝氏統計中測量什麼？
	2. 在本次分析中得到的數值是多少？
	3. 如何從寶可夢對戰的角度詮釋這個數值？
	4. 與頻率論統計的對應指標有何不同？
	5. 有什麼需要注意的限制或注意事項？

	請用繁體中文回答。"""

	text, _ = self.get_response(explain_prompt, analysis_results)
	return text

	def explain_bayesian_vs_frequentist(self):
	"""解釋貝氏與頻率論的差異"""

	explain_prompt = """請用簡單的方式解釋貝氏統計和頻率論統計的差異，特別是在寶可夢對戰分析的情境下。

	請涵蓋：
	1. 兩者的根本哲學差異是什麼？
	2. p 值 vs HDI（可信區間）有什麼不同？
	3. 為什麼我們用階層模型來分析多個屬性？
	4. 貝氏方法的優勢和限制是什麼？
	5. 什麼時候該用貝氏、什麼時候該用頻率論？

	請用寶可夢的實際例子讓說明更具體易懂，全程使用繁體中文。"""

	text, _ = self.get_response(explain_prompt, None)
	return text

	def explain_hierarchical_model(self):
	"""解釋階層模型的概念"""

	explain_prompt = """請用簡單的方式解釋貝氏階層模型，特別是在寶可夢屬性分析的情境下。

	請涵蓋：
	1. 什麼是階層模型？為什麼要用階層結構？
	2. 「借用資訊」(borrowing strength) 是什麼意思？
	3. 收縮效應 (shrinkage) 如何運作？
	4. 為什麼階層模型適合分析多個屬性？
	5. d、sigma、delta 之間的關係是什麼？

	請用寶可夢的實際例子讓說明更具體易懂，全程使用繁體中文。"""

	text, _ = self.get_response(explain_prompt, None)
	return text

	def battle_strategy_advice(self, analysis_results):
	"""提供對戰策略建議"""

	strategy_prompt = """根據貝氏階層模型的分析結果，請為寶可夢訓練師提供實際的對戰策略建議。

	請考慮：
	1. 整體而言，速度對勝率的影響有多大？
	2. 哪些屬性特別受益於速度？哪些不受影響？
	3. 訓練師在組建隊伍時應該如何權衡速度？
	4. 有沒有屬性可以忽略速度、專注其他數值？
	5. 對競技對戰有什麼啟示？

	請具體且可操作，使用繁體中文回答。"""

	text, _ = self.get_response(strategy_prompt, analysis_results)
	return text

	def compare_types(self, analysis_results):
	"""比較不同屬性"""

	compare_prompt = """請比較分析結果中不同屬性對速度的反應差異。

	請說明：
	1. 哪些屬性對速度最敏感？為什麼？
	2. 哪些屬性對速度不敏感？可能的原因是什麼？
	3. 屬性間的異質性（sigma）告訴我們什麼？
	4. 有沒有令人意外的發現？
	5. 這些差異對組隊策略有什麼啟示？

	請用繁體中文回答。"""

	text, _ = self.get_response(compare_prompt, analysis_results)
	return text

	def reset_conversation(self):
	"""重置對話歷史"""
	self.conversation_history = []