Spaces:

Wen1201
/

mcnemar

Sleeping

App Files Files Community

Wen1201 commited on Jan 8

Commit

6b5847f

verified ·

1 Parent(s): 02f7b03

Upload 6 files

Browse files

Files changed (6) hide show

README.md +177 -10
app.py +540 -0
mcnemar_core.py +241 -0
mcnemar_llm_assistant.py +251 -0
mcnemar_utils.py +338 -0
requirements.txt +6 -0

README.md CHANGED Viewed

@@ -1,12 +1,179 @@
----
-title: Mcnemar
-emoji: 🐠
-colorFrom: yellow
-colorTo: red
-sdk: gradio
-sdk_version: 6.2.0
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# McNemar 檢定分析系統 - 寶可夢對戰特徵分析
+## 📋 系統簡介
+這是一個基於 Streamlit 的 McNemar 檢定分析系統，專為分析寶可夢對戰數據設計，結合 AI 助手提供統計解釋和對戰策略建議。
+## 🎯 主要功能
+### 1. McNemar 檢定分析
+- ✅ 統計顯著性檢定（p 值）
+- ✅ 勝算比 (Odds Ratio) 計算
+- ✅ 95% 信賴區間估計
+- ✅ 不一致配對分析
+- ✅ 效果大小評估
+### 2. 視覺化圖表
+- 📊 列聯表熱力圖
+- 📈 勝算比森林圖
+- 📉 不一致配對分布圖
+- 🎨 顯著性水準視覺化
+### 3. AI 智能助手
+- 💬 自然語言對話
+- 📖 統計指標解釋
+- 🎮 對戰策略建議
+- 📚 McNemar 檢定教學
+- 🔍 結果深度分析
+## 📦 安裝步驟
+### 1. 安裝依賴套件
+```bash
+pip install -r requirements.txt
+```
+### 2. 準備資料
+將寶可夢對戰資料 CSV 檔放在同一目錄下，檔名為 `poke_mc_hong_2.csv`
+**資料格式要求：**
+- 必須包含 `cs_特徵名稱` 和 `cn_特徵名稱` 欄位
+- cs_ = 勝方 (Champion/Winner)
+- cn_ = 敗方 (Challenger/Loser)
+- 數值為 0 或 1 (0=較低, 1=較高)
+**範例欄位：**
+```
+cs_HP, cn_HP          # 血量
+cs_Attack, cn_Attack  # 攻擊
+cs_Speed, cn_Speed    # 速度
+cs_Defense, cn_Defense # 防禦
+```
+### 3. 設定 OpenAI API Key
+- 在系統左側邊欄輸入您的 OpenAI API Key
+- API Key 用於 AI 助手功能
+### 4. 執行程式
+```bash
+streamlit run mcnemar_app.py
+```
+## 🔧 檔案結構
+```
+mcnemar_app/
+├── mcnemar_app.py              # Streamlit 主程式
+├── mcnemar_core.py             # McNemar 檢定核心邏輯
+├── mcnemar_llm_assistant.py    # AI 對話助手
+├── mcnemar_utils.py            # 視覺化工具
+├── requirements.txt            # 依賴套件
+├── README.md                   # 說明文件
+└── poke_mc_hong_2.csv         # 寶可夢資料（需自行準備）
+```
+## 📊 使用方式
+### Step 1: 載入資料
+1. 選擇「使用預設資料集」或「上傳您的資料」
+2. 如果上傳，請確保 CSV 格式正確
+### Step 2: 執行分析
+1. 在「McNemar 分析」頁面選擇要分析的特徵
+2. 點擊「開始分析」按鈕
+3. 查看結果的四個子頁面：
+   - 📊 概覽：關鍵指標和摘要
+   - 📉 列聯表：配對數據分析
+   - 🎯 勝算比：效果大小評估
+   - 📋 詳細報告：完整文字報告
+### Step 3: 使用 AI 助手
+1. 切換到「AI 助手」頁面
+2. 在聊天框輸入問題，或點擊快速問題按鈕
+3. AI 會根據分析結果提供解釋和建議
+## 💡 統計指標說明
+### McNemar 統計量
+用於檢定配對資料中比例是否有差異的卡方統計量
+### p 值 (p-value)
+- p < 0.05：顯著（拒絕虛無假設）
+- p ≥ 0.05：不顯著（無法拒絕虛無假設）
+### 勝算比 (Odds Ratio)
+- OR > 1：勝方更可能較高
+- OR = 1：無差異
+- OR < 1：敗方更可能較高
+### 不一致配對
+- **b**: 勝方高且敗方低的配對數
+- **c**: 勝方低且敗方高的配對數
+- McNemar 檢定只使用這些不一致的配對
+## 🎮 應用場景
+### 1. 特徵重要性分析
+判斷哪些寶可夢特徵（HP、攻擊、速度等）對勝負影響最大
+### 2. 組隊策略制定
+根據統計結果選擇重要特徵較高的寶可夢
+### 3. 對戰機制理解
+理解不同特徵在實戰中的作用
+### 4. 教學用途
+學習 McNemar 檢定的原理和應用
+## ⚙️ 技術架構
+### 核心技術
+- **Streamlit**: Web 應用框架
+- **pandas**: 資料處理
+- **statsmodels**: McNemar 檢定
+- **plotly**: 互動式視覺化
+- **OpenAI GPT-4o-mini**: AI 助手
+### 特色設計
+- ✅ Session 隔離（多用戶支援）
+- ✅ 執行緒安全
+- ✅ 自動清理過期資料
+- ✅ 響應式 UI 設計
+- ✅ 完整錯誤處理
+## 🔒 隱私與安全
+- 所有分析在本地執行
+- Session 資料獨立儲存
+- 超過 1 小時自動清理
+- API Key 不會被儲存
+## 📝 範例問題（給 AI 助手）
+- "為什麼這個特徵顯著/不顯著？"
+- "勝算比 2.5 代表什麼意思？"
+- "我該如何組建隊伍？"
+- "不一致配對是什麼？"
+- "McNemar 檢定和 t 檢定有什麼不同？"
+- "這個結果對實戰有什麼啟示？"
+## 🚀 未來功能規劃
+- [ ] 批次分析多個特徵
+- [ ] 特徵重要性排序
+- [ ] RAG 文獻檢索系統
+- [ ] 歷史分析紀錄
+- [ ] 結果比較功能
+- [ ] 匯出 PDF 報告
+## 📧 聯絡資訊
+如有問題或建議，歡迎聯繫開發團隊。
+## 📄 授權
+本專案僅供學術研究和教學使用。
 ---
+**Powered by Streamlit & OpenAI GPT-4o-mini** 🚀

app.py ADDED Viewed

	@@ -0,0 +1,540 @@

+import streamlit as st
+import pandas as pd
+import uuid
+from datetime import datetime, timedelta
+import atexit
+import os
+# 頁面配置
+st.set_page_config(
+    page_title="McNemar Test Analysis - Pokémon Battles",
+    page_icon="⚔️",
+    layout="wide",
+    initial_sidebar_state="expanded"
+)
+# 自定義 CSS
+st.markdown("""
+<style>
+    .streamlit-expanderHeader {
+        background-color: #e8f1f8;
+        border: 1px solid #b0cfe8;
+        border-radius: 5px;
+        font-weight: 600;
+        color: #1b4f72;
+    }
+    .streamlit-expanderHeader:hover {
+        background-color: #d0e7f8;
+    }
+    .stMetric {
+        background-color: #f8fbff;
+        padding: 10px;
+        border-radius: 5px;
+        border: 1px solid #d0e4f5;
+    }
+    .stButton > button {
+        width: 100%;
+        border-radius: 20px;
+        font-weight: 600;
+        transition: all 0.3s ease;
+    }
+    .stButton > button:hover {
+        transform: translateY(-2px);
+        box-shadow: 0 4px 8px rgba(0,0,0,0.2);
+    }
+    .success-box {
+        background-color: #d4edda;
+        border: 1px solid #c3e6cb;
+        border-radius: 5px;
+        padding: 10px;
+        margin: 10px 0;
+    }
+    .warning-box {
+        background-color: #fff3cd;
+        border: 1px solid #ffeaa7;
+        border-radius: 5px;
+        padding: 10px;
+        margin: 10px 0;
+    }
+</style>
+""", unsafe_allow_html=True)
+# 導入自定義模組
+from mcnemar_core import McNemarAnalyzer
+from mcnemar_llm_assistant import McNemarLLMAssistant
+from mcnemar_utils import (
+    plot_contingency_table_heatmap,
+    plot_odds_ratio_forest,
+    plot_discordant_pairs,
+    plot_p_value_significance,
+    create_results_summary_table,
+    export_results_to_text
+)
+# 清理函數
+def cleanup_old_sessions():
+    """清理超過 1 小時的 session"""
+    current_time = datetime.now()
+    for session_id in list(McNemarAnalyzer._session_results.keys()):
+        result = McNemarAnalyzer._session_results.get(session_id)
+        if result:
+            result_time = datetime.fromisoformat(result['timestamp'])
+            if current_time - result_time > timedelta(hours=1):
+                McNemarAnalyzer.clear_session_results(session_id)
+# 註冊清理函數
+atexit.register(cleanup_old_sessions)
+# 初始化 session state
+if 'session_id' not in st.session_state:
+    st.session_state.session_id = str(uuid.uuid4())
+if 'analysis_results' not in st.session_state:
+    st.session_state.analysis_results = None
+if 'chat_history' not in st.session_state:
+    st.session_state.chat_history = []
+if 'analyzer' not in st.session_state:
+    st.session_state.analyzer = None
+# 標題
+st.title("⚔️ McNemar Test Analysis System")
+st.markdown("### 寶可夢對戰特徵顯著性分析")
+st.markdown("---")
+# Sidebar
+with st.sidebar:
+    st.header("⚙️ 配置設定")
+    # OpenAI API Key
+    api_key = st.text_input(
+        "OpenAI API Key",
+        type="password",
+        help="輸入您的 OpenAI API key 以使用 AI 助手"
+    )
+    if api_key:
+        st.session_state.api_key = api_key
+        st.success("✅ API Key 已載入")
+    st.markdown("---")
+    # 清理按鈕
+    if st.button("🧹 清理過期資料"):
+        cleanup_old_sessions()
+        st.success("✅ 清理完成")
+        st.rerun()
+    st.markdown("---")
+    # 資料來源選擇
+    st.subheader("📊 資料來源")
+    data_source = st.radio(
+        "選擇資料來源：",
+        ["使用預設資料集", "上傳您的資料"]
+    )
+    uploaded_file = None
+    if data_source == "上傳您的資料":
+        uploaded_file = st.file_uploader(
+            "上傳 CSV 檔案",
+            type=['csv'],
+            help="上傳寶可夢對戰資料（需包含 cs_特徵 和 cn_特徵 欄位）"
+        )
+        with st.expander("📖 資料格式說明"):
+            st.markdown("""
+            **必要欄位格式：**
+            - `cs_特徵名稱`: 勝方的特徵值 (0 或 1)
+            - `cn_特徵名稱`: 敗方的特徵值 (0 或 1)
+            **範例：**
+            - `cs_HP`: 勝方 HP 是否較高
+            - `cn_HP`: 敗方 HP 是否較高
+            - `cs_Attack`: 勝方攻擊是否較高
+            - `cn_Attack`: 敗方攻擊是否較高
+            **數值含義：**
+            - 1 = 較高/較快/較重
+            - 0 = 較低/較慢/較輕
+            """)
+    st.markdown("---")
+    # 關於系統
+    with st.expander("ℹ️ 關於此系統"):
+        st.markdown("""
+        **McNemar 檢定分析系統**
+        本系統使用 McNemar 檢定來分析寶可夢對戰中，
+        勝方與敗方在各項特徵上是否有顯著差異。
+        **主要功能：**
+        - 🔬 統計顯著性檢定
+        - 📊 勝算比分析
+        - 📈 視覺化圖表
+        - 💬 AI 助手解釋
+        - 🎮 對戰策略建議
+        **適用場景：**
+        - 分析哪些特徵對勝負影響最大
+        - 理解特徵重要性排序
+        - 制定組隊策略
+        """)
+# 主要內容區 - 雙 Tab
+tab1, tab2 = st.tabs(["📊 McNemar 分析", "💬 AI 助手"])
+# Tab 1: McNemar 分析
+with tab1:
+    st.header("📊 McNemar 檢定分析")
+    # 載入資料
+    if data_source == "使用預設資料集":
+        # 檢查預設資料是否存在
+        default_data_path = "poke_mc_hong_2.csv"
+        if os.path.exists(default_data_path):
+            df = pd.read_csv(default_data_path)
+            st.success(f"✅ 已載入預設資料集（{len(df)} 筆對戰記錄）")
+        else:
+            st.warning("⚠️ 找不到預設資料集，請上傳您的資料")
+            df = None
+    else:
+        if uploaded_file is not None:
+            df = pd.read_csv(uploaded_file)
+            st.success(f"✅ 已載入資料（{len(df)} 筆對戰記錄）")
+        else:
+            df = None
+            st.info("📁 請在左側上傳 CSV 檔案")
+    if df is not None:
+        # 初始化分析器
+        if st.session_state.analyzer is None:
+            st.session_state.analyzer = McNemarAnalyzer(st.session_state.session_id)
+            st.session_state.analyzer.load_data(df)
+        # 取得可用特徵
+        available_features = st.session_state.analyzer.get_available_features()
+        if not available_features:
+            st.error("❌ 資料中找不到有效的特徵欄位（需要 cs_* 和 cn_* 格式）")
+        else:
+            # 參數設定區
+            st.subheader("🎯 選擇分析特徵")
+            col1, col2 = st.columns([3, 1])
+            with col1:
+                selected_feature = st.selectbox(
+                    "選擇要分析的特徵：",
+                    options=available_features,
+                    format_func=lambda x: McNemarAnalyzer.FEATURE_LABELS.get(x, x),
+                    help="選擇一個特徵來比較勝方與敗方的差異"
+                )
+            with col2:
+                st.markdown("<br>", unsafe_allow_html=True)
+                analyze_button = st.button("🔬 開始分析", type="primary", use_container_width=True)
+            # 執行分析
+            if analyze_button:
+                with st.spinner("分析中..."):
+                    try:
+                        results = st.session_state.analyzer.run_analysis(selected_feature)
+                        st.session_state.analysis_results = results
+                        st.success("✅ 分析完成！")
+                    except Exception as e:
+                        st.error(f"❌ 分析失敗: {str(e)}")
+            # 顯示結果
+            if st.session_state.analysis_results:
+                results = st.session_state.analysis_results
+                st.markdown("---")
+                st.subheader(f"📈 分析結果：{results['feature_label']}")
+                # 建立結果 tabs
+                result_tabs = st.tabs([
+                    "📊 概覽",
+                    "📉 列聯表",
+                    "🎯 勝算比",
+                    "📋 詳細報告"
+                ])
+                # Tab: 概覽
+                with result_tabs[0]:
+                    # 關鍵指標
+                    metric_cols = st.columns(4)
+                    # 顯著性指示
+                    sig_emoji = "✅" if results['interpretation']['is_significant'] else "⚠️"
+                    metric_cols[0].metric(
+                        "統計顯著性",
+                        results['interpretation']['significance'].split('(')[0].strip(),
+                        sig_emoji
+                    )
+                    metric_cols[1].metric(
+                        "p 值",
+                        f"{results['p_value']:.4f}",
+                        "顯著" if results['p_value'] < 0.05 else "不顯著"
+                    )
+                    metric_cols[2].metric(
+                        "勝算比 (OR)",
+                        f"{results['odds_ratio']:.3f}",
+                        results['interpretation']['effect_size'].split('(')[0].strip()
+                    )
+                    metric_cols[3].metric(
+                        "不一致配對數",
+                        results['discordant_n'],
+                        f"b={results['discordant_b']}, c={results['discordant_c']}"
+                    )
+                    st.markdown("---")
+                    # 摘要表格
+                    st.markdown("### 📝 結果摘要")
+                    summary_df = create_results_summary_table(results)
+                    st.dataframe(summary_df, use_container_width=True, hide_index=True)
+                    st.markdown("---")
+                    # 視覺化：p 值顯著性
+                    st.markdown("### 🎨 顯著性水準視覺化")
+                    p_value_fig = plot_p_value_significance(results['p_value'])
+                    st.plotly_chart(p_value_fig, use_container_width=True)
+                # Tab: 列聯表
+                with result_tabs[1]:
+                    st.markdown("### 📊 配對列聯表")
+                    col1, col2 = st.columns([1, 1])
+                    with col1:
+                        st.dataframe(
+                            results['contingency_table_labeled'],
+                            use_container_width=True
+                        )
+                    with col2:
+                        heatmap_fig = plot_contingency_table_heatmap(
+                            results['contingency_table_labeled'],
+                            results['feature_label']
+                        )
+                        st.plotly_chart(heatmap_fig, use_container_width=True)
+                    st.markdown("---")
+                    # 不一致配對分析
+                    st.markdown("### 🔍 不一致配對分析")
+                    st.info(f"""
+                    **不一致配對** 是 McNemar 檢定的關鍵：
+                    - **b = {results['discordant_b']}**: 勝方{results['label_pos'].split()[1]}且敗方{results['label_neg'].split()[1]}的配對數
+                    - **c = {results['discordant_c']}**: 勝方{results['label_neg'].split()[1]}且敗方{results['label_pos'].split()[1]}的配對數
+                    McNemar 檢定只使用這些不一致的配對來判斷是否有顯著差異。
+                    """)
+                    discordant_fig = plot_discordant_pairs(
+                        results['discordant_b'],
+                        results['discordant_c'],
+                        results['label_pos'],
+                        results['label_neg']
+                    )
+                    st.plotly_chart(discordant_fig, use_container_width=True)
+                # Tab: 勝算比
+                with result_tabs[2]:
+                    st.markdown("### 🎯 勝算比 (Odds Ratio) 分析")
+                    or_fig = plot_odds_ratio_forest(
+                        results['odds_ratio'],
+                        results['ci_low'],
+                        results['ci_high'],
+                        results['feature_label']
+                    )
+                    st.plotly_chart(or_fig, use_container_width=True)
+                    st.markdown("---")
+                    # 解釋勝算比
+                    st.markdown("### 📖 勝算比解釋")
+                    if results['odds_ratio'] > 1:
+                        interpretation = f"""
+                        勝算比為 **{results['odds_ratio']:.3f}**，表示：
+                        - 勝方在 **{results['feature_label']}** 上較高的機率是敗方的 **{results['odds_ratio']:.2f} 倍**
+                        - 95% 信賴區間: [{results['ci_low']:.3f}, {results['ci_high']:.3f}]
+                        - 效果大小: {results['interpretation']['effect_size']}
+                        **結論**：{results['feature_label']} 較高的寶可夢更容易獲勝。
+                        """
+                    elif results['odds_ratio'] < 1:
+                        interpretation = f"""
+                        勝算比為 **{results['odds_ratio']:.3f}**，表示：
+                        - 敗方在 **{results['feature_label']}** 上較高的機率是勝方的 **{1/results['odds_ratio']:.2f} 倍**
+                        - 95% 信賴區間: [{results['ci_low']:.3f}, {results['ci_high']:.3f}]
+                        - 效果大小: {results['interpretation']['effect_size']}
+                        **結論**：{results['feature_label']} 較低的寶可夢反而更容易獲勝（這很少見！）。
+                        """
+                    else:
+                        interpretation = f"""
+                        勝算比為 **1.0**，表示：
+                        - 勝方和敗方在 **{results['feature_label']}** 上沒有差異
+                        - 此特徵對勝負沒有影響
+                        """
+                    st.markdown(interpretation)
+                # Tab: 詳細報告
+                with result_tabs[3]:
+                    st.markdown("### 📋 完整分析報告")
+                    # 生成文字報告
+                    text_report = export_results_to_text(results)
+                    st.text_area(
+                        "報告內容",
+                        text_report,
+                        height=400
+                    )
+                    # 下載按鈕
+                    st.download_button(
+                        label="📥 下載完整報告 (.txt)",
+                        data=text_report,
+                        file_name=f"mcnemar_report_{results['feature_name']}_{results['timestamp'][:10]}.txt",
+                        mime="text/plain"
+                    )
+# Tab 2: AI 助手
+with tab2:
+    st.header("💬 AI 分析助手")
+    if not st.session_state.get('api_key'):
+        st.warning("⚠️ 請在左側輸入您的 OpenAI API Key 以使用 AI 助手")
+    elif st.session_state.analysis_results is None:
+        st.info("ℹ️ 請先在「McNemar 分析」頁面執行分析")
+    else:
+        # 初始化 LLM 助手
+        if 'llm_assistant' not in st.session_state:
+            st.session_state.llm_assistant = McNemarLLMAssistant(
+                api_key=st.session_state.api_key,
+                session_id=st.session_state.session_id
+            )
+        # 聊天容器
+        chat_container = st.container()
+        with chat_container:
+            for message in st.session_state.chat_history:
+                with st.chat_message(message["role"]):
+                    st.markdown(message["content"])
+        # 使用者輸入
+        if prompt := st.chat_input("詢問關於分析結果的任何問題..."):
+            # 添加使用者訊息
+            st.session_state.chat_history.append({
+                "role": "user",
+                "content": prompt
+            })
+            with st.chat_message("user"):
+                st.markdown(prompt)
+            # AI 回應
+            with st.chat_message("assistant"):
+                with st.spinner("思考中..."):
+                    try:
+                        response = st.session_state.llm_assistant.get_response(
+                            user_message=prompt,
+                            analysis_results=st.session_state.analysis_results
+                        )
+                        st.markdown(response)
+                    except Exception as e:
+                        error_msg = f"❌ 錯誤: {str(e)}\n\n請檢查 API key 或重新表達問題。"
+                        st.error(error_msg)
+                        response = error_msg
+            # 添加助手回應
+            st.session_state.chat_history.append({
+                "role": "assistant",
+                "content": response
+            })
+        st.markdown("---")
+        # 快速問題按鈕
+        st.subheader("💡 快速問題")
+        quick_questions = [
+            "📊 給我這次分析的總結",
+            "🎯 解釋 p 值的意義",
+            "🔍 解釋勝算比",
+            "⚔️ 這對對戰策略有什麼啟示？",
+            "❓ 什麼是 McNemar 檢定？"
+        ]
+        cols = st.columns(len(quick_questions))
+        for idx, (col, question) in enumerate(zip(cols, quick_questions)):
+            if col.button(question, key=f"quick_{idx}"):
+                # 根據問題選擇對應的方法
+                if "總結" in question:
+                    response = st.session_state.llm_assistant.generate_summary(
+                        st.session_state.analysis_results
+                    )
+                elif "p 值" in question:
+                    response = st.session_state.llm_assistant.explain_metric(
+                        'p_value',
+                        st.session_state.analysis_results
+                    )
+                elif "勝算比" in question:
+                    response = st.session_state.llm_assistant.explain_metric(
+                        'odds_ratio',
+                        st.session_state.analysis_results
+                    )
+                elif "策略" in question:
+                    response = st.session_state.llm_assistant.battle_strategy_advice(
+                        st.session_state.analysis_results
+                    )
+                elif "McNemar" in question:
+                    response = st.session_state.llm_assistant.explain_mcnemar_test()
+                else:
+                    response = st.session_state.llm_assistant.get_response(
+                        question,
+                        st.session_state.analysis_results
+                    )
+                st.session_state.chat_history.append({
+                    "role": "user",
+                    "content": question
+                })
+                st.session_state.chat_history.append({
+                    "role": "assistant",
+                    "content": response
+                })
+                st.rerun()
+        # 重置對話按鈕
+        st.markdown("---")
+        if st.button("🔄 重置對話"):
+            st.session_state.llm_assistant.reset_conversation()
+            st.session_state.chat_history = []
+            st.success("✅ 對話已重置")
+            st.rerun()
+# Footer
+st.markdown("---")
+st.markdown(
+    f"""
+    <div style='text-align: center'>
+        <p>⚔️ McNemar Test Analysis System for Pokémon Battles | Built with Streamlit & OpenAI</p>
+        <p>Session ID: {st.session_state.session_id[:8]} | Powered by GPT-4o-mini</p>
+    </div>
+    """,
+    unsafe_allow_html=True
+)

mcnemar_core.py ADDED Viewed

	@@ -0,0 +1,241 @@

+import pandas as pd
+import numpy as np
+import re
+from statsmodels.stats.contingency_tables import mcnemar
+from datetime import datetime
+import threading
+class McNemarAnalyzer:
+    """
+    McNemar 檢定分析器
+    支援多用戶同時使用，每個 session 獨立處理
+    """
+    # 類別級的鎖，用於執行緒安全
+    _lock = threading.Lock()
+    # 儲存各 session 的分析結果
+    _session_results = {}
+    # 特徵標籤對應
+    FEATURE_LABELS = {
+        'HP': 'HP（血量）',
+        'Attack': '攻擊',
+        'Defense': '防禦',
+        'SpAtk': '特攻',
+        'SpDef': '特防',
+        'Speed': '速度',
+        'height': '身高',
+        'weight': '體重',
+        'base_experience': '基礎經驗值',
+    }
+    # 數值描述文字
+    DEFAULT_VALUE_TEXT = {1: '較高', 0: '較低'}
+    FEATURE_VALUE_TEXT = {
+        'HP': DEFAULT_VALUE_TEXT,
+        'Attack': DEFAULT_VALUE_TEXT,
+        'Defense': DEFAULT_VALUE_TEXT,
+        'SpAtk': DEFAULT_VALUE_TEXT,
+        'SpDef': DEFAULT_VALUE_TEXT,
+        'Speed': {1: '較快', 0: '較慢'},
+        'height': {1: '較高', 0: '較矮'},
+        'weight': {1: '較重', 0: '較輕'},
+        'base_experience': DEFAULT_VALUE_TEXT,
+    }
+    def __init__(self, session_id):
+        """
+        初始化分析器
+        Args:
+            session_id: 唯一的 session 識別碼
+        """
+        self.session_id = session_id
+        self.df = None
+    def load_data(self, csv_path_or_df):
+        """
+        載入資料
+        Args:
+            csv_path_or_df: CSV 檔案路徑或 DataFrame
+        """
+        if isinstance(csv_path_or_df, str):
+            self.df = pd.read_csv(csv_path_or_df)
+        else:
+            self.df = csv_path_or_df.copy()
+    def get_available_features(self):
+        """
+        取得可用的特徵列表
+        Returns:
+            list: 特徵名稱列表
+        """
+        if self.df is None:
+            return []
+        # 找出所有 cs_ 開頭的欄位
+        cs_cols = [col for col in self.df.columns if col.startswith('cs_')]
+        # 提取特徵名稱（移除 cs_ 前綴）
+        features = [col.replace('cs_', '') for col in cs_cols]
+        return features
+    def run_analysis(self, feature_name):
+        """
+        執行 McNemar 檢定分析
+        Args:
+            feature_name: 特徵名稱（例如 'HP', 'Attack'）
+        Returns:
+            dict: 包含所有分析結果的字典
+        """
+        with self._lock:
+            try:
+                if self.df is None:
+                    raise ValueError("請先載入資料")
+                # 1. 準備資料
+                feature_a = f"cs_{feature_name}"  # Winner
+                feature_b = f"cn_{feature_name}"  # Loser
+                if feature_a not in self.df.columns or feature_b not in self.df.columns:
+                    raise ValueError(f"找不到特徵 {feature_name} 的欄位")
+                var_a = self.df[feature_a]
+                var_b = self.df[feature_b]
+                # 2. 建立列聯表（1 before 0）
+                var_a = pd.Categorical(var_a, categories=[1, 0], ordered=True)
+                var_b = pd.Categorical(var_b, categories=[1, 0], ordered=True)
+                ct = pd.crosstab(var_a, var_b)
+                ctm = ct.copy()
+                # 3. 執行 McNemar 檢定
+                result = mcnemar(ctm, exact=True, correction=True)
+                # 4. 計算勝算比 (Odds Ratio)
+                b = int(ctm.at[1, 0])  # 勝方高 & 敗方低
+                c = int(ctm.at[0, 1])  # 勝方低 & 敗方高
+                # Haldane-Anscombe correction（防止除以零）
+                bh = b + 0.5 if b == 0 or c == 0 else float(b)
+                ch = c + 0.5 if b == 0 or c == 0 else float(c)
+                or_ratio = bh / ch
+                ln_or = np.log(or_ratio)
+                se = np.sqrt(1.0 / bh + 1.0 / ch)
+                z = 1.96
+                ci_low = float(np.exp(ln_or - z * se))
+                ci_high = float(np.exp(ln_or + z * se))
+                n_discordant = b + c
+                # 5. 準備標籤
+                feature_label = self.FEATURE_LABELS.get(feature_name, feature_name)
+                label_pos, label_neg = self._get_value_labels(feature_name)
+                # 6. 準備列聯表（含標籤）
+                ct_labeled = self._create_labeled_table(ct, feature_name)
+                # 7. 整理結果
+                results = {
+                    'feature_name': feature_name,
+                    'feature_label': feature_label,
+                    'contingency_table': ctm.to_dict(),
+                    'contingency_table_labeled': ct_labeled,
+                    'mcnemar_statistic': float(result.statistic),
+                    'p_value': float(result.pvalue),
+                    'odds_ratio': round(or_ratio, 3),
+                    'ci_low': round(ci_low, 3),
+                    'ci_high': round(ci_high, 3),
+                    'discordant_b': b,
+                    'discordant_c': c,
+                    'discordant_n': n_discordant,
+                    'label_pos': label_pos,
+                    'label_neg': label_neg,
+                    'interpretation': self._interpret_results(result.pvalue, or_ratio),
+                    'timestamp': datetime.now().isoformat()
+                }
+                # 儲存到 session results
+                self._session_results[self.session_id] = results
+                return results
+            except Exception as e:
+                raise Exception(f"分析失敗: {str(e)}")
+    def _create_labeled_table(self, ct, feature_name):
+        """建立帶標籤的列聯表"""
+        feature_label = self.FEATURE_LABELS.get(feature_name, feature_name)
+        mapping = self.FEATURE_VALUE_TEXT.get(feature_name, self.DEFAULT_VALUE_TEXT)
+        # 創建標籤
+        row_labels = [f"勝方 {mapping.get(1, '較高')}", f"勝方 {mapping.get(0, '較低')}"]
+        col_labels = [f"敗方 {mapping.get(1, '較高')}", f"敗方 {mapping.get(0, '較低')}"]
+        # 重建 DataFrame
+        labeled_table = pd.DataFrame(
+            ct.values,
+            index=row_labels,
+            columns=col_labels
+        )
+        # 添加總計
+        labeled_table['總數'] = labeled_table.sum(axis=1)
+        labeled_table.loc['總數'] = labeled_table.sum()
+        return labeled_table
+    def _get_value_labels(self, feature_name):
+        """取得特徵的數值標籤"""
+        name = self.FEATURE_LABELS.get(feature_name, feature_name)
+        m = self.FEATURE_VALUE_TEXT.get(feature_name, self.DEFAULT_VALUE_TEXT)
+        pos = f"{name} {m[1]}"  # 1
+        neg = f"{name} {m[0]}"  # 0
+        return pos, neg
+    def _interpret_results(self, p_value, odds_ratio):
+        """解釋分析結果"""
+        # 顯著性判斷
+        if p_value < 0.001:
+            significance = "極顯著 (p < 0.001)"
+        elif p_value < 0.01:
+            significance = "非常顯著 (p < 0.01)"
+        elif p_value < 0.05:
+            significance = "顯著 (p < 0.05)"
+        else:
+            significance = "不顯著 (p ≥ 0.05)"
+        # 效果大小判斷
+        if odds_ratio > 2:
+            effect_size = "大效果 (OR > 2)"
+        elif odds_ratio > 1.5:
+            effect_size = "中等效果 (OR > 1.5)"
+        elif odds_ratio > 1:
+            effect_size = "小效果 (OR > 1)"
+        elif odds_ratio == 1:
+            effect_size = "無差異 (OR = 1)"
+        else:
+            effect_size = f"反向效果 (OR < 1)"
+        return {
+            'significance': significance,
+            'effect_size': effect_size,
+            'is_significant': p_value < 0.05
+        }
+    @classmethod
+    def get_session_results(cls, session_id):
+        """獲取特定 session 的結果"""
+        return cls._session_results.get(session_id)
+    @classmethod
+    def clear_session_results(cls, session_id):
+        """清除特定 session 的結果"""
+        if session_id in cls._session_results:
+            del cls._session_results[session_id]

mcnemar_llm_assistant.py ADDED Viewed

	@@ -0,0 +1,251 @@

+from openai import OpenAI
+import json
+class McNemarLLMAssistant:
+    """
+    McNemar 檢定 LLM 問答助手
+    協助用戶理解 McNemar 檢定分析結果
+    """
+    def __init__(self, api_key, session_id):
+        """
+        初始化 LLM 助手
+        Args:
+            api_key: OpenAI API key
+            session_id: 唯一的 session 識別碼
+        """
+        self.client = OpenAI(api_key=api_key)
+        self.session_id = session_id
+        self.conversation_history = []
+        # 系統提示詞
+        self.system_prompt = """You are an expert statistician specializing in McNemar's test and paired categorical data analysis, particularly in the context of Pokémon battle statistics.
+Your role is to help users understand their McNemar's test results for Pokémon battles, where we compare whether winning and losing Pokémon differ on specific features (HP, Attack, Speed, etc.).
+You should:
+1. Explain McNemar's test concepts in simple, accessible terms
+2. Interpret statistical significance (p-values) clearly
+3. Explain odds ratios and confidence intervals in context
+4. Help users understand what discordant pairs mean
+5. Discuss the practical significance of results for Pokémon battles
+6. Provide insights about which features matter most for winning
+7. Suggest battle strategies based on the statistical findings
+8. Clarify limitations and assumptions of McNemar's test
+Key concepts to explain when relevant:
+- **McNemar's test**: Tests if proportions differ between paired binary data
+- **p-value**: Probability of seeing these results by chance (< 0.05 is significant)
+- **Odds Ratio**: How much more likely winners are to have higher values than losers
+- **Discordant pairs**: Cases where winner and loser differ on the feature
+- **Concordant pairs**: Cases where both have same value (not used in test)
+When discussing Pokémon battles:
+- Connect statistical findings to battle mechanics
+- Explain why certain stats matter more (e.g., Speed determines who attacks first)
+- Discuss type advantages and battle strategies
+- Use Pokémon-specific terminology naturally
+Always be clear, educational, and engaging. Use examples when helpful.
+Format responses with proper markdown for better readability."""
+    def get_response(self, user_message, analysis_results=None):
+        """
+        獲取 AI 回應
+        Args:
+            user_message: 用戶訊息
+            analysis_results: 分析結果字典（可選）
+        Returns:
+            str: AI 回應
+        """
+        # 準備上下文資訊
+        context = ""
+        if analysis_results:
+            context = self._prepare_context(analysis_results)
+        # 添加用戶訊息到歷史
+        self.conversation_history.append({
+            "role": "user",
+            "content": user_message
+        })
+        # 構建訊息列表
+        messages = [
+            {"role": "system", "content": self.system_prompt}
+        ]
+        if context:
+            messages.append({"role": "system", "content": f"Current Analysis Context:\n{context}"})
+        # 加入對話歷史
+        messages.extend(self.conversation_history)
+        try:
+            # 調用 OpenAI API
+            response = self.client.chat.completions.create(
+                model="gpt-4o-mini",
+                messages=messages,
+                temperature=0.7,
+                max_tokens=1500
+            )
+            assistant_message = response.choices[0].message.content
+            # 添加助手回應到歷史
+            self.conversation_history.append({
+                "role": "assistant",
+                "content": assistant_message
+            })
+            return assistant_message
+        except Exception as e:
+            return f"❌ Error: {str(e)}\n\nPlease check your API key and try again."
+    def _prepare_context(self, results):
+        """準備分析結果的上下文資訊"""
+        if not results:
+            return "No analysis results available yet."
+        context = f"""
+## Current McNemar Test Analysis
+### Feature Analyzed
+- Feature: {results['feature_label']} ({results['feature_name']})
+- Positive label (1): {results['label_pos']}
+- Negative label (0): {results['label_neg']}
+### Contingency Table
+```
+                    Loser Low    Loser High    Total
+Winner High         {results['contingency_table'].get(1, {}).get(1, 0):<15} {results['contingency_table'].get(1, {}).get(0, 0):<15} {results['contingency_table'].get(1, {}).get(1, 0) + results['contingency_table'].get(1, {}).get(0, 0)}
+Winner Low          {results['contingency_table'].get(0, {}).get(1, 0):<15} {results['contingency_table'].get(0, {}).get(0, 0):<15} {results['contingency_table'].get(0, {}).get(1, 0) + results['contingency_table'].get(0, {}).get(0, 0)}
+```
+### Statistical Test Results
+- **McNemar Statistic**: {results['mcnemar_statistic']:.4f}
+- **p-value**: {results['p_value']:.4f}
+- **Significance**: {results['interpretation']['significance']}
+- **Is Significant?**: {'Yes' if results['interpretation']['is_significant'] else 'No'}
+### Odds Ratio Analysis
+- **Odds Ratio**: {results['odds_ratio']:.3f}
+- **95% Confidence Interval**: [{results['ci_low']:.3f}, {results['ci_high']:.3f}]
+- **Effect Size**: {results['interpretation']['effect_size']}
+### Discordant Pairs (key for McNemar test)
+- **Winner High & Loser Low (b)**: {results['discordant_b']} pairs
+- **Winner Low & Loser High (c)**: {results['discordant_c']} pairs
+- **Total Discordant Pairs**: {results['discordant_n']} pairs
+### Interpretation
+{
+    f"The results show that winners and losers DIFFER SIGNIFICANTLY on {results['feature_label']}."
+    if results['interpretation']['is_significant']
+    else f"The results show NO SIGNIFICANT DIFFERENCE between winners and losers on {results['feature_label']}."
+}
+{
+    f"Winners are {results['odds_ratio']:.2f} times more likely to have higher {results['feature_label']} than losers."
+    if results['odds_ratio'] > 1
+    else f"Losers are {1/results['odds_ratio']:.2f} times more likely to have higher {results['feature_label']} than winners."
+}
+"""
+        return context
+    def generate_summary(self, analysis_results):
+        """自動生成分析結果總結"""
+        summary_prompt = """Based on the McNemar test results provided, please generate a comprehensive summary that includes:
+1. **What was tested**: Briefly explain what feature was analyzed and what the test measures
+2. **Statistical Findings**:
+   - Is the result statistically significant?
+   - What does the p-value tell us?
+   - What does the odds ratio mean in practical terms?
+3. **Battle Implications**: What does this mean for Pokémon battles?
+4. **Key Insights**: The most important takeaway from these results
+5. **Recommendations**: How trainers could use this information
+Format the summary in clear markdown with appropriate sections."""
+        return self.get_response(summary_prompt, analysis_results)
+    def explain_metric(self, metric_name, analysis_results):
+        """解釋特定指標"""
+        metric_explanations = {
+            'mcnemar_statistic': 'McNemar Statistic',
+            'p_value': 'p-value',
+            'odds_ratio': 'Odds Ratio',
+            'confidence_interval': '95% Confidence Interval',
+            'discordant_pairs': 'Discordant Pairs'
+        }
+        metric_display = metric_explanations.get(metric_name, metric_name)
+        explain_prompt = f"""Please explain the following metric in the context of this McNemar test analysis:
+Metric: {metric_display}
+Include:
+1. What this metric measures in general
+2. The value obtained in this analysis
+3. How to interpret this value for Pokémon battles
+4. What it tells us about the importance of this feature
+5. Any limitations or caveats to consider"""
+        return self.get_response(explain_prompt, analysis_results)
+    def compare_features(self):
+        """建議比較不同特徵"""
+        compare_prompt = """I'd like to understand how different Pokémon features (HP, Attack, Defense, Speed, etc.) compare in terms of their importance for winning battles.
+Could you:
+1. Explain which features are typically most important in Pokémon battles
+2. Discuss how McNemar's test helps us identify important features
+3. Suggest which features I should analyze next
+4. Explain how features might interact (e.g., Speed + Attack)"""
+        return self.get_response(compare_prompt, None)
+    def explain_mcnemar_test(self):
+        """解釋 McNemar 檢定的基本概念"""
+        explain_prompt = """Please explain McNemar's test in simple terms, specifically in the context of Pokémon battle analysis.
+Cover:
+1. What McNemar's test is and when to use it
+2. Why it's appropriate for comparing winner vs. loser features
+3. What "paired data" means in this context
+4. The difference between discordant and concordant pairs
+5. How to interpret the results
+Use Pokémon examples to make it concrete and easy to understand."""
+        return self.get_response(explain_prompt, None)
+    def battle_strategy_advice(self, analysis_results):
+        """提供對戰策略建議"""
+        strategy_prompt = f"""Based on the McNemar test results for {analysis_results['feature_label']}, please provide practical battle strategy advice for Pokémon trainers.
+Consider:
+1. Should trainers prioritize this feature when building teams?
+2. How important is this feature compared to others?
+3. Are there specific Pokémon types or strategies that benefit most?
+4. What are the implications for competitive play?
+5. Any exceptions or special cases to be aware of?
+Be specific and actionable."""
+        return self.get_response(strategy_prompt, analysis_results)
+    def reset_conversation(self):
+        """重置對話歷史"""
+        self.conversation_history = []

mcnemar_utils.py ADDED Viewed

	@@ -0,0 +1,338 @@

+import plotly.graph_objects as go
+import plotly.express as px
+import pandas as pd
+import numpy as np
+def plot_contingency_table_heatmap(ct_labeled, feature_label, title="列聯表熱力圖"):
+    """
+    繪製列聯表熱力圖
+    Args:
+        ct_labeled: 帶標籤的列聯表 DataFrame
+        feature_label: 特徵標籤
+        title: 圖表標題
+    Returns:
+        plotly figure
+    """
+    # 移除總數列和行
+    ct_display = ct_labeled.iloc[:-1, :-1].copy()
+    # 創建註解文字
+    annotations = []
+    for i, row in enumerate(ct_display.index):
+        for j, col in enumerate(ct_display.columns):
+            annotations.append(
+                dict(
+                    x=j,
+                    y=i,
+                    text=str(ct_display.iloc[i, j]),
+                    font=dict(size=16, color='white' if ct_display.iloc[i, j] > ct_display.values.max()/2 else 'black'),
+                    showarrow=False
+                )
+            )
+    fig = go.Figure(data=go.Heatmap(
+        z=ct_display.values,
+        x=ct_display.columns,
+        y=ct_display.index,
+        colorscale='Blues',
+        showscale=True,
+        hoverongaps=False,
+        hovertemplate='%{y}<br>%{x}<br>配對數: %{z}<extra></extra>'
+    ))
+    fig.update_layout(
+        title=f'{title}<br><sub>{feature_label}</sub>',
+        xaxis_title='敗方 (Loser)',
+        yaxis_title='勝方 (Winner)',
+        width=600,
+        height=500,
+        template='plotly_white',
+        annotations=annotations
+    )
+    return fig
+def plot_odds_ratio_forest(or_value, ci_low, ci_high, feature_label):
+    """
+    繪製勝算比森林圖
+    Args:
+        or_value: 勝算比
+        ci_low: 95% 信賴區間下界
+        ci_high: 95% 信賴區間上界
+        feature_label: 特徵標籤
+    Returns:
+        plotly figure
+    """
+    fig = go.Figure()
+    # 參考線 (OR = 1)
+    fig.add_shape(
+        type="line",
+        x0=1, x1=1,
+        y0=-0.5, y1=0.5,
+        line=dict(color="red", width=2, dash="dash"),
+    )
+    # 信賴區間
+    fig.add_trace(go.Scatter(
+        x=[ci_low, ci_high],
+        y=[0, 0],
+        mode='lines',
+        line=dict(color='#2d6ca2', width=3),
+        showlegend=False,
+        hovertemplate='95% CI: [%{x:.3f}]<extra></extra>'
+    ))
+    # 點估計
+    fig.add_trace(go.Scatter(
+        x=[or_value],
+        y=[0],
+        mode='markers',
+        marker=dict(
+            size=15,
+            color='#d62728',
+            line=dict(color='white', width=2)
+        ),
+        showlegend=False,
+        hovertemplate=f'OR: {or_value:.3f}<extra></extra>'
+    ))
+    # 添加數值標註
+    fig.add_annotation(
+        x=or_value,
+        y=0.15,
+        text=f"OR = {or_value:.3f}<br>95% CI [{ci_low:.3f}, {ci_high:.3f}]",
+        showarrow=False,
+        font=dict(size=12, color='#1b4f72'),
+        bgcolor='rgba(255,255,255,0.8)',
+        bordercolor='#2d6ca2',
+        borderwidth=1,
+        borderpad=4
+    )
+    fig.update_layout(
+        title=f'勝算比 (Odds Ratio)<br><sub>{feature_label}</sub>',
+        xaxis_title='Odds Ratio',
+        yaxis=dict(
+            showticklabels=False,
+            showgrid=False,
+            zeroline=False
+        ),
+        width=700,
+        height=300,
+        template='plotly_white',
+        xaxis=dict(type='log', showgrid=True),
+        hovermode='closest'
+    )
+    return fig
+def plot_discordant_pairs(b, c, label_pos, label_neg):
+    """
+    繪製不一致配對比較圖
+    Args:
+        b: cs=1 & cn=0 的配對數
+        c: cs=0 & cn=1 的配對數
+        label_pos: 正向標籤
+        label_neg: 負向標籤
+    Returns:
+        plotly figure
+    """
+    fig = go.Figure()
+    categories = [
+        f'勝方 {label_pos.split()[1]}<br>敗方 {label_neg.split()[1]}',
+        f'勝方 {label_neg.split()[1]}<br>敗方 {label_pos.split()[1]}'
+    ]
+    values = [b, c]
+    colors = ['#2d6ca2', '#d62728']
+    fig.add_trace(go.Bar(
+        x=categories,
+        y=values,
+        marker=dict(
+            color=colors,
+            line=dict(color='white', width=2)
+        ),
+        text=values,
+        textposition='outside',
+        textfont=dict(size=16, color='black'),
+        hovertemplate='%{x}<br>配對數: %{y}<extra></extra>'
+    ))
+    fig.update_layout(
+        title='不一致配對分布',
+        xaxis_title='配對類型',
+        yaxis_title='配對數量',
+        width=600,
+        height=400,
+        template='plotly_white',
+        showlegend=False
+    )
+    return fig
+def plot_p_value_significance(p_value):
+    """
+    繪製 p 值顯著性指示圖
+    Args:
+        p_value: p 值
+    Returns:
+        plotly figure
+    """
+    fig = go.Figure()
+    # 設定顯著性閾值
+    thresholds = [0.001, 0.01, 0.05, 1.0]
+    labels = ['p < 0.001<br>(極顯著)', 'p < 0.01<br>(非常顯著)',
+              'p < 0.05<br>(顯著)', 'p ≥ 0.05<br>(不顯著)']
+    colors = ['#1a5f1a', '#2d8b2d', '#5cb85c', '#d9534f']
+    # 找出 p 值所在區間
+    current_idx = 0
+    for i, thresh in enumerate(thresholds):
+        if p_value < thresh:
+            current_idx = i
+            break
+    # 繪製區間條
+    for i in range(len(thresholds)):
+        opacity = 1.0 if i == current_idx else 0.3
+        fig.add_trace(go.Bar(
+            x=[labels[i]],
+            y=[1],
+            marker=dict(color=colors[i], opacity=opacity),
+            showlegend=False,
+            hovertemplate=f'{labels[i]}<extra></extra>'
+        ))
+    # 添加 p 值標註
+    fig.add_annotation(
+        x=labels[current_idx],
+        y=1.1,
+        text=f"p = {p_value:.4f}",
+        showarrow=True,
+        arrowhead=2,
+        arrowsize=1,
+        arrowwidth=2,
+        arrowcolor='black',
+        font=dict(size=14, color='black', weight='bold'),
+        bgcolor='yellow',
+        bordercolor='black',
+        borderwidth=2,
+        borderpad=4
+    )
+    fig.update_layout(
+        title='顯著性水準',
+        xaxis_title='',
+        yaxis_title='',
+        yaxis=dict(showticklabels=False, showgrid=False),
+        width=700,
+        height=300,
+        template='plotly_white',
+        showlegend=False
+    )
+    return fig
+def create_results_summary_table(results):
+    """
+    創建結果摘要表格
+    Args:
+        results: 分析結果字典
+    Returns:
+        pandas DataFrame
+    """
+    summary_data = {
+        '項目': [
+            '特徵',
+            'McNemar 統計量',
+            'p 值',
+            '顯著性',
+            '勝算比 (OR)',
+            '95% 信賴區間',
+            '不一致配對數',
+            '效果大小'
+        ],
+        '數值': [
+            results['feature_label'],
+            f"{results['mcnemar_statistic']:.4f}",
+            f"{results['p_value']:.4f}",
+            results['interpretation']['significance'],
+            f"{results['odds_ratio']:.3f}",
+            f"[{results['ci_low']:.3f}, {results['ci_high']:.3f}]",
+            f"{results['discordant_n']} (b={results['discordant_b']}, c={results['discordant_c']})",
+            results['interpretation']['effect_size']
+        ]
+    }
+    return pd.DataFrame(summary_data)
+def export_results_to_text(results):
+    """
+    匯出結果為純文字格式
+    Args:
+        results: 分析結果字典
+    Returns:
+        str: 格式化的文字報告
+    """
+    report = f"""
+==============================================
+McNemar 檢定分析報告
+==============================================
+分析特徵: {results['feature_label']} ({results['feature_name']})
+分析時間: {results['timestamp']}
+----------------------------------------------
+1. 列聯表
+----------------------------------------------
+{results['contingency_table_labeled'].to_string()}
+----------------------------------------------
+2. McNemar 檢定結果
+----------------------------------------------
+McNemar 統計量: {results['mcnemar_statistic']:.4f}
+p 值: {results['p_value']:.4f}
+顯著性: {results['interpretation']['significance']}
+----------------------------------------------
+3. 勝算比分析
+----------------------------------------------
+勝算比 (OR): {results['odds_ratio']:.3f}
+95% 信賴區間: [{results['ci_low']:.3f}, {results['ci_high']:.3f}]
+效果大小: {results['interpretation']['effect_size']}
+----------------------------------------------
+4. 不一致配對
+----------------------------------------------
+勝方{results['label_pos'].split()[1]}且敗方{results['label_neg'].split()[1]} (b): {results['discordant_b']}
+勝方{results['label_neg'].split()[1]}且敗方{results['label_pos'].split()[1]} (c): {results['discordant_c']}
+總不一致配對數: {results['discordant_n']}
+----------------------------------------------
+5. 解釋
+----------------------------------------------
+{'結果顯示勝方和敗方在此特徵上有顯著差異。' if results['interpretation']['is_significant'] else '結果顯示勝方和敗方在此特徵上無顯著差異。'}
+勝算比為 {results['odds_ratio']:.3f}，表示{
+    '勝方在此特徵上較高的機率是敗方的 ' + str(round(results['odds_ratio'], 2)) + ' 倍。'
+    if results['odds_ratio'] > 1
+    else '敗方在此特徵上較高的機率是勝方的 ' + str(round(1/results['odds_ratio'], 2)) + ' 倍。'
+}
+==============================================
+"""
+    return report

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+streamlit==1.31.0
+pandas==2.1.4
+numpy==1.26.3
+plotly==5.18.0
+statsmodels==0.14.1
+openai>=1.30.0