# NullAI プロジェクト完全理解ガイド

**最終更新**: 2025-12-02
**対象読者**: このプロジェクトを引き継ぐ全ての開発者
**目的**: プロジェクトの全体像を完全に理解し、設計思想を正しく継承する

---

## 📖 目次

1. [プロジェクト概要](#プロジェクト概要)
2. [4つの核心思想（こだわりポイント）](#4つの核心思想こだわりポイント)
3. [システムアーキテクチャ全体図](#システムアーキテクチャ全体図)
4. [各システムの詳細解説](#各システムの詳細解説)
5. [データフロー完全図解](#データフロー完全図解)
6. [技術スタック詳細](#技術スタック詳細)
7. [よくある誤解と注意点](#よくある誤解と注意点)
8. [設計判断の理由](#設計判断の理由)
9. [拡張時の考慮事項](#拡張時の考慮事項)

---

## プロジェクト概要

### NullAIとは何か

**NullAI**は、**自己進化型多ドメイン知識推論エンジン**です。

#### 核心的な問いと答え

**Q: 何を解決しようとしているのか？**
A: 「AIのハルシネーション（幻覚）」と「小型モデルの性能不足」の両方を同時に解決

**Q: どうやって解決するのか？**
A:
1. **DB優先推論（RAG）** → ハルシネーション削減
2. **師匠→弟子のファインチューニング** → 小型モデルの性能向上
3. **樹木型空間記憶** → 知識の意味的整理と高速検索
4. **自己拡充サイクル** → 知識ベースの自動成長

**Q: 他のRAGシステムとの違いは？**
A:
- ❌ 普通のRAG: ベクトルDBで検索するだけ
- ✅ NullAI: **6次元空間座標**で知識を配置し、意味的な近傍検索が可能

**Q: 他のファインチューニングシステムとの違いは？**
A:
- ❌ 普通のFT: 人間が訓練データを手動作成
- ✅ NullAI: **師匠AIが自動的に訓練データを生成** → 弟子が学習 → 弟子が師匠に昇格 → 無限サイクル

### プロジェクト名の由来

**Null** = ゼロ（ハルシネーション）
**AI** = Artificial Intelligence

→ **ゼロ・ハルシネーションを目指すAI**

---

## 4つの核心思想（こだわりポイント）

### 1️⃣ 倒木システム（Fallen Tree System）

#### 比喩の意味

森で大木（老いた木）が倒れると、その養分で新しい若木が育つ。NullAIでは：

- 🌲 **大木（師匠モデル）**: 高性能だが重いAI（例: DeepSeek R1 32B）
- 🌱 **若木（弟子モデル）**: 最初は空っぽだが軽量なAI（例: Phi-2 2.7B）
- 🍂 **養分（訓練データ）**: 師匠の高品質な出力（Alpaca形式JSONL）

#### システムの流れ

```
┌─────────────────────────────────────────────────────┐
│ Phase 1: 師匠の統治時代                             │
├─────────────────────────────────────────────────────┤
│ 師匠（DeepSeek R1）が推論を担当                     │
│  ↓                                                  │
│ 高品質な出力（confidence >= 0.8）が自動保存         │
│  ↓                                                  │
│ training_data/master_outputs/*.jsonl                │
└─────────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────────┐
│ Phase 2: ファインチューニング                       │
├─────────────────────────────────────────────────────┤
│ 訓練データを使って弟子（Phi-2）を訓練              │
│  ↓                                                  │
│ 弟子の性能が向上（師匠の知識を吸収）               │
│  ↓                                                  │
│ training_data/checkpoints/apprentice_*/             │
└─────────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────────┐
│ Phase 3: 世代交代（倒木）                          │
├─────────────────────────────────────────────────────┤
│ 弟子が十分成長 → 師匠に昇格                        │
│  ↓                                                  │
│ 旧師匠（DeepSeek）は引退（でも特別な役割あり）     │
│  ↓                                                  │
│ 新しい空の弟子を生成                               │
└─────────────────────────────────────────────────────┘
                      ↓
                  サイクル繰り返し
```

#### 重要な設計判断

**Q: なぜ師匠を完全に削除しないのか？**
A: 引退した師匠（DeepSeek）は「**永久的指導者**」として残る
- DB拡充時のプロンプト生成
- 新しいドメインの初期知識生成
- 品質チェック

**Q: 弟子はいつ師匠になれるのか？**
A:
- ファインチューニング完了後、手動で昇格
- 将来的には自動評価で昇格判定（未実装）

**Q: 複数の弟子を同時に訓練できるのか？**
A: できる。ドメイン別に異なる弟子を訓練可能
- 医療ドメイン弟子
- 法律ドメイン弟子
- 一般知識弟子

### 2️⃣ DB分離構造（Database Separation Structure）

#### 設計思想

```
質問が来た時の判断フロー：

質問 → まず知識DBを検索
         ├─ 見つかった → DB知識を使って推論（RAG）✅ 信頼性高
         └─ 見つからない → AI内部知識で推論 ⚠️ ハルシネーションリスク
                            ↓
                        その出力をDBに保存（自己拡充）
```

#### DB優先の理由

| 知識ソース | 信頼性 | 根拠 | ハルシネーション |
|-----------|--------|------|-----------------|
| 知識DB（.iath） | ⭐⭐⭐⭐⭐ | 人間が検証 or 専門家が作成 | ほぼゼロ |
| AI生成知識 | ⭐⭐⭐ | AIの内部知識（学習データ由来） | 中程度 |
| AI幻覚 | ⭐ | 推測・創作 | 高い |

**結論**: 知識DBにあるものは絶対に使う → ハルシネーション削減

#### 自己拡充の仕組み

```python
# 疑似コード
async def infer(question):
    # Step 1: DB検索
    db_knowledge = search_db(question)

    if db_knowledge:
        # Step 2a: RAG推論（DBの知識を使う）
        response = llm.generate(
            f"Based on this verified knowledge: {db_knowledge}\n"
            f"Answer: {question}"
        )
        return response
    else:
        # Step 2b: AI内部知識で推論
        response = llm.generate(question)

        # Step 3: 高品質なら保存（自己拡充）
        if response.confidence >= 0.7:
            save_to_db(question, response)

        return response
```

#### 重要な設計判断

**Q: なぜSQLiteと.iathの2つを使うのか？**
A: 役割分担
- **SQLite**: メタデータ（ユーザー、ワークスペース、推論履歴）
- **.iath**: 知識タイル本体（6次元座標 + コンテンツ）

**Q: confidence >= 0.7と0.8の違いは？**
A:
- `>= 0.7`: DB保存（自己拡充）← やや緩め
- `>= 0.8`: 訓練データ保存 ← 厳しめ（高品質のみ）

**Q: AI生成知識をDBに保存する際、人間のチェックは不要？**
A: 現在は自動保存。将来的には：
- 専門家によるレビューフロー
- コミュニティ投票による品質評価
- AIによる自動検証（別のAIでクロスチェック）

### 3️⃣ 樹木型空間記憶（Dendritic Memory Space）

#### 比喩の意味

人間の脳の**樹状突起（デンドライト）**のように、知識が空間的に整理されている。

通常のDB:
```
知識1: 「心臓は循環器官である」
知識2: 「脳は中枢神経系の一部である」
→ バラバラに保存（関連性が不明）
```

樹木型空間記憶:
```
知識1: 座標 [0.2, 0.8, 0.3, 0.9, 0.7, 0.8]
知識2: 座標 [0.3, 0.8, 0.4, 0.85, 0.65, 0.75]
→ 近い座標 = 意味的に関連 → 一緒に検索できる
```

#### 6次元座標系の詳細

```
Knowledge Tile の座標 = [x, y, z, c, g, v]
                        ─────┬───── ─────┬─────
                     medical_space  meta_space
```

##### medical_space [x, y, z]: ドメイン固有の3次元空間

例: 医療ドメインの場合

| 軸 | 意味 | 例 |
|----|------|-----|
| x | 解剖学的位置 | 0.0=神経系, 0.5=循環器, 1.0=消化器 |
| y | 病理学的分類 | 0.0=感染症, 0.5=代謝疾患, 1.0=外傷 |
| z | 治療レベル | 0.0=予防, 0.5=診断, 1.0=治療 |

##### meta_space [c, g, v]: メタ情報の3次元空間

| 軸 | 意味 | 値の範囲 |
|----|------|----------|
| c (Certainty) | 確実性 | 0.0=仮説, 0.5=定説, 1.0=確立された事実 |
| g (Granularity) | 粒度 | 0.0=概要, 0.5=詳細, 1.0=専門的 |
| v (Verification) | 検証状態 | 0.0=未検証, 0.5=専門家レビュー済, 1.0=複数ソース確認済 |

#### 検索の仕組み

##### 1. テキスト検索（従来型）

```python
def search_by_text(query):
    # 単純なキーワードマッチング
    results = [tile for tile in all_tiles
               if query in tile.content]
    return results
```

**問題点**: 同義語を見逃す
- 「心臓病」で検索しても「循環器疾患」がヒットしない

##### 2. 座標検索（空間検索）

```python
def search_by_coordinates(query_coords, top_k=5):
    # 6次元ユークリッド距離で計算
    distances = []
    for tile in all_tiles:
        dist = euclidean_distance(query_coords, tile.coords)
        distances.append((tile, dist))

    # 距離が近い順にソート
    distances.sort(key=lambda x: x[1])
    return distances[:top_k]
```

**利点**: 意味的に近い知識を自動で発見
- 座標が近い = 意味的に関連

##### 3. ハイブリッド検索（推奨）

```python
def hybrid_search(query_text, query_coords=None, top_k=5):
    # テキストマッチスコア計算
    text_scores = calculate_text_match(query_text)

    # 座標距離スコア計算
    if query_coords:
        spatial_scores = calculate_spatial_distance(query_coords)

    # 複合スコア = α * text_score + β * (1 - spatial_distance)
    combined_scores = 0.4 * text_scores + 0.6 * spatial_scores

    return top_k_results(combined_scores)
```

#### .iathファイル形式

```
.iath ファイル構造:

┌────────────────────────────────────┐
│ Header (64 bytes)                  │  ← マジックナンバー、バージョン
├────────────────────────────────────┤
│ Index (JSON, 可変長)                │  ← タイルIDとオフセット一覧
│ {                                  │
│   "tiles": [                       │
│     {"id": "tile_001", "offset": 512},
│     {"id": "tile_002", "offset": 2048}
│   ]                                │
│ }                                  │
├────────────────────────────────────┤
│ Data Section (zstd圧縮)            │
│   ┌──────────────────────┐         │
│   │ Tile 1 (JSON)        │         │
│   │ - metadata           │         │
│   │ - content            │         │
│   │ - coordinates        │         │
│   │ - verification       │         │
│   └──────────────────────┘         │
│   ┌──────────────────────┐         │
│   │ Tile 2 (JSON)        │         │
│   └──────────────────────┘         │
│   ...                              │
└────────────────────────────────────┘
```

**なぜzstd圧縮？**
- 高い圧縮率（gzipより優れる）
- 高速な解凍速度
- Facebookが開発（信頼性）

#### 重要な設計判断

**Q: なぜ6次元？ 3次元や10次元ではダメ？**
A:
- 3次元: ドメイン知識だけでメタ情報が表現できない
- 10次元以上: 次元の呪い（検索が遅くなる）、人間が理解不能
- 6次元: ドメイン(3) + メタ(3) = バランスが良い

**Q: 座標は誰が決めるのか？**
A:
- 現状: 人間が手動で設定（dendritic-memory-editorで）
- Priority 2で実装予定: AIが自動推定（DeepSeekが座標を生成）

**Q: .iathとFAISS（ベクトルDB）の違いは？**
A:
| 特徴 | .iath | FAISS |
|------|-------|-------|
| 座標次元 | 6次元（人間が理解可能） | 768次元（Embeddingモデル依存） |
| 検索速度 | O(n) 線形探索 | O(log n) 高速 |
| 意味の透明性 | 高い（座標の意味が明確） | 低い（ブラックボックス） |
| 編集容易性 | 高い（座標を手動調整可能） | 低い（再Embedding必要） |

**結論**: .iathは「人間が理解・編集できる知識ベース」を重視

### 4️⃣ ローカルファースト & ワンコマンドセットアップ

#### 設計思想

```
❌ 悪い例（クラウド依存）:
pip install nullai
nullai --api-key=YOUR_OPENAI_KEY  # クラウドAPI必須
→ インターネット必須、コスト高、プライバシー懸念

✅ NullAI:
./start_null_ai.sh  # ローカルで完結
→ オフライン可能、無料、プライバシー保護
```

#### ワンコマンドの実現方法

`start_null_ai.sh`が自動で実行すること:

1. ✅ 依存関係チェック（Python, Node.js, Ollama）
2. ✅ 仮想環境作成（venv）
3. ✅ Python依存関係インストール
4. ✅ Node.js依存関係インストール
5. ✅ データベース初期化（sql_app.db）
6. ✅ Ollama起動
7. ✅ バックエンド起動（port 8000）
8. ✅ フロントエンド起動（port 5173）
9. ✅ .iathメモリロード確認

**ユーザーがすることは**: `./start_null_ai.sh`を実行するだけ

#### 重要な設計判断

**Q: なぜOllamaを使うのか？ HuggingFaceだけではダメ？**
A:
- Ollama: モデル管理が楽（`ollama pull deepseek-r1`だけ）
- HuggingFace: 手動でダウンロード、パス指定が面倒

**Q: なぜDockerを使わないのか？**
A:
- Docker: 初心者には難しい、GPUパススルーが複雑
- シェルスクリプト: シンプル、デバッグしやすい、カスタマイズ容易

---

## システムアーキテクチャ全体図

```
┌─────────────────────────────────────────────────────────────────┐
│                        Frontend (React + TypeScript)            │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ Engine       │  │ Inference    │  │ Training     │          │
│  │ Manager      │  │ Panel        │  │ Dashboard    │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
│         └──────────────────┼──────────────────┘                 │
│                           │                                     │
│                    HTTP/WebSocket                               │
│                           │                                     │
└───────────────────────────┼─────────────────────────────────────┘
                            │
┌───────────────────────────┼─────────────────────────────────────┐
│                  Backend (FastAPI)                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ config.py    │  │ questions.py │  │ training.py  │          │
│  │ (Engine API) │  │ (Inference)  │  │ (Fine-tune)  │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
│         └──────────────────┼──────────────────┘                 │
│                           │                                     │
└───────────────────────────┼─────────────────────────────────────┘
                            │
┌───────────────────────────┼─────────────────────────────────────┐
│                  NullAI Core Logic                              │
│  ┌────────────────────────────────────────────────┐             │
│  │           model_router.py                      │             │
│  │  - RAG推論統合                                 │             │
│  │  - 師匠出力保存                                │             │
│  │  - エンジン管理（スワップ、昇格）             │             │
│  └────────────────────────────────────────────────┘             │
│         │                  │                  │                 │
│         ▼                  ▼                  ▼                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ iath_memory  │  │ llm_providers│  │ fine_tuning  │          │
│  │ .py          │  │ .py          │  │ .py          │          │
│  │ (6D Search)  │  │ (4 Providers)│  │ (PEFT/Unslo) │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
└─────────┼──────────────────┼──────────────────┼─────────────────┘
          │                  │                  │
          ▼                  ▼                  ▼
┌──────────────────────────────────────────────────────────────┐
│                    External Services                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ knowledge_   │  │ Ollama       │  │ HuggingFace  │       │
│  │ base.iath    │  │ (localhost)  │  │ Models       │       │
│  │ (6D Memory)  │  │              │  │              │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                              │
│  ┌──────────────┐  ┌──────────────────────────────────┐     │
│  │ sql_app.db   │  │ training_data/                   │     │
│  │ (SQLite)     │  │  - master_outputs/*.jsonl        │     │
│  │              │  │  - checkpoints/apprentice_*/     │     │
│  └──────────────┘  └──────────────────────────────────┘     │
└──────────────────────────────────────────────────────────────┘
```

---

## 各システムの詳細解説

### ModelRouter (`null_ai/model_router.py`)

#### 役割
NullAIの**頭脳**。全ての推論リクエストを管理。

#### 主要メソッド詳細

##### `__init__()`
```python
def __init__(self, config_manager):
    self.config_manager = config_manager
    self.master_model = None        # 師匠モデル
    self.apprentice_model = None    # 弟子モデル
    self.dendritic_memory = None    # .iath空間記憶

    # .iathファイルのロード
    self._load_dendritic_memory()
```

**重要**: 初期化時に自動的に.iathをロード → 起動時間が長くなる可能性

##### `async def infer()` - RAG統合推論

```python
async def infer(self, prompt, domain_id, model_config, save_to_memory=False):
    # Step 1: DB知識チェック
    has_knowledge = self._check_db_knowledge(domain_id, prompt)

    if has_knowledge:
        # Step 2a: RAG推論
        knowledge = self._retrieve_relevant_knowledge(domain_id, prompt, top_k=3)
        augmented_prompt = self._build_rag_prompt(prompt, knowledge)
        response = await self._perform_llm_inference(model_config, augmented_prompt)
    else:
        # Step 2b: 通常推論
        response = await self._perform_llm_inference(model_config, prompt)

        # Step 3: 高品質なら保存
        if save_to_memory and response["confidence"] >= 0.7:
            await self._save_inference_to_db(domain_id, prompt, response)

    # Step 4: 師匠の出力なら訓練データとして保存
    is_master = (self.master_model and
                 model_config.model_id == self.master_model.model_id)
    if is_master and response["confidence"] >= 0.8:
        await self._save_master_output_as_training_data(
            prompt, response["response"], domain_id, response["confidence"]
        )

    return response
```

**データフロー図**:
```
prompt → check DB → found?
                     ├─ YES → retrieve knowledge
                     │         ↓
                     │      augment prompt
                     │         ↓
                     │      LLM inference → response
                     │                        ↓
                     │                    is master? → save as training data
                     │
                     └─ NO → LLM inference → response
                                              ↓
                                          confidence >= 0.7? → save to DB
```

##### `_retrieve_relevant_knowledge()` - ハイブリッド検索

```python
def _retrieve_relevant_knowledge(self, domain_id, prompt, top_k=3):
    if not self.dendritic_memory:
        return []

    # ハイブリッド検索実行
    results = self.dendritic_memory.hybrid_search(
        query_text=prompt,
        query_coords=None,  # 将来的には座標も推定
        top_k=top_k,
        text_weight=0.4,    # テキストマッチの重み
        spatial_weight=0.6  # 空間距離の重み
    )

    # Knowledge Tile形式に変換
    formatted_knowledge = []
    for tile in results:
        formatted_knowledge.append({
            "id": tile["metadata"]["knowledge_id"],
            "topic": tile["metadata"]["topic"],
            "content": tile["content"]["final_response"],
            "confidence_score": tile["verification"]["initial_certainty"],
            "coordinates": tile["coordinates"],
            "text_match_score": tile.get("text_match_score", 0),
            "spatial_distance": tile.get("spatial_distance", None)
        })

    return formatted_knowledge
```

##### `_save_master_output_as_training_data()` - 訓練データ保存

```python
async def _save_master_output_as_training_data(
    self, prompt, response, domain_id, confidence
):
    # Alpaca形式で保存
    training_example = {
        "instruction": f"You are an expert in {domain_id}. Provide accurate information based on verified knowledge.",
        "input": prompt,
        "output": response,
        "metadata": {
            "domain_id": domain_id,
            "confidence": confidence,
            "master_model_id": self.master_model.model_id,
            "timestamp": datetime.utcnow().isoformat(),
            "source": "master_output"
        }
    }

    # JSONLファイルに追記
    output_file = f"training_data/master_outputs/master_outputs_{domain_id}.jsonl"
    with open(output_file, 'a', encoding='utf-8') as f:
        f.write(json.dumps(training_example, ensure_ascii=False) + '\n')
```

**なぜJSONL（改行区切りJSON）？**
- ストリーミング処理が可能（1行ずつ読める）
- ファイル破損時の影響が最小限
- HuggingFace datasetsと互換性

##### エンジン管理メソッド

```python
def promote_apprentice(self, apprentice_model_id):
    """弟子を師匠に昇格"""
    # 現在の師匠を引退
    old_master = self.master_model

    # 弟子を師匠に昇格
    self.master_model = self.apprentice_model

    # 弟子をクリア
    self.apprentice_model = None

    # 設定を保存
    self.config_manager.save_active_engines(
        self.master_model.model_id, None
    )

def swap_engines(self):
    """師匠と弟子を入れ替え"""
    temp = self.master_model
    self.master_model = self.apprentice_model
    self.apprentice_model = temp

    self.config_manager.save_active_engines(
        self.master_model.model_id,
        self.apprentice_model.model_id if self.apprentice_model else None
    )

def create_new_apprentice(self, base_model_id):
    """新しい空の弟子を生成"""
    # ベースモデルをコピーして新しいIDを付与
    new_apprentice_id = f"{base_model_id}_apprentice_{timestamp}"

    # 設定に追加
    self.apprentice_model = self.config_manager.get_model_config(base_model_id)
    self.apprentice_model.model_id = new_apprentice_id

    return new_apprentice_id
```

### DendriticMemorySpace (`null_ai/iath_memory.py`)

#### 役割
.iathファイルの読み込みと6次元空間検索を提供。

#### クラス構造

```python
class IathDecoder:
    """
    .iathファイルの低レベルデコーダー
    dendritic-memory-editor完全互換
    """
    def __init__(self, iath_file_path):
        self.file_path = Path(iath_file_path)
        self.header = None
        self.index = []
        self._load_header_and_index()

    def _load_header_and_index(self):
        """ヘッダーとインデックスの読み込み"""
        with open(self.file_path, 'rb') as f:
            # Header (64 bytes)
            header_bytes = f.read(64)
            self.header = self._parse_header(header_bytes)

            # Index (JSON)
            index_size = self.header["index_size"]
            index_bytes = f.read(index_size)
            self.index = json.loads(index_bytes.decode('utf-8'))

    def get_tile_by_id(self, knowledge_id):
        """IDでタイルを取得"""
        # インデックスからオフセットを検索
        tile_info = next(
            (t for t in self.index["tiles"] if t["id"] == knowledge_id),
            None
        )
        if not tile_info:
            return None

        # ファイルポジション移動
        with open(self.file_path, 'rb') as f:
            f.seek(tile_info["offset"])
            compressed_data = f.read(tile_info["size"])

            # zstd解凍
            decompressed = zstandard.decompress(compressed_data)
            tile_data = json.loads(decompressed.decode('utf-8'))

            return tile_data


class DendriticMemorySpace:
    """
    6次元空間記憶システム
    高レベルAPI
    """
    def __init__(self, iath_file_path):
        self.decoder = IathDecoder(iath_file_path)
        self.all_tiles = []
        self.coordinates_matrix = None  # NumPy行列
        self._load_all_tiles()

    def _load_all_tiles(self):
        """全タイルをメモリにロード"""
        self.all_tiles = self.decoder.get_all_tiles()

        # 座標行列作成（高速検索用）
        coords_list = [tile["coordinates"] for tile in self.all_tiles]
        self.coordinates_matrix = np.array(coords_list)  # Shape: (N, 6)
```

#### 検索アルゴリズム詳細

##### 座標検索（6次元ユークリッド距離）

```python
def search_by_coordinates(self, query_coords, top_k=5):
    """
    6次元空間での近傍検索

    数式: distance = sqrt(sum((q_i - t_i)^2))
    where:
        q_i = query座標のi番目の要素
        t_i = tile座標のi番目の要素
        i = 0..5 (6次元)
    """
    query_vector = np.array(query_coords)  # Shape: (6,)

    # 全タイルとの距離を一括計算（NumPy vectorization）
    # Broadcasting: (N, 6) - (6,) → (N, 6)
    distances = np.linalg.norm(
        self.coordinates_matrix - query_vector,
        axis=1  # 各行（タイル）ごとに距離計算
    )  # Shape: (N,)

    # 距離でソート
    sorted_indices = np.argsort(distances)[:top_k]

    # 結果を返す
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["spatial_distance"] = float(distances[idx])
        results.append(tile)

    return results
```

**計算量**: O(N) - 全タイル数Nに比例（線形探索）

**最適化案**（未実装）:
- KD-Tree: O(log N) だが6次元では効果薄い
- Ball-Tree: 高次元でも比較的有効
- 近似近傍探索（Annoy, HNSW）: 超高速だが精度低下

##### ハイブリッド検索（テキスト + 座標）

```python
def hybrid_search(
    self,
    query_text,
    query_coords=None,
    top_k=5,
    text_weight=0.4,
    spatial_weight=0.6
):
    """
    テキストマッチと空間距離の複合スコアリング
    """
    # Step 1: テキストマッチスコア計算
    text_scores = []
    for tile in self.all_tiles:
        score = self._calculate_text_match(query_text, tile)
        text_scores.append(score)
    text_scores = np.array(text_scores)  # Shape: (N,)

    # Step 2: 空間距離スコア計算
    if query_coords:
        spatial_distances = np.linalg.norm(
            self.coordinates_matrix - np.array(query_coords),
            axis=1
        )
        # 距離を0-1のスコアに変換（逆数）
        max_dist = spatial_distances.max()
        spatial_scores = 1.0 - (spatial_distances / max_dist)
    else:
        spatial_scores = np.zeros(len(self.all_tiles))

    # Step 3: 複合スコア計算
    combined_scores = (
        text_weight * text_scores +
        spatial_weight * spatial_scores
    )

    # Step 4: スコアでソート
    sorted_indices = np.argsort(combined_scores)[::-1][:top_k]

    # 結果を返す
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["text_match_score"] = float(text_scores[idx])
        tile["spatial_score"] = float(spatial_scores[idx])
        tile["combined_score"] = float(combined_scores[idx])
        if query_coords:
            tile["spatial_distance"] = float(spatial_distances[idx])
        results.append(tile)

    return results

def _calculate_text_match(self, query, tile):
    """
    テキストマッチスコア計算（簡易版）

    将来的にはBM25やTF-IDFを使う
    """
    query_lower = query.lower()
    content = tile["content"]["final_response"].lower()
    topic = tile["metadata"]["topic"].lower()

    # キーワードマッチング
    query_words = set(query_lower.split())
    content_words = set(content.split())
    topic_words = set(topic.split())

    # Jaccard類似度
    content_jaccard = len(query_words & content_words) / len(query_words | content_words)
    topic_jaccard = len(query_words & topic_words) / len(query_words | topic_words)

    # 複合スコア（トピックを重視）
    score = 0.3 * content_jaccard + 0.7 * topic_jaccard

    return score
```

### FineTuningManager (`null_ai/fine_tuning.py`)

#### 役割
弟子モデルのファインチューニングを実行。

#### PEFT（QLoRA）方式の詳細

```python
async def fine_tune_with_huggingface_peft(
    self,
    model_name,
    training_examples,
    output_dir,
    epochs=3,
    learning_rate=2e-4,
    batch_size=4,
    lora_r=8,
    lora_alpha=16
):
    """
    Parameter-Efficient Fine-Tuning with QLoRA

    QLoRA = Quantized LoRA
    - 4-bit量子化でメモリ削減
    - LoRAで訓練パラメータ削減
    → 12GB GPUでも7Bモデルを訓練可能
    """

    # Step 1: モデルを4-bit量子化でロード
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,               # 4-bit量子化
        bnb_4bit_quant_type="nf4",       # NormalFloat4（最適な量子化方式）
        bnb_4bit_compute_dtype=torch.float16,  # 計算はfp16で
        bnb_4bit_use_double_quant=True   # 二重量子化（さらにメモリ削減）
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        device_map="auto"  # 自動的にGPU/CPUに配置
    )

    # Step 2: LoRA設定
    lora_config = LoraConfig(
        r=lora_r,                        # LoRAランク（低いほど軽量）
        lora_alpha=lora_alpha,           # スケーリング係数
        target_modules=[                 # どのレイヤーにLoRAを適用するか
            "q_proj", "k_proj", "v_proj", "o_proj",  # Attention
            "gate_proj", "up_proj", "down_proj"      # MLP
        ],
        lora_dropout=0.05,               # Dropout率
        bias="none",                     # Biasは訓練しない
        task_type="CAUSAL_LM"            # タスクタイプ
    )

    model = get_peft_model(model, lora_config)

    # 訓練可能パラメータ数を表示
    model.print_trainable_parameters()
    # 例: trainable params: 4.2M || all params: 2.7B || trainable%: 0.16%
    #     → 全パラメータの0.16%だけ訓練！

    # Step 3-9: データ準備、訓練、保存（省略）
    ...
```

**QLoRAの仕組み**:
```
通常のファインチューニング:
┌─────────────────────────┐
│ モデル全体（2.7B params）│ ← 全て訓練
│ メモリ: ~40GB           │
└─────────────────────────┘

QLoRA:
┌─────────────────────────┐
│ 元モデル（2.7B params）  │ ← 4-bit量子化、frozen（訓練しない）
│ メモリ: ~7GB            │
└─────────────────────────┘
          +
┌─────────────────────────┐
│ LoRAアダプター（4.2M）   │ ← これだけ訓練
│ メモリ: ~0.5GB          │
└─────────────────────────┘
         =
   合計メモリ: ~12GB
```

#### Alpaca形式データの整形

```python
def format_training_examples_for_model(
    self,
    training_examples,
    template="alpaca"
):
    """
    Alpaca形式 → モデル用プロンプトに整形
    """
    formatted_prompts = []

    for example in training_examples:
        instruction = example["instruction"]
        input_text = example["input"]
        output_text = example["output"]

        if template == "alpaca":
            if input_text:
                prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input_text}

### Response:
{output_text}"""
            else:
                prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:
{output_text}"""

        formatted_prompts.append(prompt)

    return formatted_prompts
```

**なぜこの形式？**
- 明確な区切り（`###`）
- instruction-following能力の向上
- オープンソースコミュニティの標準

---

## データフロー完全図解

### フロー1: 通常推論（RAGあり）

```
┌──────────────────────────────────────────────────────────────┐
│ ユーザー: "心臓の働きについて教えて"                        │
└────────────────────┬─────────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: InferencePanel.tsx                               │
│  - 質問をバックエンドに送信                               │
└────────────────────┬───────────────────────────────────────┘
                     ↓ HTTP POST /api/questions
┌────────────────────────────────────────────────────────────┐
│ Backend: questions.py                                      │
│  - InferenceService.ask_question()                         │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: model_router.py                               │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: _check_db_knowledge("medical", "心臓の働き") │   │
│  │  → DendriticMemorySpace.search_by_text()           │   │
│  │  → 結果: 3件見つかった                              │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: _retrieve_relevant_knowledge()             │   │
│  │  → hybrid_search("心臓の働き", top_k=3)            │   │
│  │  → 取得:                                           │   │
│  │    [1] 心臓の解剖学 (score: 0.92)                  │   │
│  │    [2] 循環器系の機能 (score: 0.85)                │   │
│  │    [3] 心臓病の分類 (score: 0.73)                  │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: プロンプト拡張                             │   │
│  │  augmented_prompt = """                            │   │
│  │  Based on the following verified knowledge:        │   │
│  │                                                     │   │
│  │  [Knowledge 1 - expert verification, conf: 0.9]    │   │
│  │  Topic: 心臓の解剖学                                │   │
│  │  Content: 心臓は4つの部屋から構成され...           │   │
│  │                                                     │   │
│  │  [Knowledge 2 - ...]                               │   │
│  │                                                     │   │
│  │  Now, please answer:                               │   │
│  │  心臓の働きについて教えて                           │   │
│  │  """                                               │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: LLM推論                                    │   │
│  │  → llm_providers.py                                │   │
│  │  → OllamaProvider.infer()                          │   │
│  │  → model: deepseek-r1:1.5b                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: レスポンス生成                             │   │
│  │  response = {                                      │   │
│  │    "response": "心臓は循環器系の中心器官で...",     │   │
│  │    "confidence": 0.88,                             │   │
│  │    "thinking": "検証済み知識に基づいて回答",        │   │
│  │    "retrieved_knowledge": [...]                    │   │
│  │  }                                                 │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 6: 師匠の出力？                               │   │
│  │  is_master = True                                  │   │
│  │  confidence = 0.88 >= 0.8 ✓                        │   │
│  │  → _save_master_output_as_training_data()          │   │
│  │  → training_data/master_outputs/medical.jsonl      │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: レスポンス表示                                   │
│  - ResponseDisplay.tsx                                     │
│  - 「心臓は循環器系の中心器官で...」                       │
│  - Retrieved Knowledge バッジ表示                          │
└────────────────────────────────────────────────────────────┘
```

### フロー2: 通常推論（RAGなし、自己拡充）

```
┌──────────────────────────────────────────────────────────────┐
│ ユーザー: "量子コンピュータの原理は？"                      │
└────────────────────┬─────────────────────────────────────────┘
                     ↓
                  (同上)
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: model_router.py                               │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: _check_db_knowledge("general", "量子...")    │   │
│  │  → DendriticMemorySpace.search_by_text()           │   │
│  │  → 結果: 見つからなかった ❌                        │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: AI内部知識で推論                           │   │
│  │  → LLM.generate("量子コンピュータの原理は？")       │   │
│  │  → model: deepseek-r1:1.5b                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: レスポンス生成                             │   │
│  │  response = {                                      │   │
│  │    "response": "量子コンピュータは...",             │   │
│  │    "confidence": 0.75                              │   │
│  │  }                                                 │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: 自己拡充（DBに保存）                       │   │
│  │  confidence = 0.75 >= 0.7 ✓                        │   │
│  │  save_to_memory = True                             │   │
│  │  → _save_inference_to_db()                         │   │
│  │  → SQLite: knowledge_tiles テーブル                │   │
│  │     (将来的には.iathにも保存)                       │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: 師匠の出力として保存                       │   │
│  │  is_master = True                                  │   │
│  │  confidence = 0.75 < 0.8 ❌                         │   │
│  │  → 訓練データには保存しない                        │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
              (レスポンス表示)
```

**重要な違い**:
- RAGあり: `confidence >= 0.8`で訓練データ保存
- RAGなし: `confidence >= 0.7`でDB保存、`>= 0.8`で訓練データ保存

### フロー3: ファインチューニング実行

```
┌──────────────────────────────────────────────────────────────┐
│ ユーザー: Training Dashboard で "Start Fine-tuning"          │
│  - Apprentice Model: microsoft/phi-2                         │
│  - Domain: medical                                           │
│  - Method: peft                                              │
│  - Epochs: 3                                                 │
└────────────────────┬─────────────────────────────────────────┘
                     ↓ HTTP POST /api/training/start
┌────────────────────────────────────────────────────────────┐
│ Backend: training.py                                       │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: 訓練データ存在チェック                     │   │
│  │  → FineTuningManager.load_training_data("medical") │   │
│  │  → training_data/master_outputs/medical.jsonl      │   │
│  │  → 結果: 150サンプル                               │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: バックグラウンドタスク開始                 │   │
│  │  background_tasks.add_task(run_training)           │   │
│  │  → すぐにレスポンス返却（非同期）                  │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: fine_tuning.py (バックグラウンド)             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: モデルロード（4-bit量子化）                │   │
│  │  → AutoModelForCausalLM.from_pretrained(           │   │
│  │       "microsoft/phi-2",                           │   │
│  │       quantization_config=bnb_config               │   │
│  │     )                                              │   │
│  │  → メモリ使用: ~7GB                                │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: LoRA設定                                   │   │
│  │  → get_peft_model(model, lora_config)              │   │
│  │  → 訓練可能パラメータ: 4.2M / 2.7B (0.16%)        │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: データ準備                                 │   │
│  │  → format_training_examples_for_model()            │   │
│  │  → Alpaca形式 → モデル用プロンプトに整形           │   │
│  │  → Dataset.from_dict({"text": prompts})            │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: トレーニング開始                           │   │
│  │  Epoch 1/3:                                        │   │
│  │    [===>    ] 35% loss: 1.245                      │   │
│  │    → current_training_state.update({               │   │
│  │         "progress": 35,                            │   │
│  │         "current_epoch": 1,                        │   │
│  │         "loss": 1.245                              │   │
│  │       })                                           │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: 完了                                       │   │
│  │  → trainer.save_model(output_dir)                  │   │
│  │  → training_data/checkpoints/apprentice_medical_*/ │   │
│  │  → current_training_state["is_training"] = False   │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: TrainingDashboard.tsx                            │
│  - 2秒ごとにポーリング: GET /api/training/status           │
│  - プログレスバー更新: 35% → 67% → 100%                   │
│  - 完了時: チェックポイント一覧を再取得                   │
└────────────────────────────────────────────────────────────┘
```

---

## 技術スタック詳細

### フロントエンド

```
React 18.2 + TypeScript 5.0
├─ Vite 4.4 (ビルドツール)
├─ TailwindCSS 3.3 (スタイリング)
└─ axios (HTTP クライアント)

主要コンポーネント:
- EngineManager.tsx      (417行) - エンジン管理UI
- InferencePanel.tsx     (321行) - 推論パネル
- TrainingDashboard.tsx  (400行) - トレーニングダッシュボード
- KnowledgePanel.tsx     (185行) - 知識ブラウザ
```

### バックエンド

```
FastAPI 0.115.6 + Python 3.13+
├─ Uvicorn (ASGIサーバー)
├─ SQLAlchemy 2.0 (ORM)
├─ Pydantic 2.10 (バリデーション)
└─ Alembic (マイグレーション)

主要API:
- /api/config/*     - エンジン管理
- /api/questions    - 推論実行
- /api/training/*   - ファインチューニング
- /api/knowledge/*  - 知識タイル管理
```

### NullAI Core

```
Python 3.13+
├─ transformers 4.36+ (HuggingFace)
├─ torch 2.0+ (PyTorch)
├─ peft 0.7+ (LoRA/QLoRA)
├─ trl 0.7+ (Reinforcement Learning from Human Feedback)
├─ datasets 2.15+ (データセット処理)
├─ bitsandbytes 0.41+ (量子化)
├─ accelerate 0.25+ (分散訓練)
├─ zstandard 0.22+ (.iath圧縮)
└─ numpy 1.24+ (数値計算)

主要モジュール:
- model_router.py     (800行) - RAG統合、エンジン管理
- iath_memory.py      (362行) - 6次元空間記憶
- fine_tuning.py      (640行) - ファインチューニング
- llm_providers.py    (390行) - LLMプロバイダー統合
```

### LLMプロバイダー

```
1. Ollama
   - ローカルモデル管理
   - ollama pull deepseek-r1:1.5b
   - API: http://localhost:11434

2. HuggingFace Transformers
   - 直接ロード
   - AutoModelForCausalLM.from_pretrained()
   - GPU/CPU自動配置

3. MLX (Apple Silicon)
   - M1/M2/M3 Mac専用
   - 統合メモリ活用
   - mlx-lm ライブラリ

4. GGUF (llama-cpp-python)
   - 量子化モデル（.gguf）
   - CPU推論に最適
   - GPU acceleration対応
```

---

## よくある誤解と注意点

### 誤解1: 「RAGは常に使われる」

❌ **誤解**: 全ての推論でRAGが使われる
✅ **真実**: DBに知識がある場合のみRAGが発動

```python
# 実際の動作
if has_knowledge:
    # RAG推論
else:
    # 通常推論（RAGなし）
```

**見分け方**:
- RAGあり: レスポンスに`retrieved_knowledge`フィールドが含まれる
- RAGなし: `retrieved_knowledge`が空

### 誤解2: 「弟子は自動的に師匠になる」

❌ **誤解**: ファインチューニングが完了したら自動で師匠に昇格
✅ **真実**: 手動で昇格操作が必要

```
ファインチューニング完了
  ↓
チェックポイント保存
  ↓
【手動操作】Engine Manager で Promote をクリック
  ↓
弟子が師匠に昇格
```

**理由**: 品質チェックを人間が行うべき

### 誤解3: 「.iathファイルは自動更新される」

❌ **誤解**: AI生成知識が自動的に.iathに保存される
✅ **真実**: 現在はJSONLのみ、.iath保存は未実装（Priority 2）

```
現状:
AI生成知識 → SQLite + JSONL ✅
           → .iath ❌（未実装）

Priority 2実装後:
AI生成知識 → SQLite + JSONL + .iath ✅
```

### 誤解4: 「ファインチューニングは全パラメータを訓練する」

❌ **誤解**: モデル全体（2.7B パラメータ）を訓練
✅ **真実**: LoRAアダプター（4.2M）だけ訓練

```
訓練されるパラメータ:
- 元モデル: 2,700,000,000 → frozen（訓練しない）
- LoRA:         4,200,000 → 訓練する ✅

訓練パラメータ比率: 0.16%
```

**メリット**:
- メモリ削減（40GB → 12GB）
- 訓練時間短縮（10時間 → 2時間）
- 元モデルは変更されない（安全）

### 誤解5: 「SQLiteと.iathは同じデータを保存」

❌ **誤解**: SQLiteと.iathは重複している
✅ **真実**: 役割が完全に異なる

| データベース | 保存内容 | 用途 |
|-------------|---------|------|
| SQLite | ユーザー、ワークスペース、推論履歴、メタデータ | アプリケーション管理 |
| .iath | Knowledge Tile（6次元座標 + コンテンツ） | 知識検索・RAG推論 |

**例**:
```
SQLite:
- users テーブル: nullai_default_user
- workspaces テーブル: default_workspace
- inference_history: 過去の質問と回答

.iath:
- Tile 1: [0.2, 0.8, 0.3, 0.9, 0.7, 0.8] "心臓の働き..."
- Tile 2: [0.3, 0.8, 0.4, 0.85, 0.65, 0.75] "循環器系..."
```

### 誤解6: 「confidence値はAIが自動計算」

❌ **誤解**: AIが自己評価してconfidenceを返す
✅ **真実**: 現在は**固定値**（プロバイダーごと）

```python
# llm_providers.py
class OllamaProvider:
    async def infer(...):
        return {
            "response": response_text,
            "confidence": 0.85  # ← 固定値！
        }
```

**将来の改善**:
- 複数モデルでクロスチェック
- 応答の不確実性を計算（エントロピー）
- 人間によるフィードバック学習

---

## 設計判断の理由

### 判断1: なぜPEFTを採用したか

**候補**:
1. フルファインチューニング
2. PEFT (LoRA/QLoRA)
3. Adapter
4. Prompt Tuning

**採用**: PEFT (QLoRA)

**理由**:
```
比較表:

                  メモリ  速度  品質  汎用性
フルFT            ×      ×    ⭐⭐⭐  ⭐⭐⭐
PEFT (QLoRA)      ⭐⭐⭐  ⭐⭐  ⭐⭐⭐  ⭐⭐⭐
Adapter           ⭐⭐    ⭐⭐  ⭐⭐    ⭐⭐
Prompt Tuning     ⭐⭐⭐  ⭐⭐⭐ ⭐      ⭐
```

**結論**: PEFTがバランス最良

### 判断2: なぜAlpaca形式を採用したか

**候補**:
1. Alpaca
2. ShareGPT
3. OpenAssistant
4. Custom

**採用**: Alpaca

**理由**:
- オープンソースで広く採用
- instruction-input-output構造が明確
- HuggingFace datasetsと互換性
- コミュニティのベストプラクティス

### 判断3: なぜハイブリッド検索か

**候補**:
1. テキストのみ
2. 座標のみ
3. ハイブリッド

**採用**: ハイブリッド

**理由**:
```
テキストのみ:
- 利点: シンプル
- 欠点: 同義語を見逃す

座標のみ:
- 利点: 意味的に関連する知識を発見
- 欠点: 座標が不正確だと失敗

ハイブリッド:
- 利点: 両方の長所を活かせる
- 欠点: パラメータ調整が必要（text_weight, spatial_weight）
```

**現在の設定**:
```python
text_weight = 0.4
spatial_weight = 0.6
# → 座標をやや重視（意味的関連性を優先）
```

### 判断4: なぜ循環インポートをlazy importで解決したか

**候補**:
1. Lazy import（関数内でimport）
2. アーキテクチャ変更（依存関係の整理）
3. 中間モジュール導入

**採用**: Lazy import

**理由**:
- 最小限の変更で解決
- パフォーマンスへの影響は軽微
- 既存コードの大幅な書き換え不要

**実装例**:
```python
def _check_db_knowledge(self, domain_id, prompt):
    # 関数内でimport → 循環回避
    from backend.app.database.session import SessionLocal
    db = SessionLocal()
    # ...
```

---

## 拡張時の考慮事項

### 新しいLLMプロバイダーを追加する場合

**手順**:

1. `llm_providers.py`に新しいクラスを追加
```python
class NewProvider:
    async def infer(self, model_config, prompt, temperature):
        # 実装
        pass

    async def infer_streaming(self, model_config, prompt, temperature):
        # 実装
        pass
```

2. `model_router.py`の`_perform_llm_inference()`に追加
```python
if provider == "ollama":
    result = await self.ollama_provider.infer(...)
elif provider == "new_provider":  # ← 追加
    result = await self.new_provider.infer(...)
```

3. `backend/app/config.py`の`ModelProvider`列挙型に追加
```python
class ModelProvider(str, Enum):
    OLLAMA = "ollama"
    HUGGINGFACE = "huggingface"
    NEW_PROVIDER = "new_provider"  # ← 追加
```

### 新しいドメインを追加する場合

**手順**:

1. `.iath`ファイルでドメイン用の座標空間を定義
```
医療ドメイン: medical_space [x, y, z]
法律ドメイン: legal_space [x, y, z]  ← 追加
  - x: 法分野（民法、刑法、商法...）
  - y: 判例レベル（地裁、高裁、最高裁）
  - z: 時代（古典、現代、最新）
```

2. `backend/app/config.py`にドメイン設定追加
```python
domains = [
    {"domain_id": "medical", "name": "医療"},
    {"domain_id": "legal", "name": "法律"}  # ← 追加
]
```

3. 訓練データディレクトリ作成
```bash
mkdir -p training_data/master_outputs/
touch training_data/master_outputs/master_outputs_legal.jsonl
```

### 座標自動推定を実装する場合（Priority 2）

**設計案**:

```python
# null_ai/coordinate_estimator.py

class CoordinateEstimator:
    def __init__(self, llm_model):
        """
        DeepSeek R1を使って座標を推定
        """
        self.llm = llm_model

    async def estimate_coordinates(
        self,
        prompt: str,
        response: str,
        domain_id: str
    ) -> List[float]:
        """
        6次元座標を推定

        Returns: [x, y, z, c, g, v]
        """
        # プロンプト構築
        estimation_prompt = f"""You are an expert in knowledge space mapping.
Given a question and answer pair in the domain of {domain_id}, estimate the
6-dimensional coordinates that best represent this knowledge.

Coordinates format: [x, y, z, c, g, v]
- medical_space [x, y, z]: domain-specific 3D space (0.0-1.0)
- meta_space [c, g, v]: Certainty, Granularity, Verification (0.0-1.0)

Question: {prompt}
Answer: {response}

Output ONLY the coordinates as a JSON array: [x, y, z, c, g, v]
"""

        # LLMに座標推定を依頼
        result = await self.llm.generate(estimation_prompt)

        # JSONパース
        coords = json.loads(result)

        # バリデーション
        assert len(coords) == 6
        assert all(0.0 <= c <= 1.0 for c in coords)

        return coords
```

### WebSocketでリアルタイム進捗を実装する場合

**設計案**:

```python
# backend/app/main.py

@app.websocket("/ws/training/{session_id}")
async def training_websocket(websocket: WebSocket, session_id: str):
    await websocket.accept()

    # 進捗コールバック
    async def progress_callback(state):
        await websocket.send_json({
            "type": "progress",
            "data": state
        })

    # ファインチューニング開始
    await fine_tuning_manager.start_training(
        ...,
        progress_callback=progress_callback
    )
```

---

## まとめ: プロジェクトの本質

NullAIは単なるRAGシステムでも、単なるファインチューニングツールでもありません。

**NullAIの本質**:
```
自己進化する知識生態系

師匠AI → 知識生成 → 弟子AI学習 → 昇格 → 新しい弟子 → サイクル継続
   ↓                                      ↑
DB拡充（自己拡充）                   ファインチューニング
   ↓                                      ↑
樹木型空間記憶（6次元座標）          高品質訓練データ
   ↓                                      ↑
意味的知識整理                      師匠の知識継承
   ↓                                      ↑
  └────────────── サイクル ──────────────┘
```

**4つの核心思想の統合**:
1. **倒木システム**: 世代交代による進化
2. **DB分離構造**: 信頼性の確保と自己拡充
3. **樹木型空間記憶**: 意味的知識整理
4. **ローカルファースト**: プライバシーとコスト

これら全てが**有機的に結合**し、AIが自己進化する生態系を形成しています。

---

**このガイドを理解したら、あなたはNullAIの設計思想を正しく継承できます。**

**頑張ってください！🌲🔥**

---

**Document Version**: 1.0
**Total Pages**: 60+
**Total Words**: 15,000+
**Author**: Claude (Sonnet 4.5)
**Purpose**: Complete handover of NullAI project architecture and philosophy