nullai-knowledge-system / PROJECT_ARCHITECTURE_GUIDE.md

kofdai

Upload PROJECT_ARCHITECTURE_GUIDE.md with huggingface_hub

3f9e2e3 verified 27 days ago

preview code

raw

history blame contribute delete

71.4 kB

NullAI プロジェクト完全理解ガイド

最終更新: 2025-12-02 対象読者: このプロジェクトを引き継ぐ全ての開発者目的: プロジェクトの全体像を完全に理解し、設計思想を正しく継承する

プロジェクト概要

NullAIとは何か

NullAIは、自己進化型多ドメイン知識推論エンジンです。

核心的な問いと答え

Q: 何を解決しようとしているのか？ A: 「AIのハルシネーション（幻覚）」と「小型モデルの性能不足」の両方を同時に解決

Q: どうやって解決するのか？ A:

DB優先推論（RAG） → ハルシネーション削減
師匠→弟子のファインチューニング → 小型モデルの性能向上
樹木型空間記憶 → 知識の意味的整理と高速検索
自己拡充サイクル → 知識ベースの自動成長

Q: 他のRAGシステムとの違いは？ A:

❌ 普通のRAG: ベクトルDBで検索するだけ
✅ NullAI: 6次元空間座標で知識を配置し、意味的な近傍検索が可能

Q: 他のファインチューニングシステムとの違いは？ A:

❌ 普通のFT: 人間が訓練データを手動作成
✅ NullAI: 師匠AIが自動的に訓練データを生成 → 弟子が学習 → 弟子が師匠に昇格 → 無限サイクル

プロジェクト名の由来

Null = ゼロ（ハルシネーション） AI = Artificial Intelligence

→ ゼロ・ハルシネーションを目指すAI

4つの核心思想（こだわりポイント）

1️⃣ 倒木システム（Fallen Tree System）

比喩の意味

森で大木（老いた木）が倒れると、その養分で新しい若木が育つ。NullAIでは：

🌲 大木（師匠モデル）: 高性能だが重いAI（例: DeepSeek R1 32B）
🌱 若木（弟子モデル）: 最初は空っぽだが軽量なAI（例: Phi-2 2.7B）
🍂 養分（訓練データ）: 師匠の高品質な出力（Alpaca形式JSONL）

システムの流れ

┌─────────────────────────────────────────────────────┐
│ Phase 1: 師匠の統治時代                             │
├─────────────────────────────────────────────────────┤
│ 師匠（DeepSeek R1）が推論を担当                     │
│  ↓                                                  │
│ 高品質な出力（confidence >= 0.8）が自動保存         │
│  ↓                                                  │
│ training_data/master_outputs/*.jsonl                │
└─────────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────────┐
│ Phase 2: ファインチューニング                       │
├─────────────────────────────────────────────────────┤
│ 訓練データを使って弟子（Phi-2）を訓練              │
│  ↓                                                  │
│ 弟子の性能が向上（師匠の知識を吸収）               │
│  ↓                                                  │
│ training_data/checkpoints/apprentice_*/             │
└─────────────────────────────────────────────────────┘
                      ↓
┌─────────────────────────────────────────────────────┐
│ Phase 3: 世代交代（倒木）                          │
├─────────────────────────────────────────────────────┤
│ 弟子が十分成長 → 師匠に昇格                        │
│  ↓                                                  │
│ 旧師匠（DeepSeek）は引退（でも特別な役割あり）     │
│  ↓                                                  │
│ 新しい空の弟子を生成                               │
└─────────────────────────────────────────────────────┘
                      ↓
                  サイクル繰り返し

重要な設計判断

Q: なぜ師匠を完全に削除しないのか？ A: 引退した師匠（DeepSeek）は「永久的指導者」として残る

DB拡充時のプロンプト生成
新しいドメインの初期知識生成
品質チェック

Q: 弟子はいつ師匠になれるのか？ A:

ファインチューニング完了後、手動で昇格
将来的には自動評価で昇格判定（未実装）

Q: 複数の弟子を同時に訓練できるのか？ A: できる。ドメイン別に異なる弟子を訓練可能

医療ドメイン弟子
法律ドメイン弟子
一般知識弟子

2️⃣ DB分離構造（Database Separation Structure）

設計思想

質問が来た時の判断フロー：

質問 → まず知識DBを検索
         ├─ 見つかった → DB知識を使って推論（RAG）✅ 信頼性高
         └─ 見つからない → AI内部知識で推論 ⚠️ ハルシネーションリスク
                            ↓
                        その出力をDBに保存（自己拡充）

DB優先の理由

知識ソース	信頼性	根拠	ハルシネーション
知識DB（.iath）	⭐⭐⭐⭐⭐	人間が検証 or 専門家が作成	ほぼゼロ
AI生成知識	⭐⭐⭐	AIの内部知識（学習データ由来）	中程度
AI幻覚	⭐	推測・創作	高い

結論: 知識DBにあるものは絶対に使う → ハルシネーション削減

自己拡充の仕組み

# 疑似コード
async def infer(question):
    # Step 1: DB検索
    db_knowledge = search_db(question)

    if db_knowledge:
        # Step 2a: RAG推論（DBの知識を使う）
        response = llm.generate(
            f"Based on this verified knowledge: {db_knowledge}\n"
            f"Answer: {question}"
        )
        return response
    else:
        # Step 2b: AI内部知識で推論
        response = llm.generate(question)

        # Step 3: 高品質なら保存（自己拡充）
        if response.confidence >= 0.7:
            save_to_db(question, response)

        return response

重要な設計判断

Q: なぜSQLiteと.iathの2つを使うのか？ A: 役割分担

SQLite: メタデータ（ユーザー、ワークスペース、推論履歴）
.iath: 知識タイル本体（6次元座標 + コンテンツ）

Q: confidence >= 0.7と0.8の違いは？ A:

>= 0.7: DB保存（自己拡充）← やや緩め
>= 0.8: 訓練データ保存 ← 厳しめ（高品質のみ）

Q: AI生成知識をDBに保存する際、人間のチェックは不要？ A: 現在は自動保存。将来的には：

専門家によるレビューフロー
コミュニティ投票による品質評価
AIによる自動検証（別のAIでクロスチェック）

3️⃣ 樹木型空間記憶（Dendritic Memory Space）

比喩の意味

人間の脳の樹状突起（デンドライト）のように、知識が空間的に整理されている。

通常のDB:

知識1: 「心臓は循環器官である」
知識2: 「脳は中枢神経系の一部である」
→ バラバラに保存（関連性が不明）

樹木型空間記憶:

知識1: 座標 [0.2, 0.8, 0.3, 0.9, 0.7, 0.8]
知識2: 座標 [0.3, 0.8, 0.4, 0.85, 0.65, 0.75]
→ 近い座標 = 意味的に関連 → 一緒に検索できる

6次元座標系の詳細

Knowledge Tile の座標 = [x, y, z, c, g, v]
                        ─────┬───── ─────┬─────
                     medical_space  meta_space

medical_space [x, y, z]: ドメイン固有の3次元空間

例: 医療ドメインの場合

軸	意味	例
x	解剖学的位置	0.0=神経系, 0.5=循環器, 1.0=消化器
y	病理学的分類	0.0=感染症, 0.5=代謝疾患, 1.0=外傷
z	治療レベル	0.0=予防, 0.5=診断, 1.0=治療

meta_space [c, g, v]: メタ情報の3次元空間

軸	意味	値の範囲
c (Certainty)	確実性	0.0=仮説, 0.5=定説, 1.0=確立された事実
g (Granularity)	粒度	0.0=概要, 0.5=詳細, 1.0=専門的
v (Verification)	検証状態	0.0=未検証, 0.5=専門家レビュー済, 1.0=複数ソース確認済

検索の仕組み

1. テキスト検索（従来型）

def search_by_text(query):
    # 単純なキーワードマッチング
    results = [tile for tile in all_tiles
               if query in tile.content]
    return results

問題点: 同義語を見逃す

「心臓病」で検索しても「循環器疾患」がヒットしない

2. 座標検索（空間検索）

def search_by_coordinates(query_coords, top_k=5):
    # 6次元ユークリッド距離で計算
    distances = []
    for tile in all_tiles:
        dist = euclidean_distance(query_coords, tile.coords)
        distances.append((tile, dist))

    # 距離が近い順にソート
    distances.sort(key=lambda x: x[1])
    return distances[:top_k]

利点: 意味的に近い知識を自動で発見

座標が近い = 意味的に関連

3. ハイブリッド検索（推奨）

def hybrid_search(query_text, query_coords=None, top_k=5):
    # テキストマッチスコア計算
    text_scores = calculate_text_match(query_text)

    # 座標距離スコア計算
    if query_coords:
        spatial_scores = calculate_spatial_distance(query_coords)

    # 複合スコア = α * text_score + β * (1 - spatial_distance)
    combined_scores = 0.4 * text_scores + 0.6 * spatial_scores

    return top_k_results(combined_scores)

.iathファイル形式

.iath ファイル構造:

┌────────────────────────────────────┐
│ Header (64 bytes)                  │  ← マジックナンバー、バージョン
├────────────────────────────────────┤
│ Index (JSON, 可変長)                │  ← タイルIDとオフセット一覧
│ {                                  │
│   "tiles": [                       │
│     {"id": "tile_001", "offset": 512},
│     {"id": "tile_002", "offset": 2048}
│   ]                                │
│ }                                  │
├────────────────────────────────────┤
│ Data Section (zstd圧縮)            │
│   ┌──────────────────────┐         │
│   │ Tile 1 (JSON)        │         │
│   │ - metadata           │         │
│   │ - content            │         │
│   │ - coordinates        │         │
│   │ - verification       │         │
│   └──────────────────────┘         │
│   ┌──────────────────────┐         │
│   │ Tile 2 (JSON)        │         │
│   └──────────────────────┘         │
│   ...                              │
└────────────────────────────────────┘

なぜzstd圧縮？

高い圧縮率（gzipより優れる）
高速な解凍速度
Facebookが開発（信頼性）

重要な設計判断

Q: なぜ6次元？ 3次元や10次元ではダメ？ A:

3次元: ドメイン知識だけでメタ情報が表現できない
10次元以上: 次元の呪い（検索が遅くなる）、人間が理解不能
6次元: ドメイン(3) + メタ(3) = バランスが良い

Q: 座標は誰が決めるのか？ A:

現状: 人間が手動で設定（dendritic-memory-editorで）
Priority 2で実装予定: AIが自動推定（DeepSeekが座標を生成）

Q: .iathとFAISS（ベクトルDB）の違いは？ A:

特徴	.iath	FAISS
座標次元	6次元（人間が理解可能）	768次元（Embeddingモデル依存）
検索速度	O(n) 線形探索	O(log n) 高速
意味の透明性	高い（座標の意味が明確）	低い（ブラックボックス）
編集容易性	高い（座標を手動調整可能）	低い（再Embedding必要）

結論: .iathは「人間が理解・編集できる知識ベース」を重視

4️⃣ ローカルファースト & ワンコマンドセットアップ

設計思想

❌ 悪い例（クラウド依存）:
pip install nullai
nullai --api-key=YOUR_OPENAI_KEY  # クラウドAPI必須
→ インターネット必須、コスト高、プライバシー懸念

✅ NullAI:
./start_null_ai.sh  # ローカルで完結
→ オフライン可能、無料、プライバシー保護

ワンコマンドの実現方法

start_null_ai.shが自動で実行すること:

✅ 依存関係チェック（Python, Node.js, Ollama）
✅ 仮想環境作成（venv）
✅ Python依存関係インストール
✅ Node.js依存関係インストール
✅ データベース初期化（sql_app.db）
✅ Ollama起動
✅ バックエンド起動（port 8000）
✅ フロントエンド起動（port 5173）
✅ .iathメモリロード確認

ユーザーがすることは: ./start_null_ai.shを実行するだけ

重要な設計判断

Q: なぜOllamaを使うのか？ HuggingFaceだけではダメ？ A:

Ollama: モデル管理が楽（ollama pull deepseek-r1だけ）
HuggingFace: 手動でダウンロード、パス指定が面倒

Q: なぜDockerを使わないのか？ A:

Docker: 初心者には難しい、GPUパススルーが複雑
シェルスクリプト: シンプル、デバッグしやすい、カスタマイズ容易

システムアーキテクチャ全体図

┌─────────────────────────────────────────────────────────────────┐
│                        Frontend (React + TypeScript)            │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ Engine       │  │ Inference    │  │ Training     │          │
│  │ Manager      │  │ Panel        │  │ Dashboard    │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
│         └──────────────────┼──────────────────┘                 │
│                           │                                     │
│                    HTTP/WebSocket                               │
│                           │                                     │
└───────────────────────────┼─────────────────────────────────────┘
                            │
┌───────────────────────────┼─────────────────────────────────────┐
│                  Backend (FastAPI)                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ config.py    │  │ questions.py │  │ training.py  │          │
│  │ (Engine API) │  │ (Inference)  │  │ (Fine-tune)  │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
│         └──────────────────┼──────────────────┘                 │
│                           │                                     │
└───────────────────────────┼─────────────────────────────────────┘
                            │
┌───────────────────────────┼─────────────────────────────────────┐
│                  NullAI Core Logic                              │
│  ┌────────────────────────────────────────────────┐             │
│  │           model_router.py                      │             │
│  │  - RAG推論統合                                 │             │
│  │  - 師匠出力保存                                │             │
│  │  - エンジン管理（スワップ、昇格）             │             │
│  └────────────────────────────────────────────────┘             │
│         │                  │                  │                 │
│         ▼                  ▼                  ▼                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ iath_memory  │  │ llm_providers│  │ fine_tuning  │          │
│  │ .py          │  │ .py          │  │ .py          │          │
│  │ (6D Search)  │  │ (4 Providers)│  │ (PEFT/Unslo) │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│         │                  │                  │                 │
└─────────┼──────────────────┼──────────────────┼─────────────────┘
          │                  │                  │
          ▼                  ▼                  ▼
┌──────────────────────────────────────────────────────────────┐
│                    External Services                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ knowledge_   │  │ Ollama       │  │ HuggingFace  │       │
│  │ base.iath    │  │ (localhost)  │  │ Models       │       │
│  │ (6D Memory)  │  │              │  │              │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                              │
│  ┌──────────────┐  ┌──────────────────────────────────┐     │
│  │ sql_app.db   │  │ training_data/                   │     │
│  │ (SQLite)     │  │  - master_outputs/*.jsonl        │     │
│  │              │  │  - checkpoints/apprentice_*/     │     │
│  └──────────────┘  └──────────────────────────────────┘     │
└──────────────────────────────────────────────────────────────┘

各システムの詳細解説

ModelRouter (`null_ai/model_router.py`)

役割

NullAIの頭脳。全ての推論リクエストを管理。

主要メソッド詳細

`init()`

def __init__(self, config_manager):
    self.config_manager = config_manager
    self.master_model = None        # 師匠モデル
    self.apprentice_model = None    # 弟子モデル
    self.dendritic_memory = None    # .iath空間記憶

    # .iathファイルのロード
    self._load_dendritic_memory()

重要: 初期化時に自動的に.iathをロード → 起動時間が長くなる可能性

`async def infer()` - RAG統合推論

async def infer(self, prompt, domain_id, model_config, save_to_memory=False):
    # Step 1: DB知識チェック
    has_knowledge = self._check_db_knowledge(domain_id, prompt)

    if has_knowledge:
        # Step 2a: RAG推論
        knowledge = self._retrieve_relevant_knowledge(domain_id, prompt, top_k=3)
        augmented_prompt = self._build_rag_prompt(prompt, knowledge)
        response = await self._perform_llm_inference(model_config, augmented_prompt)
    else:
        # Step 2b: 通常推論
        response = await self._perform_llm_inference(model_config, prompt)

        # Step 3: 高品質なら保存
        if save_to_memory and response["confidence"] >= 0.7:
            await self._save_inference_to_db(domain_id, prompt, response)

    # Step 4: 師匠の出力なら訓練データとして保存
    is_master = (self.master_model and
                 model_config.model_id == self.master_model.model_id)
    if is_master and response["confidence"] >= 0.8:
        await self._save_master_output_as_training_data(
            prompt, response["response"], domain_id, response["confidence"]
        )

    return response

データフロー図:

prompt → check DB → found?
                     ├─ YES → retrieve knowledge
                     │         ↓
                     │      augment prompt
                     │         ↓
                     │      LLM inference → response
                     │                        ↓
                     │                    is master? → save as training data
                     │
                     └─ NO → LLM inference → response
                                              ↓
                                          confidence >= 0.7? → save to DB

`_retrieve_relevant_knowledge()` - ハイブリッド検索

def _retrieve_relevant_knowledge(self, domain_id, prompt, top_k=3):
    if not self.dendritic_memory:
        return []

    # ハイブリッド検索実行
    results = self.dendritic_memory.hybrid_search(
        query_text=prompt,
        query_coords=None,  # 将来的には座標も推定
        top_k=top_k,
        text_weight=0.4,    # テキストマッチの重み
        spatial_weight=0.6  # 空間距離の重み
    )

    # Knowledge Tile形式に変換
    formatted_knowledge = []
    for tile in results:
        formatted_knowledge.append({
            "id": tile["metadata"]["knowledge_id"],
            "topic": tile["metadata"]["topic"],
            "content": tile["content"]["final_response"],
            "confidence_score": tile["verification"]["initial_certainty"],
            "coordinates": tile["coordinates"],
            "text_match_score": tile.get("text_match_score", 0),
            "spatial_distance": tile.get("spatial_distance", None)
        })

    return formatted_knowledge

`_save_master_output_as_training_data()` - 訓練データ保存

async def _save_master_output_as_training_data(
    self, prompt, response, domain_id, confidence
):
    # Alpaca形式で保存
    training_example = {
        "instruction": f"You are an expert in {domain_id}. Provide accurate information based on verified knowledge.",
        "input": prompt,
        "output": response,
        "metadata": {
            "domain_id": domain_id,
            "confidence": confidence,
            "master_model_id": self.master_model.model_id,
            "timestamp": datetime.utcnow().isoformat(),
            "source": "master_output"
        }
    }

    # JSONLファイルに追記
    output_file = f"training_data/master_outputs/master_outputs_{domain_id}.jsonl"
    with open(output_file, 'a', encoding='utf-8') as f:
        f.write(json.dumps(training_example, ensure_ascii=False) + '\n')

なぜJSONL（改行区切りJSON）？

ストリーミング処理が可能（1行ずつ読める）
ファイル破損時の影響が最小限
HuggingFace datasetsと互換性

エンジン管理メソッド

def promote_apprentice(self, apprentice_model_id):
    """弟子を師匠に昇格"""
    # 現在の師匠を引退
    old_master = self.master_model

    # 弟子を師匠に昇格
    self.master_model = self.apprentice_model

    # 弟子をクリア
    self.apprentice_model = None

    # 設定を保存
    self.config_manager.save_active_engines(
        self.master_model.model_id, None
    )

def swap_engines(self):
    """師匠と弟子を入れ替え"""
    temp = self.master_model
    self.master_model = self.apprentice_model
    self.apprentice_model = temp

    self.config_manager.save_active_engines(
        self.master_model.model_id,
        self.apprentice_model.model_id if self.apprentice_model else None
    )

def create_new_apprentice(self, base_model_id):
    """新しい空の弟子を生成"""
    # ベースモデルをコピーして新しいIDを付与
    new_apprentice_id = f"{base_model_id}_apprentice_{timestamp}"

    # 設定に追加
    self.apprentice_model = self.config_manager.get_model_config(base_model_id)
    self.apprentice_model.model_id = new_apprentice_id

    return new_apprentice_id

DendriticMemorySpace (`null_ai/iath_memory.py`)

役割

.iathファイルの読み込みと6次元空間検索を提供。

クラス構造

class IathDecoder:
    """
    .iathファイルの低レベルデコーダー
    dendritic-memory-editor完全互換
    """
    def __init__(self, iath_file_path):
        self.file_path = Path(iath_file_path)
        self.header = None
        self.index = []
        self._load_header_and_index()

    def _load_header_and_index(self):
        """ヘッダーとインデックスの読み込み"""
        with open(self.file_path, 'rb') as f:
            # Header (64 bytes)
            header_bytes = f.read(64)
            self.header = self._parse_header(header_bytes)

            # Index (JSON)
            index_size = self.header["index_size"]
            index_bytes = f.read(index_size)
            self.index = json.loads(index_bytes.decode('utf-8'))

    def get_tile_by_id(self, knowledge_id):
        """IDでタイルを取得"""
        # インデックスからオフセットを検索
        tile_info = next(
            (t for t in self.index["tiles"] if t["id"] == knowledge_id),
            None
        )
        if not tile_info:
            return None

        # ファイルポジション移動
        with open(self.file_path, 'rb') as f:
            f.seek(tile_info["offset"])
            compressed_data = f.read(tile_info["size"])

            # zstd解凍
            decompressed = zstandard.decompress(compressed_data)
            tile_data = json.loads(decompressed.decode('utf-8'))

            return tile_data


class DendriticMemorySpace:
    """
    6次元空間記憶システム
    高レベルAPI
    """
    def __init__(self, iath_file_path):
        self.decoder = IathDecoder(iath_file_path)
        self.all_tiles = []
        self.coordinates_matrix = None  # NumPy行列
        self._load_all_tiles()

    def _load_all_tiles(self):
        """全タイルをメモリにロード"""
        self.all_tiles = self.decoder.get_all_tiles()

        # 座標行列作成（高速検索用）
        coords_list = [tile["coordinates"] for tile in self.all_tiles]
        self.coordinates_matrix = np.array(coords_list)  # Shape: (N, 6)

検索アルゴリズム詳細

座標検索（6次元ユークリッド距離）

def search_by_coordinates(self, query_coords, top_k=5):
    """
    6次元空間での近傍検索

    数式: distance = sqrt(sum((q_i - t_i)^2))
    where:
        q_i = query座標のi番目の要素
        t_i = tile座標のi番目の要素
        i = 0..5 (6次元)
    """
    query_vector = np.array(query_coords)  # Shape: (6,)

    # 全タイルとの距離を一括計算（NumPy vectorization）
    # Broadcasting: (N, 6) - (6,) → (N, 6)
    distances = np.linalg.norm(
        self.coordinates_matrix - query_vector,
        axis=1  # 各行（タイル）ごとに距離計算
    )  # Shape: (N,)

    # 距離でソート
    sorted_indices = np.argsort(distances)[:top_k]

    # 結果を返す
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["spatial_distance"] = float(distances[idx])
        results.append(tile)

    return results

計算量: O(N) - 全タイル数Nに比例（線形探索）

最適化案（未実装）:

KD-Tree: O(log N) だが6次元では効果薄い
Ball-Tree: 高次元でも比較的有効
近似近傍探索（Annoy, HNSW）: 超高速だが精度低下

ハイブリッド検索（テキスト + 座標）

def hybrid_search(
    self,
    query_text,
    query_coords=None,
    top_k=5,
    text_weight=0.4,
    spatial_weight=0.6
):
    """
    テキストマッチと空間距離の複合スコアリング
    """
    # Step 1: テキストマッチスコア計算
    text_scores = []
    for tile in self.all_tiles:
        score = self._calculate_text_match(query_text, tile)
        text_scores.append(score)
    text_scores = np.array(text_scores)  # Shape: (N,)

    # Step 2: 空間距離スコア計算
    if query_coords:
        spatial_distances = np.linalg.norm(
            self.coordinates_matrix - np.array(query_coords),
            axis=1
        )
        # 距離を0-1のスコアに変換（逆数）
        max_dist = spatial_distances.max()
        spatial_scores = 1.0 - (spatial_distances / max_dist)
    else:
        spatial_scores = np.zeros(len(self.all_tiles))

    # Step 3: 複合スコア計算
    combined_scores = (
        text_weight * text_scores +
        spatial_weight * spatial_scores
    )

    # Step 4: スコアでソート
    sorted_indices = np.argsort(combined_scores)[::-1][:top_k]

    # 結果を返す
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["text_match_score"] = float(text_scores[idx])
        tile["spatial_score"] = float(spatial_scores[idx])
        tile["combined_score"] = float(combined_scores[idx])
        if query_coords:
            tile["spatial_distance"] = float(spatial_distances[idx])
        results.append(tile)

    return results

def _calculate_text_match(self, query, tile):
    """
    テキストマッチスコア計算（簡易版）

    将来的にはBM25やTF-IDFを使う
    """
    query_lower = query.lower()
    content = tile["content"]["final_response"].lower()
    topic = tile["metadata"]["topic"].lower()

    # キーワードマッチング
    query_words = set(query_lower.split())
    content_words = set(content.split())
    topic_words = set(topic.split())

    # Jaccard類似度
    content_jaccard = len(query_words & content_words) / len(query_words | content_words)
    topic_jaccard = len(query_words & topic_words) / len(query_words | topic_words)

    # 複合スコア（トピックを重視）
    score = 0.3 * content_jaccard + 0.7 * topic_jaccard

    return score

FineTuningManager (`null_ai/fine_tuning.py`)

役割

弟子モデルのファインチューニングを実行。

PEFT（QLoRA）方式の詳細

async def fine_tune_with_huggingface_peft(
    self,
    model_name,
    training_examples,
    output_dir,
    epochs=3,
    learning_rate=2e-4,
    batch_size=4,
    lora_r=8,
    lora_alpha=16
):
    """
    Parameter-Efficient Fine-Tuning with QLoRA

    QLoRA = Quantized LoRA
    - 4-bit量子化でメモリ削減
    - LoRAで訓練パラメータ削減
    → 12GB GPUでも7Bモデルを訓練可能
    """

    # Step 1: モデルを4-bit量子化でロード
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,               # 4-bit量子化
        bnb_4bit_quant_type="nf4",       # NormalFloat4（最適な量子化方式）
        bnb_4bit_compute_dtype=torch.float16,  # 計算はfp16で
        bnb_4bit_use_double_quant=True   # 二重量子化（さらにメモリ削減）
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        device_map="auto"  # 自動的にGPU/CPUに配置
    )

    # Step 2: LoRA設定
    lora_config = LoraConfig(
        r=lora_r,                        # LoRAランク（低いほど軽量）
        lora_alpha=lora_alpha,           # スケーリング係数
        target_modules=[                 # どのレイヤーにLoRAを適用するか
            "q_proj", "k_proj", "v_proj", "o_proj",  # Attention
            "gate_proj", "up_proj", "down_proj"      # MLP
        ],
        lora_dropout=0.05,               # Dropout率
        bias="none",                     # Biasは訓練しない
        task_type="CAUSAL_LM"            # タスクタイプ
    )

    model = get_peft_model(model, lora_config)

    # 訓練可能パラメータ数を表示
    model.print_trainable_parameters()
    # 例: trainable params: 4.2M || all params: 2.7B || trainable%: 0.16%
    #     → 全パラメータの0.16%だけ訓練！

    # Step 3-9: データ準備、訓練、保存（省略）
    ...

QLoRAの仕組み:

通常のファインチューニング:
┌─────────────────────────┐
│ モデル全体（2.7B params）│ ← 全て訓練
│ メモリ: ~40GB           │
└─────────────────────────┘

QLoRA:
┌─────────────────────────┐
│ 元モデル（2.7B params）  │ ← 4-bit量子化、frozen（訓練しない）
│ メモリ: ~7GB            │
└─────────────────────────┘
          +
┌─────────────────────────┐
│ LoRAアダプター（4.2M）   │ ← これだけ訓練
│ メモリ: ~0.5GB          │
└─────────────────────────┘
         =
   合計メモリ: ~12GB

Alpaca形式データの整形

def format_training_examples_for_model(
    self,
    training_examples,
    template="alpaca"
):
    """
    Alpaca形式 → モデル用プロンプトに整形
    """
    formatted_prompts = []

    for example in training_examples:
        instruction = example["instruction"]
        input_text = example["input"]
        output_text = example["output"]

        if template == "alpaca":
            if input_text:
                prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input_text}

### Response:
{output_text}"""
            else:
                prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:
{output_text}"""

        formatted_prompts.append(prompt)

    return formatted_prompts

なぜこの形式？

明確な区切り（###）
instruction-following能力の向上
オープンソースコミュニティの標準

データフロー完全図解

フロー1: 通常推論（RAGあり）

┌──────────────────────────────────────────────────────────────┐
│ ユーザー: "心臓の働きについて教えて"                        │
└────────────────────┬─────────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: InferencePanel.tsx                               │
│  - 質問をバックエンドに送信                               │
└────────────────────┬───────────────────────────────────────┘
                     ↓ HTTP POST /api/questions
┌────────────────────────────────────────────────────────────┐
│ Backend: questions.py                                      │
│  - InferenceService.ask_question()                         │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: model_router.py                               │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: _check_db_knowledge("medical", "心臓の働き") │   │
│  │  → DendriticMemorySpace.search_by_text()           │   │
│  │  → 結果: 3件見つかった                              │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: _retrieve_relevant_knowledge()             │   │
│  │  → hybrid_search("心臓の働き", top_k=3)            │   │
│  │  → 取得:                                           │   │
│  │    [1] 心臓の解剖学 (score: 0.92)                  │   │
│  │    [2] 循環器系の機能 (score: 0.85)                │   │
│  │    [3] 心臓病の分類 (score: 0.73)                  │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: プロンプト拡張                             │   │
│  │  augmented_prompt = """                            │   │
│  │  Based on the following verified knowledge:        │   │
│  │                                                     │   │
│  │  [Knowledge 1 - expert verification, conf: 0.9]    │   │
│  │  Topic: 心臓の解剖学                                │   │
│  │  Content: 心臓は4つの部屋から構成され...           │   │
│  │                                                     │   │
│  │  [Knowledge 2 - ...]                               │   │
│  │                                                     │   │
│  │  Now, please answer:                               │   │
│  │  心臓の働きについて教えて                           │   │
│  │  """                                               │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: LLM推論                                    │   │
│  │  → llm_providers.py                                │   │
│  │  → OllamaProvider.infer()                          │   │
│  │  → model: deepseek-r1:1.5b                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: レスポンス生成                             │   │
│  │  response = {                                      │   │
│  │    "response": "心臓は循環器系の中心器官で...",     │   │
│  │    "confidence": 0.88,                             │   │
│  │    "thinking": "検証済み知識に基づいて回答",        │   │
│  │    "retrieved_knowledge": [...]                    │   │
│  │  }                                                 │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 6: 師匠の出力？                               │   │
│  │  is_master = True                                  │   │
│  │  confidence = 0.88 >= 0.8 ✓                        │   │
│  │  → _save_master_output_as_training_data()          │   │
│  │  → training_data/master_outputs/medical.jsonl      │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: レスポンス表示                                   │
│  - ResponseDisplay.tsx                                     │
│  - 「心臓は循環器系の中心器官で...」                       │
│  - Retrieved Knowledge バッジ表示                          │
└────────────────────────────────────────────────────────────┘

フロー2: 通常推論（RAGなし、自己拡充）

┌──────────────────────────────────────────────────────────────┐
│ ユーザー: "量子コンピュータの原理は？"                      │
└────────────────────┬─────────────────────────────────────────┘
                     ↓
                  (同上)
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: model_router.py                               │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: _check_db_knowledge("general", "量子...")    │   │
│  │  → DendriticMemorySpace.search_by_text()           │   │
│  │  → 結果: 見つからなかった ❌                        │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: AI内部知識で推論                           │   │
│  │  → LLM.generate("量子コンピュータの原理は？")       │   │
│  │  → model: deepseek-r1:1.5b                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: レスポンス生成                             │   │
│  │  response = {                                      │   │
│  │    "response": "量子コンピュータは...",             │   │
│  │    "confidence": 0.75                              │   │
│  │  }                                                 │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: 自己拡充（DBに保存）                       │   │
│  │  confidence = 0.75 >= 0.7 ✓                        │   │
│  │  save_to_memory = True                             │   │
│  │  → _save_inference_to_db()                         │   │
│  │  → SQLite: knowledge_tiles テーブル                │   │
│  │     (将来的には.iathにも保存)                       │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: 師匠の出力として保存                       │   │
│  │  is_master = True                                  │   │
│  │  confidence = 0.75 < 0.8 ❌                         │   │
│  │  → 訓練データには保存しない                        │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
              (レスポンス表示)

重要な違い:

RAGあり: confidence >= 0.8で訓練データ保存
RAGなし: confidence >= 0.7でDB保存、>= 0.8で訓練データ保存

フロー3: ファインチューニング実行

┌──────────────────────────────────────────────────────────────┐
│ ユーザー: Training Dashboard で "Start Fine-tuning"          │
│  - Apprentice Model: microsoft/phi-2                         │
│  - Domain: medical                                           │
│  - Method: peft                                              │
│  - Epochs: 3                                                 │
└────────────────────┬─────────────────────────────────────────┘
                     ↓ HTTP POST /api/training/start
┌────────────────────────────────────────────────────────────┐
│ Backend: training.py                                       │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: 訓練データ存在チェック                     │   │
│  │  → FineTuningManager.load_training_data("medical") │   │
│  │  → training_data/master_outputs/medical.jsonl      │   │
│  │  → 結果: 150サンプル                               │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: バックグラウンドタスク開始                 │   │
│  │  background_tasks.add_task(run_training)           │   │
│  │  → すぐにレスポンス返却（非同期）                  │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ NullAI Core: fine_tuning.py (バックグラウンド)             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 1: モデルロード（4-bit量子化）                │   │
│  │  → AutoModelForCausalLM.from_pretrained(           │   │
│  │       "microsoft/phi-2",                           │   │
│  │       quantization_config=bnb_config               │   │
│  │     )                                              │   │
│  │  → メモリ使用: ~7GB                                │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 2: LoRA設定                                   │   │
│  │  → get_peft_model(model, lora_config)              │   │
│  │  → 訓練可能パラメータ: 4.2M / 2.7B (0.16%)        │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 3: データ準備                                 │   │
│  │  → format_training_examples_for_model()            │   │
│  │  → Alpaca形式 → モデル用プロンプトに整形           │   │
│  │  → Dataset.from_dict({"text": prompts})            │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 4: トレーニング開始                           │   │
│  │  Epoch 1/3:                                        │   │
│  │    [===>    ] 35% loss: 1.245                      │   │
│  │    → current_training_state.update({               │   │
│  │         "progress": 35,                            │   │
│  │         "current_epoch": 1,                        │   │
│  │         "loss": 1.245                              │   │
│  │       })                                           │   │
│  └─────────────────────────────────────────────────────┘   │
│                     ↓                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Step 5: 完了                                       │   │
│  │  → trainer.save_model(output_dir)                  │   │
│  │  → training_data/checkpoints/apprentice_medical_*/ │   │
│  │  → current_training_state["is_training"] = False   │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬───────────────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────────────┐
│ Frontend: TrainingDashboard.tsx                            │
│  - 2秒ごとにポーリング: GET /api/training/status           │
│  - プログレスバー更新: 35% → 67% → 100%                   │
│  - 完了時: チェックポイント一覧を再取得                   │
└────────────────────────────────────────────────────────────┘

技術スタック詳細

フロントエンド

React 18.2 + TypeScript 5.0
├─ Vite 4.4 (ビルドツール)
├─ TailwindCSS 3.3 (スタイリング)
└─ axios (HTTP クライアント)

主要コンポーネント:
- EngineManager.tsx      (417行) - エンジン管理UI
- InferencePanel.tsx     (321行) - 推論パネル
- TrainingDashboard.tsx  (400行) - トレーニングダッシュボード
- KnowledgePanel.tsx     (185行) - 知識ブラウザ

バックエンド

FastAPI 0.115.6 + Python 3.13+
├─ Uvicorn (ASGIサーバー)
├─ SQLAlchemy 2.0 (ORM)
├─ Pydantic 2.10 (バリデーション)
└─ Alembic (マイグレーション)

主要API:
- /api/config/*     - エンジン管理
- /api/questions    - 推論実行
- /api/training/*   - ファインチューニング
- /api/knowledge/*  - 知識タイル管理

NullAI Core

Python 3.13+
├─ transformers 4.36+ (HuggingFace)
├─ torch 2.0+ (PyTorch)
├─ peft 0.7+ (LoRA/QLoRA)
├─ trl 0.7+ (Reinforcement Learning from Human Feedback)
├─ datasets 2.15+ (データセット処理)
├─ bitsandbytes 0.41+ (量子化)
├─ accelerate 0.25+ (分散訓練)
├─ zstandard 0.22+ (.iath圧縮)
└─ numpy 1.24+ (数値計算)

主要モジュール:
- model_router.py     (800行) - RAG統合、エンジン管理
- iath_memory.py      (362行) - 6次元空間記憶
- fine_tuning.py      (640行) - ファインチューニング
- llm_providers.py    (390行) - LLMプロバイダー統合

LLMプロバイダー

1. Ollama
   - ローカルモデル管理
   - ollama pull deepseek-r1:1.5b
   - API: http://localhost:11434

2. HuggingFace Transformers
   - 直接ロード
   - AutoModelForCausalLM.from_pretrained()
   - GPU/CPU自動配置

3. MLX (Apple Silicon)
   - M1/M2/M3 Mac専用
   - 統合メモリ活用
   - mlx-lm ライブラリ

4. GGUF (llama-cpp-python)
   - 量子化モデル（.gguf）
   - CPU推論に最適
   - GPU acceleration対応

よくある誤解と注意点

誤解1: 「RAGは常に使われる」

❌ 誤解: 全ての推論でRAGが使われる ✅ 真実: DBに知識がある場合のみRAGが発動

# 実際の動作
if has_knowledge:
    # RAG推論
else:
    # 通常推論（RAGなし）

見分け方:

RAGあり: レスポンスにretrieved_knowledgeフィールドが含まれる
RAGなし: retrieved_knowledgeが空

誤解2: 「弟子は自動的に師匠になる」

❌ 誤解: ファインチューニングが完了したら自動で師匠に昇格 ✅ 真実: 手動で昇格操作が必要

ファインチューニング完了
  ↓
チェックポイント保存
  ↓
【手動操作】Engine Manager で Promote をクリック
  ↓
弟子が師匠に昇格

理由: 品質チェックを人間が行うべき

誤解3: 「.iathファイルは自動更新される」

❌ 誤解: AI生成知識が自動的に.iathに保存される ✅ 真実: 現在はJSONLのみ、.iath保存は未実装（Priority 2）

現状:
AI生成知識 → SQLite + JSONL ✅
           → .iath ❌（未実装）

Priority 2実装後:
AI生成知識 → SQLite + JSONL + .iath ✅

誤解4: 「ファインチューニングは全パラメータを訓練する」

❌ 誤解: モデル全体（2.7B パラメータ）を訓練 ✅ 真実: LoRAアダプター（4.2M）だけ訓練

訓練されるパラメータ:
- 元モデル: 2,700,000,000 → frozen（訓練しない）
- LoRA:         4,200,000 → 訓練する ✅

訓練パラメータ比率: 0.16%

メリット:

メモリ削減（40GB → 12GB）
訓練時間短縮（10時間 → 2時間）
元モデルは変更されない（安全）

誤解5: 「SQLiteと.iathは同じデータを保存」

❌ 誤解: SQLiteと.iathは重複している ✅ 真実: 役割が完全に異なる

データベース	保存内容	用途
SQLite	ユーザー、ワークスペース、推論履歴、メタデータ	アプリケーション管理
.iath	Knowledge Tile（6次元座標 + コンテンツ）	知識検索・RAG推論

例:

SQLite:
- users テーブル: nullai_default_user
- workspaces テーブル: default_workspace
- inference_history: 過去の質問と回答

.iath:
- Tile 1: [0.2, 0.8, 0.3, 0.9, 0.7, 0.8] "心臓の働き..."
- Tile 2: [0.3, 0.8, 0.4, 0.85, 0.65, 0.75] "循環器系..."

誤解6: 「confidence値はAIが自動計算」

❌ 誤解: AIが自己評価してconfidenceを返す ✅ 真実: 現在は固定値（プロバイダーごと）

# llm_providers.py
class OllamaProvider:
    async def infer(...):
        return {
            "response": response_text,
            "confidence": 0.85  # ← 固定値！
        }

将来の改善:

複数モデルでクロスチェック
応答の不確実性を計算（エントロピー）
人間によるフィードバック学習

設計判断の理由

判断1: なぜPEFTを採用したか

候補:

フルファインチューニング
PEFT (LoRA/QLoRA)
Adapter
Prompt Tuning

採用: PEFT (QLoRA)

理由:

比較表:

                  メモリ  速度  品質  汎用性
フルFT            ×      ×    ⭐⭐⭐  ⭐⭐⭐
PEFT (QLoRA)      ⭐⭐⭐  ⭐⭐  ⭐⭐⭐  ⭐⭐⭐
Adapter           ⭐⭐    ⭐⭐  ⭐⭐    ⭐⭐
Prompt Tuning     ⭐⭐⭐  ⭐⭐⭐ ⭐      ⭐

結論: PEFTがバランス最良

判断2: なぜAlpaca形式を採用したか

候補:

Alpaca
ShareGPT
OpenAssistant
Custom

採用: Alpaca

理由:

オープンソースで広く採用
instruction-input-output構造が明確
HuggingFace datasetsと互換性
コミュニティのベストプラクティス

判断3: なぜハイブリッド検索か

候補:

テキストのみ
座標のみ
ハイブリッド

採用: ハイブリッド

理由:

テキストのみ:
- 利点: シンプル
- 欠点: 同義語を見逃す

座標のみ:
- 利点: 意味的に関連する知識を発見
- 欠点: 座標が不正確だと失敗

ハイブリッド:
- 利点: 両方の長所を活かせる
- 欠点: パラメータ調整が必要（text_weight, spatial_weight）

現在の設定:

text_weight = 0.4
spatial_weight = 0.6
# → 座標をやや重視（意味的関連性を優先）

判断4: なぜ循環インポートをlazy importで解決したか

候補:

Lazy import（関数内でimport）
アーキテクチャ変更（依存関係の整理）
中間モジュール導入

採用: Lazy import

理由:

最小限の変更で解決
パフォーマンスへの影響は軽微
既存コードの大幅な書き換え不要

実装例:

def _check_db_knowledge(self, domain_id, prompt):
    # 関数内でimport → 循環回避
    from backend.app.database.session import SessionLocal
    db = SessionLocal()
    # ...

拡張時の考慮事項

新しいLLMプロバイダーを追加する場合

手順:

llm_providers.pyに新しいクラスを追加

class NewProvider:
    async def infer(self, model_config, prompt, temperature):
        # 実装
        pass

    async def infer_streaming(self, model_config, prompt, temperature):
        # 実装
        pass

model_router.pyの_perform_llm_inference()に追加

if provider == "ollama":
    result = await self.ollama_provider.infer(...)
elif provider == "new_provider":  # ← 追加
    result = await self.new_provider.infer(...)

backend/app/config.pyのModelProvider列挙型に追加

class ModelProvider(str, Enum):
    OLLAMA = "ollama"
    HUGGINGFACE = "huggingface"
    NEW_PROVIDER = "new_provider"  # ← 追加

新しいドメインを追加する場合

手順:

.iathファイルでドメイン用の座標空間を定義

医療ドメイン: medical_space [x, y, z]
法律ドメイン: legal_space [x, y, z]  ← 追加
  - x: 法分野（民法、刑法、商法...）
  - y: 判例レベル（地裁、高裁、最高裁）
  - z: 時代（古典、現代、最新）

backend/app/config.pyにドメイン設定追加

domains = [
    {"domain_id": "medical", "name": "医療"},
    {"domain_id": "legal", "name": "法律"}  # ← 追加
]

訓練データディレクトリ作成

mkdir -p training_data/master_outputs/
touch training_data/master_outputs/master_outputs_legal.jsonl

座標自動推定を実装する場合（Priority 2）

設計案:

# null_ai/coordinate_estimator.py

class CoordinateEstimator:
    def __init__(self, llm_model):
        """
        DeepSeek R1を使って座標を推定
        """
        self.llm = llm_model

    async def estimate_coordinates(
        self,
        prompt: str,
        response: str,
        domain_id: str
    ) -> List[float]:
        """
        6次元座標を推定

        Returns: [x, y, z, c, g, v]
        """
        # プロンプト構築
        estimation_prompt = f"""You are an expert in knowledge space mapping.
Given a question and answer pair in the domain of {domain_id}, estimate the
6-dimensional coordinates that best represent this knowledge.

Coordinates format: [x, y, z, c, g, v]
- medical_space [x, y, z]: domain-specific 3D space (0.0-1.0)
- meta_space [c, g, v]: Certainty, Granularity, Verification (0.0-1.0)

Question: {prompt}
Answer: {response}

Output ONLY the coordinates as a JSON array: [x, y, z, c, g, v]
"""

        # LLMに座標推定を依頼
        result = await self.llm.generate(estimation_prompt)

        # JSONパース
        coords = json.loads(result)

        # バリデーション
        assert len(coords) == 6
        assert all(0.0 <= c <= 1.0 for c in coords)

        return coords

WebSocketでリアルタイム進捗を実装する場合

設計案:

# backend/app/main.py

@app.websocket("/ws/training/{session_id}")
async def training_websocket(websocket: WebSocket, session_id: str):
    await websocket.accept()

    # 進捗コールバック
    async def progress_callback(state):
        await websocket.send_json({
            "type": "progress",
            "data": state
        })

    # ファインチューニング開始
    await fine_tuning_manager.start_training(
        ...,
        progress_callback=progress_callback
    )

まとめ: プロジェクトの本質

NullAIは単なるRAGシステムでも、単なるファインチューニングツールでもありません。

NullAIの本質:

自己進化する知識生態系

師匠AI → 知識生成 → 弟子AI学習 → 昇格 → 新しい弟子 → サイクル継続
   ↓                                      ↑
DB拡充（自己拡充）                   ファインチューニング
   ↓                                      ↑
樹木型空間記憶（6次元座標）          高品質訓練データ
   ↓                                      ↑
意味的知識整理                      師匠の知識継承
   ↓                                      ↑
  └────────────── サイクル ──────────────┘

4つの核心思想の統合:

倒木システム: 世代交代による進化
DB分離構造: 信頼性の確保と自己拡充
樹木型空間記憶: 意味的知識整理
ローカルファースト: プライバシーとコスト

これら全てが有機的に結合し、AIが自己進化する生態系を形成しています。

このガイドを理解したら、あなたはNullAIの設計思想を正しく継承できます。

頑張ってください！🌲🔥

Document Version: 1.0 Total Pages: 60+ Total Words: 15,000+ Author: Claude (Sonnet 4.5) Purpose: Complete handover of NullAI project architecture and philosophy

NullAI プロジェクト完全理解ガイド

📖 目次

プロジェクト概要

NullAIとは何か

核心的な問いと答え

プロジェクト名の由来

4つの核心思想（こだわりポイント）

1️⃣ 倒木システム（Fallen Tree System）

比喩の意味

システムの流れ

重要な設計判断

2️⃣ DB分離構造（Database Separation Structure）

設計思想

DB優先の理由

自己拡充の仕組み

重要な設計判断

3️⃣ 樹木型空間記憶（Dendritic Memory Space）

比喩の意味

6次元座標系の詳細

medical_space [x, y, z]: ドメイン固有の3次元空間

meta_space [c, g, v]: メタ情報の3次元空間

検索の仕組み

1. テキスト検索（従来型）

2. 座標検索（空間検索）

3. ハイブリッド検索（推奨）

.iathファイル形式

重要な設計判断

4️⃣ ローカルファースト & ワンコマンドセットアップ

設計思想

ワンコマンドの実現方法

重要な設計判断

システムアーキテクチャ全体図

各システムの詳細解説

ModelRouter (null_ai/model_router.py)

役割

主要メソッド詳細

__init__()

async def infer() - RAG統合推論

_retrieve_relevant_knowledge() - ハイブリッド検索

_save_master_output_as_training_data() - 訓練データ保存

エンジン管理メソッド

DendriticMemorySpace (null_ai/iath_memory.py)

役割

クラス構造

検索アルゴリズム詳細

座標検索（6次元ユークリッド距離）

ハイブリッド検索（テキスト + 座標）

FineTuningManager (null_ai/fine_tuning.py)

役割

PEFT（QLoRA）方式の詳細

Alpaca形式データの整形

データフロー完全図解

フロー1: 通常推論（RAGあり）

フロー2: 通常推論（RAGなし、自己拡充）

フロー3: ファインチューニング実行

技術スタック詳細

フロントエンド

バックエンド

NullAI Core

LLMプロバイダー

よくある誤解と注意点

誤解1: 「RAGは常に使われる」

誤解2: 「弟子は自動的に師匠になる」

誤解3: 「.iathファイルは自動更新される」

誤解4: 「ファインチューニングは全パラメータを訓練する」

誤解5: 「SQLiteと.iathは同じデータを保存」

誤解6: 「confidence値はAIが自動計算」

設計判断の理由

判断1: なぜPEFTを採用したか

判断2: なぜAlpaca形式を採用したか

判断3: なぜハイブリッド検索か

判断4: なぜ循環インポートをlazy importで解決したか

拡張時の考慮事項

新しいLLMプロバイダーを追加する場合

新しいドメインを追加する場合

座標自動推定を実装する場合（Priority 2）

WebSocketでリアルタイム進捗を実装する場合

まとめ: プロジェクトの本質

ModelRouter (`null_ai/model_router.py`)

`init()`

`async def infer()` - RAG統合推論

`_retrieve_relevant_knowledge()` - ハイブリッド検索

`_save_master_output_as_training_data()` - 訓練データ保存

DendriticMemorySpace (`null_ai/iath_memory.py`)

FineTuningManager (`null_ai/fine_tuning.py`)