nullai-knowledge-system / PROJECT_ARCHITECTURE_GUIDE.md

Upload PROJECT_ARCHITECTURE_GUIDE.md with huggingface_hub

3f9e2e3 verified 27 days ago

71.4 kB

	# NullAI プロジェクト完全理解ガイド

	最終更新: 2025-12-02
	対象読者: このプロジェクトを引き継ぐ全ての開発者
	目的: プロジェクトの全体像を完全に理解し、設計思想を正しく継承する

	---

	## 📖 目次

	1. [プロジェクト概要](#プロジェクト概要)
	2. [4つの核心思想（こだわりポイント）](#4つの核心思想こだわりポイント)
	3. [システムアーキテクチャ全体図](#システムアーキテクチャ全体図)
	4. [各システムの詳細解説](#各システムの詳細解説)
	5. [データフロー完全図解](#データフロー完全図解)
	6. [技術スタック詳細](#技術スタック詳細)
	7. [よくある誤解と注意点](#よくある誤解と注意点)
	8. [設計判断の理由](#設計判断の理由)
	9. [拡張時の考慮事項](#拡張時の考慮事項)

	---

	## プロジェクト概要

	### NullAIとは何か

	NullAIは、自己進化型多ドメイン知識推論エンジンです。

	#### 核心的な問いと答え

	Q: 何を解決しようとしているのか？
	A: 「AIのハルシネーション（幻覚）」と「小型モデルの性能不足」の両方を同時に解決

	Q: どうやって解決するのか？
	A:
	1. DB優先推論（RAG） → ハルシネーション削減
	2. 師匠→弟子のファインチューニング → 小型モデルの性能向上
	3. 樹木型空間記憶 → 知識の意味的整理と高速検索
	4. 自己拡充サイクル → 知識ベースの自動成長

	Q: 他のRAGシステムとの違いは？
	A:
	- ❌ 普通のRAG: ベクトルDBで検索するだけ
	- ✅ NullAI: 6次元空間座標で知識を配置し、意味的な近傍検索が可能

	Q: 他のファインチューニングシステムとの違いは？
	A:
	- ❌ 普通のFT: 人間が訓練データを手動作成
	- ✅ NullAI: 師匠AIが自動的に訓練データを生成 → 弟子が学習 → 弟子が師匠に昇格 → 無限サイクル

	### プロジェクト名の由来

	Null = ゼロ（ハルシネーション）
	AI = Artificial Intelligence

	→ ゼロ・ハルシネーションを目指すAI

	---

	## 4つの核心思想（こだわりポイント）

	### 1️⃣ 倒木システム（Fallen Tree System）

	#### 比喩の意味

	森で大木（老いた木）が倒れると、その養分で新しい若木が育つ。NullAIでは：

	- 🌲 大木（師匠モデル）: 高性能だが重いAI（例: DeepSeek R1 32B）
	- 🌱 若木（弟子モデル）: 最初は空っぽだが軽量なAI（例: Phi-2 2.7B）
	- 🍂 養分（訓練データ）: 師匠の高品質な出力（Alpaca形式JSONL）

	#### システムの流れ

	```
	┌─────────────────────────────────────────────────────┐
	│ Phase 1: 師匠の統治時代 │
	├─────────────────────────────────────────────────────┤
	│ 師匠（DeepSeek R1）が推論を担当 │
	│ ↓ │
	│ 高品質な出力（confidence >= 0.8）が自動保存 │
	│ ↓ │
	│ training_data/master_outputs/*.jsonl │
	└─────────────────────────────────────────────────────┘
	↓
	┌─────────────────────────────────────────────────────┐
	│ Phase 2: ファインチューニング │
	├─────────────────────────────────────────────────────┤
	│ 訓練データを使って弟子（Phi-2）を訓練 │
	│ ↓ │
	│ 弟子の性能が向上（師匠の知識を吸収） │
	│ ↓ │
	│ training_data/checkpoints/apprentice_*/ │
	└─────────────────────────────────────────────────────┘
	↓
	┌─────────────────────────────────────────────────────┐
	│ Phase 3: 世代交代（倒木） │
	├─────────────────────────────────────────────────────┤
	│ 弟子が十分成長 → 師匠に昇格 │
	│ ↓ │
	│ 旧師匠（DeepSeek）は引退（でも特別な役割あり） │
	│ ↓ │
	│ 新しい空の弟子を生成 │
	└─────────────────────────────────────────────────────┘
	↓
	サイクル繰り返し
	```

	#### 重要な設計判断

	Q: なぜ師匠を完全に削除しないのか？
	A: 引退した師匠（DeepSeek）は「永久的指導者」として残る
	- DB拡充時のプロンプト生成
	- 新しいドメインの初期知識生成
	- 品質チェック

	Q: 弟子はいつ師匠になれるのか？
	A:
	- ファインチューニング完了後、手動で昇格
	- 将来的には自動評価で昇格判定（未実装）

	Q: 複数の弟子を同時に訓練できるのか？
	A: できる。ドメイン別に異なる弟子を訓練可能
	- 医療ドメイン弟子
	- 法律ドメイン弟子
	- 一般知識弟子

	### 2️⃣ DB分離構造（Database Separation Structure）

	#### 設計思想

	```
	質問が来た時の判断フロー：

	質問 → まず知識DBを検索
	├─ 見つかった → DB知識を使って推論（RAG）✅ 信頼性高
	└─ 見つからない → AI内部知識で推論 ⚠️ ハルシネーションリスク
	↓
	その出力をDBに保存（自己拡充）
	```

	#### DB優先の理由

	\| 知識ソース \| 信頼性 \| 根拠 \| ハルシネーション \|
	\|-----------\|--------\|------\|-----------------\|
	\| 知識DB（.iath） \| ⭐⭐⭐⭐⭐ \| 人間が検証 or 専門家が作成 \| ほぼゼロ \|
	\| AI生成知識 \| ⭐⭐⭐ \| AIの内部知識（学習データ由来） \| 中程度 \|
	\| AI幻覚 \| ⭐ \| 推測・創作 \| 高い \|

	結論: 知識DBにあるものは絶対に使う → ハルシネーション削減

	#### 自己拡充の仕組み

	```python
	# 疑似コード
	async def infer(question):
	# Step 1: DB検索
	db_knowledge = search_db(question)

	if db_knowledge:
	# Step 2a: RAG推論（DBの知識を使う）
	response = llm.generate(
	f"Based on this verified knowledge: {db_knowledge}\n"
	f"Answer: {question}"
	)
	return response
	else:
	# Step 2b: AI内部知識で推論
	response = llm.generate(question)

	# Step 3: 高品質なら保存（自己拡充）
	if response.confidence >= 0.7:
	save_to_db(question, response)

	return response
	```

	#### 重要な設計判断

	Q: なぜSQLiteと.iathの2つを使うのか？
	A: 役割分担
	- SQLite: メタデータ（ユーザー、ワークスペース、推論履歴）
	- .iath: 知識タイル本体（6次元座標 + コンテンツ）

	Q: confidence >= 0.7と0.8の違いは？
	A:
	- `>= 0.7`: DB保存（自己拡充）← やや緩め
	- `>= 0.8`: 訓練データ保存 ← 厳しめ（高品質のみ）

	Q: AI生成知識をDBに保存する際、人間のチェックは不要？
	A: 現在は自動保存。将来的には：
	- 専門家によるレビューフロー
	- コミュニティ投票による品質評価
	- AIによる自動検証（別のAIでクロスチェック）

	### 3️⃣ 樹木型空間記憶（Dendritic Memory Space）

	#### 比喩の意味

	人間の脳の樹状突起（デンドライト）のように、知識が空間的に整理されている。

	通常のDB:
	```
	知識1: 「心臓は循環器官である」
	知識2: 「脳は中枢神経系の一部である」
	→ バラバラに保存（関連性が不明）
	```

	樹木型空間記憶:
	```
	知識1: 座標 [0.2, 0.8, 0.3, 0.9, 0.7, 0.8]
	知識2: 座標 [0.3, 0.8, 0.4, 0.85, 0.65, 0.75]
	→ 近い座標 = 意味的に関連 → 一緒に検索できる
	```

	#### 6次元座標系の詳細

	```
	Knowledge Tile の座標 = [x, y, z, c, g, v]
	─────┬───── ─────┬─────
	medical_space meta_space
	```

	##### medical_space [x, y, z]: ドメイン固有の3次元空間

	例: 医療ドメインの場合

	\| 軸 \| 意味 \| 例 \|
	\|----\|------\|-----\|
	\| x \| 解剖学的位置 \| 0.0=神経系, 0.5=循環器, 1.0=消化器 \|
	\| y \| 病理学的分類 \| 0.0=感染症, 0.5=代謝疾患, 1.0=外傷 \|
	\| z \| 治療レベル \| 0.0=予防, 0.5=診断, 1.0=治療 \|

	##### meta_space [c, g, v]: メタ情報の3次元空間

	\| 軸 \| 意味 \| 値の範囲 \|
	\|----\|------\|----------\|
	\| c (Certainty) \| 確実性 \| 0.0=仮説, 0.5=定説, 1.0=確立された事実 \|
	\| g (Granularity) \| 粒度 \| 0.0=概要, 0.5=詳細, 1.0=専門的 \|
	\| v (Verification) \| 検証状態 \| 0.0=未検証, 0.5=専門家レビュー済, 1.0=複数ソース確認済 \|

	#### 検索の仕組み

	##### 1. テキスト検索（従来型）

	```python
	def search_by_text(query):
	# 単純なキーワードマッチング
	results = [tile for tile in all_tiles
	if query in tile.content]
	return results
	```

	問題点: 同義語を見逃す
	- 「心臓病」で検索しても「循環器疾患」がヒットしない

	##### 2. 座標検索（空間検索）

	```python
	def search_by_coordinates(query_coords, top_k=5):
	# 6次元ユークリッド距離で計算
	distances = []
	for tile in all_tiles:
	dist = euclidean_distance(query_coords, tile.coords)
	distances.append((tile, dist))

	# 距離が近い順にソート
	distances.sort(key=lambda x: x[1])
	return distances[:top_k]
	```

	利点: 意味的に近い知識を自動で発見
	- 座標が近い = 意味的に関連

	##### 3. ハイブリッド検索（推奨）

	```python
	def hybrid_search(query_text, query_coords=None, top_k=5):
	# テキストマッチスコア計算
	text_scores = calculate_text_match(query_text)

	# 座標距離スコア計算
	if query_coords:
	spatial_scores = calculate_spatial_distance(query_coords)

	# 複合スコア = α * text_score + β * (1 - spatial_distance)
	combined_scores = 0.4 * text_scores + 0.6 * spatial_scores

	return top_k_results(combined_scores)
	```

	#### .iathファイル形式

	```
	.iath ファイル構造:

	┌────────────────────────────────────┐
	│ Header (64 bytes) │ ← マジックナンバー、バージョン
	├────────────────────────────────────┤
	│ Index (JSON, 可変長) │ ← タイルIDとオフセット一覧
	│ { │
	│ "tiles": [ │
	│ {"id": "tile_001", "offset": 512},
	│ {"id": "tile_002", "offset": 2048}
	│ ] │
	│ } │
	├────────────────────────────────────┤
	│ Data Section (zstd圧縮) │
	│ ┌──────────────────────┐ │
	│ │ Tile 1 (JSON) │ │
	│ │ - metadata │ │
	│ │ - content │ │
	│ │ - coordinates │ │
	│ │ - verification │ │
	│ └──────────────────────┘ │
	│ ┌──────────────────────┐ │
	│ │ Tile 2 (JSON) │ │
	│ └──────────────────────┘ │
	│ ... │
	└────────────────────────────────────┘
	```

	なぜzstd圧縮？
	- 高い圧縮率（gzipより優れる）
	- 高速な解凍速度
	- Facebookが開発（信頼性）

	#### 重要な設計判断

	Q: なぜ6次元？ 3次元や10次元ではダメ？
	A:
	- 3次元: ドメイン知識だけでメタ情報が表現できない
	- 10次元以上: 次元の呪い（検索が遅くなる）、人間が理解不能
	- 6次元: ドメイン(3) + メタ(3) = バランスが良い

	Q: 座標は誰が決めるのか？
	A:
	- 現状: 人間が手動で設定（dendritic-memory-editorで）
	- Priority 2で実装予定: AIが自動推定（DeepSeekが座標を生成）

	Q: .iathとFAISS（ベクトルDB）の違いは？
	A:
	\| 特徴 \| .iath \| FAISS \|
	\|------\|-------\|-------\|
	\| 座標次元 \| 6次元（人間が理解可能） \| 768次元（Embeddingモデル依存） \|
	\| 検索速度 \| O(n) 線形探索 \| O(log n) 高速 \|
	\| 意味の透明性 \| 高い（座標の意味が明確） \| 低い（ブラックボックス） \|
	\| 編集容易性 \| 高い（座標を手動調整可能） \| 低い（再Embedding必要） \|

	結論: .iathは「人間が理解・編集できる知識ベース」を重視

	### 4️⃣ ローカルファースト & ワンコマンドセットアップ

	#### 設計思想

	```
	❌ 悪い例（クラウド依存）:
	pip install nullai
	nullai --api-key=YOUR_OPENAI_KEY # クラウドAPI必須
	→ インターネット必須、コスト高、プライバシー懸念

	✅ NullAI:
	./start_null_ai.sh # ローカルで完結
	→ オフライン可能、無料、プライバシー保護
	```

	#### ワンコマンドの実現方法

	`start_null_ai.sh`が自動で実行すること:

	1. ✅ 依存関係チェック（Python, Node.js, Ollama）
	2. ✅ 仮想環境作成（venv）
	3. ✅ Python依存関係インストール
	4. ✅ Node.js依存関係インストール
	5. ✅ データベース初期化（sql_app.db）
	6. ✅ Ollama起動
	7. ✅ バックエンド起動（port 8000）
	8. ✅ フロントエンド起動（port 5173）
	9. ✅ .iathメモリロード確認

	ユーザーがすることは: `./start_null_ai.sh`を実行するだけ

	#### 重要な設計判断

	Q: なぜOllamaを使うのか？ HuggingFaceだけではダメ？
	A:
	- Ollama: モデル管理が楽（`ollama pull deepseek-r1`だけ）
	- HuggingFace: 手動でダウンロード、パス指定が面倒

	Q: なぜDockerを使わないのか？
	A:
	- Docker: 初心者には難しい、GPUパススルーが複雑
	- シェルスクリプト: シンプル、デバッグしやすい、カスタマイズ容易

	---

	## システムアーキテクチャ全体図

	```
	┌─────────────────────────────────────────────────────────────────┐
	│ Frontend (React + TypeScript) │
	│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ Engine │ │ Inference │ │ Training │ │
	│ │ Manager │ │ Panel │ │ Dashboard │ │
	│ └──────────────┘ └──────────────┘ └──────────────┘ │
	│ │ │ │ │
	│ └──────────────────┼──────────────────┘ │
	│ │ │
	│ HTTP/WebSocket │
	│ │ │
	└───────────────────────────┼─────────────────────────────────────┘
	│
	┌───────────────────────────┼─────────────────────────────────────┐
	│ Backend (FastAPI) │
	│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ config.py │ │ questions.py │ │ training.py │ │
	│ │ (Engine API) │ │ (Inference) │ │ (Fine-tune) │ │
	│ └──────────────┘ └──────────────┘ └──────────────┘ │
	│ │ │ │ │
	│ └──────────────────┼──────────────────┘ │
	│ │ │
	└───────────────────────────┼─────────────────────────────────────┘
	│
	┌───────────────────────────┼─────────────────────────────────────┐
	│ NullAI Core Logic │
	│ ┌────────────────────────────────────────────────┐ │
	│ │ model_router.py │ │
	│ │ - RAG推論統合 │ │
	│ │ - 師匠出力保存 │ │
	│ │ - エンジン管理（スワップ、昇格） │ │
	│ └────────────────────────────────────────────────┘ │
	│ │ │ │ │
	│ ▼ ▼ ▼ │
	│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ iath_memory │ │ llm_providers│ │ fine_tuning │ │
	│ │ .py │ │ .py │ │ .py │ │
	│ │ (6D Search) │ │ (4 Providers)│ │ (PEFT/Unslo) │ │
	│ └──────────────┘ └──────────────┘ └──────────────┘ │
	│ │ │ │ │
	└─────────┼──────────────────┼──────────────────┼─────────────────┘
	│ │ │
	▼ ▼ ▼
	┌──────────────────────────────────────────────────────────────┐
	│ External Services │
	│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ knowledge_ │ │ Ollama │ │ HuggingFace │ │
	│ │ base.iath │ │ (localhost) │ │ Models │ │
	│ │ (6D Memory) │ │ │ │ │ │
	│ └──────────────┘ └──────────────┘ └──────────────┘ │
	│ │
	│ ┌──────────────┐ ┌──────────────────────────────────┐ │
	│ │ sql_app.db │ │ training_data/ │ │
	│ │ (SQLite) │ │ - master_outputs/*.jsonl │ │
	│ │ │ │ - checkpoints/apprentice_*/ │ │
	│ └──────────────┘ └──────────────────────────────────┘ │
	└──────────────────────────────────────────────────────────────┘
	```

	---

	## 各システムの詳細解説

	### ModelRouter (`null_ai/model_router.py`)

	#### 役割
	NullAIの頭脳。全ての推論リクエストを管理。

	#### 主要メソッド詳細

	##### `__init__()`
	```python
	def __init__(self, config_manager):
	self.config_manager = config_manager
	self.master_model = None # 師匠モデル
	self.apprentice_model = None # 弟子モデル
	self.dendritic_memory = None # .iath空間記憶

	# .iathファイルのロード
	self._load_dendritic_memory()
	```

	重要: 初期化時に自動的に.iathをロード → 起動時間が長くなる可能性

	##### `async def infer()` - RAG統合推論

	```python
	async def infer(self, prompt, domain_id, model_config, save_to_memory=False):
	# Step 1: DB知識チェック
	has_knowledge = self._check_db_knowledge(domain_id, prompt)

	if has_knowledge:
	# Step 2a: RAG推論
	knowledge = self._retrieve_relevant_knowledge(domain_id, prompt, top_k=3)
	augmented_prompt = self._build_rag_prompt(prompt, knowledge)
	response = await self._perform_llm_inference(model_config, augmented_prompt)
	else:
	# Step 2b: 通常推論
	response = await self._perform_llm_inference(model_config, prompt)

	# Step 3: 高品質なら保存
	if save_to_memory and response["confidence"] >= 0.7:
	await self._save_inference_to_db(domain_id, prompt, response)

	# Step 4: 師匠の出力なら訓練データとして保存
	is_master = (self.master_model and
	model_config.model_id == self.master_model.model_id)
	if is_master and response["confidence"] >= 0.8:
	await self._save_master_output_as_training_data(
	prompt, response["response"], domain_id, response["confidence"]
	)

	return response
	```

	データフロー図:
	```
	prompt → check DB → found?
	├─ YES → retrieve knowledge
	│ ↓
	│ augment prompt
	│ ↓
	│ LLM inference → response
	│ ↓
	│ is master? → save as training data
	│
	└─ NO → LLM inference → response
	↓
	confidence >= 0.7? → save to DB
	```

	##### `_retrieve_relevant_knowledge()` - ハイブリッド検索

	```python
	def _retrieve_relevant_knowledge(self, domain_id, prompt, top_k=3):
	if not self.dendritic_memory:
	return []

	# ハイブリッド検索実行
	results = self.dendritic_memory.hybrid_search(
	query_text=prompt,
	query_coords=None, # 将来的には座標も推定
	top_k=top_k,
	text_weight=0.4, # テキストマッチの重み
	spatial_weight=0.6 # 空間距離の重み
	)

	# Knowledge Tile形式に変換
	formatted_knowledge = []
	for tile in results:
	formatted_knowledge.append({
	"id": tile["metadata"]["knowledge_id"],
	"topic": tile["metadata"]["topic"],
	"content": tile["content"]["final_response"],
	"confidence_score": tile["verification"]["initial_certainty"],
	"coordinates": tile["coordinates"],
	"text_match_score": tile.get("text_match_score", 0),
	"spatial_distance": tile.get("spatial_distance", None)
	})

	return formatted_knowledge
	```

	##### `_save_master_output_as_training_data()` - 訓練データ保存

	```python
	async def _save_master_output_as_training_data(
	self, prompt, response, domain_id, confidence
	):
	# Alpaca形式で保存
	training_example = {
	"instruction": f"You are an expert in {domain_id}. Provide accurate information based on verified knowledge.",
	"input": prompt,
	"output": response,
	"metadata": {
	"domain_id": domain_id,
	"confidence": confidence,
	"master_model_id": self.master_model.model_id,
	"timestamp": datetime.utcnow().isoformat(),
	"source": "master_output"
	}
	}

	# JSONLファイルに追記
	output_file = f"training_data/master_outputs/master_outputs_{domain_id}.jsonl"
	with open(output_file, 'a', encoding='utf-8') as f:
	f.write(json.dumps(training_example, ensure_ascii=False) + '\n')
	```

	なぜJSONL（改行区切りJSON）？
	- ストリーミング処理が可能（1行ずつ読める）
	- ファイル破損時の影響が最小限
	- HuggingFace datasetsと互換性

	##### エンジン管理メソッド

	```python
	def promote_apprentice(self, apprentice_model_id):
	"""弟子を師匠に昇格"""
	# 現在の師匠を引退
	old_master = self.master_model

	# 弟子を師匠に昇格
	self.master_model = self.apprentice_model

	# 弟子をクリア
	self.apprentice_model = None

	# 設定を保存
	self.config_manager.save_active_engines(
	self.master_model.model_id, None
	)

	def swap_engines(self):
	"""師匠と弟子を入れ替え"""
	temp = self.master_model
	self.master_model = self.apprentice_model
	self.apprentice_model = temp

	self.config_manager.save_active_engines(
	self.master_model.model_id,
	self.apprentice_model.model_id if self.apprentice_model else None
	)

	def create_new_apprentice(self, base_model_id):
	"""新しい空の弟子を生成"""
	# ベースモデルをコピーして新しいIDを付与
	new_apprentice_id = f"{base_model_id}_apprentice_{timestamp}"

	# 設定に追加
	self.apprentice_model = self.config_manager.get_model_config(base_model_id)
	self.apprentice_model.model_id = new_apprentice_id

	return new_apprentice_id
	```

	### DendriticMemorySpace (`null_ai/iath_memory.py`)

	#### 役割
	.iathファイルの読み込みと6次元空間検索を提供。

	#### クラス構造

	```python
	class IathDecoder:
	"""
	.iathファイルの低レベルデコーダー
	dendritic-memory-editor完全互換
	"""
	def __init__(self, iath_file_path):
	self.file_path = Path(iath_file_path)
	self.header = None
	self.index = []
	self._load_header_and_index()

	def _load_header_and_index(self):
	"""ヘッダーとインデックスの読み込み"""
	with open(self.file_path, 'rb') as f:
	# Header (64 bytes)
	header_bytes = f.read(64)
	self.header = self._parse_header(header_bytes)

	# Index (JSON)
	index_size = self.header["index_size"]
	index_bytes = f.read(index_size)
	self.index = json.loads(index_bytes.decode('utf-8'))

	def get_tile_by_id(self, knowledge_id):
	"""IDでタイルを取得"""
	# インデックスからオフセットを検索
	tile_info = next(
	(t for t in self.index["tiles"] if t["id"] == knowledge_id),
	None
	)
	if not tile_info:
	return None

	# ファイルポジション移動
	with open(self.file_path, 'rb') as f:
	f.seek(tile_info["offset"])
	compressed_data = f.read(tile_info["size"])

	# zstd解凍
	decompressed = zstandard.decompress(compressed_data)
	tile_data = json.loads(decompressed.decode('utf-8'))

	return tile_data


	class DendriticMemorySpace:
	"""
	6次元空間記憶システム
	高レベルAPI
	"""
	def __init__(self, iath_file_path):
	self.decoder = IathDecoder(iath_file_path)
	self.all_tiles = []
	self.coordinates_matrix = None # NumPy行列
	self._load_all_tiles()

	def _load_all_tiles(self):
	"""全タイルをメモリにロード"""
	self.all_tiles = self.decoder.get_all_tiles()

	# 座標行列作成（高速検索用）
	coords_list = [tile["coordinates"] for tile in self.all_tiles]
	self.coordinates_matrix = np.array(coords_list) # Shape: (N, 6)
	```

	#### 検索アルゴリズム詳細

	##### 座標検索（6次元ユークリッド距離）

	```python
	def search_by_coordinates(self, query_coords, top_k=5):
	"""
	6次元空間での近傍検索

	数式: distance = sqrt(sum((q_i - t_i)^2))
	where:
	q_i = query座標のi番目の要素
	t_i = tile座標のi番目の要素
	i = 0..5 (6次元)
	"""
	query_vector = np.array(query_coords) # Shape: (6,)

	# 全タイルとの距離を一括計算（NumPy vectorization）
	# Broadcasting: (N, 6) - (6,) → (N, 6)
	distances = np.linalg.norm(
	self.coordinates_matrix - query_vector,
	axis=1 # 各行（タイル）ごとに距離計算
	) # Shape: (N,)

	# 距離でソート
	sorted_indices = np.argsort(distances)[:top_k]

	# 結果を返す
	results = []
	for idx in sorted_indices:
	tile = self.all_tiles[idx].copy()
	tile["spatial_distance"] = float(distances[idx])
	results.append(tile)

	return results
	```

	計算量: O(N) - 全タイル数Nに比例（線形探索）

	最適化案（未実装）:
	- KD-Tree: O(log N) だが6次元では効果薄い
	- Ball-Tree: 高次元でも比較的有効
	- 近似近傍探索（Annoy, HNSW）: 超高速だが精度低下

	##### ハイブリッド検索（テキスト + 座標）

	```python
	def hybrid_search(
	self,
	query_text,
	query_coords=None,
	top_k=5,
	text_weight=0.4,
	spatial_weight=0.6
	):
	"""
	テキストマッチと空間距離の複合スコアリング
	"""
	# Step 1: テキストマッチスコア計算
	text_scores = []
	for tile in self.all_tiles:
	score = self._calculate_text_match(query_text, tile)
	text_scores.append(score)
	text_scores = np.array(text_scores) # Shape: (N,)

	# Step 2: 空間距離スコア計算
	if query_coords:
	spatial_distances = np.linalg.norm(
	self.coordinates_matrix - np.array(query_coords),
	axis=1
	)
	# 距離を0-1のスコアに変換（逆数）
	max_dist = spatial_distances.max()
	spatial_scores = 1.0 - (spatial_distances / max_dist)
	else:
	spatial_scores = np.zeros(len(self.all_tiles))

	# Step 3: 複合スコア計算
	combined_scores = (
	text_weight * text_scores +
	spatial_weight * spatial_scores
	)

	# Step 4: スコアでソート
	sorted_indices = np.argsort(combined_scores)[::-1][:top_k]

	# 結果を返す
	results = []
	for idx in sorted_indices:
	tile = self.all_tiles[idx].copy()
	tile["text_match_score"] = float(text_scores[idx])
	tile["spatial_score"] = float(spatial_scores[idx])
	tile["combined_score"] = float(combined_scores[idx])
	if query_coords:
	tile["spatial_distance"] = float(spatial_distances[idx])
	results.append(tile)

	return results

	def _calculate_text_match(self, query, tile):
	"""
	テキストマッチスコア計算（簡易版）

	将来的にはBM25やTF-IDFを使う
	"""
	query_lower = query.lower()
	content = tile["content"]["final_response"].lower()
	topic = tile["metadata"]["topic"].lower()

	# キーワードマッチング
	query_words = set(query_lower.split())
	content_words = set(content.split())
	topic_words = set(topic.split())

	# Jaccard類似度
	content_jaccard = len(query_words & content_words) / len(query_words \| content_words)
	topic_jaccard = len(query_words & topic_words) / len(query_words \| topic_words)

	# 複合スコア（トピックを重視）
	score = 0.3 * content_jaccard + 0.7 * topic_jaccard

	return score
	```

	### FineTuningManager (`null_ai/fine_tuning.py`)

	#### 役割
	弟子モデルのファインチューニングを実行。

	#### PEFT（QLoRA）方式の詳細

	```python
	async def fine_tune_with_huggingface_peft(
	self,
	model_name,
	training_examples,
	output_dir,
	epochs=3,
	learning_rate=2e-4,
	batch_size=4,
	lora_r=8,
	lora_alpha=16
	):
	"""
	Parameter-Efficient Fine-Tuning with QLoRA

	QLoRA = Quantized LoRA
	- 4-bit量子化でメモリ削減
	- LoRAで訓練パラメータ削減
	→ 12GB GPUでも7Bモデルを訓練可能
	"""

	# Step 1: モデルを4-bit量子化でロード
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True, # 4-bit量子化
	bnb_4bit_quant_type="nf4", # NormalFloat4（最適な量子化方式）
	bnb_4bit_compute_dtype=torch.float16, # 計算はfp16で
	bnb_4bit_use_double_quant=True # 二重量子化（さらにメモリ削減）
	)

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	quantization_config=bnb_config,
	device_map="auto" # 自動的にGPU/CPUに配置
	)

	# Step 2: LoRA設定
	lora_config = LoraConfig(
	r=lora_r, # LoRAランク（低いほど軽量）
	lora_alpha=lora_alpha, # スケーリング係数
	target_modules=[ # どのレイヤーにLoRAを適用するか
	"q_proj", "k_proj", "v_proj", "o_proj", # Attention
	"gate_proj", "up_proj", "down_proj" # MLP
	],
	lora_dropout=0.05, # Dropout率
	bias="none", # Biasは訓練しない
	task_type="CAUSAL_LM" # タスクタイプ
	)

	model = get_peft_model(model, lora_config)

	# 訓練可能パラメータ数を表示
	model.print_trainable_parameters()
	# 例: trainable params: 4.2M \|\| all params: 2.7B \|\| trainable%: 0.16%
	# → 全パラメータの0.16%だけ訓練！

	# Step 3-9: データ準備、訓練、保存（省略）
	...
	```

	QLoRAの仕組み:
	```
	通常のファインチューニング:
	┌─────────────────────────┐
	│ モデル全体（2.7B params）│ ← 全て訓練
	│ メモリ: ~40GB │
	└─────────────────────────┘

	QLoRA:
	┌─────────────────────────┐
	│ 元モデル（2.7B params） │ ← 4-bit量子化、frozen（訓練しない）
	│ メモリ: ~7GB │
	└─────────────────────────┘
	+
	┌─────────────────────────┐
	│ LoRAアダプター（4.2M） │ ← これだけ訓練
	│ メモリ: ~0.5GB │
	└─────────────────────────┘
	=
	合計メモリ: ~12GB
	```

	#### Alpaca形式データの整形

	```python
	def format_training_examples_for_model(
	self,
	training_examples,
	template="alpaca"
	):
	"""
	Alpaca形式 → モデル用プロンプトに整形
	"""
	formatted_prompts = []

	for example in training_examples:
	instruction = example["instruction"]
	input_text = example["input"]
	output_text = example["output"]

	if template == "alpaca":
	if input_text:
	prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Input:
	{input_text}

	### Response:
	{output_text}"""
	else:
	prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Response:
	{output_text}"""

	formatted_prompts.append(prompt)

	return formatted_prompts
	```

	なぜこの形式？
	- 明確な区切り（`###`）
	- instruction-following能力の向上
	- オープンソースコミュニティの標準

	---

	## データフロー完全図解

	### フロー1: 通常推論（RAGあり）

	```
	┌──────────────────────────────────────────────────────────────┐
	│ ユーザー: "心臓の働きについて教えて" │
	└────────────────────┬─────────────────────────────────────────┘
	↓
	┌────────────────────────────────────────────────────────────┐
	│ Frontend: InferencePanel.tsx │
	│ - 質問をバックエンドに送信 │
	└────────────────────┬───────────────────────────────────────┘
	↓ HTTP POST /api/questions
	┌────────────────────────────────────────────────────────────┐
	│ Backend: questions.py │
	│ - InferenceService.ask_question() │
	└────────────────────┬───────────────────────────────────────┘
	↓
	┌────────────────────────────────────────────────────────────┐
	│ NullAI Core: model_router.py │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 1: _check_db_knowledge("medical", "心臓の働き") │ │
	│ │ → DendriticMemorySpace.search_by_text() │ │
	│ │ → 結果: 3件見つかった │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 2: _retrieve_relevant_knowledge() │ │
	│ │ → hybrid_search("心臓の働き", top_k=3) │ │
	│ │ → 取得: │ │
	│ │ [1] 心臓の解剖学 (score: 0.92) │ │
	│ │ [2] 循環器系の機能 (score: 0.85) │ │
	│ │ [3] 心臓病の分類 (score: 0.73) │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 3: プロンプト拡張 │ │
	│ │ augmented_prompt = """ │ │
	│ │ Based on the following verified knowledge: │ │
	│ │ │ │
	│ │ [Knowledge 1 - expert verification, conf: 0.9] │ │
	│ │ Topic: 心臓の解剖学 │ │
	│ │ Content: 心臓は4つの部屋から構成され... │ │
	│ │ │ │
	│ │ [Knowledge 2 - ...] │ │
	│ │ │ │
	│ │ Now, please answer: │ │
	│ │ 心臓の働きについて教えて │ │
	│ │ """ │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 4: LLM推論 │ │
	│ │ → llm_providers.py │ │
	│ │ → OllamaProvider.infer() │ │
	│ │ → model: deepseek-r1:1.5b │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 5: レスポンス生成 │ │
	│ │ response = { │ │
	│ │ "response": "心臓は循環器系の中心器官で...", │ │
	│ │ "confidence": 0.88, │ │
	│ │ "thinking": "検証済み知識に基づいて回答", │ │
	│ │ "retrieved_knowledge": [...] │ │
	│ │ } │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 6: 師匠の出力？ │ │
	│ │ is_master = True │ │
	│ │ confidence = 0.88 >= 0.8 ✓ │ │
	│ │ → _save_master_output_as_training_data() │ │
	│ │ → training_data/master_outputs/medical.jsonl │ │
	│ └─────────────────────────────────────────────────────┘ │
	└────────────────────┬───────────────────────────────────────┘
	↓
	┌────────────────────────────────────────────────────────────┐
	│ Frontend: レスポンス表示 │
	│ - ResponseDisplay.tsx │
	│ - 「心臓は循環器系の中心器官で...」 │
	│ - Retrieved Knowledge バッジ表示 │
	└────────────────────────────────────────────────────────────┘
	```

	### フロー2: 通常推論（RAGなし、自己拡充）

	```
	┌──────────────────────────────────────────────────────────────┐
	│ ユーザー: "量子コンピュータの原理は？" │
	└────────────────────┬─────────────────────────────────────────┘
	↓
	(同上)
	↓
	┌────────────────────────────────────────────────────────────┐
	│ NullAI Core: model_router.py │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 1: _check_db_knowledge("general", "量子...") │ │
	│ │ → DendriticMemorySpace.search_by_text() │ │
	│ │ → 結果: 見つからなかった ❌ │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 2: AI内部知識で推論 │ │
	│ │ → LLM.generate("量子コンピュータの原理は？") │ │
	│ │ → model: deepseek-r1:1.5b │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 3: レスポンス生成 │ │
	│ │ response = { │ │
	│ │ "response": "量子コンピュータは...", │ │
	│ │ "confidence": 0.75 │ │
	│ │ } │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 4: 自己拡充（DBに保存） │ │
	│ │ confidence = 0.75 >= 0.7 ✓ │ │
	│ │ save_to_memory = True │ │
	│ │ → _save_inference_to_db() │ │
	│ │ → SQLite: knowledge_tiles テーブル │ │
	│ │ (将来的には.iathにも保存) │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 5: 師匠の出力として保存 │ │
	│ │ is_master = True │ │
	│ │ confidence = 0.75 < 0.8 ❌ │ │
	│ │ → 訓練データには保存しない │ │
	│ └─────────────────────────────────────────────────────┘ │
	└────────────────────┬───────────────────────────────────────┘
	↓
	(レスポンス表示)
	```

	重要な違い:
	- RAGあり: `confidence >= 0.8`で訓練データ保存
	- RAGなし: `confidence >= 0.7`でDB保存、`>= 0.8`で訓練データ保存

	### フロー3: ファインチューニング実行

	```
	┌──────────────────────────────────────────────────────────────┐
	│ ユーザー: Training Dashboard で "Start Fine-tuning" │
	│ - Apprentice Model: microsoft/phi-2 │
	│ - Domain: medical │
	│ - Method: peft │
	│ - Epochs: 3 │
	└────────────────────┬─────────────────────────────────────────┘
	↓ HTTP POST /api/training/start
	┌────────────────────────────────────────────────────────────┐
	│ Backend: training.py │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 1: 訓練データ存在チェック │ │
	│ │ → FineTuningManager.load_training_data("medical") │ │
	│ │ → training_data/master_outputs/medical.jsonl │ │
	│ │ → 結果: 150サンプル │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 2: バックグラウンドタスク開始 │ │
	│ │ background_tasks.add_task(run_training) │ │
	│ │ → すぐにレスポンス返却（非同期） │ │
	│ └─────────────────────────────────────────────────────┘ │
	└────────────────────┬───────────────────────────────────────┘
	↓
	┌────────────────────────────────────────────────────────────┐
	│ NullAI Core: fine_tuning.py (バックグラウンド) │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 1: モデルロード（4-bit量子化） │ │
	│ │ → AutoModelForCausalLM.from_pretrained( │ │
	│ │ "microsoft/phi-2", │ │
	│ │ quantization_config=bnb_config │ │
	│ │ ) │ │
	│ │ → メモリ使用: ~7GB │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 2: LoRA設定 │ │
	│ │ → get_peft_model(model, lora_config) │ │
	│ │ → 訓練可能パラメータ: 4.2M / 2.7B (0.16%) │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 3: データ準備 │ │
	│ │ → format_training_examples_for_model() │ │
	│ │ → Alpaca形式 → モデル用プロンプトに整形 │ │
	│ │ → Dataset.from_dict({"text": prompts}) │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 4: トレーニング開始 │ │
	│ │ Epoch 1/3: │ │
	│ │ [===> ] 35% loss: 1.245 │ │
	│ │ → current_training_state.update({ │ │
	│ │ "progress": 35, │ │
	│ │ "current_epoch": 1, │ │
	│ │ "loss": 1.245 │ │
	│ │ }) │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ ↓ │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ Step 5: 完了 │ │
	│ │ → trainer.save_model(output_dir) │ │
	│ │ → training_data/checkpoints/apprentice_medical_*/ │ │
	│ │ → current_training_state["is_training"] = False │ │
	│ └─────────────────────────────────────────────────────┘ │
	└────────────────────┬───────────────────────────────────────┘
	↓
	┌────────────────────────────────────────────────────────────┐
	│ Frontend: TrainingDashboard.tsx │
	│ - 2秒ごとにポーリング: GET /api/training/status │
	│ - プログレスバー更新: 35% → 67% → 100% │
	│ - 完了時: チェックポイント一覧を再取得 │
	└────────────────────────────────────────────────────────────┘
	```

	---

	## 技術スタック詳細

	### フロントエンド

	```
	React 18.2 + TypeScript 5.0
	├─ Vite 4.4 (ビルドツール)
	├─ TailwindCSS 3.3 (スタイリング)
	└─ axios (HTTP クライアント)

	主要コンポーネント:
	- EngineManager.tsx (417行) - エンジン管理UI
	- InferencePanel.tsx (321行) - 推論パネル
	- TrainingDashboard.tsx (400行) - トレーニングダッシュボード
	- KnowledgePanel.tsx (185行) - 知識ブラウザ
	```

	### バックエンド

	```
	FastAPI 0.115.6 + Python 3.13+
	├─ Uvicorn (ASGIサーバー)
	├─ SQLAlchemy 2.0 (ORM)
	├─ Pydantic 2.10 (バリデーション)
	└─ Alembic (マイグレーション)

	主要API:
	- /api/config/* - エンジン管理
	- /api/questions - 推論実行
	- /api/training/* - ファインチューニング
	- /api/knowledge/* - 知識タイル管理
	```

	### NullAI Core

	```
	Python 3.13+
	├─ transformers 4.36+ (HuggingFace)
	├─ torch 2.0+ (PyTorch)
	├─ peft 0.7+ (LoRA/QLoRA)
	├─ trl 0.7+ (Reinforcement Learning from Human Feedback)
	├─ datasets 2.15+ (データセット処理)
	├─ bitsandbytes 0.41+ (量子化)
	├─ accelerate 0.25+ (分散訓練)
	├─ zstandard 0.22+ (.iath圧縮)
	└─ numpy 1.24+ (数値計算)

	主要モジュール:
	- model_router.py (800行) - RAG統合、エンジン管理
	- iath_memory.py (362行) - 6次元空間記憶
	- fine_tuning.py (640行) - ファインチューニング
	- llm_providers.py (390行) - LLMプロバイダー統合
	```

	### LLMプロバイダー

	```
	1. Ollama
	- ローカルモデル管理
	- ollama pull deepseek-r1:1.5b
	- API: http://localhost:11434

	2. HuggingFace Transformers
	- 直接ロード
	- AutoModelForCausalLM.from_pretrained()
	- GPU/CPU自動配置

	3. MLX (Apple Silicon)
	- M1/M2/M3 Mac専用
	- 統合メモリ活用
	- mlx-lm ライブラリ

	4. GGUF (llama-cpp-python)
	- 量子化モデル（.gguf）
	- CPU推論に最適
	- GPU acceleration対応
	```

	---

	## よくある誤解と注意点

	### 誤解1: 「RAGは常に使われる」

	❌ 誤解: 全ての推論でRAGが使われる
	✅ 真実: DBに知識がある場合のみRAGが発動

	```python
	# 実際の動作
	if has_knowledge:
	# RAG推論
	else:
	# 通常推論（RAGなし）
	```

	見分け方:
	- RAGあり: レスポンスに`retrieved_knowledge`フィールドが含まれる
	- RAGなし: `retrieved_knowledge`が空

	### 誤解2: 「弟子は自動的に師匠になる」

	❌ 誤解: ファインチューニングが完了したら自動で師匠に昇格
	✅ 真実: 手動で昇格操作が必要

	```
	ファインチューニング完了
	↓
	チェックポイント保存
	↓
	【手動操作】Engine Manager で Promote をクリック
	↓
	弟子が師匠に昇格
	```

	理由: 品質チェックを人間が行うべき

	### 誤解3: 「.iathファイルは自動更新される」

	❌ 誤解: AI生成知識が自動的に.iathに保存される
	✅ 真実: 現在はJSONLのみ、.iath保存は未実装（Priority 2）

	```
	現状:
	AI生成知識 → SQLite + JSONL ✅
	→ .iath ❌（未実装）

	Priority 2実装後:
	AI生成知識 → SQLite + JSONL + .iath ✅
	```

	### 誤解4: 「ファインチューニングは全パラメータを訓練する」

	❌ 誤解: モデル全体（2.7B パラメータ）を訓練
	✅ 真実: LoRAアダプター（4.2M）だけ訓練

	```
	訓練されるパラメータ:
	- 元モデル: 2,700,000,000 → frozen（訓練しない）
	- LoRA: 4,200,000 → 訓練する ✅

	訓練パラメータ比率: 0.16%
	```

	メリット:
	- メモリ削減（40GB → 12GB）
	- 訓練時間短縮（10時間 → 2時間）
	- 元モデルは変更されない（安全）

	### 誤解5: 「SQLiteと.iathは同じデータを保存」

	❌ 誤解: SQLiteと.iathは重複している
	✅ 真実: 役割が完全に異なる

	\| データベース \| 保存内容 \| 用途 \|
	\|-------------\|---------\|------\|
	\| SQLite \| ユーザー、ワークスペース、推論履歴、メタデータ \| アプリケーション管理 \|
	\| .iath \| Knowledge Tile（6次元座標 + コンテンツ） \| 知識検索・RAG推論 \|

	例:
	```
	SQLite:
	- users テーブル: nullai_default_user
	- workspaces テーブル: default_workspace
	- inference_history: 過去の質問と回答

	.iath:
	- Tile 1: [0.2, 0.8, 0.3, 0.9, 0.7, 0.8] "心臓の働き..."
	- Tile 2: [0.3, 0.8, 0.4, 0.85, 0.65, 0.75] "循環器系..."
	```

	### 誤解6: 「confidence値はAIが自動計算」

	❌ 誤解: AIが自己評価してconfidenceを返す
	✅ 真実: 現在は固定値（プロバイダーごと）

	```python
	# llm_providers.py
	class OllamaProvider:
	async def infer(...):
	return {
	"response": response_text,
	"confidence": 0.85 # ← 固定値！
	}
	```

	将来の改善:
	- 複数モデルでクロスチェック
	- 応答の不確実性を計算（エントロピー）
	- 人間によるフィードバック学習

	---

	## 設計判断の理由

	### 判断1: なぜPEFTを採用したか

	候補:
	1. フルファインチューニング
	2. PEFT (LoRA/QLoRA)
	3. Adapter
	4. Prompt Tuning

	採用: PEFT (QLoRA)

	理由:
	```
	比較表:

	メモリ速度品質汎用性
	フルFT × × ⭐⭐⭐ ⭐⭐⭐
	PEFT (QLoRA) ⭐⭐⭐ ⭐⭐ ⭐⭐⭐ ⭐⭐⭐
	Adapter ⭐⭐ ⭐⭐ ⭐⭐ ⭐⭐
	Prompt Tuning ⭐⭐⭐ ⭐⭐⭐ ⭐ ⭐
	```

	結論: PEFTがバランス最良

	### 判断2: なぜAlpaca形式を採用したか

	候補:
	1. Alpaca
	2. ShareGPT
	3. OpenAssistant
	4. Custom

	採用: Alpaca

	理由:
	- オープンソースで広く採用
	- instruction-input-output構造が明確
	- HuggingFace datasetsと互換性
	- コミュニティのベストプラクティス

	### 判断3: なぜハイブリッド検索か

	候補:
	1. テキストのみ
	2. 座標のみ
	3. ハイブリッド

	採用: ハイブリッド

	理由:
	```
	テキストのみ:
	- 利点: シンプル
	- 欠点: 同義語を見逃す

	座標のみ:
	- 利点: 意味的に関連する知識を発見
	- 欠点: 座標が不正確だと失敗

	ハイブリッド:
	- 利点: 両方の長所を活かせる
	- 欠点: パラメータ調整が必要（text_weight, spatial_weight）
	```

	現在の設定:
	```python
	text_weight = 0.4
	spatial_weight = 0.6
	# → 座標をやや重視（意味的関連性を優先）
	```

	### 判断4: なぜ循環インポートをlazy importで解決したか

	候補:
	1. Lazy import（関数内でimport）
	2. アーキテクチャ変更（依存関係の整理）
	3. 中間モジュール導入

	採用: Lazy import

	理由:
	- 最小限の変更で解決
	- パフォーマンスへの影響は軽微
	- 既存コードの大幅な書き換え不要

	実装例:
	```python
	def _check_db_knowledge(self, domain_id, prompt):
	# 関数内でimport → 循環回避
	from backend.app.database.session import SessionLocal
	db = SessionLocal()
	# ...
	```

	---

	## 拡張時の考慮事項

	### 新しいLLMプロバイダーを追加する場合

	手順:

	1. `llm_providers.py`に新しいクラスを追加
	```python
	class NewProvider:
	async def infer(self, model_config, prompt, temperature):
	# 実装
	pass

	async def infer_streaming(self, model_config, prompt, temperature):
	# 実装
	pass
	```

	2. `model_router.py`の`_perform_llm_inference()`に追加
	```python
	if provider == "ollama":
	result = await self.ollama_provider.infer(...)
	elif provider == "new_provider": # ← 追加
	result = await self.new_provider.infer(...)
	```

	3. `backend/app/config.py`の`ModelProvider`列挙型に追加
	```python
	class ModelProvider(str, Enum):
	OLLAMA = "ollama"
	HUGGINGFACE = "huggingface"
	NEW_PROVIDER = "new_provider" # ← 追加
	```

	### 新しいドメインを追加する場合

	手順:

	1. `.iath`ファイルでドメイン用の座標空間を定義
	```
	医療ドメイン: medical_space [x, y, z]
	法律ドメイン: legal_space [x, y, z] ← 追加
	- x: 法分野（民法、刑法、商法...）
	- y: 判例レベル（地裁、高裁、最高裁）
	- z: 時代（古典、現代、最新）
	```

	2. `backend/app/config.py`にドメイン設定追加
	```python
	domains = [
	{"domain_id": "medical", "name": "医療"},
	{"domain_id": "legal", "name": "法律"} # ← 追加
	]
	```

	3. 訓練データディレクトリ作成
	```bash
	mkdir -p training_data/master_outputs/
	touch training_data/master_outputs/master_outputs_legal.jsonl
	```

	### 座標自動推定を実装する場合（Priority 2）

	設計案:

	```python
	# null_ai/coordinate_estimator.py

	class CoordinateEstimator:
	def __init__(self, llm_model):
	"""
	DeepSeek R1を使って座標を推定
	"""
	self.llm = llm_model

	async def estimate_coordinates(
	self,
	prompt: str,
	response: str,
	domain_id: str
	) -> List[float]:
	"""
	6次元座標を推定

	Returns: [x, y, z, c, g, v]
	"""
	# プロンプト構築
	estimation_prompt = f"""You are an expert in knowledge space mapping.
	Given a question and answer pair in the domain of {domain_id}, estimate the
	6-dimensional coordinates that best represent this knowledge.

	Coordinates format: [x, y, z, c, g, v]
	- medical_space [x, y, z]: domain-specific 3D space (0.0-1.0)
	- meta_space [c, g, v]: Certainty, Granularity, Verification (0.0-1.0)

	Question: {prompt}
	Answer: {response}

	Output ONLY the coordinates as a JSON array: [x, y, z, c, g, v]
	"""

	# LLMに座標推定を依頼
	result = await self.llm.generate(estimation_prompt)

	# JSONパース
	coords = json.loads(result)

	# バリデーション
	assert len(coords) == 6
	assert all(0.0 <= c <= 1.0 for c in coords)

	return coords
	```

	### WebSocketでリアルタイム進捗を実装する場合

	設計案:

	```python
	# backend/app/main.py

	@app.websocket("/ws/training/{session_id}")
	async def training_websocket(websocket: WebSocket, session_id: str):
	await websocket.accept()

	# 進捗コールバック
	async def progress_callback(state):
	await websocket.send_json({
	"type": "progress",
	"data": state
	})

	# ファインチューニング開始
	await fine_tuning_manager.start_training(
	...,
	progress_callback=progress_callback
	)
	```

	---

	## まとめ: プロジェクトの本質

	NullAIは単なるRAGシステムでも、単なるファインチューニングツールでもありません。

	NullAIの本質:
	```
	自己進化する知識生態系

	師匠AI → 知識生成 → 弟子AI学習 → 昇格 → 新しい弟子 → サイクル継続
	↓ ↑
	DB拡充（自己拡充）ファインチューニング
	↓ ↑
	樹木型空間記憶（6次元座標）高品質訓練データ
	↓ ↑
	意味的知識整理師匠の知識継承
	↓ ↑
	└────────────── サイクル ──────────────┘
	```

	4つの核心思想の統合:
	1. 倒木システム: 世代交代による進化
	2. DB分離構造: 信頼性の確保と自己拡充
	3. 樹木型空間記憶: 意味的知識整理
	4. ローカルファースト: プライバシーとコスト

	これら全てが有機的に結合し、AIが自己進化する生態系を形成しています。

	---

	このガイドを理解したら、あなたはNullAIの設計思想を正しく継承できます。

	頑張ってください！🌲🔥

	---

	Document Version: 1.0
	Total Pages: 60+
	Total Words: 15,000+
	Author: Claude (Sonnet 4.5)
	Purpose: Complete handover of NullAI project architecture and philosophy