nullai-knowledge-system / PROJECT_ARCHITECTURE_GUIDE.md
kofdai's picture
Upload PROJECT_ARCHITECTURE_GUIDE.md with huggingface_hub
3f9e2e3 verified

NullAI ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆๅฎŒๅ…จ็†่งฃใ‚ฌใ‚คใƒ‰

ๆœ€็ต‚ๆ›ดๆ–ฐ: 2025-12-02 ๅฏพ่ฑก่ชญ่€…: ใ“ใฎใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใ‚’ๅผ•ใ็ถ™ใๅ…จใฆใฎ้–‹็™บ่€… ็›ฎ็š„: ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฎๅ…จไฝ“ๅƒใ‚’ๅฎŒๅ…จใซ็†่งฃใ—ใ€่จญ่จˆๆ€ๆƒณใ‚’ๆญฃใ—ใ็ถ™ๆ‰ฟใ™ใ‚‹


๐Ÿ“– ็›ฎๆฌก

  1. ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆๆฆ‚่ฆ
  2. 4ใคใฎๆ ธๅฟƒๆ€ๆƒณ๏ผˆใ“ใ ใ‚ใ‚Šใƒใ‚คใƒณใƒˆ๏ผ‰
  3. ใ‚ทใ‚นใƒ†ใƒ ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃๅ…จไฝ“ๅ›ณ
  4. ๅ„ใ‚ทใ‚นใƒ†ใƒ ใฎ่ฉณ็ดฐ่งฃ่ชฌ
  5. ใƒ‡ใƒผใ‚ฟใƒ•ใƒญใƒผๅฎŒๅ…จๅ›ณ่งฃ
  6. ๆŠ€่ก“ใ‚นใ‚ฟใƒƒใ‚ฏ่ฉณ็ดฐ
  7. ใ‚ˆใใ‚ใ‚‹่ชค่งฃใจๆณจๆ„็‚น
  8. ่จญ่จˆๅˆคๆ–ญใฎ็†็”ฑ
  9. ๆ‹กๅผตๆ™‚ใฎ่€ƒๆ…ฎไบ‹้ …

ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆๆฆ‚่ฆ

NullAIใจใฏไฝ•ใ‹

NullAIใฏใ€่‡ชๅทฑ้€ฒๅŒ–ๅž‹ๅคšใƒ‰ใƒกใ‚คใƒณ็Ÿฅ่ญ˜ๆŽจ่ซ–ใ‚จใƒณใ‚ธใƒณใงใ™ใ€‚

ๆ ธๅฟƒ็š„ใชๅ•ใ„ใจ็ญ”ใˆ

Q: ไฝ•ใ‚’่งฃๆฑบใ—ใ‚ˆใ†ใจใ—ใฆใ„ใ‚‹ใฎใ‹๏ผŸ A: ใ€ŒAIใฎใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณ๏ผˆๅนป่ฆš๏ผ‰ใ€ใจใ€Œๅฐๅž‹ใƒขใƒ‡ใƒซใฎๆ€ง่ƒฝไธ่ถณใ€ใฎไธกๆ–นใ‚’ๅŒๆ™‚ใซ่งฃๆฑบ

Q: ใฉใ†ใ‚„ใฃใฆ่งฃๆฑบใ™ใ‚‹ใฎใ‹๏ผŸ A:

  1. DBๅ„ชๅ…ˆๆŽจ่ซ–๏ผˆRAG๏ผ‰ โ†’ ใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณๅ‰Šๆธ›
  2. ๅธซๅŒ โ†’ๅผŸๅญใฎใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ โ†’ ๅฐๅž‹ใƒขใƒ‡ใƒซใฎๆ€ง่ƒฝๅ‘ไธŠ
  3. ๆจนๆœจๅž‹็ฉบ้–“่จ˜ๆ†ถ โ†’ ็Ÿฅ่ญ˜ใฎๆ„ๅ‘ณ็š„ๆ•ด็†ใจ้ซ˜้€Ÿๆคœ็ดข
  4. ่‡ชๅทฑๆ‹กๅ……ใ‚ตใ‚คใ‚ฏใƒซ โ†’ ็Ÿฅ่ญ˜ใƒ™ใƒผใ‚นใฎ่‡ชๅ‹•ๆˆ้•ท

Q: ไป–ใฎRAGใ‚ทใ‚นใƒ†ใƒ ใจใฎ้•ใ„ใฏ๏ผŸ A:

  • โŒ ๆ™ฎ้€šใฎRAG: ใƒ™ใ‚ฏใƒˆใƒซDBใงๆคœ็ดขใ™ใ‚‹ใ ใ‘
  • โœ… NullAI: 6ๆฌกๅ…ƒ็ฉบ้–“ๅบงๆจ™ใง็Ÿฅ่ญ˜ใ‚’้…็ฝฎใ—ใ€ๆ„ๅ‘ณ็š„ใช่ฟ‘ๅ‚ๆคœ็ดขใŒๅฏ่ƒฝ

Q: ไป–ใฎใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐใ‚ทใ‚นใƒ†ใƒ ใจใฎ้•ใ„ใฏ๏ผŸ A:

  • โŒ ๆ™ฎ้€šใฎFT: ไบบ้–“ใŒ่จ“็ทดใƒ‡ใƒผใ‚ฟใ‚’ๆ‰‹ๅ‹•ไฝœๆˆ
  • โœ… NullAI: ๅธซๅŒ AIใŒ่‡ชๅ‹•็š„ใซ่จ“็ทดใƒ‡ใƒผใ‚ฟใ‚’็”Ÿๆˆ โ†’ ๅผŸๅญใŒๅญฆ็ฟ’ โ†’ ๅผŸๅญใŒๅธซๅŒ ใซๆ˜‡ๆ ผ โ†’ ็„ก้™ใ‚ตใ‚คใ‚ฏใƒซ

ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆๅใฎ็”ฑๆฅ

Null = ใ‚ผใƒญ๏ผˆใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณ๏ผ‰ AI = Artificial Intelligence

โ†’ ใ‚ผใƒญใƒปใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณใ‚’็›ฎๆŒ‡ใ™AI


4ใคใฎๆ ธๅฟƒๆ€ๆƒณ๏ผˆใ“ใ ใ‚ใ‚Šใƒใ‚คใƒณใƒˆ๏ผ‰

1๏ธโƒฃ ๅ€’ๆœจใ‚ทใ‚นใƒ†ใƒ ๏ผˆFallen Tree System๏ผ‰

ๆฏ”ๅ–ฉใฎๆ„ๅ‘ณ

ๆฃฎใงๅคงๆœจ๏ผˆ่€ใ„ใŸๆœจ๏ผ‰ใŒๅ€’ใ‚Œใ‚‹ใจใ€ใใฎ้คŠๅˆ†ใงๆ–ฐใ—ใ„่‹ฅๆœจใŒ่‚ฒใคใ€‚NullAIใงใฏ๏ผš

  • ๐ŸŒฒ ๅคงๆœจ๏ผˆๅธซๅŒ ใƒขใƒ‡ใƒซ๏ผ‰: ้ซ˜ๆ€ง่ƒฝใ ใŒ้‡ใ„AI๏ผˆไพ‹: DeepSeek R1 32B๏ผ‰
  • ๐ŸŒฑ ่‹ฅๆœจ๏ผˆๅผŸๅญใƒขใƒ‡ใƒซ๏ผ‰: ๆœ€ๅˆใฏ็ฉบใฃใฝใ ใŒ่ปฝ้‡ใชAI๏ผˆไพ‹: Phi-2 2.7B๏ผ‰
  • ๐Ÿ‚ ้คŠๅˆ†๏ผˆ่จ“็ทดใƒ‡ใƒผใ‚ฟ๏ผ‰: ๅธซๅŒ ใฎ้ซ˜ๅ“่ณชใชๅ‡บๅŠ›๏ผˆAlpacaๅฝขๅผJSONL๏ผ‰

ใ‚ทใ‚นใƒ†ใƒ ใฎๆตใ‚Œ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Phase 1: ๅธซๅŒ ใฎ็ตฑๆฒปๆ™‚ไปฃ                             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ๅธซๅŒ ๏ผˆDeepSeek R1๏ผ‰ใŒๆŽจ่ซ–ใ‚’ๆ‹…ๅฝ“                     โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ ้ซ˜ๅ“่ณชใชๅ‡บๅŠ›๏ผˆconfidence >= 0.8๏ผ‰ใŒ่‡ชๅ‹•ไฟๅญ˜         โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ training_data/master_outputs/*.jsonl                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Phase 2: ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ่จ“็ทดใƒ‡ใƒผใ‚ฟใ‚’ไฝฟใฃใฆๅผŸๅญ๏ผˆPhi-2๏ผ‰ใ‚’่จ“็ทด              โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ ๅผŸๅญใฎๆ€ง่ƒฝใŒๅ‘ไธŠ๏ผˆๅธซๅŒ ใฎ็Ÿฅ่ญ˜ใ‚’ๅธๅŽ๏ผ‰               โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ training_data/checkpoints/apprentice_*/             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Phase 3: ไธ–ไปฃไบคไปฃ๏ผˆๅ€’ๆœจ๏ผ‰                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ๅผŸๅญใŒๅๅˆ†ๆˆ้•ท โ†’ ๅธซๅŒ ใซๆ˜‡ๆ ผ                        โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ ๆ—งๅธซๅŒ ๏ผˆDeepSeek๏ผ‰ใฏๅผ•้€€๏ผˆใงใ‚‚็‰นๅˆฅใชๅฝนๅ‰ฒใ‚ใ‚Š๏ผ‰     โ”‚
โ”‚  โ†“                                                  โ”‚
โ”‚ ๆ–ฐใ—ใ„็ฉบใฎๅผŸๅญใ‚’็”Ÿๆˆ                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ†“
                  ใ‚ตใ‚คใ‚ฏใƒซ็นฐใ‚Š่ฟ”ใ—

้‡่ฆใช่จญ่จˆๅˆคๆ–ญ

Q: ใชใœๅธซๅŒ ใ‚’ๅฎŒๅ…จใซๅ‰Š้™คใ—ใชใ„ใฎใ‹๏ผŸ A: ๅผ•้€€ใ—ใŸๅธซๅŒ ๏ผˆDeepSeek๏ผ‰ใฏใ€Œๆฐธไน…็š„ๆŒ‡ๅฐŽ่€…ใ€ใจใ—ใฆๆฎ‹ใ‚‹

  • DBๆ‹กๅ……ๆ™‚ใฎใƒ—ใƒญใƒณใƒ—ใƒˆ็”Ÿๆˆ
  • ๆ–ฐใ—ใ„ใƒ‰ใƒกใ‚คใƒณใฎๅˆๆœŸ็Ÿฅ่ญ˜็”Ÿๆˆ
  • ๅ“่ณชใƒใ‚งใƒƒใ‚ฏ

Q: ๅผŸๅญใฏใ„ใคๅธซๅŒ ใซใชใ‚Œใ‚‹ใฎใ‹๏ผŸ A:

  • ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐๅฎŒไบ†ๅพŒใ€ๆ‰‹ๅ‹•ใงๆ˜‡ๆ ผ
  • ๅฐ†ๆฅ็š„ใซใฏ่‡ชๅ‹•่ฉ•ไพกใงๆ˜‡ๆ ผๅˆคๅฎš๏ผˆๆœชๅฎŸ่ฃ…๏ผ‰

Q: ่ค‡ๆ•ฐใฎๅผŸๅญใ‚’ๅŒๆ™‚ใซ่จ“็ทดใงใใ‚‹ใฎใ‹๏ผŸ A: ใงใใ‚‹ใ€‚ใƒ‰ใƒกใ‚คใƒณๅˆฅใซ็•ฐใชใ‚‹ๅผŸๅญใ‚’่จ“็ทดๅฏ่ƒฝ

  • ๅŒป็™‚ใƒ‰ใƒกใ‚คใƒณๅผŸๅญ
  • ๆณ•ๅพ‹ใƒ‰ใƒกใ‚คใƒณๅผŸๅญ
  • ไธ€่ˆฌ็Ÿฅ่ญ˜ๅผŸๅญ

2๏ธโƒฃ DBๅˆ†้›ขๆง‹้€ ๏ผˆDatabase Separation Structure๏ผ‰

่จญ่จˆๆ€ๆƒณ

่ณชๅ•ใŒๆฅใŸๆ™‚ใฎๅˆคๆ–ญใƒ•ใƒญใƒผ๏ผš

่ณชๅ• โ†’ ใพใš็Ÿฅ่ญ˜DBใ‚’ๆคœ็ดข
         โ”œโ”€ ่ฆ‹ใคใ‹ใฃใŸ โ†’ DB็Ÿฅ่ญ˜ใ‚’ไฝฟใฃใฆๆŽจ่ซ–๏ผˆRAG๏ผ‰โœ… ไฟก้ ผๆ€ง้ซ˜
         โ””โ”€ ่ฆ‹ใคใ‹ใ‚‰ใชใ„ โ†’ AIๅ†…้ƒจ็Ÿฅ่ญ˜ใงๆŽจ่ซ– โš ๏ธ ใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณใƒชใ‚นใ‚ฏ
                            โ†“
                        ใใฎๅ‡บๅŠ›ใ‚’DBใซไฟๅญ˜๏ผˆ่‡ชๅทฑๆ‹กๅ……๏ผ‰

DBๅ„ชๅ…ˆใฎ็†็”ฑ

็Ÿฅ่ญ˜ใ‚ฝใƒผใ‚น ไฟก้ ผๆ€ง ๆ นๆ‹  ใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณ
็Ÿฅ่ญ˜DB๏ผˆ.iath๏ผ‰ โญโญโญโญโญ ไบบ้–“ใŒๆคœ่จผ or ๅฐ‚้–€ๅฎถใŒไฝœๆˆ ใปใผใ‚ผใƒญ
AI็”Ÿๆˆ็Ÿฅ่ญ˜ โญโญโญ AIใฎๅ†…้ƒจ็Ÿฅ่ญ˜๏ผˆๅญฆ็ฟ’ใƒ‡ใƒผใ‚ฟ็”ฑๆฅ๏ผ‰ ไธญ็จ‹ๅบฆ
AIๅนป่ฆš โญ ๆŽจๆธฌใƒปๅ‰ตไฝœ ้ซ˜ใ„

็ต่ซ–: ็Ÿฅ่ญ˜DBใซใ‚ใ‚‹ใ‚‚ใฎใฏ็ตถๅฏพใซไฝฟใ† โ†’ ใƒใƒซใ‚ทใƒใƒผใ‚ทใƒงใƒณๅ‰Šๆธ›

่‡ชๅทฑๆ‹กๅ……ใฎไป•็ต„ใฟ

# ็–‘ไผผใ‚ณใƒผใƒ‰
async def infer(question):
    # Step 1: DBๆคœ็ดข
    db_knowledge = search_db(question)

    if db_knowledge:
        # Step 2a: RAGๆŽจ่ซ–๏ผˆDBใฎ็Ÿฅ่ญ˜ใ‚’ไฝฟใ†๏ผ‰
        response = llm.generate(
            f"Based on this verified knowledge: {db_knowledge}\n"
            f"Answer: {question}"
        )
        return response
    else:
        # Step 2b: AIๅ†…้ƒจ็Ÿฅ่ญ˜ใงๆŽจ่ซ–
        response = llm.generate(question)

        # Step 3: ้ซ˜ๅ“่ณชใชใ‚‰ไฟๅญ˜๏ผˆ่‡ชๅทฑๆ‹กๅ……๏ผ‰
        if response.confidence >= 0.7:
            save_to_db(question, response)

        return response

้‡่ฆใช่จญ่จˆๅˆคๆ–ญ

Q: ใชใœSQLiteใจ.iathใฎ2ใคใ‚’ไฝฟใ†ใฎใ‹๏ผŸ A: ๅฝนๅ‰ฒๅˆ†ๆ‹…

  • SQLite: ใƒกใ‚ฟใƒ‡ใƒผใ‚ฟ๏ผˆใƒฆใƒผใ‚ถใƒผใ€ใƒฏใƒผใ‚ฏใ‚นใƒšใƒผใ‚นใ€ๆŽจ่ซ–ๅฑฅๆญด๏ผ‰
  • .iath: ็Ÿฅ่ญ˜ใ‚ฟใ‚คใƒซๆœฌไฝ“๏ผˆ6ๆฌกๅ…ƒๅบงๆจ™ + ใ‚ณใƒณใƒ†ใƒณใƒ„๏ผ‰

Q: confidence >= 0.7ใจ0.8ใฎ้•ใ„ใฏ๏ผŸ A:

  • >= 0.7: DBไฟๅญ˜๏ผˆ่‡ชๅทฑๆ‹กๅ……๏ผ‰โ† ใ‚„ใ‚„็ทฉใ‚
  • >= 0.8: ่จ“็ทดใƒ‡ใƒผใ‚ฟไฟๅญ˜ โ† ๅŽณใ—ใ‚๏ผˆ้ซ˜ๅ“่ณชใฎใฟ๏ผ‰

Q: AI็”Ÿๆˆ็Ÿฅ่ญ˜ใ‚’DBใซไฟๅญ˜ใ™ใ‚‹้š›ใ€ไบบ้–“ใฎใƒใ‚งใƒƒใ‚ฏใฏไธ่ฆ๏ผŸ A: ็พๅœจใฏ่‡ชๅ‹•ไฟๅญ˜ใ€‚ๅฐ†ๆฅ็š„ใซใฏ๏ผš

  • ๅฐ‚้–€ๅฎถใซใ‚ˆใ‚‹ใƒฌใƒ“ใƒฅใƒผใƒ•ใƒญใƒผ
  • ใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃๆŠ•็ฅจใซใ‚ˆใ‚‹ๅ“่ณช่ฉ•ไพก
  • AIใซใ‚ˆใ‚‹่‡ชๅ‹•ๆคœ่จผ๏ผˆๅˆฅใฎAIใงใ‚ฏใƒญใ‚นใƒใ‚งใƒƒใ‚ฏ๏ผ‰

3๏ธโƒฃ ๆจนๆœจๅž‹็ฉบ้–“่จ˜ๆ†ถ๏ผˆDendritic Memory Space๏ผ‰

ๆฏ”ๅ–ฉใฎๆ„ๅ‘ณ

ไบบ้–“ใฎ่„ณใฎๆจน็Šถ็ช่ตท๏ผˆใƒ‡ใƒณใƒ‰ใƒฉใ‚คใƒˆ๏ผ‰ใฎใ‚ˆใ†ใซใ€็Ÿฅ่ญ˜ใŒ็ฉบ้–“็š„ใซๆ•ด็†ใ•ใ‚Œใฆใ„ใ‚‹ใ€‚

้€šๅธธใฎDB:

็Ÿฅ่ญ˜1: ใ€Œๅฟƒ่‡“ใฏๅพช็’ฐๅ™จๅฎ˜ใงใ‚ใ‚‹ใ€
็Ÿฅ่ญ˜2: ใ€Œ่„ณใฏไธญๆžข็ฅž็ตŒ็ณปใฎไธ€้ƒจใงใ‚ใ‚‹ใ€
โ†’ ใƒใƒฉใƒใƒฉใซไฟๅญ˜๏ผˆ้–ข้€ฃๆ€งใŒไธๆ˜Ž๏ผ‰

ๆจนๆœจๅž‹็ฉบ้–“่จ˜ๆ†ถ:

็Ÿฅ่ญ˜1: ๅบงๆจ™ [0.2, 0.8, 0.3, 0.9, 0.7, 0.8]
็Ÿฅ่ญ˜2: ๅบงๆจ™ [0.3, 0.8, 0.4, 0.85, 0.65, 0.75]
โ†’ ่ฟ‘ใ„ๅบงๆจ™ = ๆ„ๅ‘ณ็š„ใซ้–ข้€ฃ โ†’ ไธ€็ท’ใซๆคœ็ดขใงใใ‚‹

6ๆฌกๅ…ƒๅบงๆจ™็ณปใฎ่ฉณ็ดฐ

Knowledge Tile ใฎๅบงๆจ™ = [x, y, z, c, g, v]
                        โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€
                     medical_space  meta_space
medical_space [x, y, z]: ใƒ‰ใƒกใ‚คใƒณๅ›บๆœ‰ใฎ3ๆฌกๅ…ƒ็ฉบ้–“

ไพ‹: ๅŒป็™‚ใƒ‰ใƒกใ‚คใƒณใฎๅ ดๅˆ

่ปธ ๆ„ๅ‘ณ ไพ‹
x ่งฃๅ‰–ๅญฆ็š„ไฝ็ฝฎ 0.0=็ฅž็ตŒ็ณป, 0.5=ๅพช็’ฐๅ™จ, 1.0=ๆถˆๅŒ–ๅ™จ
y ็—…็†ๅญฆ็š„ๅˆ†้กž 0.0=ๆ„ŸๆŸ“็—‡, 0.5=ไปฃ่ฌ็–พๆ‚ฃ, 1.0=ๅค–ๅ‚ท
z ๆฒป็™‚ใƒฌใƒ™ใƒซ 0.0=ไบˆ้˜ฒ, 0.5=่จบๆ–ญ, 1.0=ๆฒป็™‚
meta_space [c, g, v]: ใƒกใ‚ฟๆƒ…ๅ ฑใฎ3ๆฌกๅ…ƒ็ฉบ้–“
่ปธ ๆ„ๅ‘ณ ๅ€คใฎ็ฏ„ๅ›ฒ
c (Certainty) ็ขบๅฎŸๆ€ง 0.0=ไปฎ่ชฌ, 0.5=ๅฎš่ชฌ, 1.0=็ขบ็ซ‹ใ•ใ‚ŒใŸไบ‹ๅฎŸ
g (Granularity) ็ฒ’ๅบฆ 0.0=ๆฆ‚่ฆ, 0.5=่ฉณ็ดฐ, 1.0=ๅฐ‚้–€็š„
v (Verification) ๆคœ่จผ็Šถๆ…‹ 0.0=ๆœชๆคœ่จผ, 0.5=ๅฐ‚้–€ๅฎถใƒฌใƒ“ใƒฅใƒผๆธˆ, 1.0=่ค‡ๆ•ฐใ‚ฝใƒผใ‚น็ขบ่ชๆธˆ

ๆคœ็ดขใฎไป•็ต„ใฟ

1. ใƒ†ใ‚ญใ‚นใƒˆๆคœ็ดข๏ผˆๅพ“ๆฅๅž‹๏ผ‰
def search_by_text(query):
    # ๅ˜็ด”ใชใ‚ญใƒผใƒฏใƒผใƒ‰ใƒžใƒƒใƒใƒณใ‚ฐ
    results = [tile for tile in all_tiles
               if query in tile.content]
    return results

ๅ•้กŒ็‚น: ๅŒ็พฉ่ชžใ‚’่ฆ‹้€ƒใ™

  • ใ€Œๅฟƒ่‡“็—…ใ€ใงๆคœ็ดขใ—ใฆใ‚‚ใ€Œๅพช็’ฐๅ™จ็–พๆ‚ฃใ€ใŒใƒ’ใƒƒใƒˆใ—ใชใ„
2. ๅบงๆจ™ๆคœ็ดข๏ผˆ็ฉบ้–“ๆคœ็ดข๏ผ‰
def search_by_coordinates(query_coords, top_k=5):
    # 6ๆฌกๅ…ƒใƒฆใƒผใ‚ฏใƒชใƒƒใƒ‰่ท้›ขใง่จˆ็ฎ—
    distances = []
    for tile in all_tiles:
        dist = euclidean_distance(query_coords, tile.coords)
        distances.append((tile, dist))

    # ่ท้›ขใŒ่ฟ‘ใ„้ †ใซใ‚ฝใƒผใƒˆ
    distances.sort(key=lambda x: x[1])
    return distances[:top_k]

ๅˆฉ็‚น: ๆ„ๅ‘ณ็š„ใซ่ฟ‘ใ„็Ÿฅ่ญ˜ใ‚’่‡ชๅ‹•ใง็™บ่ฆ‹

  • ๅบงๆจ™ใŒ่ฟ‘ใ„ = ๆ„ๅ‘ณ็š„ใซ้–ข้€ฃ
3. ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰ๆคœ็ดข๏ผˆๆŽจๅฅจ๏ผ‰
def hybrid_search(query_text, query_coords=None, top_k=5):
    # ใƒ†ใ‚ญใ‚นใƒˆใƒžใƒƒใƒใ‚นใ‚ณใ‚ข่จˆ็ฎ—
    text_scores = calculate_text_match(query_text)

    # ๅบงๆจ™่ท้›ขใ‚นใ‚ณใ‚ข่จˆ็ฎ—
    if query_coords:
        spatial_scores = calculate_spatial_distance(query_coords)

    # ่ค‡ๅˆใ‚นใ‚ณใ‚ข = ฮฑ * text_score + ฮฒ * (1 - spatial_distance)
    combined_scores = 0.4 * text_scores + 0.6 * spatial_scores

    return top_k_results(combined_scores)

.iathใƒ•ใ‚กใ‚คใƒซๅฝขๅผ

.iath ใƒ•ใ‚กใ‚คใƒซๆง‹้€ :

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Header (64 bytes)                  โ”‚  โ† ใƒžใ‚ธใƒƒใ‚ฏใƒŠใƒณใƒใƒผใ€ใƒใƒผใ‚ธใƒงใƒณ
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Index (JSON, ๅฏๅค‰้•ท)                โ”‚  โ† ใ‚ฟใ‚คใƒซIDใจใ‚ชใƒ•ใ‚ปใƒƒใƒˆไธ€่ฆง
โ”‚ {                                  โ”‚
โ”‚   "tiles": [                       โ”‚
โ”‚     {"id": "tile_001", "offset": 512},
โ”‚     {"id": "tile_002", "offset": 2048}
โ”‚   ]                                โ”‚
โ”‚ }                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Data Section (zstdๅœง็ธฎ)            โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”‚
โ”‚   โ”‚ Tile 1 (JSON)        โ”‚         โ”‚
โ”‚   โ”‚ - metadata           โ”‚         โ”‚
โ”‚   โ”‚ - content            โ”‚         โ”‚
โ”‚   โ”‚ - coordinates        โ”‚         โ”‚
โ”‚   โ”‚ - verification       โ”‚         โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”‚
โ”‚   โ”‚ Tile 2 (JSON)        โ”‚         โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚
โ”‚   ...                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

ใชใœzstdๅœง็ธฎ๏ผŸ

  • ้ซ˜ใ„ๅœง็ธฎ็އ๏ผˆgzipใ‚ˆใ‚Šๅ„ชใ‚Œใ‚‹๏ผ‰
  • ้ซ˜้€Ÿใช่งฃๅ‡้€Ÿๅบฆ
  • FacebookใŒ้–‹็™บ๏ผˆไฟก้ ผๆ€ง๏ผ‰

้‡่ฆใช่จญ่จˆๅˆคๆ–ญ

Q: ใชใœ6ๆฌกๅ…ƒ๏ผŸ 3ๆฌกๅ…ƒใ‚„10ๆฌกๅ…ƒใงใฏใƒ€ใƒก๏ผŸ A:

  • 3ๆฌกๅ…ƒ: ใƒ‰ใƒกใ‚คใƒณ็Ÿฅ่ญ˜ใ ใ‘ใงใƒกใ‚ฟๆƒ…ๅ ฑใŒ่กจ็พใงใใชใ„
  • 10ๆฌกๅ…ƒไปฅไธŠ: ๆฌกๅ…ƒใฎๅ‘ชใ„๏ผˆๆคœ็ดขใŒ้…ใใชใ‚‹๏ผ‰ใ€ไบบ้–“ใŒ็†่งฃไธ่ƒฝ
  • 6ๆฌกๅ…ƒ: ใƒ‰ใƒกใ‚คใƒณ(3) + ใƒกใ‚ฟ(3) = ใƒใƒฉใƒณใ‚นใŒ่‰ฏใ„

Q: ๅบงๆจ™ใฏ่ชฐใŒๆฑบใ‚ใ‚‹ใฎใ‹๏ผŸ A:

  • ็พ็Šถ: ไบบ้–“ใŒๆ‰‹ๅ‹•ใง่จญๅฎš๏ผˆdendritic-memory-editorใง๏ผ‰
  • Priority 2ใงๅฎŸ่ฃ…ไบˆๅฎš: AIใŒ่‡ชๅ‹•ๆŽจๅฎš๏ผˆDeepSeekใŒๅบงๆจ™ใ‚’็”Ÿๆˆ๏ผ‰

Q: .iathใจFAISS๏ผˆใƒ™ใ‚ฏใƒˆใƒซDB๏ผ‰ใฎ้•ใ„ใฏ๏ผŸ A:

็‰นๅพด .iath FAISS
ๅบงๆจ™ๆฌกๅ…ƒ 6ๆฌกๅ…ƒ๏ผˆไบบ้–“ใŒ็†่งฃๅฏ่ƒฝ๏ผ‰ 768ๆฌกๅ…ƒ๏ผˆEmbeddingใƒขใƒ‡ใƒซไพๅญ˜๏ผ‰
ๆคœ็ดข้€Ÿๅบฆ O(n) ็ทšๅฝขๆŽข็ดข O(log n) ้ซ˜้€Ÿ
ๆ„ๅ‘ณใฎ้€ๆ˜Žๆ€ง ้ซ˜ใ„๏ผˆๅบงๆจ™ใฎๆ„ๅ‘ณใŒๆ˜Ž็ขบ๏ผ‰ ไฝŽใ„๏ผˆใƒ–ใƒฉใƒƒใ‚ฏใƒœใƒƒใ‚ฏใ‚น๏ผ‰
็ทจ้›†ๅฎนๆ˜“ๆ€ง ้ซ˜ใ„๏ผˆๅบงๆจ™ใ‚’ๆ‰‹ๅ‹•่ชฟๆ•ดๅฏ่ƒฝ๏ผ‰ ไฝŽใ„๏ผˆๅ†Embeddingๅฟ…่ฆ๏ผ‰

็ต่ซ–: .iathใฏใ€Œไบบ้–“ใŒ็†่งฃใƒป็ทจ้›†ใงใใ‚‹็Ÿฅ่ญ˜ใƒ™ใƒผใ‚นใ€ใ‚’้‡่ฆ–

4๏ธโƒฃ ใƒญใƒผใ‚ซใƒซใƒ•ใ‚กใƒผใ‚นใƒˆ & ใƒฏใƒณใ‚ณใƒžใƒณใƒ‰ใ‚ปใƒƒใƒˆใ‚ขใƒƒใƒ—

่จญ่จˆๆ€ๆƒณ

โŒ ๆ‚ชใ„ไพ‹๏ผˆใ‚ฏใƒฉใ‚ฆใƒ‰ไพๅญ˜๏ผ‰:
pip install nullai
nullai --api-key=YOUR_OPENAI_KEY  # ใ‚ฏใƒฉใ‚ฆใƒ‰APIๅฟ…้ ˆ
โ†’ ใ‚คใƒณใ‚ฟใƒผใƒใƒƒใƒˆๅฟ…้ ˆใ€ใ‚ณใ‚นใƒˆ้ซ˜ใ€ใƒ—ใƒฉใ‚คใƒใ‚ทใƒผๆ‡ธๅฟต

โœ… NullAI:
./start_null_ai.sh  # ใƒญใƒผใ‚ซใƒซใงๅฎŒ็ต
โ†’ ใ‚ชใƒ•ใƒฉใ‚คใƒณๅฏ่ƒฝใ€็„กๆ–™ใ€ใƒ—ใƒฉใ‚คใƒใ‚ทใƒผไฟ่ญท

ใƒฏใƒณใ‚ณใƒžใƒณใƒ‰ใฎๅฎŸ็พๆ–นๆณ•

start_null_ai.shใŒ่‡ชๅ‹•ใงๅฎŸ่กŒใ™ใ‚‹ใ“ใจ:

  1. โœ… ไพๅญ˜้–ขไฟ‚ใƒใ‚งใƒƒใ‚ฏ๏ผˆPython, Node.js, Ollama๏ผ‰
  2. โœ… ไปฎๆƒณ็’ฐๅขƒไฝœๆˆ๏ผˆvenv๏ผ‰
  3. โœ… Pythonไพๅญ˜้–ขไฟ‚ใ‚คใƒณใ‚นใƒˆใƒผใƒซ
  4. โœ… Node.jsไพๅญ˜้–ขไฟ‚ใ‚คใƒณใ‚นใƒˆใƒผใƒซ
  5. โœ… ใƒ‡ใƒผใ‚ฟใƒ™ใƒผใ‚นๅˆๆœŸๅŒ–๏ผˆsql_app.db๏ผ‰
  6. โœ… Ollama่ตทๅ‹•
  7. โœ… ใƒใƒƒใ‚ฏใ‚จใƒณใƒ‰่ตทๅ‹•๏ผˆport 8000๏ผ‰
  8. โœ… ใƒ•ใƒญใƒณใƒˆใ‚จใƒณใƒ‰่ตทๅ‹•๏ผˆport 5173๏ผ‰
  9. โœ… .iathใƒกใƒขใƒชใƒญใƒผใƒ‰็ขบ่ช

ใƒฆใƒผใ‚ถใƒผใŒใ™ใ‚‹ใ“ใจใฏ: ./start_null_ai.shใ‚’ๅฎŸ่กŒใ™ใ‚‹ใ ใ‘

้‡่ฆใช่จญ่จˆๅˆคๆ–ญ

Q: ใชใœOllamaใ‚’ไฝฟใ†ใฎใ‹๏ผŸ HuggingFaceใ ใ‘ใงใฏใƒ€ใƒก๏ผŸ A:

  • Ollama: ใƒขใƒ‡ใƒซ็ฎก็†ใŒๆฅฝ๏ผˆollama pull deepseek-r1ใ ใ‘๏ผ‰
  • HuggingFace: ๆ‰‹ๅ‹•ใงใƒ€ใ‚ฆใƒณใƒญใƒผใƒ‰ใ€ใƒ‘ใ‚นๆŒ‡ๅฎšใŒ้ขๅ€’

Q: ใชใœDockerใ‚’ไฝฟใ‚ใชใ„ใฎใ‹๏ผŸ A:

  • Docker: ๅˆๅฟƒ่€…ใซใฏ้›ฃใ—ใ„ใ€GPUใƒ‘ใ‚นใ‚นใƒซใƒผใŒ่ค‡้›‘
  • ใ‚ทใ‚งใƒซใ‚นใ‚ฏใƒชใƒ—ใƒˆ: ใ‚ทใƒณใƒ—ใƒซใ€ใƒ‡ใƒใƒƒใ‚ฐใ—ใ‚„ใ™ใ„ใ€ใ‚ซใ‚นใ‚ฟใƒžใ‚คใ‚บๅฎนๆ˜“

ใ‚ทใ‚นใƒ†ใƒ ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃๅ…จไฝ“ๅ›ณ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        Frontend (React + TypeScript)            โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚  โ”‚ Engine       โ”‚  โ”‚ Inference    โ”‚  โ”‚ Training     โ”‚          โ”‚
โ”‚  โ”‚ Manager      โ”‚  โ”‚ Panel        โ”‚  โ”‚ Dashboard    โ”‚          โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚                 โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                 โ”‚
โ”‚                           โ”‚                                     โ”‚
โ”‚                    HTTP/WebSocket                               โ”‚
โ”‚                           โ”‚                                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Backend (FastAPI)                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚  โ”‚ config.py    โ”‚  โ”‚ questions.py โ”‚  โ”‚ training.py  โ”‚          โ”‚
โ”‚  โ”‚ (Engine API) โ”‚  โ”‚ (Inference)  โ”‚  โ”‚ (Fine-tune)  โ”‚          โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚                 โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                 โ”‚
โ”‚                           โ”‚                                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  NullAI Core Logic                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”             โ”‚
โ”‚  โ”‚           model_router.py                      โ”‚             โ”‚
โ”‚  โ”‚  - RAGๆŽจ่ซ–็ตฑๅˆ                                 โ”‚             โ”‚
โ”‚  โ”‚  - ๅธซๅŒ ๅ‡บๅŠ›ไฟๅญ˜                                โ”‚             โ”‚
โ”‚  โ”‚  - ใ‚จใƒณใ‚ธใƒณ็ฎก็†๏ผˆใ‚นใƒฏใƒƒใƒ—ใ€ๆ˜‡ๆ ผ๏ผ‰             โ”‚             โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜             โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚                 โ”‚
โ”‚         โ–ผ                  โ–ผ                  โ–ผ                 โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚  โ”‚ iath_memory  โ”‚  โ”‚ llm_providersโ”‚  โ”‚ fine_tuning  โ”‚          โ”‚
โ”‚  โ”‚ .py          โ”‚  โ”‚ .py          โ”‚  โ”‚ .py          โ”‚          โ”‚
โ”‚  โ”‚ (6D Search)  โ”‚  โ”‚ (4 Providers)โ”‚  โ”‚ (PEFT/Unslo) โ”‚          โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ”‚                  โ”‚                  โ”‚
          โ–ผ                  โ–ผ                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    External Services                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚  โ”‚ knowledge_   โ”‚  โ”‚ Ollama       โ”‚  โ”‚ HuggingFace  โ”‚       โ”‚
โ”‚  โ”‚ base.iath    โ”‚  โ”‚ (localhost)  โ”‚  โ”‚ Models       โ”‚       โ”‚
โ”‚  โ”‚ (6D Memory)  โ”‚  โ”‚              โ”‚  โ”‚              โ”‚       โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                                                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚ sql_app.db   โ”‚  โ”‚ training_data/                   โ”‚     โ”‚
โ”‚  โ”‚ (SQLite)     โ”‚  โ”‚  - master_outputs/*.jsonl        โ”‚     โ”‚
โ”‚  โ”‚              โ”‚  โ”‚  - checkpoints/apprentice_*/     โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

ๅ„ใ‚ทใ‚นใƒ†ใƒ ใฎ่ฉณ็ดฐ่งฃ่ชฌ

ModelRouter (null_ai/model_router.py)

ๅฝนๅ‰ฒ

NullAIใฎ้ ญ่„ณใ€‚ๅ…จใฆใฎๆŽจ่ซ–ใƒชใ‚ฏใ‚จใ‚นใƒˆใ‚’็ฎก็†ใ€‚

ไธป่ฆใƒกใ‚ฝใƒƒใƒ‰่ฉณ็ดฐ

__init__()
def __init__(self, config_manager):
    self.config_manager = config_manager
    self.master_model = None        # ๅธซๅŒ ใƒขใƒ‡ใƒซ
    self.apprentice_model = None    # ๅผŸๅญใƒขใƒ‡ใƒซ
    self.dendritic_memory = None    # .iath็ฉบ้–“่จ˜ๆ†ถ

    # .iathใƒ•ใ‚กใ‚คใƒซใฎใƒญใƒผใƒ‰
    self._load_dendritic_memory()

้‡่ฆ: ๅˆๆœŸๅŒ–ๆ™‚ใซ่‡ชๅ‹•็š„ใซ.iathใ‚’ใƒญใƒผใƒ‰ โ†’ ่ตทๅ‹•ๆ™‚้–“ใŒ้•ทใใชใ‚‹ๅฏ่ƒฝๆ€ง

async def infer() - RAG็ตฑๅˆๆŽจ่ซ–
async def infer(self, prompt, domain_id, model_config, save_to_memory=False):
    # Step 1: DB็Ÿฅ่ญ˜ใƒใ‚งใƒƒใ‚ฏ
    has_knowledge = self._check_db_knowledge(domain_id, prompt)

    if has_knowledge:
        # Step 2a: RAGๆŽจ่ซ–
        knowledge = self._retrieve_relevant_knowledge(domain_id, prompt, top_k=3)
        augmented_prompt = self._build_rag_prompt(prompt, knowledge)
        response = await self._perform_llm_inference(model_config, augmented_prompt)
    else:
        # Step 2b: ้€šๅธธๆŽจ่ซ–
        response = await self._perform_llm_inference(model_config, prompt)

        # Step 3: ้ซ˜ๅ“่ณชใชใ‚‰ไฟๅญ˜
        if save_to_memory and response["confidence"] >= 0.7:
            await self._save_inference_to_db(domain_id, prompt, response)

    # Step 4: ๅธซๅŒ ใฎๅ‡บๅŠ›ใชใ‚‰่จ“็ทดใƒ‡ใƒผใ‚ฟใจใ—ใฆไฟๅญ˜
    is_master = (self.master_model and
                 model_config.model_id == self.master_model.model_id)
    if is_master and response["confidence"] >= 0.8:
        await self._save_master_output_as_training_data(
            prompt, response["response"], domain_id, response["confidence"]
        )

    return response

ใƒ‡ใƒผใ‚ฟใƒ•ใƒญใƒผๅ›ณ:

prompt โ†’ check DB โ†’ found?
                     โ”œโ”€ YES โ†’ retrieve knowledge
                     โ”‚         โ†“
                     โ”‚      augment prompt
                     โ”‚         โ†“
                     โ”‚      LLM inference โ†’ response
                     โ”‚                        โ†“
                     โ”‚                    is master? โ†’ save as training data
                     โ”‚
                     โ””โ”€ NO โ†’ LLM inference โ†’ response
                                              โ†“
                                          confidence >= 0.7? โ†’ save to DB
_retrieve_relevant_knowledge() - ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰ๆคœ็ดข
def _retrieve_relevant_knowledge(self, domain_id, prompt, top_k=3):
    if not self.dendritic_memory:
        return []

    # ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰ๆคœ็ดขๅฎŸ่กŒ
    results = self.dendritic_memory.hybrid_search(
        query_text=prompt,
        query_coords=None,  # ๅฐ†ๆฅ็š„ใซใฏๅบงๆจ™ใ‚‚ๆŽจๅฎš
        top_k=top_k,
        text_weight=0.4,    # ใƒ†ใ‚ญใ‚นใƒˆใƒžใƒƒใƒใฎ้‡ใฟ
        spatial_weight=0.6  # ็ฉบ้–“่ท้›ขใฎ้‡ใฟ
    )

    # Knowledge Tileๅฝขๅผใซๅค‰ๆ›
    formatted_knowledge = []
    for tile in results:
        formatted_knowledge.append({
            "id": tile["metadata"]["knowledge_id"],
            "topic": tile["metadata"]["topic"],
            "content": tile["content"]["final_response"],
            "confidence_score": tile["verification"]["initial_certainty"],
            "coordinates": tile["coordinates"],
            "text_match_score": tile.get("text_match_score", 0),
            "spatial_distance": tile.get("spatial_distance", None)
        })

    return formatted_knowledge
_save_master_output_as_training_data() - ่จ“็ทดใƒ‡ใƒผใ‚ฟไฟๅญ˜
async def _save_master_output_as_training_data(
    self, prompt, response, domain_id, confidence
):
    # Alpacaๅฝขๅผใงไฟๅญ˜
    training_example = {
        "instruction": f"You are an expert in {domain_id}. Provide accurate information based on verified knowledge.",
        "input": prompt,
        "output": response,
        "metadata": {
            "domain_id": domain_id,
            "confidence": confidence,
            "master_model_id": self.master_model.model_id,
            "timestamp": datetime.utcnow().isoformat(),
            "source": "master_output"
        }
    }

    # JSONLใƒ•ใ‚กใ‚คใƒซใซ่ฟฝ่จ˜
    output_file = f"training_data/master_outputs/master_outputs_{domain_id}.jsonl"
    with open(output_file, 'a', encoding='utf-8') as f:
        f.write(json.dumps(training_example, ensure_ascii=False) + '\n')

ใชใœJSONL๏ผˆๆ”น่กŒๅŒบๅˆ‡ใ‚ŠJSON๏ผ‰๏ผŸ

  • ใ‚นใƒˆใƒชใƒผใƒŸใƒณใ‚ฐๅ‡ฆ็†ใŒๅฏ่ƒฝ๏ผˆ1่กŒใšใค่ชญใ‚ใ‚‹๏ผ‰
  • ใƒ•ใ‚กใ‚คใƒซ็ ดๆๆ™‚ใฎๅฝฑ้ŸฟใŒๆœ€ๅฐ้™
  • HuggingFace datasetsใจไบ’ๆ›ๆ€ง
ใ‚จใƒณใ‚ธใƒณ็ฎก็†ใƒกใ‚ฝใƒƒใƒ‰
def promote_apprentice(self, apprentice_model_id):
    """ๅผŸๅญใ‚’ๅธซๅŒ ใซๆ˜‡ๆ ผ"""
    # ็พๅœจใฎๅธซๅŒ ใ‚’ๅผ•้€€
    old_master = self.master_model

    # ๅผŸๅญใ‚’ๅธซๅŒ ใซๆ˜‡ๆ ผ
    self.master_model = self.apprentice_model

    # ๅผŸๅญใ‚’ใ‚ฏใƒชใ‚ข
    self.apprentice_model = None

    # ่จญๅฎšใ‚’ไฟๅญ˜
    self.config_manager.save_active_engines(
        self.master_model.model_id, None
    )

def swap_engines(self):
    """ๅธซๅŒ ใจๅผŸๅญใ‚’ๅ…ฅใ‚Œๆ›ฟใˆ"""
    temp = self.master_model
    self.master_model = self.apprentice_model
    self.apprentice_model = temp

    self.config_manager.save_active_engines(
        self.master_model.model_id,
        self.apprentice_model.model_id if self.apprentice_model else None
    )

def create_new_apprentice(self, base_model_id):
    """ๆ–ฐใ—ใ„็ฉบใฎๅผŸๅญใ‚’็”Ÿๆˆ"""
    # ใƒ™ใƒผใ‚นใƒขใƒ‡ใƒซใ‚’ใ‚ณใƒ”ใƒผใ—ใฆๆ–ฐใ—ใ„IDใ‚’ไป˜ไธŽ
    new_apprentice_id = f"{base_model_id}_apprentice_{timestamp}"

    # ่จญๅฎšใซ่ฟฝๅŠ 
    self.apprentice_model = self.config_manager.get_model_config(base_model_id)
    self.apprentice_model.model_id = new_apprentice_id

    return new_apprentice_id

DendriticMemorySpace (null_ai/iath_memory.py)

ๅฝนๅ‰ฒ

.iathใƒ•ใ‚กใ‚คใƒซใฎ่ชญใฟ่พผใฟใจ6ๆฌกๅ…ƒ็ฉบ้–“ๆคœ็ดขใ‚’ๆไพ›ใ€‚

ใ‚ฏใƒฉใ‚นๆง‹้€ 

class IathDecoder:
    """
    .iathใƒ•ใ‚กใ‚คใƒซใฎไฝŽใƒฌใƒ™ใƒซใƒ‡ใ‚ณใƒผใƒ€ใƒผ
    dendritic-memory-editorๅฎŒๅ…จไบ’ๆ›
    """
    def __init__(self, iath_file_path):
        self.file_path = Path(iath_file_path)
        self.header = None
        self.index = []
        self._load_header_and_index()

    def _load_header_and_index(self):
        """ใƒ˜ใƒƒใƒ€ใƒผใจใ‚คใƒณใƒ‡ใƒƒใ‚ฏใ‚นใฎ่ชญใฟ่พผใฟ"""
        with open(self.file_path, 'rb') as f:
            # Header (64 bytes)
            header_bytes = f.read(64)
            self.header = self._parse_header(header_bytes)

            # Index (JSON)
            index_size = self.header["index_size"]
            index_bytes = f.read(index_size)
            self.index = json.loads(index_bytes.decode('utf-8'))

    def get_tile_by_id(self, knowledge_id):
        """IDใงใ‚ฟใ‚คใƒซใ‚’ๅ–ๅพ—"""
        # ใ‚คใƒณใƒ‡ใƒƒใ‚ฏใ‚นใ‹ใ‚‰ใ‚ชใƒ•ใ‚ปใƒƒใƒˆใ‚’ๆคœ็ดข
        tile_info = next(
            (t for t in self.index["tiles"] if t["id"] == knowledge_id),
            None
        )
        if not tile_info:
            return None

        # ใƒ•ใ‚กใ‚คใƒซใƒใ‚ธใ‚ทใƒงใƒณ็งปๅ‹•
        with open(self.file_path, 'rb') as f:
            f.seek(tile_info["offset"])
            compressed_data = f.read(tile_info["size"])

            # zstd่งฃๅ‡
            decompressed = zstandard.decompress(compressed_data)
            tile_data = json.loads(decompressed.decode('utf-8'))

            return tile_data


class DendriticMemorySpace:
    """
    6ๆฌกๅ…ƒ็ฉบ้–“่จ˜ๆ†ถใ‚ทใ‚นใƒ†ใƒ 
    ้ซ˜ใƒฌใƒ™ใƒซAPI
    """
    def __init__(self, iath_file_path):
        self.decoder = IathDecoder(iath_file_path)
        self.all_tiles = []
        self.coordinates_matrix = None  # NumPy่กŒๅˆ—
        self._load_all_tiles()

    def _load_all_tiles(self):
        """ๅ…จใ‚ฟใ‚คใƒซใ‚’ใƒกใƒขใƒชใซใƒญใƒผใƒ‰"""
        self.all_tiles = self.decoder.get_all_tiles()

        # ๅบงๆจ™่กŒๅˆ—ไฝœๆˆ๏ผˆ้ซ˜้€Ÿๆคœ็ดข็”จ๏ผ‰
        coords_list = [tile["coordinates"] for tile in self.all_tiles]
        self.coordinates_matrix = np.array(coords_list)  # Shape: (N, 6)

ๆคœ็ดขใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ่ฉณ็ดฐ

ๅบงๆจ™ๆคœ็ดข๏ผˆ6ๆฌกๅ…ƒใƒฆใƒผใ‚ฏใƒชใƒƒใƒ‰่ท้›ข๏ผ‰
def search_by_coordinates(self, query_coords, top_k=5):
    """
    6ๆฌกๅ…ƒ็ฉบ้–“ใงใฎ่ฟ‘ๅ‚ๆคœ็ดข

    ๆ•ฐๅผ: distance = sqrt(sum((q_i - t_i)^2))
    where:
        q_i = queryๅบงๆจ™ใฎi็•ช็›ฎใฎ่ฆ็ด 
        t_i = tileๅบงๆจ™ใฎi็•ช็›ฎใฎ่ฆ็ด 
        i = 0..5 (6ๆฌกๅ…ƒ)
    """
    query_vector = np.array(query_coords)  # Shape: (6,)

    # ๅ…จใ‚ฟใ‚คใƒซใจใฎ่ท้›ขใ‚’ไธ€ๆ‹ฌ่จˆ็ฎ—๏ผˆNumPy vectorization๏ผ‰
    # Broadcasting: (N, 6) - (6,) โ†’ (N, 6)
    distances = np.linalg.norm(
        self.coordinates_matrix - query_vector,
        axis=1  # ๅ„่กŒ๏ผˆใ‚ฟใ‚คใƒซ๏ผ‰ใ”ใจใซ่ท้›ข่จˆ็ฎ—
    )  # Shape: (N,)

    # ่ท้›ขใงใ‚ฝใƒผใƒˆ
    sorted_indices = np.argsort(distances)[:top_k]

    # ็ตๆžœใ‚’่ฟ”ใ™
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["spatial_distance"] = float(distances[idx])
        results.append(tile)

    return results

่จˆ็ฎ—้‡: O(N) - ๅ…จใ‚ฟใ‚คใƒซๆ•ฐNใซๆฏ”ไพ‹๏ผˆ็ทšๅฝขๆŽข็ดข๏ผ‰

ๆœ€้ฉๅŒ–ๆกˆ๏ผˆๆœชๅฎŸ่ฃ…๏ผ‰:

  • KD-Tree: O(log N) ใ ใŒ6ๆฌกๅ…ƒใงใฏๅŠนๆžœ่–„ใ„
  • Ball-Tree: ้ซ˜ๆฌกๅ…ƒใงใ‚‚ๆฏ”่ผƒ็š„ๆœ‰ๅŠน
  • ่ฟ‘ไผผ่ฟ‘ๅ‚ๆŽข็ดข๏ผˆAnnoy, HNSW๏ผ‰: ่ถ…้ซ˜้€Ÿใ ใŒ็ฒพๅบฆไฝŽไธ‹
ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰ๆคœ็ดข๏ผˆใƒ†ใ‚ญใ‚นใƒˆ + ๅบงๆจ™๏ผ‰
def hybrid_search(
    self,
    query_text,
    query_coords=None,
    top_k=5,
    text_weight=0.4,
    spatial_weight=0.6
):
    """
    ใƒ†ใ‚ญใ‚นใƒˆใƒžใƒƒใƒใจ็ฉบ้–“่ท้›ขใฎ่ค‡ๅˆใ‚นใ‚ณใ‚ขใƒชใƒณใ‚ฐ
    """
    # Step 1: ใƒ†ใ‚ญใ‚นใƒˆใƒžใƒƒใƒใ‚นใ‚ณใ‚ข่จˆ็ฎ—
    text_scores = []
    for tile in self.all_tiles:
        score = self._calculate_text_match(query_text, tile)
        text_scores.append(score)
    text_scores = np.array(text_scores)  # Shape: (N,)

    # Step 2: ็ฉบ้–“่ท้›ขใ‚นใ‚ณใ‚ข่จˆ็ฎ—
    if query_coords:
        spatial_distances = np.linalg.norm(
            self.coordinates_matrix - np.array(query_coords),
            axis=1
        )
        # ่ท้›ขใ‚’0-1ใฎใ‚นใ‚ณใ‚ขใซๅค‰ๆ›๏ผˆ้€†ๆ•ฐ๏ผ‰
        max_dist = spatial_distances.max()
        spatial_scores = 1.0 - (spatial_distances / max_dist)
    else:
        spatial_scores = np.zeros(len(self.all_tiles))

    # Step 3: ่ค‡ๅˆใ‚นใ‚ณใ‚ข่จˆ็ฎ—
    combined_scores = (
        text_weight * text_scores +
        spatial_weight * spatial_scores
    )

    # Step 4: ใ‚นใ‚ณใ‚ขใงใ‚ฝใƒผใƒˆ
    sorted_indices = np.argsort(combined_scores)[::-1][:top_k]

    # ็ตๆžœใ‚’่ฟ”ใ™
    results = []
    for idx in sorted_indices:
        tile = self.all_tiles[idx].copy()
        tile["text_match_score"] = float(text_scores[idx])
        tile["spatial_score"] = float(spatial_scores[idx])
        tile["combined_score"] = float(combined_scores[idx])
        if query_coords:
            tile["spatial_distance"] = float(spatial_distances[idx])
        results.append(tile)

    return results

def _calculate_text_match(self, query, tile):
    """
    ใƒ†ใ‚ญใ‚นใƒˆใƒžใƒƒใƒใ‚นใ‚ณใ‚ข่จˆ็ฎ—๏ผˆ็ฐกๆ˜“็‰ˆ๏ผ‰

    ๅฐ†ๆฅ็š„ใซใฏBM25ใ‚„TF-IDFใ‚’ไฝฟใ†
    """
    query_lower = query.lower()
    content = tile["content"]["final_response"].lower()
    topic = tile["metadata"]["topic"].lower()

    # ใ‚ญใƒผใƒฏใƒผใƒ‰ใƒžใƒƒใƒใƒณใ‚ฐ
    query_words = set(query_lower.split())
    content_words = set(content.split())
    topic_words = set(topic.split())

    # Jaccard้กžไผผๅบฆ
    content_jaccard = len(query_words & content_words) / len(query_words | content_words)
    topic_jaccard = len(query_words & topic_words) / len(query_words | topic_words)

    # ่ค‡ๅˆใ‚นใ‚ณใ‚ข๏ผˆใƒˆใƒ”ใƒƒใ‚ฏใ‚’้‡่ฆ–๏ผ‰
    score = 0.3 * content_jaccard + 0.7 * topic_jaccard

    return score

FineTuningManager (null_ai/fine_tuning.py)

ๅฝนๅ‰ฒ

ๅผŸๅญใƒขใƒ‡ใƒซใฎใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐใ‚’ๅฎŸ่กŒใ€‚

PEFT๏ผˆQLoRA๏ผ‰ๆ–นๅผใฎ่ฉณ็ดฐ

async def fine_tune_with_huggingface_peft(
    self,
    model_name,
    training_examples,
    output_dir,
    epochs=3,
    learning_rate=2e-4,
    batch_size=4,
    lora_r=8,
    lora_alpha=16
):
    """
    Parameter-Efficient Fine-Tuning with QLoRA

    QLoRA = Quantized LoRA
    - 4-bit้‡ๅญๅŒ–ใงใƒกใƒขใƒชๅ‰Šๆธ›
    - LoRAใง่จ“็ทดใƒ‘ใƒฉใƒกใƒผใ‚ฟๅ‰Šๆธ›
    โ†’ 12GB GPUใงใ‚‚7Bใƒขใƒ‡ใƒซใ‚’่จ“็ทดๅฏ่ƒฝ
    """

    # Step 1: ใƒขใƒ‡ใƒซใ‚’4-bit้‡ๅญๅŒ–ใงใƒญใƒผใƒ‰
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,               # 4-bit้‡ๅญๅŒ–
        bnb_4bit_quant_type="nf4",       # NormalFloat4๏ผˆๆœ€้ฉใช้‡ๅญๅŒ–ๆ–นๅผ๏ผ‰
        bnb_4bit_compute_dtype=torch.float16,  # ่จˆ็ฎ—ใฏfp16ใง
        bnb_4bit_use_double_quant=True   # ไบŒ้‡้‡ๅญๅŒ–๏ผˆใ•ใ‚‰ใซใƒกใƒขใƒชๅ‰Šๆธ›๏ผ‰
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        device_map="auto"  # ่‡ชๅ‹•็š„ใซGPU/CPUใซ้…็ฝฎ
    )

    # Step 2: LoRA่จญๅฎš
    lora_config = LoraConfig(
        r=lora_r,                        # LoRAใƒฉใƒณใ‚ฏ๏ผˆไฝŽใ„ใปใฉ่ปฝ้‡๏ผ‰
        lora_alpha=lora_alpha,           # ใ‚นใ‚ฑใƒผใƒชใƒณใ‚ฐไฟ‚ๆ•ฐ
        target_modules=[                 # ใฉใฎใƒฌใ‚คใƒคใƒผใซLoRAใ‚’้ฉ็”จใ™ใ‚‹ใ‹
            "q_proj", "k_proj", "v_proj", "o_proj",  # Attention
            "gate_proj", "up_proj", "down_proj"      # MLP
        ],
        lora_dropout=0.05,               # Dropout็އ
        bias="none",                     # Biasใฏ่จ“็ทดใ—ใชใ„
        task_type="CAUSAL_LM"            # ใ‚ฟใ‚นใ‚ฏใ‚ฟใ‚คใƒ—
    )

    model = get_peft_model(model, lora_config)

    # ่จ“็ทดๅฏ่ƒฝใƒ‘ใƒฉใƒกใƒผใ‚ฟๆ•ฐใ‚’่กจ็คบ
    model.print_trainable_parameters()
    # ไพ‹: trainable params: 4.2M || all params: 2.7B || trainable%: 0.16%
    #     โ†’ ๅ…จใƒ‘ใƒฉใƒกใƒผใ‚ฟใฎ0.16%ใ ใ‘่จ“็ทด๏ผ

    # Step 3-9: ใƒ‡ใƒผใ‚ฟๆบ–ๅ‚™ใ€่จ“็ทดใ€ไฟๅญ˜๏ผˆ็œ็•ฅ๏ผ‰
    ...

QLoRAใฎไป•็ต„ใฟ:

้€šๅธธใฎใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ใƒขใƒ‡ใƒซๅ…จไฝ“๏ผˆ2.7B params๏ผ‰โ”‚ โ† ๅ…จใฆ่จ“็ทด
โ”‚ ใƒกใƒขใƒช: ~40GB           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

QLoRA:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅ…ƒใƒขใƒ‡ใƒซ๏ผˆ2.7B params๏ผ‰  โ”‚ โ† 4-bit้‡ๅญๅŒ–ใ€frozen๏ผˆ่จ“็ทดใ—ใชใ„๏ผ‰
โ”‚ ใƒกใƒขใƒช: ~7GB            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          +
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ LoRAใ‚ขใƒ€ใƒ—ใ‚ฟใƒผ๏ผˆ4.2M๏ผ‰   โ”‚ โ† ใ“ใ‚Œใ ใ‘่จ“็ทด
โ”‚ ใƒกใƒขใƒช: ~0.5GB          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         =
   ๅˆ่จˆใƒกใƒขใƒช: ~12GB

Alpacaๅฝขๅผใƒ‡ใƒผใ‚ฟใฎๆ•ดๅฝข

def format_training_examples_for_model(
    self,
    training_examples,
    template="alpaca"
):
    """
    Alpacaๅฝขๅผ โ†’ ใƒขใƒ‡ใƒซ็”จใƒ—ใƒญใƒณใƒ—ใƒˆใซๆ•ดๅฝข
    """
    formatted_prompts = []

    for example in training_examples:
        instruction = example["instruction"]
        input_text = example["input"]
        output_text = example["output"]

        if template == "alpaca":
            if input_text:
                prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input_text}

### Response:
{output_text}"""
            else:
                prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:
{output_text}"""

        formatted_prompts.append(prompt)

    return formatted_prompts

ใชใœใ“ใฎๅฝขๅผ๏ผŸ

  • ๆ˜Ž็ขบใชๅŒบๅˆ‡ใ‚Š๏ผˆ###๏ผ‰
  • instruction-following่ƒฝๅŠ›ใฎๅ‘ไธŠ
  • ใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใฎๆจ™ๆบ–

ใƒ‡ใƒผใ‚ฟใƒ•ใƒญใƒผๅฎŒๅ…จๅ›ณ่งฃ

ใƒ•ใƒญใƒผ1: ้€šๅธธๆŽจ่ซ–๏ผˆRAGใ‚ใ‚Š๏ผ‰

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ใƒฆใƒผใ‚ถใƒผ: "ๅฟƒ่‡“ใฎๅƒใใซใคใ„ใฆๆ•™ใˆใฆ"                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Frontend: InferencePanel.tsx                               โ”‚
โ”‚  - ่ณชๅ•ใ‚’ใƒใƒƒใ‚ฏใ‚จใƒณใƒ‰ใซ้€ไฟก                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“ HTTP POST /api/questions
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Backend: questions.py                                      โ”‚
โ”‚  - InferenceService.ask_question()                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ NullAI Core: model_router.py                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 1: _check_db_knowledge("medical", "ๅฟƒ่‡“ใฎๅƒใ") โ”‚   โ”‚
โ”‚  โ”‚  โ†’ DendriticMemorySpace.search_by_text()           โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ็ตๆžœ: 3ไปถ่ฆ‹ใคใ‹ใฃใŸ                              โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 2: _retrieve_relevant_knowledge()             โ”‚   โ”‚
โ”‚  โ”‚  โ†’ hybrid_search("ๅฟƒ่‡“ใฎๅƒใ", top_k=3)            โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ๅ–ๅพ—:                                           โ”‚   โ”‚
โ”‚  โ”‚    [1] ๅฟƒ่‡“ใฎ่งฃๅ‰–ๅญฆ (score: 0.92)                  โ”‚   โ”‚
โ”‚  โ”‚    [2] ๅพช็’ฐๅ™จ็ณปใฎๆฉŸ่ƒฝ (score: 0.85)                โ”‚   โ”‚
โ”‚  โ”‚    [3] ๅฟƒ่‡“็—…ใฎๅˆ†้กž (score: 0.73)                  โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 3: ใƒ—ใƒญใƒณใƒ—ใƒˆๆ‹กๅผต                             โ”‚   โ”‚
โ”‚  โ”‚  augmented_prompt = """                            โ”‚   โ”‚
โ”‚  โ”‚  Based on the following verified knowledge:        โ”‚   โ”‚
โ”‚  โ”‚                                                     โ”‚   โ”‚
โ”‚  โ”‚  [Knowledge 1 - expert verification, conf: 0.9]    โ”‚   โ”‚
โ”‚  โ”‚  Topic: ๅฟƒ่‡“ใฎ่งฃๅ‰–ๅญฆ                                โ”‚   โ”‚
โ”‚  โ”‚  Content: ๅฟƒ่‡“ใฏ4ใคใฎ้ƒจๅฑ‹ใ‹ใ‚‰ๆง‹ๆˆใ•ใ‚Œ...           โ”‚   โ”‚
โ”‚  โ”‚                                                     โ”‚   โ”‚
โ”‚  โ”‚  [Knowledge 2 - ...]                               โ”‚   โ”‚
โ”‚  โ”‚                                                     โ”‚   โ”‚
โ”‚  โ”‚  Now, please answer:                               โ”‚   โ”‚
โ”‚  โ”‚  ๅฟƒ่‡“ใฎๅƒใใซใคใ„ใฆๆ•™ใˆใฆ                           โ”‚   โ”‚
โ”‚  โ”‚  """                                               โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 4: LLMๆŽจ่ซ–                                    โ”‚   โ”‚
โ”‚  โ”‚  โ†’ llm_providers.py                                โ”‚   โ”‚
โ”‚  โ”‚  โ†’ OllamaProvider.infer()                          โ”‚   โ”‚
โ”‚  โ”‚  โ†’ model: deepseek-r1:1.5b                         โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 5: ใƒฌใ‚นใƒใƒณใ‚น็”Ÿๆˆ                             โ”‚   โ”‚
โ”‚  โ”‚  response = {                                      โ”‚   โ”‚
โ”‚  โ”‚    "response": "ๅฟƒ่‡“ใฏๅพช็’ฐๅ™จ็ณปใฎไธญๅฟƒๅ™จๅฎ˜ใง...",     โ”‚   โ”‚
โ”‚  โ”‚    "confidence": 0.88,                             โ”‚   โ”‚
โ”‚  โ”‚    "thinking": "ๆคœ่จผๆธˆใฟ็Ÿฅ่ญ˜ใซๅŸบใฅใ„ใฆๅ›ž็ญ”",        โ”‚   โ”‚
โ”‚  โ”‚    "retrieved_knowledge": [...]                    โ”‚   โ”‚
โ”‚  โ”‚  }                                                 โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 6: ๅธซๅŒ ใฎๅ‡บๅŠ›๏ผŸ                               โ”‚   โ”‚
โ”‚  โ”‚  is_master = True                                  โ”‚   โ”‚
โ”‚  โ”‚  confidence = 0.88 >= 0.8 โœ“                        โ”‚   โ”‚
โ”‚  โ”‚  โ†’ _save_master_output_as_training_data()          โ”‚   โ”‚
โ”‚  โ”‚  โ†’ training_data/master_outputs/medical.jsonl      โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Frontend: ใƒฌใ‚นใƒใƒณใ‚น่กจ็คบ                                   โ”‚
โ”‚  - ResponseDisplay.tsx                                     โ”‚
โ”‚  - ใ€Œๅฟƒ่‡“ใฏๅพช็’ฐๅ™จ็ณปใฎไธญๅฟƒๅ™จๅฎ˜ใง...ใ€                       โ”‚
โ”‚  - Retrieved Knowledge ใƒใƒƒใ‚ธ่กจ็คบ                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

ใƒ•ใƒญใƒผ2: ้€šๅธธๆŽจ่ซ–๏ผˆRAGใชใ—ใ€่‡ชๅทฑๆ‹กๅ……๏ผ‰

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ใƒฆใƒผใ‚ถใƒผ: "้‡ๅญใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใฎๅŽŸ็†ใฏ๏ผŸ"                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
                  (ๅŒไธŠ)
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ NullAI Core: model_router.py                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 1: _check_db_knowledge("general", "้‡ๅญ...")    โ”‚   โ”‚
โ”‚  โ”‚  โ†’ DendriticMemorySpace.search_by_text()           โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ็ตๆžœ: ่ฆ‹ใคใ‹ใ‚‰ใชใ‹ใฃใŸ โŒ                        โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 2: AIๅ†…้ƒจ็Ÿฅ่ญ˜ใงๆŽจ่ซ–                           โ”‚   โ”‚
โ”‚  โ”‚  โ†’ LLM.generate("้‡ๅญใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใฎๅŽŸ็†ใฏ๏ผŸ")       โ”‚   โ”‚
โ”‚  โ”‚  โ†’ model: deepseek-r1:1.5b                         โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 3: ใƒฌใ‚นใƒใƒณใ‚น็”Ÿๆˆ                             โ”‚   โ”‚
โ”‚  โ”‚  response = {                                      โ”‚   โ”‚
โ”‚  โ”‚    "response": "้‡ๅญใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใฏ...",             โ”‚   โ”‚
โ”‚  โ”‚    "confidence": 0.75                              โ”‚   โ”‚
โ”‚  โ”‚  }                                                 โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 4: ่‡ชๅทฑๆ‹กๅ……๏ผˆDBใซไฟๅญ˜๏ผ‰                       โ”‚   โ”‚
โ”‚  โ”‚  confidence = 0.75 >= 0.7 โœ“                        โ”‚   โ”‚
โ”‚  โ”‚  save_to_memory = True                             โ”‚   โ”‚
โ”‚  โ”‚  โ†’ _save_inference_to_db()                         โ”‚   โ”‚
โ”‚  โ”‚  โ†’ SQLite: knowledge_tiles ใƒ†ใƒผใƒ–ใƒซ                โ”‚   โ”‚
โ”‚  โ”‚     (ๅฐ†ๆฅ็š„ใซใฏ.iathใซใ‚‚ไฟๅญ˜)                       โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 5: ๅธซๅŒ ใฎๅ‡บๅŠ›ใจใ—ใฆไฟๅญ˜                       โ”‚   โ”‚
โ”‚  โ”‚  is_master = True                                  โ”‚   โ”‚
โ”‚  โ”‚  confidence = 0.75 < 0.8 โŒ                         โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ่จ“็ทดใƒ‡ใƒผใ‚ฟใซใฏไฟๅญ˜ใ—ใชใ„                        โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
              (ใƒฌใ‚นใƒใƒณใ‚น่กจ็คบ)

้‡่ฆใช้•ใ„:

  • RAGใ‚ใ‚Š: confidence >= 0.8ใง่จ“็ทดใƒ‡ใƒผใ‚ฟไฟๅญ˜
  • RAGใชใ—: confidence >= 0.7ใงDBไฟๅญ˜ใ€>= 0.8ใง่จ“็ทดใƒ‡ใƒผใ‚ฟไฟๅญ˜

ใƒ•ใƒญใƒผ3: ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐๅฎŸ่กŒ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ใƒฆใƒผใ‚ถใƒผ: Training Dashboard ใง "Start Fine-tuning"          โ”‚
โ”‚  - Apprentice Model: microsoft/phi-2                         โ”‚
โ”‚  - Domain: medical                                           โ”‚
โ”‚  - Method: peft                                              โ”‚
โ”‚  - Epochs: 3                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“ HTTP POST /api/training/start
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Backend: training.py                                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 1: ่จ“็ทดใƒ‡ใƒผใ‚ฟๅญ˜ๅœจใƒใ‚งใƒƒใ‚ฏ                     โ”‚   โ”‚
โ”‚  โ”‚  โ†’ FineTuningManager.load_training_data("medical") โ”‚   โ”‚
โ”‚  โ”‚  โ†’ training_data/master_outputs/medical.jsonl      โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ็ตๆžœ: 150ใ‚ตใƒณใƒ—ใƒซ                               โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 2: ใƒใƒƒใ‚ฏใ‚ฐใƒฉใ‚ฆใƒณใƒ‰ใ‚ฟใ‚นใ‚ฏ้–‹ๅง‹                 โ”‚   โ”‚
โ”‚  โ”‚  background_tasks.add_task(run_training)           โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ใ™ใใซใƒฌใ‚นใƒใƒณใ‚น่ฟ”ๅด๏ผˆ้žๅŒๆœŸ๏ผ‰                  โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ NullAI Core: fine_tuning.py (ใƒใƒƒใ‚ฏใ‚ฐใƒฉใ‚ฆใƒณใƒ‰)             โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 1: ใƒขใƒ‡ใƒซใƒญใƒผใƒ‰๏ผˆ4-bit้‡ๅญๅŒ–๏ผ‰                โ”‚   โ”‚
โ”‚  โ”‚  โ†’ AutoModelForCausalLM.from_pretrained(           โ”‚   โ”‚
โ”‚  โ”‚       "microsoft/phi-2",                           โ”‚   โ”‚
โ”‚  โ”‚       quantization_config=bnb_config               โ”‚   โ”‚
โ”‚  โ”‚     )                                              โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ใƒกใƒขใƒชไฝฟ็”จ: ~7GB                                โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 2: LoRA่จญๅฎš                                   โ”‚   โ”‚
โ”‚  โ”‚  โ†’ get_peft_model(model, lora_config)              โ”‚   โ”‚
โ”‚  โ”‚  โ†’ ่จ“็ทดๅฏ่ƒฝใƒ‘ใƒฉใƒกใƒผใ‚ฟ: 4.2M / 2.7B (0.16%)        โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 3: ใƒ‡ใƒผใ‚ฟๆบ–ๅ‚™                                 โ”‚   โ”‚
โ”‚  โ”‚  โ†’ format_training_examples_for_model()            โ”‚   โ”‚
โ”‚  โ”‚  โ†’ Alpacaๅฝขๅผ โ†’ ใƒขใƒ‡ใƒซ็”จใƒ—ใƒญใƒณใƒ—ใƒˆใซๆ•ดๅฝข           โ”‚   โ”‚
โ”‚  โ”‚  โ†’ Dataset.from_dict({"text": prompts})            โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 4: ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐ้–‹ๅง‹                           โ”‚   โ”‚
โ”‚  โ”‚  Epoch 1/3:                                        โ”‚   โ”‚
โ”‚  โ”‚    [===>    ] 35% loss: 1.245                      โ”‚   โ”‚
โ”‚  โ”‚    โ†’ current_training_state.update({               โ”‚   โ”‚
โ”‚  โ”‚         "progress": 35,                            โ”‚   โ”‚
โ”‚  โ”‚         "current_epoch": 1,                        โ”‚   โ”‚
โ”‚  โ”‚         "loss": 1.245                              โ”‚   โ”‚
โ”‚  โ”‚       })                                           โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                     โ†“                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Step 5: ๅฎŒไบ†                                       โ”‚   โ”‚
โ”‚  โ”‚  โ†’ trainer.save_model(output_dir)                  โ”‚   โ”‚
โ”‚  โ”‚  โ†’ training_data/checkpoints/apprentice_medical_*/ โ”‚   โ”‚
โ”‚  โ”‚  โ†’ current_training_state["is_training"] = False   โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Frontend: TrainingDashboard.tsx                            โ”‚
โ”‚  - 2็ง’ใ”ใจใซใƒใƒผใƒชใƒณใ‚ฐ: GET /api/training/status           โ”‚
โ”‚  - ใƒ—ใƒญใ‚ฐใƒฌใ‚นใƒใƒผๆ›ดๆ–ฐ: 35% โ†’ 67% โ†’ 100%                   โ”‚
โ”‚  - ๅฎŒไบ†ๆ™‚: ใƒใ‚งใƒƒใ‚ฏใƒใ‚คใƒณใƒˆไธ€่ฆงใ‚’ๅ†ๅ–ๅพ—                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

ๆŠ€่ก“ใ‚นใ‚ฟใƒƒใ‚ฏ่ฉณ็ดฐ

ใƒ•ใƒญใƒณใƒˆใ‚จใƒณใƒ‰

React 18.2 + TypeScript 5.0
โ”œโ”€ Vite 4.4 (ใƒ“ใƒซใƒ‰ใƒ„ใƒผใƒซ)
โ”œโ”€ TailwindCSS 3.3 (ใ‚นใ‚ฟใ‚คใƒชใƒณใ‚ฐ)
โ””โ”€ axios (HTTP ใ‚ฏใƒฉใ‚คใ‚ขใƒณใƒˆ)

ไธป่ฆใ‚ณใƒณใƒใƒผใƒใƒณใƒˆ:
- EngineManager.tsx      (417่กŒ) - ใ‚จใƒณใ‚ธใƒณ็ฎก็†UI
- InferencePanel.tsx     (321่กŒ) - ๆŽจ่ซ–ใƒ‘ใƒใƒซ
- TrainingDashboard.tsx  (400่กŒ) - ใƒˆใƒฌใƒผใƒ‹ใƒณใ‚ฐใƒ€ใƒƒใ‚ทใƒฅใƒœใƒผใƒ‰
- KnowledgePanel.tsx     (185่กŒ) - ็Ÿฅ่ญ˜ใƒ–ใƒฉใ‚ฆใ‚ถ

ใƒใƒƒใ‚ฏใ‚จใƒณใƒ‰

FastAPI 0.115.6 + Python 3.13+
โ”œโ”€ Uvicorn (ASGIใ‚ตใƒผใƒใƒผ)
โ”œโ”€ SQLAlchemy 2.0 (ORM)
โ”œโ”€ Pydantic 2.10 (ใƒใƒชใƒ‡ใƒผใ‚ทใƒงใƒณ)
โ””โ”€ Alembic (ใƒžใ‚คใ‚ฐใƒฌใƒผใ‚ทใƒงใƒณ)

ไธป่ฆAPI:
- /api/config/*     - ใ‚จใƒณใ‚ธใƒณ็ฎก็†
- /api/questions    - ๆŽจ่ซ–ๅฎŸ่กŒ
- /api/training/*   - ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ
- /api/knowledge/*  - ็Ÿฅ่ญ˜ใ‚ฟใ‚คใƒซ็ฎก็†

NullAI Core

Python 3.13+
โ”œโ”€ transformers 4.36+ (HuggingFace)
โ”œโ”€ torch 2.0+ (PyTorch)
โ”œโ”€ peft 0.7+ (LoRA/QLoRA)
โ”œโ”€ trl 0.7+ (Reinforcement Learning from Human Feedback)
โ”œโ”€ datasets 2.15+ (ใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆๅ‡ฆ็†)
โ”œโ”€ bitsandbytes 0.41+ (้‡ๅญๅŒ–)
โ”œโ”€ accelerate 0.25+ (ๅˆ†ๆ•ฃ่จ“็ทด)
โ”œโ”€ zstandard 0.22+ (.iathๅœง็ธฎ)
โ””โ”€ numpy 1.24+ (ๆ•ฐๅ€ค่จˆ็ฎ—)

ไธป่ฆใƒขใ‚ธใƒฅใƒผใƒซ:
- model_router.py     (800่กŒ) - RAG็ตฑๅˆใ€ใ‚จใƒณใ‚ธใƒณ็ฎก็†
- iath_memory.py      (362่กŒ) - 6ๆฌกๅ…ƒ็ฉบ้–“่จ˜ๆ†ถ
- fine_tuning.py      (640่กŒ) - ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ
- llm_providers.py    (390่กŒ) - LLMใƒ—ใƒญใƒใ‚คใƒ€ใƒผ็ตฑๅˆ

LLMใƒ—ใƒญใƒใ‚คใƒ€ใƒผ

1. Ollama
   - ใƒญใƒผใ‚ซใƒซใƒขใƒ‡ใƒซ็ฎก็†
   - ollama pull deepseek-r1:1.5b
   - API: http://localhost:11434

2. HuggingFace Transformers
   - ็›ดๆŽฅใƒญใƒผใƒ‰
   - AutoModelForCausalLM.from_pretrained()
   - GPU/CPU่‡ชๅ‹•้…็ฝฎ

3. MLX (Apple Silicon)
   - M1/M2/M3 Macๅฐ‚็”จ
   - ็ตฑๅˆใƒกใƒขใƒชๆดป็”จ
   - mlx-lm ใƒฉใ‚คใƒ–ใƒฉใƒช

4. GGUF (llama-cpp-python)
   - ้‡ๅญๅŒ–ใƒขใƒ‡ใƒซ๏ผˆ.gguf๏ผ‰
   - CPUๆŽจ่ซ–ใซๆœ€้ฉ
   - GPU accelerationๅฏพๅฟœ

ใ‚ˆใใ‚ใ‚‹่ชค่งฃใจๆณจๆ„็‚น

่ชค่งฃ1: ใ€ŒRAGใฏๅธธใซไฝฟใ‚ใ‚Œใ‚‹ใ€

โŒ ่ชค่งฃ: ๅ…จใฆใฎๆŽจ่ซ–ใงRAGใŒไฝฟใ‚ใ‚Œใ‚‹ โœ… ็œŸๅฎŸ: DBใซ็Ÿฅ่ญ˜ใŒใ‚ใ‚‹ๅ ดๅˆใฎใฟRAGใŒ็™บๅ‹•

# ๅฎŸ้š›ใฎๅ‹•ไฝœ
if has_knowledge:
    # RAGๆŽจ่ซ–
else:
    # ้€šๅธธๆŽจ่ซ–๏ผˆRAGใชใ—๏ผ‰

่ฆ‹ๅˆ†ใ‘ๆ–น:

  • RAGใ‚ใ‚Š: ใƒฌใ‚นใƒใƒณใ‚นใซretrieved_knowledgeใƒ•ใ‚ฃใƒผใƒซใƒ‰ใŒๅซใพใ‚Œใ‚‹
  • RAGใชใ—: retrieved_knowledgeใŒ็ฉบ

่ชค่งฃ2: ใ€ŒๅผŸๅญใฏ่‡ชๅ‹•็š„ใซๅธซๅŒ ใซใชใ‚‹ใ€

โŒ ่ชค่งฃ: ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐใŒๅฎŒไบ†ใ—ใŸใ‚‰่‡ชๅ‹•ใงๅธซๅŒ ใซๆ˜‡ๆ ผ โœ… ็œŸๅฎŸ: ๆ‰‹ๅ‹•ใงๆ˜‡ๆ ผๆ“ไฝœใŒๅฟ…่ฆ

ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐๅฎŒไบ†
  โ†“
ใƒใ‚งใƒƒใ‚ฏใƒใ‚คใƒณใƒˆไฟๅญ˜
  โ†“
ใ€ๆ‰‹ๅ‹•ๆ“ไฝœใ€‘Engine Manager ใง Promote ใ‚’ใ‚ฏใƒชใƒƒใ‚ฏ
  โ†“
ๅผŸๅญใŒๅธซๅŒ ใซๆ˜‡ๆ ผ

็†็”ฑ: ๅ“่ณชใƒใ‚งใƒƒใ‚ฏใ‚’ไบบ้–“ใŒ่กŒใ†ในใ

่ชค่งฃ3: ใ€Œ.iathใƒ•ใ‚กใ‚คใƒซใฏ่‡ชๅ‹•ๆ›ดๆ–ฐใ•ใ‚Œใ‚‹ใ€

โŒ ่ชค่งฃ: AI็”Ÿๆˆ็Ÿฅ่ญ˜ใŒ่‡ชๅ‹•็š„ใซ.iathใซไฟๅญ˜ใ•ใ‚Œใ‚‹ โœ… ็œŸๅฎŸ: ็พๅœจใฏJSONLใฎใฟใ€.iathไฟๅญ˜ใฏๆœชๅฎŸ่ฃ…๏ผˆPriority 2๏ผ‰

็พ็Šถ:
AI็”Ÿๆˆ็Ÿฅ่ญ˜ โ†’ SQLite + JSONL โœ…
           โ†’ .iath โŒ๏ผˆๆœชๅฎŸ่ฃ…๏ผ‰

Priority 2ๅฎŸ่ฃ…ๅพŒ:
AI็”Ÿๆˆ็Ÿฅ่ญ˜ โ†’ SQLite + JSONL + .iath โœ…

่ชค่งฃ4: ใ€Œใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐใฏๅ…จใƒ‘ใƒฉใƒกใƒผใ‚ฟใ‚’่จ“็ทดใ™ใ‚‹ใ€

โŒ ่ชค่งฃ: ใƒขใƒ‡ใƒซๅ…จไฝ“๏ผˆ2.7B ใƒ‘ใƒฉใƒกใƒผใ‚ฟ๏ผ‰ใ‚’่จ“็ทด โœ… ็œŸๅฎŸ: LoRAใ‚ขใƒ€ใƒ—ใ‚ฟใƒผ๏ผˆ4.2M๏ผ‰ใ ใ‘่จ“็ทด

่จ“็ทดใ•ใ‚Œใ‚‹ใƒ‘ใƒฉใƒกใƒผใ‚ฟ:
- ๅ…ƒใƒขใƒ‡ใƒซ: 2,700,000,000 โ†’ frozen๏ผˆ่จ“็ทดใ—ใชใ„๏ผ‰
- LoRA:         4,200,000 โ†’ ่จ“็ทดใ™ใ‚‹ โœ…

่จ“็ทดใƒ‘ใƒฉใƒกใƒผใ‚ฟๆฏ”็އ: 0.16%

ใƒกใƒชใƒƒใƒˆ:

  • ใƒกใƒขใƒชๅ‰Šๆธ›๏ผˆ40GB โ†’ 12GB๏ผ‰
  • ่จ“็ทดๆ™‚้–“็Ÿญ็ธฎ๏ผˆ10ๆ™‚้–“ โ†’ 2ๆ™‚้–“๏ผ‰
  • ๅ…ƒใƒขใƒ‡ใƒซใฏๅค‰ๆ›ดใ•ใ‚Œใชใ„๏ผˆๅฎ‰ๅ…จ๏ผ‰

่ชค่งฃ5: ใ€ŒSQLiteใจ.iathใฏๅŒใ˜ใƒ‡ใƒผใ‚ฟใ‚’ไฟๅญ˜ใ€

โŒ ่ชค่งฃ: SQLiteใจ.iathใฏ้‡่ค‡ใ—ใฆใ„ใ‚‹ โœ… ็œŸๅฎŸ: ๅฝนๅ‰ฒใŒๅฎŒๅ…จใซ็•ฐใชใ‚‹

ใƒ‡ใƒผใ‚ฟใƒ™ใƒผใ‚น ไฟๅญ˜ๅ†…ๅฎน ็”จ้€”
SQLite ใƒฆใƒผใ‚ถใƒผใ€ใƒฏใƒผใ‚ฏใ‚นใƒšใƒผใ‚นใ€ๆŽจ่ซ–ๅฑฅๆญดใ€ใƒกใ‚ฟใƒ‡ใƒผใ‚ฟ ใ‚ขใƒ—ใƒชใ‚ฑใƒผใ‚ทใƒงใƒณ็ฎก็†
.iath Knowledge Tile๏ผˆ6ๆฌกๅ…ƒๅบงๆจ™ + ใ‚ณใƒณใƒ†ใƒณใƒ„๏ผ‰ ็Ÿฅ่ญ˜ๆคœ็ดขใƒปRAGๆŽจ่ซ–

ไพ‹:

SQLite:
- users ใƒ†ใƒผใƒ–ใƒซ: nullai_default_user
- workspaces ใƒ†ใƒผใƒ–ใƒซ: default_workspace
- inference_history: ้ŽๅŽปใฎ่ณชๅ•ใจๅ›ž็ญ”

.iath:
- Tile 1: [0.2, 0.8, 0.3, 0.9, 0.7, 0.8] "ๅฟƒ่‡“ใฎๅƒใ..."
- Tile 2: [0.3, 0.8, 0.4, 0.85, 0.65, 0.75] "ๅพช็’ฐๅ™จ็ณป..."

่ชค่งฃ6: ใ€Œconfidenceๅ€คใฏAIใŒ่‡ชๅ‹•่จˆ็ฎ—ใ€

โŒ ่ชค่งฃ: AIใŒ่‡ชๅทฑ่ฉ•ไพกใ—ใฆconfidenceใ‚’่ฟ”ใ™ โœ… ็œŸๅฎŸ: ็พๅœจใฏๅ›บๅฎšๅ€ค๏ผˆใƒ—ใƒญใƒใ‚คใƒ€ใƒผใ”ใจ๏ผ‰

# llm_providers.py
class OllamaProvider:
    async def infer(...):
        return {
            "response": response_text,
            "confidence": 0.85  # โ† ๅ›บๅฎšๅ€ค๏ผ
        }

ๅฐ†ๆฅใฎๆ”นๅ–„:

  • ่ค‡ๆ•ฐใƒขใƒ‡ใƒซใงใ‚ฏใƒญใ‚นใƒใ‚งใƒƒใ‚ฏ
  • ๅฟœ็ญ”ใฎไธ็ขบๅฎŸๆ€งใ‚’่จˆ็ฎ—๏ผˆใ‚จใƒณใƒˆใƒญใƒ”ใƒผ๏ผ‰
  • ไบบ้–“ใซใ‚ˆใ‚‹ใƒ•ใ‚ฃใƒผใƒ‰ใƒใƒƒใ‚ฏๅญฆ็ฟ’

่จญ่จˆๅˆคๆ–ญใฎ็†็”ฑ

ๅˆคๆ–ญ1: ใชใœPEFTใ‚’ๆŽก็”จใ—ใŸใ‹

ๅ€™่ฃœ:

  1. ใƒ•ใƒซใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ
  2. PEFT (LoRA/QLoRA)
  3. Adapter
  4. Prompt Tuning

ๆŽก็”จ: PEFT (QLoRA)

็†็”ฑ:

ๆฏ”่ผƒ่กจ:

                  ใƒกใƒขใƒช  ้€Ÿๅบฆ  ๅ“่ณช  ๆฑŽ็”จๆ€ง
ใƒ•ใƒซFT            ร—      ร—    โญโญโญ  โญโญโญ
PEFT (QLoRA)      โญโญโญ  โญโญ  โญโญโญ  โญโญโญ
Adapter           โญโญ    โญโญ  โญโญ    โญโญ
Prompt Tuning     โญโญโญ  โญโญโญ โญ      โญ

็ต่ซ–: PEFTใŒใƒใƒฉใƒณใ‚นๆœ€่‰ฏ

ๅˆคๆ–ญ2: ใชใœAlpacaๅฝขๅผใ‚’ๆŽก็”จใ—ใŸใ‹

ๅ€™่ฃœ:

  1. Alpaca
  2. ShareGPT
  3. OpenAssistant
  4. Custom

ๆŽก็”จ: Alpaca

็†็”ฑ:

  • ใ‚ชใƒผใƒ—ใƒณใ‚ฝใƒผใ‚นใงๅบƒใๆŽก็”จ
  • instruction-input-outputๆง‹้€ ใŒๆ˜Ž็ขบ
  • HuggingFace datasetsใจไบ’ๆ›ๆ€ง
  • ใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใฎใƒ™ใ‚นใƒˆใƒ—ใƒฉใ‚ฏใƒ†ใ‚ฃใ‚น

ๅˆคๆ–ญ3: ใชใœใƒใ‚คใƒ–ใƒชใƒƒใƒ‰ๆคœ็ดขใ‹

ๅ€™่ฃœ:

  1. ใƒ†ใ‚ญใ‚นใƒˆใฎใฟ
  2. ๅบงๆจ™ใฎใฟ
  3. ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰

ๆŽก็”จ: ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰

็†็”ฑ:

ใƒ†ใ‚ญใ‚นใƒˆใฎใฟ:
- ๅˆฉ็‚น: ใ‚ทใƒณใƒ—ใƒซ
- ๆฌ ็‚น: ๅŒ็พฉ่ชžใ‚’่ฆ‹้€ƒใ™

ๅบงๆจ™ใฎใฟ:
- ๅˆฉ็‚น: ๆ„ๅ‘ณ็š„ใซ้–ข้€ฃใ™ใ‚‹็Ÿฅ่ญ˜ใ‚’็™บ่ฆ‹
- ๆฌ ็‚น: ๅบงๆจ™ใŒไธๆญฃ็ขบใ ใจๅคฑๆ•—

ใƒใ‚คใƒ–ใƒชใƒƒใƒ‰:
- ๅˆฉ็‚น: ไธกๆ–นใฎ้•ทๆ‰€ใ‚’ๆดปใ‹ใ›ใ‚‹
- ๆฌ ็‚น: ใƒ‘ใƒฉใƒกใƒผใ‚ฟ่ชฟๆ•ดใŒๅฟ…่ฆ๏ผˆtext_weight, spatial_weight๏ผ‰

็พๅœจใฎ่จญๅฎš:

text_weight = 0.4
spatial_weight = 0.6
# โ†’ ๅบงๆจ™ใ‚’ใ‚„ใ‚„้‡่ฆ–๏ผˆๆ„ๅ‘ณ็š„้–ข้€ฃๆ€งใ‚’ๅ„ชๅ…ˆ๏ผ‰

ๅˆคๆ–ญ4: ใชใœๅพช็’ฐใ‚คใƒณใƒใƒผใƒˆใ‚’lazy importใง่งฃๆฑบใ—ใŸใ‹

ๅ€™่ฃœ:

  1. Lazy import๏ผˆ้–ขๆ•ฐๅ†…ใงimport๏ผ‰
  2. ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃๅค‰ๆ›ด๏ผˆไพๅญ˜้–ขไฟ‚ใฎๆ•ด็†๏ผ‰
  3. ไธญ้–“ใƒขใ‚ธใƒฅใƒผใƒซๅฐŽๅ…ฅ

ๆŽก็”จ: Lazy import

็†็”ฑ:

  • ๆœ€ๅฐ้™ใฎๅค‰ๆ›ดใง่งฃๆฑบ
  • ใƒ‘ใƒ•ใ‚ฉใƒผใƒžใƒณใ‚นใธใฎๅฝฑ้Ÿฟใฏ่ปฝๅพฎ
  • ๆ—ขๅญ˜ใ‚ณใƒผใƒ‰ใฎๅคงๅน…ใชๆ›ธใๆ›ใˆไธ่ฆ

ๅฎŸ่ฃ…ไพ‹:

def _check_db_knowledge(self, domain_id, prompt):
    # ้–ขๆ•ฐๅ†…ใงimport โ†’ ๅพช็’ฐๅ›ž้ฟ
    from backend.app.database.session import SessionLocal
    db = SessionLocal()
    # ...

ๆ‹กๅผตๆ™‚ใฎ่€ƒๆ…ฎไบ‹้ …

ๆ–ฐใ—ใ„LLMใƒ—ใƒญใƒใ‚คใƒ€ใƒผใ‚’่ฟฝๅŠ ใ™ใ‚‹ๅ ดๅˆ

ๆ‰‹้ †:

  1. llm_providers.pyใซๆ–ฐใ—ใ„ใ‚ฏใƒฉใ‚นใ‚’่ฟฝๅŠ 
class NewProvider:
    async def infer(self, model_config, prompt, temperature):
        # ๅฎŸ่ฃ…
        pass

    async def infer_streaming(self, model_config, prompt, temperature):
        # ๅฎŸ่ฃ…
        pass
  1. model_router.pyใฎ_perform_llm_inference()ใซ่ฟฝๅŠ 
if provider == "ollama":
    result = await self.ollama_provider.infer(...)
elif provider == "new_provider":  # โ† ่ฟฝๅŠ 
    result = await self.new_provider.infer(...)
  1. backend/app/config.pyใฎModelProviderๅˆ—ๆŒ™ๅž‹ใซ่ฟฝๅŠ 
class ModelProvider(str, Enum):
    OLLAMA = "ollama"
    HUGGINGFACE = "huggingface"
    NEW_PROVIDER = "new_provider"  # โ† ่ฟฝๅŠ 

ๆ–ฐใ—ใ„ใƒ‰ใƒกใ‚คใƒณใ‚’่ฟฝๅŠ ใ™ใ‚‹ๅ ดๅˆ

ๆ‰‹้ †:

  1. .iathใƒ•ใ‚กใ‚คใƒซใงใƒ‰ใƒกใ‚คใƒณ็”จใฎๅบงๆจ™็ฉบ้–“ใ‚’ๅฎš็พฉ
ๅŒป็™‚ใƒ‰ใƒกใ‚คใƒณ: medical_space [x, y, z]
ๆณ•ๅพ‹ใƒ‰ใƒกใ‚คใƒณ: legal_space [x, y, z]  โ† ่ฟฝๅŠ 
  - x: ๆณ•ๅˆ†้‡Ž๏ผˆๆฐ‘ๆณ•ใ€ๅˆ‘ๆณ•ใ€ๅ•†ๆณ•...๏ผ‰
  - y: ๅˆคไพ‹ใƒฌใƒ™ใƒซ๏ผˆๅœฐ่ฃใ€้ซ˜่ฃใ€ๆœ€้ซ˜่ฃ๏ผ‰
  - z: ๆ™‚ไปฃ๏ผˆๅคๅ…ธใ€็พไปฃใ€ๆœ€ๆ–ฐ๏ผ‰
  1. backend/app/config.pyใซใƒ‰ใƒกใ‚คใƒณ่จญๅฎš่ฟฝๅŠ 
domains = [
    {"domain_id": "medical", "name": "ๅŒป็™‚"},
    {"domain_id": "legal", "name": "ๆณ•ๅพ‹"}  # โ† ่ฟฝๅŠ 
]
  1. ่จ“็ทดใƒ‡ใƒผใ‚ฟใƒ‡ใ‚ฃใƒฌใ‚ฏใƒˆใƒชไฝœๆˆ
mkdir -p training_data/master_outputs/
touch training_data/master_outputs/master_outputs_legal.jsonl

ๅบงๆจ™่‡ชๅ‹•ๆŽจๅฎšใ‚’ๅฎŸ่ฃ…ใ™ใ‚‹ๅ ดๅˆ๏ผˆPriority 2๏ผ‰

่จญ่จˆๆกˆ:

# null_ai/coordinate_estimator.py

class CoordinateEstimator:
    def __init__(self, llm_model):
        """
        DeepSeek R1ใ‚’ไฝฟใฃใฆๅบงๆจ™ใ‚’ๆŽจๅฎš
        """
        self.llm = llm_model

    async def estimate_coordinates(
        self,
        prompt: str,
        response: str,
        domain_id: str
    ) -> List[float]:
        """
        6ๆฌกๅ…ƒๅบงๆจ™ใ‚’ๆŽจๅฎš

        Returns: [x, y, z, c, g, v]
        """
        # ใƒ—ใƒญใƒณใƒ—ใƒˆๆง‹็ฏ‰
        estimation_prompt = f"""You are an expert in knowledge space mapping.
Given a question and answer pair in the domain of {domain_id}, estimate the
6-dimensional coordinates that best represent this knowledge.

Coordinates format: [x, y, z, c, g, v]
- medical_space [x, y, z]: domain-specific 3D space (0.0-1.0)
- meta_space [c, g, v]: Certainty, Granularity, Verification (0.0-1.0)

Question: {prompt}
Answer: {response}

Output ONLY the coordinates as a JSON array: [x, y, z, c, g, v]
"""

        # LLMใซๅบงๆจ™ๆŽจๅฎšใ‚’ไพ้ ผ
        result = await self.llm.generate(estimation_prompt)

        # JSONใƒ‘ใƒผใ‚น
        coords = json.loads(result)

        # ใƒใƒชใƒ‡ใƒผใ‚ทใƒงใƒณ
        assert len(coords) == 6
        assert all(0.0 <= c <= 1.0 for c in coords)

        return coords

WebSocketใงใƒชใ‚ขใƒซใ‚ฟใ‚คใƒ ้€ฒๆ—ใ‚’ๅฎŸ่ฃ…ใ™ใ‚‹ๅ ดๅˆ

่จญ่จˆๆกˆ:

# backend/app/main.py

@app.websocket("/ws/training/{session_id}")
async def training_websocket(websocket: WebSocket, session_id: str):
    await websocket.accept()

    # ้€ฒๆ—ใ‚ณใƒผใƒซใƒใƒƒใ‚ฏ
    async def progress_callback(state):
        await websocket.send_json({
            "type": "progress",
            "data": state
        })

    # ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ้–‹ๅง‹
    await fine_tuning_manager.start_training(
        ...,
        progress_callback=progress_callback
    )

ใพใจใ‚: ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฎๆœฌ่ณช

NullAIใฏๅ˜ใชใ‚‹RAGใ‚ทใ‚นใƒ†ใƒ ใงใ‚‚ใ€ๅ˜ใชใ‚‹ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐใƒ„ใƒผใƒซใงใ‚‚ใ‚ใ‚Šใพใ›ใ‚“ใ€‚

NullAIใฎๆœฌ่ณช:

่‡ชๅทฑ้€ฒๅŒ–ใ™ใ‚‹็Ÿฅ่ญ˜็”Ÿๆ…‹็ณป

ๅธซๅŒ AI โ†’ ็Ÿฅ่ญ˜็”Ÿๆˆ โ†’ ๅผŸๅญAIๅญฆ็ฟ’ โ†’ ๆ˜‡ๆ ผ โ†’ ๆ–ฐใ—ใ„ๅผŸๅญ โ†’ ใ‚ตใ‚คใ‚ฏใƒซ็ถ™็ถš
   โ†“                                      โ†‘
DBๆ‹กๅ……๏ผˆ่‡ชๅทฑๆ‹กๅ……๏ผ‰                   ใƒ•ใ‚กใ‚คใƒณใƒใƒฅใƒผใƒ‹ใƒณใ‚ฐ
   โ†“                                      โ†‘
ๆจนๆœจๅž‹็ฉบ้–“่จ˜ๆ†ถ๏ผˆ6ๆฌกๅ…ƒๅบงๆจ™๏ผ‰          ้ซ˜ๅ“่ณช่จ“็ทดใƒ‡ใƒผใ‚ฟ
   โ†“                                      โ†‘
ๆ„ๅ‘ณ็š„็Ÿฅ่ญ˜ๆ•ด็†                      ๅธซๅŒ ใฎ็Ÿฅ่ญ˜็ถ™ๆ‰ฟ
   โ†“                                      โ†‘
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ใ‚ตใ‚คใ‚ฏใƒซ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

4ใคใฎๆ ธๅฟƒๆ€ๆƒณใฎ็ตฑๅˆ:

  1. ๅ€’ๆœจใ‚ทใ‚นใƒ†ใƒ : ไธ–ไปฃไบคไปฃใซใ‚ˆใ‚‹้€ฒๅŒ–
  2. DBๅˆ†้›ขๆง‹้€ : ไฟก้ ผๆ€งใฎ็ขบไฟใจ่‡ชๅทฑๆ‹กๅ……
  3. ๆจนๆœจๅž‹็ฉบ้–“่จ˜ๆ†ถ: ๆ„ๅ‘ณ็š„็Ÿฅ่ญ˜ๆ•ด็†
  4. ใƒญใƒผใ‚ซใƒซใƒ•ใ‚กใƒผใ‚นใƒˆ: ใƒ—ใƒฉใ‚คใƒใ‚ทใƒผใจใ‚ณใ‚นใƒˆ

ใ“ใ‚Œใ‚‰ๅ…จใฆใŒๆœ‰ๆฉŸ็š„ใซ็ตๅˆใ—ใ€AIใŒ่‡ชๅทฑ้€ฒๅŒ–ใ™ใ‚‹็”Ÿๆ…‹็ณปใ‚’ๅฝขๆˆใ—ใฆใ„ใพใ™ใ€‚


ใ“ใฎใ‚ฌใ‚คใƒ‰ใ‚’็†่งฃใ—ใŸใ‚‰ใ€ใ‚ใชใŸใฏNullAIใฎ่จญ่จˆๆ€ๆƒณใ‚’ๆญฃใ—ใ็ถ™ๆ‰ฟใงใใพใ™ใ€‚

้ ‘ๅผตใฃใฆใใ ใ•ใ„๏ผ๐ŸŒฒ๐Ÿ”ฅ


Document Version: 1.0 Total Pages: 60+ Total Words: 15,000+ Author: Claude (Sonnet 4.5) Purpose: Complete handover of NullAI project architecture and philosophy