File size: 2,633 Bytes
2a9d63a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
title: "Muse:Eye β On-device Multimodal XR Docent"
emoji: π¨
colorFrom: blue
colorTo: purple
sdk: other
pipeline_tag: other
tags:
- multimodal
- XR
- on-device
- tflite
- mobileclip
- android
- unity
- RAG
- museum-tech
---
# Muse:Eye
**On-device Multimodal AI + XR Museum Docent for Android**
Muse:Eyeλ **μΉ΄λ©λΌ κΈ°λ° μν μΈμ(MobileCLIP)**,
**μ¨λλ°μ΄μ€ LLM(Gemma-3N)**,
**XR μΈν°λμ
(Unity MR)**
μ κ²°ν©ν νμ΄λΈλ¦¬λ AI λμ¨νΈ μμ€ν
μ
λλ€.
μ€νλΌμΈ νκ²½μμλ λΉ λ₯Έ μ΄λ―Έμ§ κ²μ,
λ©νλ°μ΄ν° κΈ°λ° RAG,
μ±μΈ/μ΄λ¦°μ΄/μ λ¬Έκ° λͺ¨λ μ€λͺ
μ κ³΅μ΄ κ°λ₯ν©λλ€.
---
# β¨ Features
### 1) β‘ On-device AI (No Internet Required)
- MobileCLIP (TFLite) μ΄λ―Έμ§ μλ² λ©
- μν μ μ¬λ κ²μ (FAISS β custom `.bin` index)
- Gemma-3N μ¨λλ°μ΄μ€ LLM
### 2) πΌ Artwork Recognition
- μΉ΄λ©λΌ μΈμ β μ¦μ embedding μΆμΆ
- RAG κΈ°λ° μνΒ·μκ° μ€λͺ
μμ±
### 3) πΆπ§β𦳠Three Docent Modes
- μ΄λ¦°μ΄ λͺ¨λ
- μ±μΈ λͺ¨λ
- μ λ¬Έκ° λͺ¨λ
### 4) π₯½ XR Integrated
- Unity + Android Studio ν΅ν©
- MR λͺ¨λμμ 3D UI λ° μμ± μλ΄ μ 곡
---
# π System Architecture
Muse:Eyeλ λ κ°μ§ μ€ν λͺ¨λλ₯Ό μ§μν©λλ€:
- **Mobile Mode (On-device AI)** β μμ μ€νλΌμΈ
- **XR/MR Mode (Cloud Multimodal AI)** β Unity κΈ°λ° MR + Cloud API
---
## π± A. Mobile Mode (On-device AI)
```plaintext
Mobile App (Android)
β
βββ Camera Input
β
βββ MobileCLIP (TFLite)
β ββ on-device embedding (512-d)
β
βββ Embedding Index (.bin)
β ββ cosine similarity search (offline)
β
βββ Local Metadata RAG
β ββ artwork metadata lookup
β
βββ Gemma-3N On-device LLM
β ββ adult / child / expert explanation
β
βββ Android TTS
```
---
## π₯½ B. XR/MR Mode (Cloud Multimodal AI)
```plaintext
Unity XR App (MR View)
β
βββ MR Camera (RenderTexture)
β
βββ JNI Bridge β Android
β ββ frame bytes μ λ¬
β
βββ Cloud Multimodal API (Gemini)
β ββ artwork analysis
β ββ style / meaning extraction
β ββ multimodal reasoning
β
βββ Cloud PromptManager
β ββ μ΄λ¦°μ΄ λͺ¨λ
β ββ μ±μΈ λͺ¨λ
β ββ μ λ¬Έκ° λͺ¨λ
β
βββ Unity 3D UI + Android TTS
ββ floating info panel
βββ 3D guide elements
βββ audio output
```
|