spartan8806 commited on
Commit
73baae2
Β·
verified Β·
1 Parent(s): 1646a02

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +67 -188
  2. app.py +185 -0
  3. requirements.txt +5 -0
README.md CHANGED
@@ -1,188 +1,67 @@
1
- ---
2
- title: ATLES-ECHO System
3
- emoji: 🧠
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: static
7
- pinned: true
8
- license: mit
9
- tags:
10
- - semantic-memory
11
- - personal-ai
12
- - embeddings
13
- - privacy-first
14
- - digital-twin
15
- ---
16
-
17
- # ATLES-ECHO 🧠
18
-
19
- **Your Semantic Digital Twin** - A privacy-first AI system that remembers everything you do.
20
-
21
- ## 🌟 Overview
22
-
23
- ATLES-ECHO is an intelligent semantic monitoring and memory system powered by the **[ATLES Champion Embedding Model](https://huggingface.co/spartan8806/atles-champion-embedding)** (Top-10 worldwide on MTEB).
24
-
25
- ### What It Does
26
-
27
- - πŸ“ **Captures** - Files, screens, code, notes, conversations
28
- - πŸ” **Understands** - Uses advanced embeddings for semantic comprehension
29
- - 🧠 **Learns** - Builds behavioral patterns and interest profiles
30
- - πŸ” **Protects** - 100% local storage, zero cloud uploads
31
- - ⚑ **Delivers** - Real-time semantic search across your entire digital life
32
-
33
- ## πŸš€ Quick Facts
34
-
35
- | Feature | Details |
36
- |---------|---------|
37
- | **Embedding Model** | [spartan8806/atles-champion-embedding](https://huggingface.co/spartan8806/atles-champion-embedding) |
38
- | **Performance** | STS-B Pearson: 0.8445, Spearman: 0.8374 (Top-10 MTEB) |
39
- | **Dimensions** | 768-dim MPNet-base architecture |
40
- | **Speed** | ~200 embeddings/sec (GPU) |
41
- | **Vector DB** | FAISS (Facebook AI Similarity Search) |
42
- | **Backend** | FastAPI (Python 3.11+) |
43
- | **Frontend** | React 18 + TypeScript |
44
- | **Privacy** | 100% local, encrypted, open source |
45
-
46
- ## 🎯 Core Capabilities
47
-
48
- ### Semantic Monitoring
49
- - πŸ“ **File Changes** - Track all code and document edits
50
- - πŸ–₯️ **Screen Content** - OCR-based extraction (optional)
51
- - πŸ“‹ **Clipboard** - Save copied text and snippets
52
- - πŸͺŸ **App Usage** - Monitor focus time and patterns
53
- - ⌨️ **Typing Patterns** - Context-aware analysis (opt-in)
54
-
55
- ### AI-Powered Search
56
- - **Natural Language** - "How did I implement authentication?"
57
- - **Semantic Similarity** - Find related content without exact matches
58
- - **Context Retrieval** - Get relevant background for any topic
59
- - **Pattern Detection** - Discover productivity trends
60
-
61
- ### Auto-Generated Insights
62
- - Daily activity summaries
63
- - Interest profiling
64
- - Usage analytics
65
- - Behavioral pattern analysis
66
-
67
- ## πŸ—οΈ Architecture
68
-
69
- ```
70
- Web UI (React)
71
- ↓
72
- FastAPI Backend
73
- ↓
74
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
75
- β”‚ Embedding Engineβ”‚ Vector DB β”‚
76
- β”‚ (Champion) β”‚ (FAISS) β”‚
77
- β”‚ 768-dim β”‚ Similarity β”‚
78
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
79
- ↓
80
- Knowledge Base (SQLite)
81
- ↓
82
- Watchers (File, Screen, Clipboard, etc.)
83
- ```
84
-
85
- ## πŸ“– Usage Example
86
-
87
- ```python
88
- import requests
89
-
90
- # Semantic search
91
- response = requests.get(
92
- "http://localhost:5001/api/search",
93
- params={
94
- "query": "authentication implementation",
95
- "limit": 5
96
- }
97
- )
98
-
99
- results = response.json()
100
- for item in results["results"]:
101
- print(f"{item['score']:.3f} - {item['content'][:100]}...")
102
- ```
103
-
104
- ## πŸ”’ Privacy
105
-
106
- ATLES-ECHO is **privacy-first by design**:
107
-
108
- βœ… **100% Local** - All data stays on your machine
109
- βœ… **No Cloud** - Zero uploads, ever
110
- βœ… **Encrypted** - AES-256 encryption at rest
111
- βœ… **Open Source** - Audit the code yourself
112
- βœ… **Full Control** - Disable any feature anytime
113
-
114
- ## πŸš€ Installation
115
-
116
- ```bash
117
- # Clone repository
118
- git clone https://github.com/spartan8806/atles-echo.git
119
- cd atles-echo
120
-
121
- # Install backend
122
- cd backend && pip install -r requirements.txt
123
-
124
- # Install frontend
125
- cd ../frontend && npm install
126
-
127
- # Run (Windows)
128
- .\start_echo.bat
129
- ```
130
-
131
- Access dashboard at: **http://localhost:3000**
132
-
133
- ## πŸ“Š Performance
134
-
135
- | Dataset Size | Search Latency | Storage |
136
- |--------------|----------------|---------|
137
- | 1K entries | 5ms | 4.5 MB |
138
- | 10K entries | 8ms | 45 MB |
139
- | 100K entries | 15ms | 450 MB |
140
- | 1M entries | 50ms | 4.5 GB |
141
-
142
- ## 🎨 Use Cases
143
-
144
- - **Developers**: Code search, debug history, research notes
145
- - **Writers**: Document tracking, research management, idea capture
146
- - **Researchers**: Paper organization, experiment notes, literature review
147
- - **Knowledge Workers**: Second brain, meeting notes, project memory
148
-
149
- ## πŸ—ΊοΈ Roadmap
150
-
151
- - [x] Core semantic monitoring
152
- - [x] Real-time search
153
- - [x] Interest profiling
154
- - [ ] Browser history integration
155
- - [ ] Email integration (local)
156
- - [ ] Voice memo capture
157
- - [ ] Mobile companion app
158
-
159
- ## 🀝 Part of ATLES Ecosystem
160
-
161
- ATLES-ECHO is one component of the **ATLES (Advanced Thinking & Learning Execution System)**:
162
-
163
- - **ATLES Brain** - Central AI coordinator
164
- - **ATLES-ECHO** - Semantic memory (this project)
165
- - **Phoenix** - AI introspection research system *(private, not public)*
166
- - **SENTINEL** - Documentation-focused semantic monitoring *(like ECHO for docs)*
167
- - **ATLES-MENTOR** - MoE code assistance system *(private, not public)*
168
-
169
- ## οΏ½οΏ½οΏ½ License
170
-
171
- MIT License - Copyright (c) 2025 Conner (spartan8806)
172
-
173
- ## πŸ™ Credits
174
-
175
- Powered by **[ATLES Champion Embedding](https://huggingface.co/spartan8806/atles-champion-embedding)**
176
-
177
- Built with: FAISS, FastAPI, React, Sentence Transformers
178
-
179
- ---
180
-
181
- <div align="center">
182
-
183
- **"Your digital life, remembered."**
184
-
185
- [πŸ“š Full Documentation](https://github.com/spartan8806/atles-echo) | [πŸ› Report Issues](https://github.com/spartan8806/atles-echo/issues) | [⭐ Star on GitHub](https://github.com/spartan8806/atles-echo)
186
-
187
- </div>
188
-
 
1
+ ---
2
+ title: ATLES-ECHO Embedding Service
3
+ emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: true
10
+ license: apache-2.0
11
+ short_description: Semantic embeddings with ATLES Champion model
12
+ ---
13
+
14
+ # 🧠 ATLES-ECHO Embedding Service
15
+
16
+ Generate high-quality semantic embeddings using the **ATLES Champion** embedding model.
17
+
18
+ ## Features
19
+
20
+ - **πŸ”€ Single Embedding**: Generate embedding for any text
21
+ - **βš–οΈ Compare Similarity**: Compare semantic similarity between two texts
22
+ - **πŸ“¦ Batch Embed**: Process multiple texts at once (up to 10)
23
+
24
+ ## Model Details
25
+
26
+ | Property | Value |
27
+ |----------|-------|
28
+ | **Model** | [spartan8806/atles-champion-embedding](https://huggingface.co/spartan8806/atles-champion-embedding) |
29
+ | **Dimension** | 768 |
30
+ | **Base Model** | all-mpnet-base-v2 |
31
+ | **Parameters** | 110M |
32
+ | **Training** | H200 GPU (30 minutes) |
33
+
34
+ ## Performance (MTEB STS-B)
35
+
36
+ - **Pearson**: 0.8445 (Top-10 worldwide)
37
+ - **Spearman**: 0.8374
38
+
39
+ ## About ATLES
40
+
41
+ ATLES-ECHO is the semantic memory core of the ATLES ecosystem - your AI digital twin that learns from your digital life while keeping everything private and local.
42
+
43
+ **Ecosystem Components:**
44
+ - 🧠 **ECHO** - Semantic memory and embeddings
45
+ - πŸ¦… **Phoenix** - AI council for decision making
46
+ - πŸ”¬ **SENTINEL** - Research and knowledge gathering
47
+ - πŸ“š **MENTOR** - Code understanding and assistance
48
+
49
+ ## API Usage
50
+
51
+ This space provides a visual interface. For programmatic access, use the model directly:
52
+
53
+ ```python
54
+ from sentence_transformers import SentenceTransformer
55
+
56
+ model = SentenceTransformer("spartan8806/atles-champion-embedding")
57
+ embedding = model.encode("Your text here", normalize_embeddings=True)
58
+ ```
59
+
60
+ ## License
61
+
62
+ Apache 2.0 - Free for commercial and personal use.
63
+
64
+ ---
65
+
66
+ Built with ❀️ by [spartan8806](https://huggingface.co/spartan8806)
67
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py ADDED
@@ -0,0 +1,185 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ATLES-ECHO - Semantic Embedding Service
3
+ A Hugging Face Space for generating embeddings using the ATLES Champion model.
4
+ """
5
+
6
+ import gradio as gr
7
+ from sentence_transformers import SentenceTransformer
8
+ import numpy as np
9
+
10
+ # Load the ATLES Champion embedding model
11
+ print("Loading ATLES Champion Embedding model...")
12
+ model = SentenceTransformer("spartan8806/atles-champion-embedding")
13
+ print(f"Model loaded! Dimension: {model.get_sentence_embedding_dimension()}")
14
+
15
+ def generate_embedding(text: str) -> dict:
16
+ """Generate embedding for input text"""
17
+ if not text or not text.strip():
18
+ return {"error": "Please enter some text", "embedding": None, "dimension": None}
19
+
20
+ # Generate embedding
21
+ embedding = model.encode(text, normalize_embeddings=True)
22
+
23
+ return {
24
+ "text_preview": text[:100] + "..." if len(text) > 100 else text,
25
+ "dimension": len(embedding),
26
+ "embedding_preview": embedding[:10].tolist(), # First 10 values
27
+ "embedding_full": embedding.tolist()
28
+ }
29
+
30
+ def compare_texts(text1: str, text2: str) -> dict:
31
+ """Compare similarity between two texts"""
32
+ if not text1.strip() or not text2.strip():
33
+ return {"error": "Please enter both texts", "similarity": None}
34
+
35
+ # Generate embeddings
36
+ embeddings = model.encode([text1, text2], normalize_embeddings=True)
37
+
38
+ # Calculate cosine similarity
39
+ similarity = float(np.dot(embeddings[0], embeddings[1]))
40
+
41
+ return {
42
+ "text1_preview": text1[:50] + "..." if len(text1) > 50 else text1,
43
+ "text2_preview": text2[:50] + "..." if len(text2) > 50 else text2,
44
+ "similarity": round(similarity, 4),
45
+ "similarity_percent": f"{similarity * 100:.1f}%",
46
+ "interpretation": get_similarity_interpretation(similarity)
47
+ }
48
+
49
+ def get_similarity_interpretation(score: float) -> str:
50
+ """Interpret similarity score"""
51
+ if score >= 0.9:
52
+ return "🟒 Nearly identical meaning"
53
+ elif score >= 0.7:
54
+ return "🟑 Very similar"
55
+ elif score >= 0.5:
56
+ return "🟠 Somewhat related"
57
+ elif score >= 0.3:
58
+ return "πŸ”΄ Loosely related"
59
+ else:
60
+ return "⚫ Different topics"
61
+
62
+ def batch_embed(texts: str) -> dict:
63
+ """Generate embeddings for multiple texts (one per line)"""
64
+ lines = [l.strip() for l in texts.split('\n') if l.strip()]
65
+
66
+ if not lines:
67
+ return {"error": "Please enter at least one text (one per line)", "embeddings": None}
68
+
69
+ if len(lines) > 10:
70
+ return {"error": "Maximum 10 texts at a time", "embeddings": None}
71
+
72
+ # Generate embeddings
73
+ embeddings = model.encode(lines, normalize_embeddings=True)
74
+
75
+ results = []
76
+ for i, (text, emb) in enumerate(zip(lines, embeddings)):
77
+ results.append({
78
+ "index": i + 1,
79
+ "text": text[:50] + "..." if len(text) > 50 else text,
80
+ "embedding_preview": emb[:5].tolist()
81
+ })
82
+
83
+ return {
84
+ "count": len(lines),
85
+ "dimension": len(embeddings[0]),
86
+ "results": results
87
+ }
88
+
89
+ # Create Gradio interface
90
+ with gr.Blocks(
91
+ title="ATLES-ECHO Embedding Service",
92
+ theme=gr.themes.Soft(primary_hue="blue", secondary_hue="cyan")
93
+ ) as demo:
94
+
95
+ gr.Markdown("""
96
+ # 🧠 ATLES-ECHO Embedding Service
97
+
98
+ Generate high-quality semantic embeddings using the **ATLES Champion** model.
99
+
100
+ - **Model**: [spartan8806/atles-champion-embedding](https://huggingface.co/spartan8806/atles-champion-embedding)
101
+ - **Dimension**: 768
102
+ - **Top-10 MTEB Performance**: Pearson 0.8445, Spearman 0.8374
103
+ """)
104
+
105
+ with gr.Tabs():
106
+ # Tab 1: Single Embedding
107
+ with gr.TabItem("πŸ”€ Single Embedding"):
108
+ gr.Markdown("Generate an embedding for a single piece of text.")
109
+
110
+ with gr.Row():
111
+ with gr.Column():
112
+ single_input = gr.Textbox(
113
+ label="Input Text",
114
+ placeholder="Enter text to embed...",
115
+ lines=3
116
+ )
117
+ single_btn = gr.Button("Generate Embedding", variant="primary")
118
+
119
+ with gr.Column():
120
+ single_output = gr.JSON(label="Embedding Result")
121
+
122
+ single_btn.click(
123
+ fn=generate_embedding,
124
+ inputs=single_input,
125
+ outputs=single_output
126
+ )
127
+
128
+ # Tab 2: Compare Texts
129
+ with gr.TabItem("βš–οΈ Compare Similarity"):
130
+ gr.Markdown("Compare the semantic similarity between two texts.")
131
+
132
+ with gr.Row():
133
+ text1_input = gr.Textbox(label="Text 1", placeholder="First text...", lines=2)
134
+ text2_input = gr.Textbox(label="Text 2", placeholder="Second text...", lines=2)
135
+
136
+ compare_btn = gr.Button("Compare Similarity", variant="primary")
137
+ compare_output = gr.JSON(label="Similarity Result")
138
+
139
+ compare_btn.click(
140
+ fn=compare_texts,
141
+ inputs=[text1_input, text2_input],
142
+ outputs=compare_output
143
+ )
144
+
145
+ # Tab 3: Batch Embedding
146
+ with gr.TabItem("πŸ“¦ Batch Embed"):
147
+ gr.Markdown("Generate embeddings for multiple texts (one per line, max 10).")
148
+
149
+ with gr.Row():
150
+ with gr.Column():
151
+ batch_input = gr.Textbox(
152
+ label="Texts (one per line)",
153
+ placeholder="Text 1\nText 2\nText 3...",
154
+ lines=6
155
+ )
156
+ batch_btn = gr.Button("Generate Batch Embeddings", variant="primary")
157
+
158
+ with gr.Column():
159
+ batch_output = gr.JSON(label="Batch Results")
160
+
161
+ batch_btn.click(
162
+ fn=batch_embed,
163
+ inputs=batch_input,
164
+ outputs=batch_output
165
+ )
166
+
167
+ gr.Markdown("""
168
+ ---
169
+ ### About ATLES-ECHO
170
+
171
+ ATLES-ECHO is the semantic memory core of the ATLES ecosystem - your AI digital twin that learns from your digital life.
172
+
173
+ **Features:**
174
+ - 🧠 High-quality semantic embeddings (768 dimensions)
175
+ - ⚑ Fast inference with normalized vectors
176
+ - 🎯 Top-10 MTEB benchmark performance
177
+ - πŸ”’ Built for the ATLES privacy-first ecosystem
178
+
179
+ [View Model Card](https://huggingface.co/spartan8806/atles-champion-embedding) | [ATLES GitHub](https://github.com/spartan8806)
180
+ """)
181
+
182
+ # Launch the app
183
+ if __name__ == "__main__":
184
+ demo.launch()
185
+
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio>=4.0.0
2
+ sentence-transformers>=2.2.2
3
+ torch>=2.0.0
4
+ numpy>=1.24.0
5
+