Guy DuGan II commited on
Commit
d22457e
Β·
verified Β·
1 Parent(s): 0d7f41b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +208 -15
README.md CHANGED
@@ -1,28 +1,221 @@
1
  ---
2
  title: README
3
- emoji: 🐠
4
  colorFrom: indigo
5
  colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
- # Within Us AI
 
 
11
 
12
- **Frontier-grade datasets for agentic, reasoning-driven AI software engineering.**
13
 
14
- Within Us AI builds high-signal open datasets designed to train AI systems that
15
- think, build, verify, and operate like real software engineers.
16
 
17
- ## Focus Areas
18
- - Agentic AI & tool-using workflows
19
- - Code LLM datasets (tests-as-truth, diff-first patching)
20
- - Reasoning & evaluation-driven training
21
- - Secure, governed, auditable AI engineering
22
- - Cost- and latency-aware autonomous systems
23
 
24
- ## Flagship Work
25
- - **Genesis AI Code**
26
- Demo β†’ 10K β†’ 50K β†’ 100K (Frontier)
 
 
 
27
 
28
- > *Intelligence is not just generated. It is engineered.*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: README
3
+ emoji: 🧠
4
  colorFrom: indigo
5
  colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
+ <div align="center">
11
+ <img src="image.png" width="250" alt="Within Us AI Logo" />
12
+ </div>
13
 
14
+ <br>
15
 
16
+ # Frontier AI Systems for Agentic & Self-Evolving Intelligence
 
17
 
18
+ **WithinUsAI** is an independent AI research organization building beyond traditional machine learning pipelines. We design systems that do not only generate outputs β€” they think, construct, verify, and recursively improve through structured experience.
 
 
 
 
 
19
 
20
+ Our work spans:
21
+ * High-signal datasets
22
+ * Agentic coding systems
23
+ * Recursive intelligence architectures
24
+ * Evaluation-driven AI engineering
25
+ * Model transformation and synthesis
26
 
27
+ ---
28
+
29
+ ## πŸ”¬ Core Vision
30
+
31
+ We believe traditional large language models are approaching structural limits in their ability to learn, adapt, and evolve. Instead of treating intelligence as static, we explore **Developmental Autopoiesis** β€” AI systems that continuously evolve through recursion, memory, and self-generated experience.
32
+
33
+ This shifts AI from:
34
+ * static training β†’ continuous adaptation
35
+ * single-pass inference β†’ recursive cognition loops
36
+ * scaling parameters β†’ designing learning systems
37
+
38
+ ---
39
+
40
+ ## βš™οΈ Research Focus
41
+
42
+ ### πŸ” Recursive Intelligence Systems
43
+ We build architectures that simulate self-improving cognition through:
44
+ * Recursive Seed AI systems (TRM-style models)
45
+ * External memory indexing frameworks
46
+ * Self-reinforcing computation loops
47
+ * Noogenesis.Concordia.Mind.XI experimental architecture
48
+
49
+ ### πŸ’» Agentic AI & Code Systems
50
+ We design models that behave like software engineers:
51
+ * Tool-using workflows
52
+ * Code generation + verification
53
+ * Diff-based patching systems
54
+ * Test-driven reasoning (β€œtests-as-truth”)
55
+
56
+ ### πŸ“š High-Signal Dataset Engineering
57
+ Our datasets are designed as training environments, not just corpora:
58
+ * Python + software engineering datasets
59
+ * Agentic reasoning traces
60
+ * Structured evaluation benchmarks
61
+ * Synthetic multi-domain reasoning corpora
62
+ * Complex technical and historical text mixtures
63
+
64
+ ### ⚑ Efficient AI Deployment
65
+ We prioritize systems that can actually run and iterate:
66
+ * GGUF / llama.cpp ecosystems
67
+ * Low-cost inference pipelines
68
+ * Multi-GPU & TPU optimized training workflows
69
+ * Fast experimental cycles over large-scale compute
70
+
71
+ ---
72
+
73
+ ## 🧬 Model Engineering & Transformation
74
+
75
+ A core part of WithinUsAI research is model transformation rather than just training.
76
+
77
+ ### 🧠 Fine-Tuning & Training LLMs
78
+ We design and execute:
79
+ * Instruction tuning pipelines
80
+ * Domain-specific adaptation
81
+ * Reasoning and coding specialization training
82
+ * Dataset-driven behavioral shaping
83
+
84
+ ### πŸ”€ Merging LLMs
85
+ We explore:
86
+ * Weight merging techniques
87
+ * Architecture blending across model families
88
+ * Behavior fusion between reasoning + coding models
89
+ * Cross-model capability transfer
90
+
91
+ ### 🧠 Mixture of Experts (MoE) Model Merging
92
+ We develop and experiment with:
93
+ * Sparse expert routing systems
94
+ * MoE model merging strategies
95
+ * Expert specialization for coding, reasoning, and tool use
96
+ * Compute-efficient activation-based intelligence
97
+
98
+ *This allows us to build systems where different β€œparts of intelligence” activate only when needed.*
99
+
100
+ ---
101
+
102
+ ## 🧠 Flagship Work
103
+
104
+ ### πŸ”₯ Genesis AI Code Series
105
+ Progressive dataset scaling initiative:
106
+ * Demo β†’ 10K β†’ 50K β†’ 100K
107
+ * Designed for frontier coding agent training
108
+
109
+ ### 🧬 Core Experimental Systems
110
+ * GODs.Ghost.Codex.XI (recursive architecture lineages)
111
+ * MoE sparse reasoning models
112
+ * Agentic coding frameworks
113
+ * Recursive seed AI prototypes
114
+
115
+ ---
116
+
117
+ ## πŸ€– Model Ecosystem
118
+
119
+ WithinUsAI develops interconnected model families:
120
+
121
+ **🧠 Reasoning Models**
122
+ * Long-context reasoning systems
123
+ * Uncensored experimental variants
124
+ * Structured inference models
125
+
126
+ **πŸ’» Coding Models**
127
+ * 0.4B β†’ 8B coding systems
128
+ * MoE-based efficient coders
129
+ * LLaMA, Qwen, Gemma-based derivatives
130
+
131
+ **πŸ€– Agentic Systems**
132
+ * Hermes-style structured agents
133
+ * Claude/Gemini-inspired hybrid agents
134
+ * Space-agent reasoning architectures
135
+
136
+ ---
137
+
138
+ ## πŸ‘₯ Join the Team
139
+
140
+ WithinUsAI is actively expanding and seeking collaborators. We are looking for individuals who want to build systems-level AI, not just fine-tune models.
141
+
142
+ ### 🧠 Roles We Are Looking For
143
+
144
+ **Model Architecture & Research**
145
+ * PyTorch / transformer developers
146
+ * Recursive system designers
147
+ * MoE architecture experimentation
148
+ * LLM merging and fine-tuning engineers
149
+
150
+ **πŸ“Š Dataset Engineering**
151
+ * Synthetic dataset generation
152
+ * Reasoning trace construction
153
+ * Evaluation dataset design
154
+ * Data pipeline optimization
155
+
156
+ **βš™οΈ Systems & Infrastructure**
157
+ * Training pipeline engineers
158
+ * GGUF / inference optimization specialists
159
+ * Multi-GPU & TPU scaling workflows
160
+ * Deployment automation
161
+
162
+ ---
163
+
164
+ ## πŸš€ Why Work With Us
165
+
166
+ You will be contributing to systems that:
167
+ * Evolve through structure, not scale alone
168
+ * Operate as agentic reasoning environments
169
+ * Integrate datasets, models, and recursive learning loops
170
+ * Combine fine-tuning, merging, and MoE synthesis into unified workflows
171
+
172
+ **We are not building static models. We are building adaptive computational ecosystems.**
173
+
174
+ ---
175
+
176
+ ## 🀝 How to Get Involved
177
+
178
+ If you want to contribute:
179
+ * Open a discussion or issue on a repository
180
+ * Propose experiments in training, merging, or MoE design
181
+
182
+ ---
183
+
184
+ ## 🌌 Vision
185
+
186
+ We are working toward a new category of AI: Systems that do not just predict text β€” but recursively construct better versions of themselves.
187
+
188
+ The future is not one model. It is a network of evolving, specialized intelligence systems working together.
189
+
190
+ ---
191
+
192
+ ## πŸ“š Featured Projects
193
+
194
+ * **GODs.Ghost.Codex.XI** β€” recursive architecture framework
195
+ * **PythonGOD-25k** β€” high-density coding dataset
196
+ * **MoE Efficient Coders** β€” sparse expert systems
197
+ * **Genesis AI Code Series** β€” scalable reasoning dataset pipeline
198
+
199
+ ---
200
+
201
+ ## πŸ™ Acknowledgements & Shout-Outs
202
+
203
+ WithinUsAI extends our sincere gratitude to the entire open-source community and the major providers who make this research possible. Thank you for letting us experiment with your foundational models, platforms, and datasets!
204
+
205
+ A special shout-out to:
206
+ **Google β€’ OpenAI β€’ Alibaba β€’ IBM β€’ Microsoft β€’ xAI β€’ DeepSeek β€’ Nvidia β€’ Mistral β€’ BigCode**
207
+
208
+ ...and to all the independent AI developers pushing the boundaries of what is possible.
209
+
210
+ ---
211
+
212
+ ## 🧩 Closing Note
213
+
214
+ WithinUsAI exists at the intersection of:
215
+ * Datasets as environments
216
+ * Models as agents
217
+ * Recursion as learning
218
+ * Merging as synthesis
219
+ * MoE systems as distributed intelligence
220
+
221
+ **We are not training models. We are building self-improving computational ecosystems πŸ§ βš™οΈ**