guru / README.md
tejadabheja's picture
Update architecture: LSH layer, ScaNN, quantization, confidence floor, pruning
964911b verified
---
license: apache-2.0
tags:
- graph-reasoning
- non-neural
- cpu-native
- explainable-ai
- vector-symbolic
language:
- en
- multilingual
library_name: webmind
pipeline_tag: text-generation
---
# webmind-brain-v1
A graph-based reasoning engine. Not a neural network. No gradient descent. No GPU required.
The brain learns by building a co-occurrence graph over word vectors, then reasons by converging through the graph. Every answer has a traceable source. Knowledge is editable and deletable.
## Quick Start
```bash
pip install numpy fastapi uvicorn lmdb
```
```python
from webmind import Brain
brain = Brain.from_pretrained("webmind/webmind-brain-v1")
# Teach it something
brain.teach("Paris is the capital of France")
brain.teach("London is the capital of England")
# Ask
result = brain.ask("capital of France")
print(result["answer"]) # paris capital france
print(result["confidence"]) # 0.85
print(result["strategy"]) # convergence / co-occurrence / abstain
# Generate fluent text
gen = brain.generate("Tell me about France", max_tokens=20, temperature=0.7)
print(gen["text"])
# Save
brain.flush()
```
## OpenAI-Compatible Server
```bash
python serve.py
# Then:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "capital of france"}]}'
```
Supports streaming (`"stream": true`), the `/v1/models` endpoint, and `/health`.
## Architecture
```
Input -> Garbage Filter (heuristic + LSH)
-> Tier 1: Q→A Direct Lookup (LRU + LMDB, <1ms)
-> Tier 1.5: LSH Semantic Search (O(1) bucket lookup, seed concepts)
-> Tier 2: Convergence Loop (multi-hop reasoning over sparse graph)
-> Co-occurrence Search (complementary sparse signal)
-> Sentence Retrieval (full text from LMDB)
-> Confidence Floor (abstain if < 0.15)
-> Web Search fallback (DuckDuckGo + Wikipedia)
```
Key properties:
- **Co-occurrence graph**: words that appear together pull toward each other in a sparse matrix
- **Convergence loop**: iteratively search the graph, blending discovered concepts back into the query until the output stabilizes
- **Dual retrieval**: dense neuron search + sparse co-occurrence search race in parallel
- **Successor chains**: each word neuron stores its top-10 successors for generation
- **Confidence tracking**: every neuron has a confidence score that grows when useful and shrinks when not
- **LSH vocabulary filter**: locality-sensitive hashing over MiniLM embeddings for garbage detection, morphological linking ("gravitational"→"gravity"), vocabulary dedup, and O(1) semantic search
- **ScaNN backend**: Google's anisotropic vector quantization for faster ANN search (optional, falls back to LSH)
- **Int8 quantization**: PolarQuant-inspired 4x embedding compression with ~1% accuracy loss
- **Confidence floor**: abstain rather than return weak convergence results (bad context > no context)
- **Vocabulary pruning**: score words by convergence contribution, remove low-value entries
## What It Is Good At
- Factual Q&A with traceable sources
- Multi-hop reasoning (convergence crosses concept boundaries)
- Incremental learning (teach new facts at runtime, no retraining)
- Honest failure (says "I don't know" when it doesn't converge)
- Knowledge editing (delete a neuron = delete a fact)
## What It Is Not Good At
- Fluent prose generation (output is concept-oriented, not grammatically polished)
- Creative writing
- Long-form text
- Tasks requiring deep syntactic understanding
## Training Data
This model ships empty. It learns from what you teach it. The `from_pretrained` download includes the graph structure and vocabulary but no pre-loaded knowledge.
For evaluation, we tested on HotPotQA (200 train, 50 test) achieving 72% exact match with word neurons + successor chains.
## Limitations
- Context window is limited by the convergence loop (not fixed-length, but practically ~10 hops)
- Generation quality depends heavily on what has been taught
- No coreference resolution beyond what convergence provides
- Function words are stripped during reasoning (grammar handled separately)
## Citation
If you use this work, please cite:
```bibtex
@software{webmind_brain_2026,
title={Webmind Brain: Graph-Based Reasoning Without Neural Networks},
url={https://github.com/webmind-ai/webmind-brain},
year={2026},
license={Apache-2.0}
}
```
## License
Apache 2.0