guru / README.md

Update architecture: LSH layer, ScaNN, quantization, confidence floor, pruning

964911b verified about 1 month ago

4.48 kB

	---
	license: apache-2.0
	tags:
	- graph-reasoning
	- non-neural
	- cpu-native
	- explainable-ai
	- vector-symbolic
	language:
	- en
	- multilingual
	library_name: webmind
	pipeline_tag: text-generation
	---

	# webmind-brain-v1

	A graph-based reasoning engine. Not a neural network. No gradient descent. No GPU required.

	The brain learns by building a co-occurrence graph over word vectors, then reasons by converging through the graph. Every answer has a traceable source. Knowledge is editable and deletable.

	## Quick Start

	```bash
	pip install numpy fastapi uvicorn lmdb
	```

	```python
	from webmind import Brain

	brain = Brain.from_pretrained("webmind/webmind-brain-v1")

	# Teach it something
	brain.teach("Paris is the capital of France")
	brain.teach("London is the capital of England")

	# Ask
	result = brain.ask("capital of France")
	print(result["answer"]) # paris capital france
	print(result["confidence"]) # 0.85
	print(result["strategy"]) # convergence / co-occurrence / abstain

	# Generate fluent text
	gen = brain.generate("Tell me about France", max_tokens=20, temperature=0.7)
	print(gen["text"])

	# Save
	brain.flush()
	```

	## OpenAI-Compatible Server

	```bash
	python serve.py
	# Then:
	curl http://localhost:8000/v1/chat/completions \
	-H "Content-Type: application/json" \
	-d '{"messages": [{"role": "user", "content": "capital of france"}]}'
	```

	Supports streaming (`"stream": true`), the `/v1/models` endpoint, and `/health`.

	## Architecture

	```
	Input -> Garbage Filter (heuristic + LSH)
	-> Tier 1: Q→A Direct Lookup (LRU + LMDB, <1ms)
	-> Tier 1.5: LSH Semantic Search (O(1) bucket lookup, seed concepts)
	-> Tier 2: Convergence Loop (multi-hop reasoning over sparse graph)
	-> Co-occurrence Search (complementary sparse signal)
	-> Sentence Retrieval (full text from LMDB)
	-> Confidence Floor (abstain if < 0.15)
	-> Web Search fallback (DuckDuckGo + Wikipedia)
	```

	Key properties:
	- Co-occurrence graph: words that appear together pull toward each other in a sparse matrix
	- Convergence loop: iteratively search the graph, blending discovered concepts back into the query until the output stabilizes
	- Dual retrieval: dense neuron search + sparse co-occurrence search race in parallel
	- Successor chains: each word neuron stores its top-10 successors for generation
	- Confidence tracking: every neuron has a confidence score that grows when useful and shrinks when not
	- LSH vocabulary filter: locality-sensitive hashing over MiniLM embeddings for garbage detection, morphological linking ("gravitational"→"gravity"), vocabulary dedup, and O(1) semantic search
	- ScaNN backend: Google's anisotropic vector quantization for faster ANN search (optional, falls back to LSH)
	- Int8 quantization: PolarQuant-inspired 4x embedding compression with ~1% accuracy loss
	- Confidence floor: abstain rather than return weak convergence results (bad context > no context)
	- Vocabulary pruning: score words by convergence contribution, remove low-value entries

	## What It Is Good At

	- Factual Q&A with traceable sources
	- Multi-hop reasoning (convergence crosses concept boundaries)
	- Incremental learning (teach new facts at runtime, no retraining)
	- Honest failure (says "I don't know" when it doesn't converge)
	- Knowledge editing (delete a neuron = delete a fact)

	## What It Is Not Good At

	- Fluent prose generation (output is concept-oriented, not grammatically polished)
	- Creative writing
	- Long-form text
	- Tasks requiring deep syntactic understanding

	## Training Data

	This model ships empty. It learns from what you teach it. The `from_pretrained` download includes the graph structure and vocabulary but no pre-loaded knowledge.

	For evaluation, we tested on HotPotQA (200 train, 50 test) achieving 72% exact match with word neurons + successor chains.

	## Limitations

	- Context window is limited by the convergence loop (not fixed-length, but practically ~10 hops)
	- Generation quality depends heavily on what has been taught
	- No coreference resolution beyond what convergence provides
	- Function words are stripped during reasoning (grammar handled separately)

	## Citation

	If you use this work, please cite:

	```bibtex
	@software{webmind_brain_2026,
	title={Webmind Brain: Graph-Based Reasoning Without Neural Networks},
	url={https://github.com/webmind-ai/webmind-brain},
	year={2026},
	license={Apache-2.0}
	}
	```

	## License

	Apache 2.0