--- license: apache-2.0 tags: - graph-reasoning - non-neural - cpu-native - explainable-ai - vector-symbolic language: - en - multilingual library_name: webmind pipeline_tag: text-generation --- # webmind-brain-v1 A graph-based reasoning engine. Not a neural network. No gradient descent. No GPU required. The brain learns by building a co-occurrence graph over word vectors, then reasons by converging through the graph. Every answer has a traceable source. Knowledge is editable and deletable. ## Quick Start ```bash pip install numpy fastapi uvicorn lmdb ``` ```python from webmind import Brain brain = Brain.from_pretrained("webmind/webmind-brain-v1") # Teach it something brain.teach("Paris is the capital of France") brain.teach("London is the capital of England") # Ask result = brain.ask("capital of France") print(result["answer"]) # paris capital france print(result["confidence"]) # 0.85 print(result["strategy"]) # convergence / co-occurrence / abstain # Generate fluent text gen = brain.generate("Tell me about France", max_tokens=20, temperature=0.7) print(gen["text"]) # Save brain.flush() ``` ## OpenAI-Compatible Server ```bash python serve.py # Then: curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"messages": [{"role": "user", "content": "capital of france"}]}' ``` Supports streaming (`"stream": true`), the `/v1/models` endpoint, and `/health`. ## Architecture ``` Input -> Garbage Filter (heuristic + LSH) -> Tier 1: Q→A Direct Lookup (LRU + LMDB, <1ms) -> Tier 1.5: LSH Semantic Search (O(1) bucket lookup, seed concepts) -> Tier 2: Convergence Loop (multi-hop reasoning over sparse graph) -> Co-occurrence Search (complementary sparse signal) -> Sentence Retrieval (full text from LMDB) -> Confidence Floor (abstain if < 0.15) -> Web Search fallback (DuckDuckGo + Wikipedia) ``` Key properties: - **Co-occurrence graph**: words that appear together pull toward each other in a sparse matrix - **Convergence loop**: iteratively search the graph, blending discovered concepts back into the query until the output stabilizes - **Dual retrieval**: dense neuron search + sparse co-occurrence search race in parallel - **Successor chains**: each word neuron stores its top-10 successors for generation - **Confidence tracking**: every neuron has a confidence score that grows when useful and shrinks when not - **LSH vocabulary filter**: locality-sensitive hashing over MiniLM embeddings for garbage detection, morphological linking ("gravitational"→"gravity"), vocabulary dedup, and O(1) semantic search - **ScaNN backend**: Google's anisotropic vector quantization for faster ANN search (optional, falls back to LSH) - **Int8 quantization**: PolarQuant-inspired 4x embedding compression with ~1% accuracy loss - **Confidence floor**: abstain rather than return weak convergence results (bad context > no context) - **Vocabulary pruning**: score words by convergence contribution, remove low-value entries ## What It Is Good At - Factual Q&A with traceable sources - Multi-hop reasoning (convergence crosses concept boundaries) - Incremental learning (teach new facts at runtime, no retraining) - Honest failure (says "I don't know" when it doesn't converge) - Knowledge editing (delete a neuron = delete a fact) ## What It Is Not Good At - Fluent prose generation (output is concept-oriented, not grammatically polished) - Creative writing - Long-form text - Tasks requiring deep syntactic understanding ## Training Data This model ships empty. It learns from what you teach it. The `from_pretrained` download includes the graph structure and vocabulary but no pre-loaded knowledge. For evaluation, we tested on HotPotQA (200 train, 50 test) achieving 72% exact match with word neurons + successor chains. ## Limitations - Context window is limited by the convergence loop (not fixed-length, but practically ~10 hops) - Generation quality depends heavily on what has been taught - No coreference resolution beyond what convergence provides - Function words are stripped during reasoning (grammar handled separately) ## Citation If you use this work, please cite: ```bibtex @software{webmind_brain_2026, title={Webmind Brain: Graph-Based Reasoning Without Neural Networks}, url={https://github.com/webmind-ai/webmind-brain}, year={2026}, license={Apache-2.0} } ``` ## License Apache 2.0