Spaces:
Running
on
Zero
Running
on
Zero
| title: GASM Enhanced - Geometric Language AI | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.16.0 | |
| app_file: app.py | |
| pinned: false | |
| license: cc-by-nd-4.0 | |
| # ๐ GASM Enhanced - Geometric Attention for Spatial Understanding | |
| > *Bridging natural language and geometric reasoning through SE(3)-invariant neural architectures* | |
| ## What Makes This Different? | |
| Traditional AI understands *what* objects are mentioned, but struggles with *where* they are and *how* they relate spatially. GASM changes this. | |
| **GASM** (Geometric Attention for Spatial & Mathematical understanding) represents a breakthrough in AI spatial reasoning: | |
| - **๐ง Advanced NLP**: Goes beyond keywords with spaCy + semantic categorization | |
| - **๐ Proper 3D Math**: Uses SE(3) Lie groups for mathematically correct spatial relationships | |
| - **๐ Geometric Optimization**: Minimizes curvature on Riemannian manifolds for optimal layouts | |
| - **โจ Real-time Visualization**: Shows spatial understanding in live 3D geometry | |
| ## ๐ What This Enables | |
| ### The Spatial Intelligence Gap | |
| Current language models excel at: | |
| - โ "What is a keyboard?" โ *An input device* | |
| - โ "Where is the keyboard relative to the monitor?" โ *Spatial confusion* | |
| GASM bridges this gap through mathematical spatial reasoning. | |
| ### Real Applications | |
| This isn't just a demo - GASM addresses actual problems in: | |
| - **๐ค Robotics**: "Move the component above the platform" โ Precise 3D coordinates | |
| - **๐ฌ Scientific Modeling**: "The electron orbits the nucleus" โ Proper geometric relationships | |
| - **๐๏ธ Engineering**: "Place the support between the beams" โ Constraint satisfaction | |
| - **๐ฅฝ AR/VR**: Natural language to 3D scene understanding | |
| ## ๐ฏ Try It Yourself | |
| ### Watch GASM in Action | |
| Input any sentence with spatial relationships: | |
| > *"The ball lies left of the table next to the computer, while the book sits between the keyboard and the monitor."* | |
| **GASM Output:** | |
| - โ **6 entities identified**: ball, table, computer, book, keyboard, monitor | |
| - ๐ **5 spatial relations**: left_of, next_to, between | |
| - ๐ **3D geometric layout** with proper SE(3) positioning | |
| - ๐ **Curvature evolution** showing geometric convergence | |
| ### More Examples | |
| **๐ค Robotics**: *"The robotic arm moves the satellite component above the assembly platform."* | |
| **๐ฌ Scientific**: *"The electron orbits the nucleus while the magnetic field flows through the crystal."* | |
| **๐ Everyday**: *"The red car parks between two buildings near the park entrance."* | |
| ### What You'll See | |
| 1. **Advanced Entity Recognition**: Far beyond simple keyword matching | |
| 2. **Spatial Relationship Extraction**: Understands "left of", "between", "above" in context | |
| 3. **3D Visualization**: Real geometric positioning in proper 3D space | |
| 4. **Mathematical Convergence**: Curvature evolution showing optimization progress | |
| ## ๐ Project Structure | |
| ``` | |
| GASM-Huggingface/ | |
| โโโ app.py # Main Gradio application with complete interface | |
| โโโ gasm_core.py # Core GASM implementation with SE(3) math | |
| โโโ fastapi_endpoint.py # Optional API endpoints (standalone) | |
| โโโ requirements.txt # Python dependencies | |
| โโโ README.md # This file | |
| ``` | |
| ## ๐งฎ The Mathematics Behind GASM | |
| ### What Makes It Special | |
| Unlike traditional NLP that treats text as sequences of tokens, GASM understands geometry: | |
| **1. SE(3) Invariant Processing** | |
| - Uses Special Euclidean Group SE(3) for proper 3D transformations | |
| - Maintains mathematical correctness under rotations and translations | |
| - Employs Lie group operations for geometric learning | |
| **2. Advanced Entity Recognition** | |
| - **spaCy NLP**: Part-of-speech tagging + named entity recognition | |
| - **Semantic Filtering**: Domain-specific vocabularies (robotics, scientific, everyday) | |
| - **Contextual Understanding**: Extracts objects from spatial prepositions | |
| **3. Geometric Optimization** | |
| - **Geodesic Distances**: Shortest paths on SE(3) manifold | |
| - **Discrete Curvature**: Graph Laplacian eigenvalue-based computation | |
| - **Energy Minimization**: Constraint satisfaction via Lagrange multipliers | |
| ### Technical Architecture | |
| ``` | |
| Text โ spaCy NLP โ Entity Extraction โ Semantic Filtering | |
| โ | |
| SE(3) Embedding โ Attention Mechanism โ Geometric Refinement | |
| โ | |
| Constraint Satisfaction โ Curvature Optimization โ 3D Visualization | |
| ``` | |
| ### Why This Matters | |
| Most AI systems use simple word embeddings that lose spatial meaning. GASM preserves geometric relationships through mathematically principled operations, enabling true spatial understanding. | |
| ## ๐จ Visualizations | |
| The Space provides two main visualizations: | |
| ### 1. Curvature Evolution Plot | |
| - Shows geometric convergence over iterations | |
| - Displays SE(3) manifold optimization progress | |
| - Uses matplotlib with dark theme for clarity | |
| ### 2. 3D Entity Space Plot | |
| - Interactive 3D positioning of extracted entities | |
| - Color-coded by entity type (robotic, physical, spatial, etc.) | |
| - Shows relationship connections between entities | |
| ## ๐ฌ How It Works | |
| 1. **Text Input**: User provides text for analysis | |
| 2. **Entity Extraction**: Regex-based extraction of meaningful entities | |
| 3. **Relation Detection**: Identification of spatial, temporal, physical relations | |
| 4. **GASM Processing**: If available, real SE(3) forward pass through geometric manifold | |
| 5. **Visualization**: Generate curvature evolution and 3D entity plots | |
| 6. **Results**: Comprehensive analysis with JSON output | |
| ## โก Performance | |
| - **CPU Mode**: Optimized for HuggingFace Spaces CPU allocation | |
| - **GPU Fallback**: Automatic ZeroGPU usage when available | |
| - **Memory Efficient**: ~430MB total memory footprint | |
| - **Fast Processing**: 0.1-0.8s processing time depending on text length | |
| ## ๐ ๏ธ Local Development | |
| To run locally: | |
| ```bash | |
| git clone <this-repo> | |
| cd GASM-Huggingface | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the application | |
| python app.py | |
| ``` | |
| ## ๐ Space Configuration | |
| This Space is configured with: | |
| - **SDK**: Gradio 4.44.1+ | |
| - **Python**: 3.8+ | |
| - **GPU**: ZeroGPU compatible (A10G/T4 fallback) | |
| - **Memory**: 16GB RAM allocation | |
| - **Storage**: Persistent storage for model caching | |
| ## ๐ API Endpoints | |
| The Space also exposes FastAPI endpoints (when fastapi_endpoint.py is run separately): | |
| - `POST /process`: Process text with geometric enhancement | |
| - `GET /health`: Health check and memory usage | |
| - `GET /info`: Model configuration information | |
| ## ๐ Use Cases | |
| Perfect for analyzing: | |
| - **Technical Documentation**: Spatial relationships in engineering texts | |
| - **Scientific Literature**: Physical phenomena and experimental setups | |
| - **Educational Content**: Geometry and physics explanations | |
| - **Robotic Systems**: Assembly instructions and spatial configurations | |
| ## ๐ฏ Model Details | |
| - **Base Architecture**: Built on transformer foundations | |
| - **Geometric Processing**: SE(3) Lie group operations | |
| - **Attention Mechanism**: Geodesic distance-based attention weighting | |
| - **Curvature Computation**: Discrete Gaussian curvature via graph Laplacian | |
| - **Constraint Handling**: Energy minimization with Lagrange multipliers | |
| ## ๐ Why This Matters | |
| ### Current State of AI | |
| - โ Excellent at text understanding and generation | |
| - โ Great at image recognition and computer vision | |
| - โ **Struggles with spatial reasoning from language** | |
| - โ **Can't bridge text โ 3D geometry gap** | |
| ### GASM's Contribution | |
| GASM represents a step toward AI that understands space the way humans do - not just as coordinates, but as meaningful geometric relationships between objects in the world. | |
| **Applications on the horizon:** | |
| - ๐ค Robots that understand spatial instructions naturally | |
| - ๐๏ธ AI architects that reason about 3D spaces from descriptions | |
| - ๐ฌ Scientific AI that models physical systems geometrically | |
| - ๐ฎ Game AI that understands spatial gameplay naturally | |
| ## ๐ ๏ธ Local Development | |
| ```bash | |
| git clone https://github.com/scheitelpunk/GASM-Huggingface | |
| cd GASM-Huggingface | |
| pip install -r requirements.txt | |
| python app.py | |
| ``` | |
| The system gracefully handles missing dependencies with intelligent fallbacks. | |
| ## ๐ค Contributing | |
| This is active research in spatial AI! We welcome: | |
| - ๐ Bug reports and edge cases | |
| - ๐ก New spatial relationship types | |
| - ๐ Additional language support | |
| - ๐ Evaluation datasets | |
| - ๐ง Performance optimizations | |
| ## ๐ License & Citation | |
| Licensed under CC-BY-NC 4.0. For research use, please cite: | |
| ```bibtex | |
| @misc{gasm2025, | |
| title={GASM: Geometric Attention for Spatial Understanding}, | |
| author={Michael Neuberger, Versino PsiOmega GmbH}, | |
| year={2025}, | |
| url={https://huggingface.co/spaces/scheitelpunk/GASM} | |
| } | |
| ``` | |
| ## ๐ Built With | |
| - ๐ค **Hugging Face Spaces** - Deployment platform | |
| - ๐ **spaCy** - Advanced NLP processing | |
| - ๐ข **PyTorch** - Neural network framework | |
| - ๐ **Gradio** - Interactive ML interfaces | |
| - ๐ **Geomstats** - Geometric computing | |
| --- | |
| *GASM: Where language meets geometry, and AI begins to understand space.* ๐ | |
| Built by Michael Neuberger, Versino PsiOmega GmbH |