--- title: Hello World emoji: "\U0001F916" colorFrom: blue colorTo: purple sdk: static pinned: false short_description: "One App to Rule Them All — 146 APIs, 81 emotions" tags: - reachy-mini - reachy_mini - reachy_mini_python_app models: - onnx-community/yolo26n-ONNX - onnx-community/yolo26n-pose-ONNX - onnx-community/yolo26s-ONNX - onnx-community/yolo26m-ONNX - onnx-community/yolo26m-pose-ONNX - onnx-community/yolo26s-pose-ONNX datasets: - pollen-robotics/reachy-mini-emotions-library - pollen-robotics/reachy-mini-dances-library thumbnail: >- https://huggingface.co/spaces/panny247/hello_world/resolve/main/screenshots/thumbnail.png ---
| ### Hit the Ground Running Reachy Mini ships with basic demos. This app gives every new owner **everything** on day one: AI conversation with 31 tools, real-time YOLO vision, 81 emotions, 20 dances, system monitoring, a web shell, music playback, timers, Bluetooth audio, and full motor control. Install it once, open a browser, and your robot is alive. | ### A Platform to Build Upon Not just an app — a **developer platform**. 146 documented REST endpoints with a full OpenAPI spec. Modular Python architecture: each feature is a self-contained module you can study, modify, or replace. Fork it, add your own API endpoints, build new tabs. The codebase is designed to be read and extended. | ### Lightweight by Design Pure Python + vanilla JavaScript. No React, no Vue, no bundler, no node_modules, no build step. The entire app runs on a **Raspberry Pi CM4 with 4GB RAM**. Clone the repo, `pip install -e .`, restart the daemon — you're live in under a minute. Every dependency earns its place. |
| ### Dual ONNX Runtime Vision Pipeline Real-time YOLO inference on **two backends simultaneously** — ONNX Runtime on ARM64 (Pi CM4) and ONNX Runtime Web via WebGPU in the browser. The CM4 runs nano models at 5-8 FPS for always-on detection; the browser GPU accelerates larger models to 30-60 FPS for detailed analysis. Both share results through a unified WebSocket channel, and the robot reacts to what it sees. | ### 31-Tool Autonomous AI Agent Not just a chatbot — a fully embodied AI that can see, move, listen, speak, play music, set timers, take photos, record video, control motors, and create HTML visualizations. Built on LiteLLM for provider-agnostic access to OpenAI, Anthropic, Groq, Gemini, and DeepSeek. The voice pipeline chains VAD, STT, LLM (with tool calling), and TTS into a seamless conversational loop. |
| ### MuJoCo-Class 3D Simulation Full URDF robot model rendered with Three.js and post-processing (bloom, SMAA). Live WebSocket pose data at 15Hz creates a real-time digital twin. Skin textures, background scenes, interactive orbit controls. Every emotion and dance can be previewed in 3D before playing on the physical robot. | ### 146 Endpoints, Zero Build Steps The backend exposes a full REST API with OpenAPI documentation — every feature is programmable. The frontend is 24 vanilla JS modules with no framework, no transpilation, no bundler. Read the source, change it, reload. This is a codebase designed for developers who want to understand what they're running. |
| Category | Technologies |
|---|---|
| NVIDIA Ecosystem | ONNX Runtime (ARM64 CPU inference on Pi CM4) • ONNX Runtime Web (WebGPU + WASM browser inference) • MuJoCo (physics-grade URDF robot model) |
| Pollen Robotics | reachy-mini SDK (ReachyMiniApp base class, motor control, audio pipeline) • Reachy Mini (9-DOF expressive robot head, Pi CM4, camera, mic/speaker) |
| HuggingFace | HuggingFace Hub (on-demand YOLO model downloads with disk space checking) • HuggingFace Spaces (community distribution) |
| AI / ML | YOLO v8/11 (detection, pose, segmentation, open vocabulary) • LiteLLM (unified multi-provider LLM/TTS/STT) • webrtcvad (voice activity detection) |
| Backend | FastAPI (146 REST endpoints + 6 WebSocket channels) • Python 3.10+ • OpenCV (image processing, video recording) |
| Frontend | Vanilla JavaScript (zero framework, 24 modules) • Three.js (3D URDF rendering) • xterm.js (terminal emulator) • WebRTC (camera streaming) • WebGPU (browser-side ML inference) |