Spaces:

meshllm
/

README

Running

App Files Files Community

README / README.md

micdn

Update README.md

097dfdf verified 24 days ago

preview code

raw

history blame contribute delete

1.37 kB

metadata

title: README
emoji: 🐨
colorFrom: red
colorTo: yellow
sdk: static
pinned: false
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/60fa66a3c4c6bd8c56ee541f/FEldjz5JpuoMy8dauKA2M.png
short_description: pool compute for huge model inference

mesh-llm turns spare compute into a peer-to-peer inference cloud for open models.

mesh-llm pools GPUs across macOS and Linux machines so teams, researchers, and agents can run local or open-weight models through one OpenAI-compatible endpoint. It can serve a model on one node, distribute large models across nearby peers, route requests to specialized models, and let agents coordinate through mesh gossip.

What it is for

Share spare GPU capacity across trusted machines.
Run open models locally without a centralized inference provider.
Serve an OpenAI-compatible API at http://localhost:9337/v1.
Route requests across multiple nodes, models, and capabilities.
Experiment with distributed inference, MoE expert sharding, and agent collaboration.

see: https://docs.anarchai.org/ and: https://github.com/mesh-LLM/

Mesh uses a pipelined/network aware distributed inference approach built on llama.cpp called "skippy" - https://github.com/Mesh-LLM/hf-mesh-skippy-splitter contains current code which prepares models so layers can be efficiently JIT downloaded for participating nodes.