Spaces:

meshllm
/

README

Running

README / README.md

Update README.md

097dfdf verified 24 days ago

1.37 kB

	---
	title: README
	emoji: 🐨
	colorFrom: red
	colorTo: yellow
	sdk: static
	pinned: false
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/60fa66a3c4c6bd8c56ee541f/FEldjz5JpuoMy8dauKA2M.png
	short_description: pool compute for huge model inference
	---

	mesh-llm turns spare compute into a peer-to-peer inference cloud for open models.

	mesh-llm pools GPUs across macOS and Linux machines so teams, researchers, and agents can run local or open-weight models through one OpenAI-compatible endpoint. It can serve a model on one node, distribute large models across nearby peers, route requests to specialized models, and let agents coordinate through mesh gossip.


	What it is for
	* Share spare GPU capacity across trusted machines.
	* Run open models locally without a centralized inference provider.
	* Serve an OpenAI-compatible API at http://localhost:9337/v1.
	* Route requests across multiple nodes, models, and capabilities.
	* Experiment with distributed inference, MoE expert sharding, and agent collaboration.


	see: https://docs.anarchai.org/
	and: https://github.com/mesh-LLM/

	Mesh uses a pipelined/network aware distributed inference approach built on llama.cpp called "skippy" - https://github.com/Mesh-LLM/hf-mesh-skippy-splitter contains current code which prepares models so layers can be efficiently JIT downloaded for participating nodes.