jamesdumay's picture
Upload README.md with huggingface_hub
1295954 verified
metadata
library_name: mesh-llm
base_model:
  - unsloth/DeepSeek-R1-GGUF
pipeline_tag: text-generation
tags:
  - gguf
  - mesh-llm
  - layer-package
  - skippy
  - distributed-inference
  - local-inference
  - openai-compatible
Mesh LLM

DeepSeek-R1-Q4_K_M

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running DeepSeek-R1-Q4_K_M across a local Mesh LLM cluster.

This package is derived from unsloth/DeepSeek-R1-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally Q4_K_M layer package

Model Overview

Property Value
Source model unsloth/DeepSeek-R1-GGUF
Model id unsloth/DeepSeek-R1-GGUF:DeepSeek-R1-Q4_K_M
Family DeepSeek
Parameter scale not recorded
Quantization Q4_K_M
Layer count 61
Activation width 7168
Package size 377.0 GB
Source file DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf
Package repo meshllm/DeepSeek-R1-Q4_K_M-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/DeepSeek-R1-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/DeepSeek-R1-Q4_K_M-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/DeepSeek-R1-GGUF:DeepSeek-R1-Q4_K_M",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/DeepSeek-R1-GGUF@main/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf
Source revision main
Source SHA-256 d111d9e28b4035e6781906b6451b7866737b4a4ee734baa1575c55d8aa1b4200
Skippy ABI 0.1.22
Package manifest SHA-256 f5a62c4f2f5427ac6e083d7666313832cb61a2fc5d8dacfc317b540d8ac82e9d

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums f5a62c4f2f5427ac6e083d7666313832cb61a2fc5d8dacfc317b540d8ac82e9d
Metadata shared/metadata.gguf 0 tensors, 5.0 MB 0e1bf01f20ef69f691126b69c633fd63977cd190bea4182a1c9f41d1537b0ad6
Embeddings shared/embeddings.gguf 1 tensors, 502.1 MB 951352774cabdce1e5fe940b30ca85ac8440de68afb6c6eceb451524109991f1
Output head shared/output.gguf 2 tensors, 730.0 MB 131fcb580fd1ba667733e39ffc806f819f54928621f40cd47405399a8b9abecb
Transformer layers layers/layer-*.gguf 61 layer artifacts, 1022 tensors, 375.8 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_DeepSeek-R1-Q4_K_M-layers-137/package"

Links