Spaces:

airsltd
/

LiquidAI_hf

Sleeping

App Files Files Community

OpenCode Deployer commited on Jan 23

Commit

5fd9c8d

1 Parent(s): e366a65

update

Browse files

Files changed (3) hide show

Dockerfile +67 -0
README.md +93 -1
start-lfm25-server.sh +1 -1

Dockerfile ADDED Viewed

	@@ -0,0 +1,67 @@

+FROM ubuntu:22.04
+ENV DEBIAN_FRONTEND=noninteractive
+ENV MODEL_FILE="LFM2.5-1.2B-Thinking-Q4_K_M.gguf"
+ENV HOST="0.0.0.0"
+ENV PORT="7860"
+ENV CTX_SIZE="4096"
+ENV THREADS="-1"
+ENV TEMPERATURE="0.7"
+ENV PREDICT_TOKENS="2048"
+RUN apt-get update && apt-get install -y \
+    curl \
+    wget \
+    build-essential \
+    git \
+    python3 \
+    python3-pip \
+    && rm -rf /var/lib/apt/lists/*
+WORKDIR /app
+COPY start-lfm25-server.sh /app/start-lfm25-server.sh
+RUN git clone https://github.com/ggerganov/llama.cpp.git /tmp/llamacpp && \
+    cd /tmp/llamacpp && \
+    make LLAMA_SERVER=1 && \
+    cp /tmp/llamacpp/llama-server /usr/local/bin/ && \
+    rm -rf /tmp/llamacpp
+RUN echo "📥 下载 LFM2.5-1.2B-Thinking-Q4_K_M.gguf (731MB)..." && \
+    curl -L -o "$MODEL_FILE" \
+        "https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-GGUF/resolve/main/LFM2.5-1.2B-Thinking-Q4_K_M.gguf" \
+        --connect-timeout 60 \
+        --max-time 300 && \
+    echo "✅ 模型下载完成"
+RUN chmod +x /app/start-lfm25-server.sh
+RUN cat > /app/entrypoint.sh << 'EOF'
+#!/bin/bash
+set -e
+echo "🚀 启动 LFM2.5-1.2B-Thinking-Q4_K_M.gguf HTTP 服务器..."
+echo "📁 模型文件: $MODEL_FILE"
+echo "🌐 服务地址: http://0.0.0.0:7860"
+echo "💬 API 端点: http://0.0.0.0:7860/v1/chat/completions"
+echo ""
+exec llama-server \
+    --model "$MODEL_FILE" \
+    --host "0.0.0.0" \
+    --port "7860" \
+    --ctx-size "$CTX_SIZE" \
+    --threads "$THREADS" \
+    --temp "$TEMPERATURE" \
+    --n-predict "$PREDICT_TOKENS" \
+    --log-disable \
+    --verbose-prompt \
+    --api-key "lfm25-api-key"
+EOF
+RUN chmod +x /app/entrypoint.sh
+EXPOSE 7860
+CMD ["/app/entrypoint.sh"]

README.md CHANGED Viewed

@@ -7,4 +7,96 @@ sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 pinned: false
 ---
+# LiquidAI LFM2.5-1.2B-Thinking HuggingFace Space
+基于 llama.cpp 的 LFM2.5-1.2B-Thinking 模型 HTTP API 服务器部署。
+## 🚀 模型信息
+- **模型名称**: LFM2.5-1.2B-Thinking
+- **量化版本**: Q4_K_M
+- **文件大小**: 731MB
+- **架构**: Transformer-based 语言模型
+## 📡 API 服务
+### 服务端点
+- **基础 URL**: `http://localhost:7860`
+- **聊天完成**: `POST /v1/chat/completions`
+- **健康检查**: `GET /health`
+### API 使用示例
+```bash
+curl -X POST "http://localhost:7860/v1/chat/completions" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer lfm25-api-key" \
+  -d '{
+    "model": "LFM2.5-1.2B-Thinking-Q4_K_M.gguf",
+    "messages": [
+      {"role": "user", "content": "你好，请介绍一下你自己"}
+    ],
+    "temperature": 0.7,
+    "max_tokens": 2048
+  }'
+```
+## 🐳 Docker 部署
+### 本地部署
+```bash
+# 构建镜像
+docker build -t liquidai-lfm25 .
+# 运行容器
+docker run -p 7860:7860 liquidai-lfm25
+```
+### HuggingFace Space 自动部署
+1. 推送代码到 HuggingFace Space 仓库
+2. Space 将自动构建并运行 Docker 容器
+3. 服务将在端口 7860 上可用
+## ⚙️ 配置参数
+- **监听地址**: 0.0.0.0
+- **监听端口**: 7860
+- **上下文大小**: 4096 tokens
+- **CPU 线程**: 自动检测
+- **温度参数**: 0.7
+- **最大预测 tokens**: 2048
+- **API 密钥**: lfm25-api-key
+## 📊 监控和日志
+服务器启动时将显示：
+- 模型文件路径
+- 服务地址
+- API 端点信息
+## 🛠️ 开发和调试
+### 本地开发
+```bash
+# 安装依赖
+./start-lfm25-server.sh
+```
+### 日志查看
+```bash
+# 查看容器日志
+docker logs <container_id>
+```
+---
+## 📚 更多信息
+- [llama.cpp 官方文档](https://github.com/ggerganov/llama.cpp)
+- [LiquidAI 模型仓库](https://huggingface.co/LiquidAI)
+- [HuggingFace Space 配置参考](https://huggingface.co/docs/hub/spaces-config-reference)

start-lfm25-server.sh CHANGED Viewed

@@ -8,7 +8,7 @@ set -e
 # 配置变量
 MODEL_FILE="LFM2.5-1.2B-Thinking-Q4_K_M.gguf"
 HOST="0.0.0.0"
-PORT="8080"
 CTX_SIZE="4096"
 THREADS="-1"  # 自动检测CPU核心数
 TEMPERATURE="0.7"

 # 配置变量
 MODEL_FILE="LFM2.5-1.2B-Thinking-Q4_K_M.gguf"
 HOST="0.0.0.0"
+PORT="7860"
 CTX_SIZE="4096"
 THREADS="-1"  # 自动检测CPU核心数
 TEMPERATURE="0.7"