javaeeduke commited on
Commit
680f451
·
verified ·
1 Parent(s): 596061c

Update Dockerfile

Browse files
Files changed (1) hide show
  1. Dockerfile +66 -42
Dockerfile CHANGED
@@ -1,50 +1,74 @@
 
 
 
 
 
1
  FROM ubuntu:22.04
2
 
3
- # 1. 安装系统依赖 + Python
4
- RUN apt-get update && apt-get install -y \
5
- wget \
6
- curl \
7
- ca-certificates \
8
- python3 \
9
- python3-pip \
10
- python3-venv \
11
- && rm -rf /var/lib/apt/lists/*
12
 
13
- # 2. 设置工作目录
14
  WORKDIR /app
15
 
16
- # 3. 下载 llamafile(可选,如果你需要它)
17
- RUN curl -L -o llamafile \
 
 
 
 
 
 
 
 
18
  https://github.com/Mozilla-Ocho/llamafile/releases/download/0.9.2/llamafile-0.9.2 \
19
- && chmod +x llamafile
20
-
21
- # 4. 安装 Python 依赖
22
- RUN pip3 install --no-cache-dir fastapi uvicorn
23
-
24
- # 5. 生成 app.py(你的 API 接口)
25
- RUN echo 'import os, subprocess\n\
26
- from fastapi import FastAPI, Query\n\
27
- import uvicorn\n\
28
- app = FastAPI()\n\
29
- @app.get("/")\n\
30
- def index():\n\
31
- return {"status": "running", "msg": "Welcome to Agent-Reach API"}\n\
32
- @app.get("/cmd")\n\
33
- def run_command(q: str = Query(..., description="command")):\n\
34
- try:\n\
35
- result = subprocess.check_output(q, shell=True, stderr=subprocess.STDOUT, text=True)\n\
36
- return {"code": 0, "output": result}\n\
37
- except subprocess.CalledProcessError as e:\n\
38
- return {"code": 1, "output": e.output}\n\
39
- if __name__ == "__main__":\n\
40
- uvicorn.run(app, host="0.0.0.0", port=7860)\n\
41
- ' > /app/app.py
42
-
43
- # 6. 权限(可选)
44
- RUN chmod -R 777 /app
45
-
46
- # 7. 暴露端口
47
  EXPOSE 7860
48
 
49
- # 8. 启动
50
- CMD ["python3", "/app/app.py"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 下面这个可以直接复制到 Hugging Face Space 里的 **`Dockerfile`** 使用。
2
+
3
+ 这个版本是:**llamafile + Qwen2.5-1.5B-Instruct-GGUF + 7860 端口聊天网页**。
4
+
5
+ ```dockerfile
6
  FROM ubuntu:22.04
7
 
8
+ ENV DEBIAN_FRONTEND=noninteractive
 
 
 
 
 
 
 
 
9
 
 
10
  WORKDIR /app
11
 
12
+ # 安装基础工具
13
+ RUN apt-get update && apt-get install -y --no-install-recommends \
14
+ ca-certificates \
15
+ curl \
16
+ wget \
17
+ bash \
18
+ && rm -rf /var/lib/apt/lists/*
19
+
20
+ # 下载 llamafile 主程序
21
+ RUN curl -L --fail \
22
  https://github.com/Mozilla-Ocho/llamafile/releases/download/0.9.2/llamafile-0.9.2 \
23
+ -o /app/llamafile \
24
+ && chmod +x /app/llamafile
25
+
26
+ # 下载 GGUF 模型
27
+ # 这个模型比较适合 HF 免费 CPU Space
28
+ RUN curl -L --fail \
29
+ https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q8_0.gguf \
30
+ -o /app/model.gguf
31
+
32
+ # Hugging Face Space 默认网页端口
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  EXPOSE 7860
34
 
35
+ # 启动 llamafile Web Chat
36
+ CMD ["/app/llamafile", \
37
+ "-m", "/app/model.gguf", \
38
+ "--server", \
39
+ "--host", "0.0.0.0", \
40
+ "--port", "7860", \
41
+ "-ngl", "0", \
42
+ "-t", "2", \
43
+ "-c", "4096"]
44
+ ```
45
+
46
+ 你还需要确认 `README.md` 顶部是这样:
47
+
48
+ ```yaml
49
+ ---
50
+ title: LLM Chat
51
+ emoji: 🤖
52
+ colorFrom: blue
53
+ colorTo: purple
54
+ sdk: docker
55
+ app_port: 7860
56
+ pinned: false
57
+ ---
58
+ ```
59
+
60
+ 提交后等它重新 Build,成功后直接打开:
61
+
62
+ ```text
63
+ https://huggingface.co/spaces/javaeeduke/llm
64
+ ```
65
+
66
+ 就应该能看到 llamafile 的聊天界面。
67
+
68
+ 如果这个 1.5B Q8 模型启动慢,可以把模型换成更小的 0.5B:
69
+
70
+ ```dockerfile
71
+ RUN curl -L --fail \
72
+ https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q8_0.gguf \
73
+ -o /app/model.gguf
74
+ ```