llmAI / minecraft-llm-docs.html
quan3s's picture
Upload minecraft-llm-docs.html
a69c95b verified
<!DOCTYPE html>
<html lang="vi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Minecraft Bot LLM Backend — System Design</title>
<style>
@import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@300;400;600;700&family=Syne:wght@400;600;700;800&display=swap');
:root {
--bg: #0d1117;
--surface: #161b22;
--border: #30363d;
--green: #3fb950;
--green-dim:#1a4a26;
--blue: #58a6ff;
--orange: #f0883e;
--red: #ff7b72;
--purple: #bc8cff;
--yellow: #e3b341;
--text: #e6edf3;
--muted: #7d8590;
--code-bg: #0d1117;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: 'Syne', sans-serif;
background: var(--bg);
color: var(--text);
line-height: 1.6;
}
/* ── Header ── */
header {
background: linear-gradient(135deg, #0d1117 0%, #161b22 50%, #0d1f12 100%);
border-bottom: 1px solid var(--border);
padding: 48px 40px 36px;
position: relative;
overflow: hidden;
}
header::before {
content: '';
position: absolute;
top: -80px; right: -80px;
width: 320px; height: 320px;
border-radius: 50%;
background: radial-gradient(circle, rgba(63,185,80,0.12) 0%, transparent 70%);
}
.header-tag {
font-family: 'JetBrains Mono', monospace;
font-size: 11px;
color: var(--green);
letter-spacing: 3px;
text-transform: uppercase;
margin-bottom: 12px;
}
header h1 {
font-size: clamp(24px, 4vw, 40px);
font-weight: 800;
line-height: 1.1;
margin-bottom: 12px;
}
header h1 span { color: var(--green); }
.header-sub {
color: var(--muted);
font-size: 14px;
font-family: 'JetBrains Mono', monospace;
}
/* ── Layout ── */
.container { max-width: 980px; margin: 0 auto; padding: 40px 24px 80px; }
/* ── Section ── */
section { margin-bottom: 48px; }
.section-label {
font-family: 'JetBrains Mono', monospace;
font-size: 10px;
color: var(--muted);
letter-spacing: 3px;
text-transform: uppercase;
margin-bottom: 8px;
}
h2 {
font-size: 20px;
font-weight: 700;
margin-bottom: 20px;
display: flex;
align-items: center;
gap: 10px;
}
h2 .icon {
width: 28px; height: 28px;
border-radius: 6px;
display: flex; align-items: center; justify-content: center;
font-size: 14px;
flex-shrink: 0;
}
h2 .icon.green { background: var(--green-dim); color: var(--green); }
h2 .icon.blue { background: #0c2a4a; color: var(--blue); }
h2 .icon.orange { background: #3d2206; color: var(--orange); }
h2 .icon.purple { background: #2a1f4a; color: var(--purple); }
/* ── Architecture diagram ── */
.arch {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 12px;
padding: 28px;
font-family: 'JetBrains Mono', monospace;
font-size: 13px;
}
.arch-row {
display: flex;
align-items: center;
gap: 12px;
margin-bottom: 8px;
flex-wrap: wrap;
}
.arch-box {
padding: 8px 16px;
border-radius: 6px;
font-size: 12px;
font-weight: 600;
}
.arch-box.client { background: #0c2a4a; border: 1px solid var(--blue); color: var(--blue); }
.arch-box.auth { background: #3d2206; border: 1px solid var(--orange); color: var(--orange); }
.arch-box.api { background: var(--green-dim); border: 1px solid var(--green); color: var(--green); }
.arch-box.model { background: #2a1f4a; border: 1px solid var(--purple); color: var(--purple); }
.arch-box.hf { background: #1c1209; border: 1px solid var(--yellow); color: var(--yellow); }
.arch-arrow { color: var(--muted); font-size: 16px; }
/* ── File cards ── */
.file-card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 12px;
margin-bottom: 24px;
overflow: hidden;
transition: border-color 0.2s;
}
.file-card:hover { border-color: var(--green); }
.file-header {
display: flex;
align-items: center;
justify-content: space-between;
padding: 14px 20px;
border-bottom: 1px solid var(--border);
background: rgba(255,255,255,0.02);
}
.file-name {
font-family: 'JetBrains Mono', monospace;
font-size: 13px;
font-weight: 700;
color: var(--text);
display: flex;
align-items: center;
gap: 8px;
}
.file-badge {
font-family: 'JetBrains Mono', monospace;
font-size: 10px;
padding: 2px 8px;
border-radius: 20px;
font-weight: 600;
}
.badge-py { background: #1a3a5c; color: #79c0ff; }
.badge-docker { background: #0c2d3d; color: #56d4fc; }
.badge-txt { background: var(--green-dim); color: var(--green); }
.badge-js { background: #3a2a00; color: var(--yellow); }
pre {
margin: 0;
padding: 24px 20px;
overflow-x: auto;
font-family: 'JetBrains Mono', monospace;
font-size: 12px;
line-height: 1.7;
background: var(--code-bg);
color: #e6edf3;
}
/* Syntax highlight helpers */
.c { color: var(--muted); }
.k { color: var(--purple); }
.s { color: var(--orange); }
.n { color: var(--blue); }
.g { color: var(--green); }
.y { color: var(--yellow); }
.r { color: var(--red); }
/* ── Info cards ── */
.info-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(260px, 1fr)); gap: 16px; }
.info-card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 10px;
padding: 20px;
}
.info-card-title {
font-size: 11px;
text-transform: uppercase;
letter-spacing: 2px;
color: var(--muted);
font-family: 'JetBrains Mono', monospace;
margin-bottom: 10px;
}
.info-card-value {
font-size: 16px;
font-weight: 700;
color: var(--green);
font-family: 'JetBrains Mono', monospace;
}
.info-card-desc {
font-size: 12px;
color: var(--muted);
margin-top: 4px;
}
/* ── Steps ── */
.steps { display: flex; flex-direction: column; gap: 16px; }
.step {
display: flex;
gap: 16px;
background: var(--surface);
border: 1px solid var(--border);
border-radius: 10px;
padding: 20px;
align-items: flex-start;
}
.step-num {
width: 32px; height: 32px;
border-radius: 50%;
background: var(--green-dim);
border: 1px solid var(--green);
color: var(--green);
font-family: 'JetBrains Mono', monospace;
font-weight: 700;
font-size: 13px;
display: flex; align-items: center; justify-content: center;
flex-shrink: 0;
}
.step-content h3 { font-size: 15px; font-weight: 700; margin-bottom: 6px; }
.step-content p { font-size: 13px; color: var(--muted); }
.step-content code {
font-family: 'JetBrains Mono', monospace;
font-size: 12px;
background: rgba(255,255,255,0.06);
padding: 2px 6px;
border-radius: 4px;
color: var(--orange);
}
/* ── Secret config box ── */
.secret-box {
background: #1a0a0a;
border: 1px solid #6e1515;
border-radius: 10px;
padding: 24px;
margin-top: 16px;
}
.secret-box h3 {
color: var(--red);
font-size: 14px;
font-weight: 700;
margin-bottom: 12px;
display: flex; align-items: center; gap: 8px;
}
.secret-row {
display: flex;
gap: 12px;
align-items: center;
font-family: 'JetBrains Mono', monospace;
font-size: 13px;
margin-bottom: 8px;
}
.secret-key { color: var(--orange); font-weight: 700; }
.secret-value { color: var(--muted); }
/* ── Warning ── */
.warning {
background: #1f1a00;
border: 1px solid #6e5200;
border-radius: 8px;
padding: 14px 18px;
font-size: 13px;
color: var(--yellow);
font-family: 'JetBrains Mono', monospace;
margin-top: 16px;
}
/* ── Model comparison table ── */
table {
width: 100%;
border-collapse: collapse;
font-family: 'JetBrains Mono', monospace;
font-size: 12px;
background: var(--surface);
border-radius: 10px;
overflow: hidden;
border: 1px solid var(--border);
}
th {
background: rgba(255,255,255,0.04);
padding: 12px 16px;
text-align: left;
color: var(--muted);
font-size: 10px;
letter-spacing: 2px;
text-transform: uppercase;
border-bottom: 1px solid var(--border);
}
td {
padding: 12px 16px;
border-bottom: 1px solid rgba(48,54,61,0.5);
vertical-align: middle;
}
tr:last-child td { border-bottom: none; }
.recommended { color: var(--green); font-weight: 700; }
.check { color: var(--green); }
.cross { color: var(--red); }
</style>
</head>
<body>
<header>
<div class="header-tag">⚙ DevOps + AI Engineering</div>
<h1>Minecraft Bot <span>LLM Backend</span><br>System Design</h1>
<div class="header-sub">Docker · FastAPI · llama-cpp-python · Hugging Face Spaces · OpenAI-compatible API</div>
</header>
<div class="container">
<!-- ── Architecture ── -->
<section>
<div class="section-label">01 / Architecture</div>
<h2><span class="icon green">🗺</span>Luồng xử lý hệ thống</h2>
<div class="arch">
<div class="arch-row">
<div class="arch-box client">Minecraft Bot<br><small>(Node.js / Python)</small></div>
<div class="arch-arrow">──▶</div>
<div class="arch-box auth">Bearer Token<br><small>Header Auth</small></div>
<div class="arch-arrow">──▶</div>
<div class="arch-box api">FastAPI Server<br><small>:7860 /v1/chat/completions</small></div>
<div class="arch-arrow">──▶</div>
<div class="arch-box model">llama-cpp-python<br><small>Qwen2.5-Coder GGUF</small></div>
</div>
<div class="arch-row" style="margin-top:16px; color: var(--muted); font-size: 11px;">
<span>📦 Hosted on</span>
<div class="arch-box hf" style="font-size:11px; padding:4px 12px;">Hugging Face Spaces (Docker)</div>
<span>· Model downloaded at build time · ENV secrets injected at runtime</span>
</div>
</div>
</section>
<!-- ── Model choice ── -->
<section>
<div class="section-label">02 / Model Selection</div>
<h2><span class="icon purple">🧠</span>Model được đề xuất: Qwen2.5-Coder-7B-Instruct Q4_K_M</h2>
<table>
<tr>
<th>Model</th><th>Size</th><th>RAM ~</th><th>Coding</th><th>Reasoning</th><th>License</th><th>GGUF</th>
</tr>
<tr>
<td class="recommended">★ Qwen2.5-Coder-7B-Instruct Q4_K_M</td>
<td>7B</td><td>~5.5 GB</td>
<td class="check">✔ Excellent</td><td class="check">✔ Strong</td><td>Apache 2.0</td><td class="check"></td>
</tr>
<tr>
<td>DeepSeek-Coder-V2-Lite-Instruct Q4</td>
<td>16B MoE</td><td>~8 GB</td>
<td class="check">✔ Excellent</td><td class="check">✔ Excellent</td><td>DeepSeek</td><td class="check"></td>
</tr>
<tr>
<td>Phi-3.5-mini-instruct Q4</td>
<td>3.8B</td><td>~2.5 GB</td>
<td class="check">✔ Good</td><td class="check">✔ Good</td><td>MIT</td><td class="check"></td>
</tr>
<tr>
<td>CodeLlama-7B-Instruct Q4</td>
<td>7B</td><td>~5 GB</td>
<td class="check">✔ Good</td><td class="cross">✘ Weaker</td><td>Llama 2</td><td class="check"></td>
</tr>
</table>
<div class="warning">⚠ HF Spaces free tier RAM ~ 16 GB · Qwen2.5-Coder-7B Q4_K_M (~4.4 GB file, ~5.5 GB runtime) là lựa chọn an toàn nhất về tài nguyên và chất lượng.</div>
</section>
<!-- ── Files ── -->
<section>
<div class="section-label">03 / Source Files</div>
<h2><span class="icon green">📁</span>Các file cấu hình & mã nguồn</h2>
<!-- Dockerfile -->
<div class="file-card">
<div class="file-header">
<div class="file-name">🐳 Dockerfile</div>
<span class="file-badge badge-docker">DOCKER</span>
</div>
<pre><span class="c"># ── Stage 1: builder (compile llama-cpp-python)</span>
<span class="k">FROM</span> python:3.11-slim <span class="k">AS</span> builder
<span class="k">RUN</span> apt-get install -y build-essential cmake wget ...
<span class="k">ENV</span> <span class="n">CMAKE_ARGS</span>=<span class="s">"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"</span>
<span class="k">RUN</span> pip install -r requirements.txt --target /build/deps
<span class="c"># ── Stage 2: runtime (slim image)</span>
<span class="k">FROM</span> python:3.11-slim
<span class="k">RUN</span> useradd -m -u 1000 user <span class="c"># HF Spaces yêu cầu non-root</span>
<span class="k">USER</span> user
<span class="c"># Download GGUF model tại BUILD time (~4.4 GB)</span>
<span class="k">RUN</span> python -c <span class="s">"from huggingface_hub import hf_hub_download; \
hf_hub_download(repo_id='Qwen/Qwen2.5-Coder-7B-Instruct-GGUF', \
filename='qwen2.5-coder-7b-instruct-q4_k_m.gguf', \
local_dir='/app/models')"</span>
<span class="k">ENV</span> <span class="n">MODEL_PATH</span>=<span class="s">/app/models/qwen2.5-coder-7b-instruct-q4_k_m.gguf</span>
<span class="c"># BEARER_TOKEN được inject từ HF Secret — không hard-code ở đây!</span>
<span class="k">EXPOSE</span> <span class="y">7860</span>
<span class="k">CMD</span> [<span class="s">"python"</span>, <span class="s">"-m"</span>, <span class="s">"uvicorn"</span>, <span class="s">"app:app"</span>, <span class="s">"--host"</span>, <span class="s">"0.0.0.0"</span>, <span class="s">"--port"</span>, <span class="s">"7860"</span>]</pre>
</div>
<!-- app.py -->
<div class="file-card">
<div class="file-header">
<div class="file-name">🐍 app.py <small style="color:var(--muted);font-weight:400;"> — FastAPI Server</small></div>
<span class="file-badge badge-py">PYTHON</span>
</div>
<pre><span class="k">from</span> fastapi <span class="k">import</span> FastAPI, HTTPException, Depends
<span class="k">from</span> fastapi.security <span class="k">import</span> HTTPBearer, HTTPAuthorizationCredentials
<span class="k">import</span> os
<span class="n">BEARER_TOKEN</span> = os.environ.get(<span class="s">"BEARER_TOKEN"</span>, <span class="s">""</span>) <span class="c"># ← từ HF Secret</span>
<span class="c"># Auth middleware</span>
<span class="k">def</span> <span class="g">verify_token</span>(creds: HTTPAuthorizationCredentials):
<span class="k">if</span> creds.credentials != BEARER_TOKEN:
<span class="k">raise</span> HTTPException(<span class="y">401</span>, <span class="s">"Invalid Bearer Token"</span>)
<span class="c"># OpenAI-compatible endpoint</span>
<span class="g">@app.post</span>(<span class="s">"/v1/chat/completions"</span>, dependencies=[Depends(verify_token)])
<span class="k">async def</span> <span class="g">chat_completions</span>(request: ChatCompletionRequest):
result = llm.create_chat_completion(
messages=[{<span class="s">"role"</span>: m.role, <span class="s">"content"</span>: m.content} <span class="k">for</span> m <span class="k">in</span> request.messages],
max_tokens=request.max_tokens,
temperature=request.temperature,
)
<span class="k">return</span> ChatCompletionResponse(...) <span class="c"># wrapped in OpenAI schema</span></pre>
</div>
<!-- requirements.txt -->
<div class="file-card">
<div class="file-header">
<div class="file-name">📋 requirements.txt</div>
<span class="file-badge badge-txt">TXT</span>
</div>
<pre><span class="n">llama-cpp-python</span><span class="c">==0.3.4 # GGUF inference engine</span>
<span class="n">fastapi</span><span class="c">==0.115.6 # API framework</span>
<span class="n">uvicorn</span>[standard]<span class="c">==0.32.1 # ASGI server</span>
<span class="n">pydantic</span><span class="c">==2.10.3 # Schema validation</span>
<span class="n">huggingface-hub</span><span class="c">==0.27.0 # Model download</span>
<span class="n">httpx</span><span class="c">==0.28.1 # HTTP client</span></pre>
</div>
<!-- client_test.js -->
<div class="file-card">
<div class="file-header">
<div class="file-name">🟨 client_test.js <small style="color:var(--muted);font-weight:400;"> — Node.js client</small></div>
<span class="file-badge badge-js">NODE.JS</span>
</div>
<pre><span class="k">import</span> OpenAI <span class="k">from</span> <span class="s">"openai"</span>;
<span class="k">const</span> client = <span class="k">new</span> <span class="n">OpenAI</span>({
baseURL: <span class="s">"https://&lt;username&gt;-&lt;space&gt;.hf.space/v1"</span>,
apiKey: process.env.<span class="n">BEARER_TOKEN</span>,
});
<span class="k">const</span> response = <span class="k">await</span> client.chat.completions.create({
model: <span class="s">"qwen2.5-coder-7b-instruct"</span>,
messages: [
{ role: <span class="s">"system"</span>, content: <span class="s">"You are a Minecraft bot brain..."</span> },
{ role: <span class="s">"user"</span>, content: <span class="s">"Bot at x=120. Nearest: oak_log. Chop it."</span> },
],
max_tokens: <span class="y">512</span>,
temperature: <span class="y">0.2</span>,
});</pre>
</div>
<!-- client_test.py -->
<div class="file-card">
<div class="file-header">
<div class="file-name">🐍 client_test.py <small style="color:var(--muted);font-weight:400;"> — Python client</small></div>
<span class="file-badge badge-py">PYTHON</span>
</div>
<pre><span class="k">from</span> openai <span class="k">import</span> OpenAI
<span class="k">import</span> os
client = OpenAI(
base_url=<span class="s">"https://&lt;username&gt;-&lt;space&gt;.hf.space/v1"</span>,
api_key=os.environ.get(<span class="s">"BEARER_TOKEN"</span>),
)
response = client.chat.completions.create(
model=<span class="s">"qwen2.5-coder-7b-instruct"</span>,
messages=[
{<span class="s">"role"</span>: <span class="s">"system"</span>, <span class="s">"content"</span>: <span class="s">"You are a Minecraft bot brain..."</span>},
{<span class="s">"role"</span>: <span class="s">"user"</span>, <span class="s">"content"</span>: <span class="s">"Bot at x=120. Nearest: oak_log. Chop it."</span>},
],
max_tokens=<span class="y">512</span>,
temperature=<span class="y">0.2</span>,
)</pre>
</div>
</section>
<!-- ── HF Secret Config ── -->
<section>
<div class="section-label">04 / Deployment</div>
<h2><span class="icon orange">🔐</span>Cấu hình Secret trên Hugging Face Spaces</h2>
<div class="steps">
<div class="step">
<div class="step-num">1</div>
<div class="step-content">
<h3>Mở Settings của Space</h3>
<p>Vào Space của bạn → tab <strong>Settings</strong> → cuộn xuống mục <strong>"Repository secrets"</strong>.</p>
</div>
</div>
<div class="step">
<div class="step-num">2</div>
<div class="step-content">
<h3>Thêm Secret mới</h3>
<p>Click <strong>"New secret"</strong> → điền <code>BEARER_TOKEN</code> vào trường <em>Name</em> → điền token bí mật của bạn vào <em>Value</em>.</p>
</div>
</div>
<div class="step">
<div class="step-num">3</div>
<div class="step-content">
<h3>Save & Rebuild</h3>
<p>HF sẽ inject giá trị này như biến môi trường vào container lúc runtime. Container sẽ tự rebuild. Token <strong>không bao giờ</strong> xuất hiện trong log hay image layer.</p>
</div>
</div>
<div class="step">
<div class="step-num">4</div>
<div class="step-content">
<h3>Dùng token khi gọi API</h3>
<p>Client phải gửi header: <code>Authorization: Bearer &lt;your-token&gt;</code>. Đặt token vào biến môi trường phía client (<code>BEARER_TOKEN</code>) để tránh hard-code.</p>
</div>
</div>
</div>
<div class="secret-box">
<h3>🚫 Tuyệt đối KHÔNG làm</h3>
<div class="secret-row"><span class="secret-key"></span><span class="secret-value">Hard-code token trong Dockerfile, app.py, hay bất kỳ file nào commit lên repo</span></div>
<div class="secret-row"><span class="secret-key"></span><span class="secret-value">Dùng ENV trong Dockerfile để set BEARER_TOKEN (sẽ bị lộ trong image layer)</span></div>
<div class="secret-row"><span class="secret-key"></span><span class="secret-value">In token ra console hay log file</span></div>
</div>
</section>
<!-- ── Quick stats ── -->
<section>
<div class="section-label">05 / Summary</div>
<h2><span class="icon blue">📊</span>Thông số hệ thống</h2>
<div class="info-grid">
<div class="info-card">
<div class="info-card-title">Port</div>
<div class="info-card-value">7860</div>
<div class="info-card-desc">Mặc định của HF Spaces Docker</div>
</div>
<div class="info-card">
<div class="info-card-title">Model RAM usage</div>
<div class="info-card-value">~5.5 GB</div>
<div class="info-card-desc">Qwen2.5-7B Q4_K_M — an toàn với 16 GB</div>
</div>
<div class="info-card">
<div class="info-card-title">Context window</div>
<div class="info-card-value">4096 tokens</div>
<div class="info-card-desc">Tunable qua N_CTX env var</div>
</div>
<div class="info-card">
<div class="info-card-title">API format</div>
<div class="info-card-value">OpenAI v1</div>
<div class="info-card-desc">/v1/chat/completions · /v1/models</div>
</div>
<div class="info-card">
<div class="info-card-title">Auth method</div>
<div class="info-card-value">Bearer Token</div>
<div class="info-card-desc">Đọc từ HF Secret BEARER_TOKEN</div>
</div>
<div class="info-card">
<div class="info-card-title">Build strategy</div>
<div class="info-card-value">Multi-stage</div>
<div class="info-card-desc">Builder + slim runtime, model pre-downloaded</div>
</div>
</div>
</section>
</div>
</body>
</html>