koreallmdev commited on
Commit
787e896
·
verified ·
1 Parent(s): 91bdcbb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ pipeline_tag: text-generation
4
+ tags:
5
+ - qwen
6
+ - qwen2
7
+ - lora
8
+ - vllm
9
+ - open-webui
10
+ - korean
11
+ - coding
12
+ ---
13
+
14
+ # 7bcustom-model
15
+
16
+ This is a public deployment package for a local DGX AI Factory coding assistant runtime.
17
+
18
+ ## Model
19
+
20
+ - Public name: `7bcustom-model`
21
+ - Runtime served name: `dgx-stable-current`
22
+ - Base family: Qwen2 7B Instruct class
23
+ - Runtime: vLLM OpenAI-compatible API
24
+ - Open-WebUI compatible: yes
25
+
26
+ ## Deployment status
27
+
28
+ This public release is based on the locally validated stable deployment.
29
+
30
+ ```text
31
+ average_score: 97.75
32
+ pass_70_plus: 20/20
33
+ strong_85_plus: 20/20
34
+ critical_fail_count: 0
35
+ decision: DEPLOY_CANDIDATE
36
+ ```
37
+
38
+ ## Runtime policy
39
+
40
+ The local production runtime uses router/template safeguards for deterministic operational answers:
41
+
42
+ - Linux guarded prompt
43
+ - vLLM medium prompt
44
+ - CUDA check template
45
+ - LoRA/stable/rejected policy template
46
+
47
+ ## vLLM example
48
+
49
+ ```bash
50
+ python -m vllm.entrypoints.openai.api_server \
51
+ --model ./ \
52
+ --served-model-name 7bcustom-model \
53
+ --dtype float16 \
54
+ --host 0.0.0.0 \
55
+ --port 8000 \
56
+ --max-model-len 1536 \
57
+ --gpu-memory-utilization 0.50 \
58
+ --max-num-seqs 8
59
+ ```
60
+
61
+ ## Open-WebUI
62
+
63
+ ```text
64
+ Base URL: http://<host>:8000/v1
65
+ Model : 7bcustom-model
66
+ API Key : dummy
67
+ ```
68
+
69
+ ## Notes
70
+
71
+ This repository is intended as a public model/runtime release record. Local absolute paths, private operational logs, and preservation tarballs are not required for public usage.