kgrabko commited on
Commit
f6c8468
·
verified ·
1 Parent(s): 1e37c98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md CHANGED
@@ -1,3 +1,116 @@
1
  ---
 
 
 
 
 
 
 
 
2
  license: unknown
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
+ tags:
4
+ - code
5
+ - coding
6
+ - qwen3.0
7
+ - onnx
8
+ - int8
9
+ - web-ui
10
  license: unknown
11
  ---
12
+
13
+ # JiRack Coder Reasoing 14B INT4
14
+
15
+ A fast and efficient coding assistant with a clean built-in web UI, powered by Qwen3.0-Coder-14B-Instruct base and optimized using Microsoft ONNX Runtime.
16
+
17
+ ## Quick Start
18
+ Watch the JiRack Coder 14B in action:
19
+ **DEMO**: [JiRack Coder Reasoing 14B Web UI](https://youtu.be/mq1DxIov7Bw)
20
+
21
+
22
+ ### Run with Docker
23
+
24
+ ---
25
+ --Default CPU--
26
+
27
+ - docker run -d \
28
+ --name jirack_coder_reasoing_14b \
29
+ -p 7869:7869 \
30
+ --restart unless-stopped \
31
+ cmsmanhattan/jirack_coder_14b_int4_qwenbase:latest
32
+
33
+ --Multi CPU--
34
+
35
+ - docker run -d \
36
+ --name jirack_coder_reasoing_14b \
37
+ -p 7869:7869 \
38
+ --restart unless-stopped \
39
+ --memory=20g \
40
+ --cpus=12 \
41
+ cmsmanhattan/jirack_coder_14b_int4_qwenbase:latest
42
+
43
+ ---GPU--
44
+ -- comming soon
45
+
46
+ - docker run -d \
47
+ --name jirack_coder_reasoing_14b \
48
+ -p 7869:7869 \
49
+ --gpus all \
50
+ --restart unless-stopped \
51
+ cmsmanhattan/jirack_coder_14b_int4_gpu_qwenbase:latest
52
+
53
+ ---
54
+
55
+
56
+ ## Access the UI
57
+
58
+ Once the container is running, open your browser and navigate to:
59
+
60
+ **`http://localhost:7869`**
61
+
62
+ This opens the **JiRack Coder UI** — a clean web interface designed for coding.
63
+
64
+ ## Changing the Port
65
+
66
+ The listening port can be easily modified directly from the **Settings** panel within the JiRack Coder UI.
67
+
68
+ ## Licensing
69
+
70
+ - The **JiRack Coder 14B model** is provided under a commercial license. It ia about 12$ for year per user .
71
+ - All **JiRack UI clients** are provided under a commercial license.
72
+ - However, the UI clients can be used for free when running together with the official JiRack Docker containers, as long as they are not redistributed separately.
73
+
74
+ **JiRack Coder 32B** is available exclusively under a commercial enterprise license.
75
+
76
+ For commercial licensing, cluster deployment, or enterprise use of the JiRack Coder 32B and JiRack Coder 14B , please contact us.
77
+ - JiRack MS Windows 11 Desktop chat client with ollama API setup : https://huggingface.co/kgrabko/JiRackTernary_1b/resolve/main/jirack-chat.zip
78
+ - Live email chat with model via support@cmsmanhattan.com
79
+
80
+
81
+ ## Hardware Recommendations for AMD Systems
82
+ It is more heavy then JiRack Coder 7B INT8
83
+ ### Recommended Hardware for JiRack Coder Reasoing 14B INT8 . It is one dcoker container
84
+
85
+ | Use Case | CPU | GPU (ROCm) | VRAM / RAM | Expected Speed | Recommendation |
86
+ |-----------------------|----------------------------------|-----------------------------------|----------------|---------------------|--------------------|
87
+ | **Recommended** | Ryzen 7 7700 / 9700X | RX 7900 XTX / 7900 XT | 24GB VRAM | 50-75 tokens/s | Best choice |
88
+ | **High Performance** | Ryzen 9 7950X / 9950X | RX 7900 XTX | 24GB+ VRAM | 65-90 tokens/s | Excellent |
89
+ | **Enterprise** | EPYC 7003/9004 series | MI300X or 2x RX 7900 XTX | 48GB+ VRAM | 90-140 tokens/s | For 32B model |
90
+ | **Budget Option** | Ryzen 5 7600 / 9600X | RX 7800 XT (16GB) | 16GB VRAM | 35-50 tokens/s | Acceptable |
91
+
92
+ ### Important Memory Notes
93
+
94
+ Even though the 14B INT4 model itself takes approximately **5–6 GB**, we recommend **at least 24GB VRAM** for the following reasons:
95
+
96
+ - KV-cache consumption during generation (especially with long context)
97
+ - ONNX Runtime overhead and temporary buffers
98
+ - System stability and to avoid Out of Memory errors
99
+ - Room for larger context windows
100
+
101
+ **Minimum recommended:** 24GB VRAM (RX 7900 series)
102
+ **Ideal:** 24–32GB VRAM
103
+
104
+ For pure CPU inference (no GPU), we recommend at least **64GB system RAM** (Ryzen 9 7950X/9950X).
105
+
106
+ ---
107
+ I will the default model in full FP32 precision for quantization, allowing us to find the optimal balance between model size and performance.
108
+
109
+
110
+ ## 📧 Contact & Licensing
111
+ For joint venture opportunities, hardware integration, or licensing inquiries:
112
+ - **Email:** [grabko@cmsmanhattan.com](mailto:grabko@cmsmanhattan.com)
113
+ - **Phone:** +1 (516) 777-0945
114
+ - **Location:** New York, USA
115
+
116
+