adminguhantech commited on
Commit
2c89582
·
verified ·
1 Parent(s): 221f7b7

Initial v0.1 closed-beta release

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. CipherModel-1.5B-Q4_K_M.gguf +3 -0
  3. README.md +110 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ CipherModel-1.5B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
CipherModel-1.5B-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc324af070c2ecbfd324a30884d2f951a7ff756aba85cb811a6ec436933bb046
3
+ size 1117320768
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - code
9
+ - coding-assistant
10
+ - llama-cpp
11
+ - gguf
12
+ - ciphercode
13
+ - vscode
14
+ library_name: gguf
15
+ ---
16
+
17
+ # CipherModel-1.5B
18
+
19
+ > **The model behind CipherCode™ — the AI coding assistant that writes code the way YOU would.**
20
+ > Closed-beta v0.1, by **Lila AI LLC**.
21
+
22
+ This repository hosts the GGUF Q4_K_M quantization served by the [CipherCode VS Code extension](https://github.com/lila-ai-llc/ciphercode-vscode) (closed beta). It is built on top of [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) and is suitable for inline code completion, refactor / explain / fix / docstring tasks, and short conversational coding chat.
23
+
24
+ ## What's in this repo
25
+
26
+ | File | Size | Format |
27
+ |---|---|---|
28
+ | `CipherModel-1.5B-Q4_K_M.gguf` | ~1.07 GB | GGUF Q4_K_M (llama.cpp) |
29
+
30
+ ## What this is
31
+
32
+ - **A redistribution of `Qwen2.5-Coder-1.5B-Instruct` in GGUF Q4_K_M format**, branded as CipherModel-1.5B for use in the CipherCode extension's closed beta.
33
+ - **No fine-tuning has been applied yet at v0.1.** The "Cipher Persona" style adaptation that ships with CipherCode operates entirely at the system-prompt level, injecting the developer's detected style into every request — model weights are unchanged from base Qwen.
34
+ - A future v0.2+ release of this repo will contain a true LoRA fine-tune merged into the base.
35
+
36
+ ## Usage
37
+
38
+ ### Via the CipherCode VS Code extension (recommended)
39
+
40
+ ```bash
41
+ # Friends of Lila AI: install the .vsix sent to you privately
42
+ code --install-extension ciphercode-0.1.0.vsix
43
+ ```
44
+
45
+ The extension talks to a private Cloud Run endpoint that serves this model via `llama-server`. End users of the extension never need to download this GGUF themselves.
46
+
47
+ ### Direct with llama.cpp
48
+
49
+ ```bash
50
+ # Download the GGUF
51
+ huggingface-cli download guhantech/CipherModel-1.5B CipherModel-1.5B-Q4_K_M.gguf --local-dir .
52
+
53
+ # Run llama-server
54
+ llama-server \
55
+ -m CipherModel-1.5B-Q4_K_M.gguf \
56
+ --host 0.0.0.0 --port 8080 \
57
+ --ctx-size 4096 -np 5
58
+
59
+ # Hit it
60
+ curl -X POST http://localhost:8080/v1/chat/completions \
61
+ -H "Content-Type: application/json" \
62
+ -d '{"model":"cipher-model","messages":[{"role":"user","content":"write a python fizzbuzz"}],"max_tokens":256}'
63
+ ```
64
+
65
+ ### Direct with `llama-cpp-python`
66
+
67
+ ```python
68
+ from llama_cpp import Llama
69
+ llm = Llama(model_path="CipherModel-1.5B-Q4_K_M.gguf", n_ctx=4096)
70
+ out = llm("def fizzbuzz(n):", max_tokens=256)
71
+ print(out["choices"][0]["text"])
72
+ ```
73
+
74
+ ## Specifications
75
+
76
+ - **Architecture:** Qwen2.5-Coder (transformer)
77
+ - **Parameters:** 1.5 B
78
+ - **Context window:** 32 K (we run at 4 K in production for memory)
79
+ - **Quantization:** Q4_K_M
80
+ - **License:** Apache 2.0 (inherited from base model)
81
+ - **Languages supported:** strong in Python, JavaScript, TypeScript, Java, Go, Rust, C/C++ — see Qwen2.5-Coder's eval table for details
82
+
83
+ ## Limitations
84
+
85
+ - Quality is meaningfully lower than Qwen-Coder-7B / 32B. For complex multi-file reasoning or long-context tasks, prefer the larger sizes.
86
+ - Q4_K_M trades ~1–2% quality for ~4× smaller size vs full fp16. Acceptable for autocomplete and single-file tasks.
87
+ - This is a closed-beta artifact; no SLAs, no support guarantees.
88
+
89
+ ## Citation / credits
90
+
91
+ Built on top of:
92
+
93
+ ```bibtex
94
+ @article{hui2024qwen2,
95
+ title={Qwen2.5-Coder Technical Report},
96
+ author={Binyuan Hui and Jian Yang and Zeyu Cui and Jiaxi Yang and Dayiheng Liu and Lei Zhang and Tianyu Liu and Jiajun Zhang and Bowen Yu and Keming Lu and Kai Dang and Yang Fan and Yichang Zhang and An Yang and Rui Men and Fei Huang and Bo Zheng and Yibo Miao and Shanghaoran Quan and Yunlong Feng and Xingzhang Ren and Xuancheng Ren and Jingren Zhou and Junyang Lin},
97
+ journal={arXiv preprint arXiv:2409.12186},
98
+ year={2024}
99
+ }
100
+ ```
101
+
102
+ ## Trademark
103
+
104
+ CipherCode™ and Cipher Persona™ are trademarks of **Lila AI LLC**. All rights reserved.
105
+
106
+ The CipherModel weights themselves are released under Apache 2.0 (inherited from Qwen). The trademarks restrict only how you may name and brand derivative work — the underlying weights are free to use.
107
+
108
+ ---
109
+
110
+ © 2026 Lila AI LLC.