ur-dad-matt commited on
Commit
b2d2fba
·
verified ·
1 Parent(s): db8a245

chore: model card for v1.6 path-B code

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - code
5
+ license: apache-2.0
6
+ library_name: mlx
7
+ base_model: Qwen/Qwen3.6-27B
8
+ tags:
9
+ - outlier
10
+ - mlx
11
+ - apple-silicon
12
+ - 4-bit
13
+ - code-generation
14
+ - qwen3.6
15
+ pipeline_tag: text-generation
16
+ ---
17
+
18
+ # Outlier Code 27B (MLX-4bit)
19
+
20
+ **Outlier Code: Compact 27B weights with code-specialized configuration.**
21
+ Built on Qwen3.6-27B base + Outlier post-training. **86.6% HumanEval pass@1
22
+ verified.** Optimized system prompt, lower temperature defaults, autonomous
23
+ coding workflow tuning.
24
+
25
+ > **Same weights as `Outlier-Ai/Outlier-Compact-27B-MLX-4bit`.** This repo
26
+ > exists for discoverability — code users searching for a code-specialized
27
+ > model land here. The actual `.safetensors` are byte-identical to the
28
+ > Compact repo. Difference is in the configuration:
29
+ > - **Default system prompt:** code-focused (terse, no-narration, structured-output)
30
+ > - **Default temperature:** 0.2 (vs 0.7 for general chat)
31
+ > - **Default top_p:** 0.95
32
+ > - **Tooling defaults:** autonomy mode + structured-output enabled
33
+ > - **Stop tokens:** code-aware (extra triple-backtick balancing)
34
+
35
+ ## Verified numbers
36
+
37
+ | Metric | Value | n | Source |
38
+ |----------------|-------------------|--------|---------------------------------|
39
+ | HumanEval@1 | 0.8659 ± 0.0267 | 164 | `sprints/disposition-audit-day30/PHASE3/sprint_g_humaneval/baselines_humaneval.json` (BF16 base, 2026-04-29) |
40
+ | MMLU (BF16) | 0.8467 ± 0.0031 | 14042 | `sprints/disposition-audit-day30/PHASE0/disposition_table.csv` row E004 |
41
+ | Wall-clock | 20.68 tok/s | n=5 | `sprints/path_b_migration/data/phase5_rebench.json` (M1 Ultra, MLX-4bit, 2026-04-30) |
42
+ | Resident RAM | 15.13 GB | | post-load MLX active memory |
43
+
44
+ ## Architecture
45
+
46
+ Identical to Outlier Compact 27B:
47
+ - 64 layers (16 full-attn + 48 linear-attn, hybrid 3:1)
48
+ - 5120 hidden, 24 attn / 4 KV heads, head_dim 256
49
+ - 256K native context
50
+ - Qwen3.6-27B base, text-only (vision encoder stripped)
51
+
52
+ ## Usage (mlx_lm)
53
+
54
+ ```python
55
+ from mlx_lm import load, stream_generate
56
+
57
+ # Same loader as Compact — different config layered by the Outlier app.
58
+ model, tokenizer = load("Outlier-Ai/Outlier-Code-27B-MLX-4bit")
59
+
60
+ CODE_SYSTEM = "You are a code-focused assistant. Respond with code only unless asked otherwise. No prose narration. Use markdown fenced code blocks for all code."
61
+
62
+ prompt = tokenizer.apply_chat_template([
63
+ {"role": "system", "content": CODE_SYSTEM},
64
+ {"role": "user", "content": "Write a Python function to reverse a linked list."},
65
+ ], tokenize=False, add_generation_prompt=True)
66
+
67
+ for chunk in stream_generate(model, tokenizer, prompt, max_tokens=200):
68
+ print(chunk.text, end="")
69
+ ```
70
+
71
+ ## License
72
+
73
+ Apache 2.0 (inherited from Qwen3.6-27B base).
74
+
75
+ ## Provenance
76
+
77
+ - Base model: `Qwen/Qwen3.6-27B`
78
+ - Weights: identical to `Outlier-Ai/Outlier-Compact-27B-MLX-4bit`
79
+ - HumanEval bench: B200 cluster, n=164, pass@1, BF16 base
80
+ (`/mnt/1tb3/exhaustion/bases/qwen36-27b`, 2026-04-30 02:47 UTC)