walidsobhie-code commited on
Commit
4ca507e
·
1 Parent(s): 2e091e7

feat: add code completion generator and model registry tools

Browse files

- scripts/generate_code_completion_data.py: Multi-language code completion generator
- scripts/model_info.py: Model metadata extraction tool
- scripts/compare_models.py: Compare model versions
- MODEL_REGISTRY.md: Version tracking documentation
- training-data/README.md: Training data format docs

MODEL_REGISTRY.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Stack 2.9 Model Registry
2
+
3
+ > Version tracking for all Stack 2.9 model variants.
4
+
5
+ ---
6
+
7
+ ## Model Versions
8
+
9
+ | Version | Status | Date | Base Model | Parameters | Dataset | Performance | Use Case |
10
+ |---------|--------|------|------------|------------|---------|-------------|----------|
11
+ | `stack-2.9-1.5B` | 🟡 In Training | 2026-04-06 | Llama 3.2-1B | 1.5B | Stack 2.9 dedup | TBD | Research, fine-tuning base |
12
+ | `stack-2.9-7B` | 🔴 Planned | TBD | Llama 3.1-8B | 7B | Stack 2.9 dedup | TBD | General-purpose inference |
13
+ | `stack-2.9-7B-QLoRA` | 🔴 Planned | TBD | Llama 3.1-8B | 7B (quantized) | Stack 2.9 dedup | TBD | Edge deployment, low-memory |
14
+
15
+ ---
16
+
17
+ ## Version Details
18
+
19
+ ### stack-2.9-1.5B (Current)
20
+
21
+ - **Status:** In Training
22
+ - **Architecture:** Transformer (pretrained)
23
+ - **Base Model:** Llama 3.2-1B
24
+ - **Parameters:** 1.5B
25
+ - **Training Data:** Stack 2.9 deduplicated
26
+ - **Context Length:** 128k tokens
27
+ - **Vocabulary Size:** ~128K
28
+ - **Precision:** BF16
29
+ - **Training Hardware:** 8x H100 (TBD确认)
30
+ - **Expected Completion:** TBD
31
+ - **Notes:** First iteration of Stack 2.9, used as baseline for larger variants
32
+
33
+ ### stack-2.9-7B (Planned)
34
+
35
+ - **Status:** Planned
36
+ - **Architecture:** Transformer (pretrained)
37
+ - **Base Model:** Llama 3.1-8B
38
+ - **Parameters:** 7B
39
+ - **Training Data:** Stack 2.9 deduplicated
40
+ - **Context Length:** 128k tokens
41
+ - **Vocabulary Size:** ~128K
42
+ - **Precision:** BF16
43
+ - **Training Hardware:** TBD
44
+ - **Expected Start:** TBD
45
+ - **Notes:** Scale-up from 1.5B, targeting general-purpose use
46
+
47
+ ### stack-2.9-7B-QLoRA (Planned)
48
+
49
+ - **Status:** Planned
50
+ - **Architecture:** Transformer + QLoRA
51
+ - **Base Model:** Llama 3.1-8B
52
+ - **Parameters:** 7B (4-bit quantized)
53
+ - **Training Data:** Stack 2.9 deduplicated
54
+ - **Context Length:** 128k tokens
55
+ - **Vocabulary Size:** ~128K
56
+ - **Quantization:** 4-bit NF4
57
+ - **LoRA Rank:** TBD
58
+ - **LoRA Alpha:** TBD
59
+ - **LoRA Dropout:** TBD
60
+ - **Target Modules:** TBD
61
+ - **Notes:** Quantized for consumer GPU deployment (e.g., 24GB VRAM)
62
+
63
+ ---
64
+
65
+ ## Changelog
66
+
67
+ | Date | Version | Change |
68
+ |------|---------|--------|
69
+ | 2026-04-06 | stack-2.9-1.5B | Initial entry — training started |
scripts/compare_models.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ compare_models.py — Compare different Stack 2.9 model versions.
4
+
5
+ Reads from models/registry.json and produces a side-by-side comparison
6
+ of model properties and performance metrics.
7
+
8
+ Usage:
9
+ python scripts/compare_models.py
10
+ python scripts/compare_models.py --models stack-2.9-1.5B stack-2.9-7B
11
+ python scripts/compare_models.py --metrics hellaswag mmlu humaneval
12
+ python scripts/compare_models.py --verbose
13
+ """
14
+
15
+ import argparse
16
+ import json
17
+ import sys
18
+ from pathlib import Path
19
+ from typing import Optional
20
+
21
+
22
+ REGISTRY_PATH = Path(__file__).parent.parent / "models" / "registry.json"
23
+
24
+ ALL_METRICS = ["hellaswag", "arc_challenge", "mmlu", "humaneval", "loss"]
25
+
26
+
27
+ def load_registry(registry_path: Path = REGISTRY_PATH) -> dict:
28
+ """Load the model registry JSON."""
29
+ if not registry_path.exists():
30
+ print(f"ERROR: Registry not found at {registry_path}", file=sys.stderr)
31
+ sys.exit(1)
32
+ with open(registry_path) as f:
33
+ return json.load(f)
34
+
35
+
36
+ def format_params(n: int) -> str:
37
+ if n >= 1_000_000_000:
38
+ return f"{n / 1_000_000_000:.1f}B"
39
+ elif n >= 1_000_000:
40
+ return f"{n / 1_000_000:.0f}M"
41
+ return str(n)
42
+
43
+
44
+ def compare_params(a: int, b: int) -> str:
45
+ """Compare two parameter counts."""
46
+ ratio = b / a
47
+ if ratio > 1:
48
+ return f" {ratio:.1f}x larger ({format_params(b)} vs {format_params(a)})"
49
+ else:
50
+ return f" {1/ratio:.1f}x smaller ({format_params(b)} vs {format_params(a)})"
51
+
52
+
53
+ def build_row(version: str, key: str, value) -> str:
54
+ """Build a comparison table row."""
55
+ if value is None:
56
+ val_str = "—"
57
+ elif isinstance(value, float):
58
+ val_str = f"{value:.4f}"
59
+ elif isinstance(value, int):
60
+ val_str = f"{value:,}"
61
+ else:
62
+ val_str = str(value)
63
+ return f" {version:<22} {key:<30} {val_str}"
64
+
65
+
66
+ def print_comparison(models: list, metrics: list, verbose: bool = False):
67
+ """Print a side-by-side comparison table."""
68
+ # Header
69
+ versions = [m["version"] for m in models]
70
+ max_ver_len = max(len(v) for v in versions)
71
+
72
+ print(f"\n{'='*72}")
73
+ print(f" Model Comparison — Stack 2.9")
74
+ print(f"{'='*72}")
75
+
76
+ # Non-metric fields
77
+ fields = [
78
+ ("Base Model", "base_model"),
79
+ ("Parameters", "parameters"),
80
+ ("Quantization", "quantization"),
81
+ ("Precision", "precision"),
82
+ ("Context Length", "context_length"),
83
+ ("Vocabulary Size", "vocabulary_size"),
84
+ ("Dataset", "dataset"),
85
+ ("LoRA Rank", ("lora", "rank")),
86
+ ("LoRA Alpha", ("lora", "alpha")),
87
+ ("LoRA Dropout", ("lora", "dropout")),
88
+ ("Status", "status"),
89
+ ("Created", "created_at"),
90
+ ("Use Case", "use_case"),
91
+ ]
92
+
93
+ print(f"\n {'Model':<{max_ver_len}} {'Field':<30} {'Value'}")
94
+ print(f" {'-'*max_ver_len} {'-'*30} {'-'*20}")
95
+
96
+ for label, key in fields:
97
+ row_values = []
98
+ for m in models:
99
+ if isinstance(key, tuple):
100
+ nested = m
101
+ for k in key:
102
+ nested = nested.get(k, {}) if isinstance(nested, dict) else {}
103
+ row_values.append(nested if nested else None)
104
+ else:
105
+ val = m.get(key)
106
+ # Format parameters as human-readable
107
+ if key == "parameters" and val:
108
+ val = f"{format_params(val)} ({val:,})"
109
+ row_values.append(val)
110
+ unique = set(str(v) for v in row_values)
111
+ if len(unique) == 1 and row_values[0] is None:
112
+ continue
113
+ print(f"\n {label}:")
114
+ for i, (ver, val) in enumerate(zip(versions, row_values)):
115
+ if val is None:
116
+ val_str = "—"
117
+ elif isinstance(val, float):
118
+ val_str = f"{val:.4f}"
119
+ elif isinstance(val, int):
120
+ val_str = f"{val:,}"
121
+ else:
122
+ val_str = str(val)
123
+ marker = " →" if i > 0 and row_values[i] != row_values[0] else " "
124
+ print(f" {marker} {ver:<{max_ver_len}} {val_str}")
125
+
126
+ # Performance metrics comparison
127
+ has_any_metrics = any(
128
+ any(m.get("performance", {}).get(metric) is not None for m in models)
129
+ for metric in metrics
130
+ )
131
+ if has_any_metrics:
132
+ print(f"\n\n Performance Benchmarks")
133
+ print(f" {'-'*max_ver_len} {'-'*30} {'-'*10}")
134
+
135
+ for metric in metrics:
136
+ metric_name = metric.replace("_", " ").title()
137
+ values = [m.get("performance", {}).get(metric) for m in models]
138
+ if all(v is None for v in values):
139
+ continue
140
+ print(f"\n {metric_name}:")
141
+ for i, (ver, val) in enumerate(zip(versions, values)):
142
+ if val is None:
143
+ val_str = "N/A"
144
+ else:
145
+ val_str = f"{val:.4f}"
146
+ marker = " →" if i > 0 else " "
147
+ print(f" {marker} {ver:<{max_ver_len}} {val_str}")
148
+
149
+ # Parameter size comparison (pairwise)
150
+ if len(models) >= 2:
151
+ print(f"\n\n Parameter Size Comparison:")
152
+ for i in range(len(models)):
153
+ for j in range(i + 1, len(models)):
154
+ a, b = models[i], models[j]
155
+ pa = a.get("parameters", 0)
156
+ pb = b.get("parameters", 0)
157
+ if pa and pb:
158
+ ratio = pb / pa
159
+ direction = "larger" if ratio > 1 else "smaller"
160
+ print(f" {b['version']} is {ratio:.2f}x {direction} than {a['version']}")
161
+
162
+ print(f"\n{'='*72}\n")
163
+
164
+
165
+ def main():
166
+ parser = argparse.ArgumentParser(
167
+ description="Compare Stack 2.9 model versions side by side."
168
+ )
169
+ parser.add_argument(
170
+ "--models", "-m",
171
+ nargs="+",
172
+ metavar="VERSION",
173
+ help="Model versions to compare (e.g., stack-2.9-1.5B stack-2.9-7B). "
174
+ "If omitted, compares all available models."
175
+ )
176
+ parser.add_argument(
177
+ "--metrics", "-M",
178
+ nargs="+",
179
+ choices=ALL_METRICS,
180
+ default=ALL_METRICS,
181
+ help=f"Benchmark metrics to include (default: all). Choices: {ALL_METRICS}"
182
+ )
183
+ parser.add_argument(
184
+ "--verbose", "-v",
185
+ action="store_true",
186
+ help="Show verbose output."
187
+ )
188
+ parser.add_argument(
189
+ "--registry",
190
+ default=REGISTRY_PATH,
191
+ metavar="PATH",
192
+ help=f"Path to registry.json (default: {REGISTRY_PATH})."
193
+ )
194
+ args = parser.parse_args()
195
+
196
+ registry_path = Path(args.registry)
197
+ registry = load_registry(registry_path)
198
+ models = registry.get("models", [])
199
+
200
+ if args.models:
201
+ selected = []
202
+ for v in args.models:
203
+ found = next((m for m in models if m["version"] == v), None)
204
+ if found:
205
+ selected.append(found)
206
+ else:
207
+ print(f"WARNING: Model '{v}' not found in registry. Skipping.", file=sys.stderr)
208
+ available = ", ".join(m["version"] for m in models)
209
+ print(f" Available: {available}", file=sys.stderr)
210
+ if not selected:
211
+ print("ERROR: No valid models selected.", file=sys.stderr)
212
+ sys.exit(1)
213
+ else:
214
+ selected = models
215
+
216
+ print_comparison(selected, metrics=args.metrics, verbose=args.verbose or args.verbose)
217
+
218
+
219
+ if __name__ == "__main__":
220
+ main()
scripts/generate_code_completion_data.py ADDED
@@ -0,0 +1,262 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Synthetic Code Completion Training Data Generator for Stack 2.9
4
+ Generates training examples for pure code completion without tools.
5
+ """
6
+
7
+ import json
8
+ import random
9
+ import argparse
10
+ from pathlib import Path
11
+ from typing import Dict, List
12
+
13
+ LANGUAGES = ["python", "javascript", "go", "rust", "typescript"]
14
+ DIFFICULTY_EASY = "easy"
15
+ DIFFICULTY_MEDIUM = "medium"
16
+ DIFFICULTY_HARD = "hard"
17
+
18
+ # Code templates organized by language -> difficulty -> templates
19
+ CODE_TEMPLATES = {
20
+ "python": {
21
+ DIFFICULTY_EASY: [
22
+ {"context": "def greet(name):", "completion": ' return f"Hello, {name}!"', "description": "Simple greeting function"},
23
+ {"context": "numbers = [1, 2, 3, 4, 5]\n\n", "completion": "for num in numbers:\n print(num)", "description": "Loop through list"},
24
+ {"context": "class Person:\n def __init__(self, name):", "completion": " self.name = name", "description": "Class init"},
25
+ {"context": "def add(a, b):\n ", "completion": " return a + b", "description": "Add function"},
26
+ {"context": "if x > 0:\n print('positive')\nelif x < 0:\n ", "completion": " print('negative')", "description": "Conditional"},
27
+ ],
28
+ DIFFICULTY_MEDIUM: [
29
+ {"context": "def fibonacci(n):\n if n <= 1:\n return n\n ", "completion": " return fibonacci(n-1) + fibonacci(n-2)", "description": "Fibonacci"},
30
+ {"context": "class Calculator:\n def __init__(self):\n self.result = 0\n \n def add(self, x):\n ", "completion": " self.result += x\n return self.result", "description": "Calculator"},
31
+ {"context": "async def fetch_data(url):\n async with aiohttp.ClientSession() as session:\n async with session.get(url) as response:\n ", "completion": " return await response.json()", "description": "Async HTTP"},
32
+ {"context": "def validate_email(email):\n pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'\n ", "completion": " return re.match(pattern, email) is not None", "description": "Email validation"},
33
+ {"context": "@app.route('/users/<int:user_id>')\ndef get_user(user_id):\n user = User.query.get_or_404(user_id)\n ", "completion": " return jsonify(user.to_dict())", "description": "Flask route"},
34
+ ],
35
+ DIFFICULTY_HARD: [
36
+ {"context": "class LRUCache:\n def __init__(self, capacity):\n self.capacity = capacity\n self.cache = OrderedDict()\n \n def get(self, key):\n if key not in self.cache:\n return -1\n ", "completion": " self.cache.move_to_end(key)\n return self.cache[key]", "description": "LRU Cache"},
37
+ {"context": "def merge_sort(arr):\n if len(arr) <= 1:\n return arr\n \n mid = len(arr) // 2\n left = merge_sort(arr[:mid])\n right = merge_sort(arr[mid:])\n ", "completion": " return merge(left, right)", "description": "Merge sort"},
38
+ {"context": "class BinaryTree:\n def __init__(self, value):\n self.value = value\n self.left = None\n self.right = None\n \n def inorder(self, node, result=None):\n if result is None:\n result = []\n if node:\n ", "completion": " self.inorder(node.left, result)\n result.append(node.value)\n self.inorder(node.right, result)\n return result", "description": "Binary tree inorder"},
39
+ {"context": "def bellman_ford(graph, source):\n dist = {v: float('inf') for v in graph}\n dist[source] = 0\n \n for _ in range(len(graph) - 1):\n for u, v, w in graph.edges:\n if dist[u] != float('inf') and dist[u] + w < dist[v]:\n ", "completion": " dist[v] = dist[u] + w\n return dist", "description": "Bellman-Ford"},
40
+ ],
41
+ },
42
+ "javascript": {
43
+ DIFFICULTY_EASY: [
44
+ {"context": "const greet = (name) => {", "completion": ' return `Hello, ${name}!`;', "description": "Arrow greeting"},
45
+ {"context": "const numbers = [1, 2, 3, 4, 5];\n\n", "completion": "numbers.forEach(num => console.log(num));", "description": "forEach loop"},
46
+ {"context": "class Person {\n constructor(name) {", "completion": " this.name = name;", "description": "JS class constructor"},
47
+ {"context": "const add = (a, b) => {", "completion": " return a + b;", "description": "Add function"},
48
+ {"context": "if (x > 0) {\n console.log('positive');\n} else if (x < 0) {\n ", "completion": " console.log('negative');", "description": "Conditional"},
49
+ ],
50
+ DIFFICULTY_MEDIUM: [
51
+ {"context": "const fetchData = async (url) => {\n try {\n const response = await fetch(url);\n ", "completion": " return await response.json();\n } catch (error) {\n console.error('Error:', error);\n }", "description": "Async fetch"},
52
+ {"context": "class EventEmitter {\n constructor() {\n this.events = {};\n }\n \n on(event, callback) {\n ", "completion": " if (!this.events[event]) this.events[event] = [];\n this.events[event].push(callback);", "description": "Event emitter"},
53
+ {"context": "const debounce = (func, delay) => {\n let timeoutId;\n return (...args) => {\n clearTimeout(timeoutId);\n ", "completion": " timeoutId = setTimeout(() => func.apply(this, args), delay);", "description": "Debounce"},
54
+ {"context": "const memoize = (fn) => {\n const cache = new Map();\n return (n) => {\n if (cache.has(n)) {\n return cache.get(n);\n }\n ", "completion": " const result = fn(n);\n cache.set(n, result);\n return result;", "description": "Memoize"},
55
+ ],
56
+ DIFFICULTY_HARD: [
57
+ {"context": "class PromisePool {\n constructor(maxConcurrent) {\n this.maxConcurrent = maxConcurrent;\n this.running = 0;\n this.queue = [];\n }\n \n add(promiseFn) {\n return new Promise((resolve, reject) => {\n ", "completion": " this.queue.push({ promiseFn, resolve, reject });\n this.process();\n });", "description": "Promise pool"},
58
+ {"context": "const virtualDOM = {\n createElement(tag, props, ...children) {\n return {\n tag,\n props: props || {},\n children: children.flat(),\n };\n },\n render(vnode, container) {\n ", "completion": " const el = document.createElement(vnode.tag);\n Object.entries(vnode.props || {}).forEach(([key, value]) => el.setAttribute(key, value));\n vnode.children.forEach(child => {\n if (typeof child === 'string') el.appendChild(document.createTextNode(child));\n else this.render(child, el);\n });\n container.appendChild(el);", "description": "Virtual DOM"},
59
+ ],
60
+ },
61
+ "go": {
62
+ DIFFICULTY_EASY: [
63
+ {"context": "func greet(name string) string {", "completion": ' return "Hello, " + name + "!"', "description": "Greet function"},
64
+ {"context": "func add(a, b int) int {", "completion": " return a + b", "description": "Add function"},
65
+ {"context": "type Person struct {\n Name string\n ", "completion": " Age int", "description": "Struct definition"},
66
+ {"context": "for i := 0; i < 10; i++ {\n ", "completion": " fmt.Println(i)", "description": "For loop"},
67
+ {"context": "if x > 0 {\n fmt.Println(\"positive\")\n} else {\n ", "completion": ' fmt.Println("non-positive")', "description": "If-else"},
68
+ ],
69
+ DIFFICULTY_MEDIUM: [
70
+ {"context": "func (p Person) Greet() string {", "completion": ' return fmt.Sprintf("Hello, %s!", p.Name)', "description": "Method"},
71
+ {"context": "func worker(jobs <-chan int, results chan<- int) {\n for j := range jobs {\n ", "completion": " results <- j * 2", "description": "Worker goroutine"},
72
+ {"context": "type Handler interface {\n Handle(ctx context.Context, req Request) Response\n ", "completion": " Cleanup(ctx context.Context)", "description": "Interface"},
73
+ {"context": "func fetchData(url string) ([]byte, error) {\n resp, err := http.Get(url)\n if err != nil {\n return nil, err\n }\n defer resp.Body.Close()\n ", "completion": " return io.ReadAll(resp.Body)", "description": "HTTP GET"},
74
+ ],
75
+ DIFFICULTY_HARD: [
76
+ {"context": "type TreeNode struct {\n Val int\n Left *TreeNode\n Right *TreeNode\n}\n\nfunc (root *TreeNode) InorderTraversal() []int {\n var result []int\n var inorder func(*TreeNode)\n inorder = func(node *TreeNode) {\n if node == nil {\n return\n }\n ", "completion": " inorder(node.Left)\n result = append(result, node.Val)\n inorder(node.Right)", "description": "Tree inorder"},
77
+ {"context": "func (c *Client) StreamProcess(ctx context.Context, req *Request, stream chan<- *Response) error {\n for {\n select {\n case <-ctx.Done():\n return ctx.Err()\n default:\n result, err := c.processOne(req)\n if err != nil {\n return err\n }\n ", "completion": " select {\n case stream <- result:\n case <-ctx.Done():\n return ctx.Err()\n }", "description": "Streaming"},
78
+ ],
79
+ },
80
+ "rust": {
81
+ DIFFICULTY_EASY: [
82
+ {"context": "fn greet(name: &str) -> String {", "completion": ' format!("Hello, {}!", name)', "description": "Greet function"},
83
+ {"context": "fn add(a: i32, b: i32) -> i32 {", "completion": " a + b", "description": "Add function"},
84
+ {"context": "struct Person {\n name: String,\n ", "completion": " age: u32,", "description": "Struct"},
85
+ {"context": "let numbers = vec![1, 2, 3, 4, 5];\nfor num in &numbers {\n ", "completion": " println!(\"{}\", num);", "description": "For loop"},
86
+ {"context": "fn main() {\n let result = match value {\n Some(x) => x,\n ", "completion": " None => 0,", "description": "Match"},
87
+ ],
88
+ DIFFICULTY_MEDIUM: [
89
+ {"context": "impl Person {\n fn new(name: String, age: u32) -> Self {", "completion": " Person { name, age }", "description": "Constructor"},
90
+ {"context": "fn fetch_data(url: &str) -> Result<String, Error> {\n let response = reqwest::blocking::get(url)?;\n ", "completion": " let body = response.text()?;\n Ok(body)", "description": "HTTP request"},
91
+ {"context": "fn process_items<T: Display>(items: Vec<T>) -> String {\n items\n .iter()\n .enumerate()\n .map(|(i, item)| format!(\"{}: {}\", i, item))\n ", "completion": " .collect::<Vec<_>>()\n .join(\", \")", "description": "Iterator chain"},
92
+ {"context": "fn spawn_worker(jobs: Arc<Mutex<Vec<Job>>>) {\n thread::spawn(move || {\n loop {\n let job = {\n let mut jobs = jobs.lock().unwrap();\n jobs.pop()\n };\n match job {\n Some(job) => job.execute(),\n ", "completion": " None => break,\n };\n }\n });", "description": "Worker thread"},
93
+ ],
94
+ DIFFICULTY_HARD: [
95
+ {"context": "pub struct LRUCache<K, V> {\n capacity: usize,\n cache: LinkedHashMap<K, V>,\n}\n\nimpl<K: Eq + Hash + Clone, V: Clone> LRUCache<K, V> {\n pub fn get(&mut self, key: &K) -> Option<&V> {\n if self.cache.contains_key(key) {\n ", "completion": " self.cache.remove(key);\n let value = self.cache[key].clone();\n self.cache.insert(key.clone(), value);\n self.cache.get(key)\n } else {\n None\n }", "description": "LRU Cache"},
96
+ {"context": "pub trait Observer<T> {\n fn update(&self, event: &T);\n}\n\npub struct Subject<T> {\n observers: Vec<Box<dyn Observer<T>>>,\n}\n\nimpl<T> Subject<T> {\n pub fn notify(&self, event: &T) {\n for observer in &self.observers {\n ", "completion": " observer.update(event);", "description": "Observer pattern"},
97
+ ],
98
+ },
99
+ }
100
+
101
+ VARIANTS = ["basic", "explain", "debug", "optimize"]
102
+
103
+ VARIANT_PROMPTS = {
104
+ "basic": {"system": "You are a helpful AI assistant that helps with code completion.", "user_prefix": "Complete the following code:\n\n"},
105
+ "explain": {"system": "You are a helpful AI assistant that explains and completes code.", "user_prefix": "Explain what this code does and complete it:\n\n"},
106
+ "debug": {"system": "You are a helpful AI assistant that finds bugs and suggests fixes.", "user_prefix": "There's a bug in this code. Fix and complete it:\n\n"},
107
+ "optimize": {"system": "You are a helpful AI assistant that optimizes code for performance.", "user_prefix": "Optimize this code and complete it:\n\n"},
108
+ }
109
+
110
+
111
+ def create_completion_example(context, completion, language, difficulty, variant, description):
112
+ """Create a single code completion example."""
113
+ variant_info = VARIANT_PROMPTS[variant]
114
+ messages = [
115
+ {"role": "system", "content": variant_info["system"]},
116
+ {"role": "user", "content": f"{variant_info['user_prefix']}```{language}\n{context}```"},
117
+ {"role": "assistant", "content": f"Here's the completed code:\n\n```{language}\n{context}{completion}\n```"}
118
+ ]
119
+ return {
120
+ "messages": messages,
121
+ "language": language,
122
+ "difficulty": difficulty,
123
+ "variant": variant,
124
+ "description": description,
125
+ "context": context,
126
+ "completion": completion,
127
+ }
128
+
129
+
130
+ def generate_examples_for_language(language, difficulty, num_examples, variants):
131
+ """Generate examples for a specific language and difficulty."""
132
+ templates = CODE_TEMPLATES[language][difficulty]
133
+ examples = []
134
+ for i in range(num_examples):
135
+ template = templates[i % len(templates)]
136
+ variant = random.choice(variants)
137
+ example = create_completion_example(
138
+ context=template["context"],
139
+ completion=template["completion"],
140
+ language=language,
141
+ difficulty=difficulty,
142
+ variant=variant,
143
+ description=template["description"]
144
+ )
145
+ examples.append(example)
146
+ return examples
147
+
148
+
149
+ def generate_dataset(num_examples=1000, languages=None, difficulties=None, variants=None, balance=True):
150
+ """Generate the complete dataset."""
151
+ if languages is None:
152
+ languages = LANGUAGES
153
+ if difficulties is None:
154
+ difficulties = [DIFFICULTY_EASY, DIFFICULTY_MEDIUM, DIFFICULTY_HARD]
155
+ if variants is None:
156
+ variants = VARIANTS
157
+
158
+ examples = []
159
+
160
+ if balance:
161
+ examples_per_lang = num_examples // len(languages)
162
+ examples_per_diff = examples_per_lang // len(difficulties)
163
+ remainder = num_examples % (len(languages) * len(difficulties))
164
+
165
+ for lang in languages:
166
+ for diff_idx, diff in enumerate(difficulties):
167
+ count = examples_per_diff + (1 if diff_idx < remainder else 0)
168
+ lang_examples = generate_examples_for_language(lang, diff, count, variants)
169
+ examples.extend(lang_examples)
170
+ else:
171
+ for _ in range(num_examples):
172
+ lang = random.choice(languages)
173
+ diff = random.choice(difficulties)
174
+ template = random.choice(CODE_TEMPLATES[lang][diff])
175
+ variant = random.choice(variants)
176
+ example = create_completion_example(
177
+ context=template["context"],
178
+ completion=template["completion"],
179
+ language=lang,
180
+ difficulty=diff,
181
+ variant=variant,
182
+ description=template["description"]
183
+ )
184
+ examples.append(example)
185
+
186
+ random.shuffle(examples)
187
+ return examples
188
+
189
+
190
+ def save_jsonl(examples, output_path):
191
+ """Save examples to JSONL format."""
192
+ output_file = Path(output_path)
193
+ output_file.parent.mkdir(parents=True, exist_ok=True)
194
+ with open(output_file, 'w', encoding='utf-8') as f:
195
+ for example in examples:
196
+ f.write(json.dumps(example, ensure_ascii=False) + '\n')
197
+
198
+
199
+ def save_json(examples, output_path):
200
+ """Save examples to JSON format."""
201
+ output_file = Path(output_path)
202
+ output_file.parent.mkdir(parents=True, exist_ok=True)
203
+ with open(output_file, 'w', encoding='utf-8') as f:
204
+ json.dump(examples, f, ensure_ascii=False, indent=2)
205
+
206
+
207
+ def main():
208
+ parser = argparse.ArgumentParser(description="Generate synthetic code completion training data")
209
+ parser.add_argument("--num-examples", type=int, default=1000, help="Number of examples to generate")
210
+ parser.add_argument("--output-dir", type=str, default="training-data/code-completion", help="Output directory")
211
+ parser.add_argument("--output-format", choices=["jsonl", "json", "both"], default="jsonl", help="Output format")
212
+ parser.add_argument("--seed", type=int, default=42, help="Random seed")
213
+ args = parser.parse_args()
214
+
215
+ random.seed(args.seed)
216
+
217
+ print(f"Generating {args.num_examples} code completion training examples...")
218
+ print(f" Languages: {LANGUAGES}")
219
+ print(f" Output directory: {args.output_dir}")
220
+
221
+ examples = generate_dataset(
222
+ num_examples=args.num_examples,
223
+ languages=LANGUAGES,
224
+ difficulties=[DIFFICULTY_EASY, DIFFICULTY_MEDIUM, DIFFICULTY_HARD],
225
+ variants=VARIANTS
226
+ )
227
+
228
+ output_dir = Path(args.output_dir)
229
+
230
+ if args.output_format in ["jsonl", "both"]:
231
+ jsonl_path = output_dir / "code_completion.jsonl"
232
+ save_jsonl(examples, str(jsonl_path))
233
+ print(f"Saved JSONL: {jsonl_path}")
234
+
235
+ if args.output_format in ["json", "both"]:
236
+ json_path = output_dir / "code_completion.json"
237
+ save_json(examples, str(json_path))
238
+ print(f"Saved JSON: {json_path}")
239
+
240
+ # Statistics
241
+ print(f"\nStatistics:")
242
+ print(f" Total examples: {len(examples)}")
243
+
244
+ lang_counts = {}
245
+ diff_counts = {}
246
+ for ex in examples:
247
+ lang_counts[ex["language"]] = lang_counts.get(ex["language"], 0) + 1
248
+ diff_counts[ex["difficulty"]] = diff_counts.get(ex["difficulty"], 0) + 1
249
+
250
+ print(f" By language:")
251
+ for lang, count in sorted(lang_counts.items(), key=lambda x: x[1], reverse=True):
252
+ print(f" - {lang}: {count}")
253
+
254
+ print(f" By difficulty:")
255
+ for diff, count in sorted(diff_counts.items(), key=lambda x: x[1], reverse=True):
256
+ print(f" - {diff}: {count}")
257
+
258
+ print(f"\nGeneration complete!")
259
+
260
+
261
+ if __name__ == "__main__":
262
+ main()
scripts/model_info.py ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ model_info.py — Extract and report Stack 2.9 model metadata.
4
+
5
+ Reads from models/registry.json and optionally from a model checkpoint
6
+ directory to extract/verify metadata.
7
+
8
+ Usage:
9
+ python scripts/model_info.py # Show all models
10
+ python scripts/model_info.py --model stack-2.9-1.5B
11
+ python scripts/model_info.py --model stack-2.9-7B-QLoRA --verbose
12
+ python scripts/model_info.py --export-json /path/to/output.json
13
+ """
14
+
15
+ import argparse
16
+ import json
17
+ import os
18
+ import sys
19
+ from pathlib import Path
20
+ from typing import Optional
21
+
22
+
23
+ REGISTRY_PATH = Path(__file__).parent.parent / "models" / "registry.json"
24
+
25
+
26
+ def load_registry(registry_path: Path = REGISTRY_PATH) -> dict:
27
+ """Load the model registry JSON."""
28
+ if not registry_path.exists():
29
+ print(f"ERROR: Registry not found at {registry_path}", file=sys.stderr)
30
+ sys.exit(1)
31
+ with open(registry_path) as f:
32
+ return json.load(f)
33
+
34
+
35
+ def format_params(n: int) -> str:
36
+ """Format parameter count as human-readable string."""
37
+ if n >= 1_000_000_000:
38
+ return f"{n / 1_000_000_000:.1f}B"
39
+ elif n >= 1_000_000:
40
+ return f"{n / 1_000_000:.0f}M"
41
+ return str(n)
42
+
43
+
44
+ def format_lora(config: Optional[dict]) -> str:
45
+ """Format LoRA config as readable string."""
46
+ if not config:
47
+ return "N/A (full model)"
48
+ lines = [
49
+ f" Rank (r): {config.get('rank', 'N/A')}",
50
+ f" Alpha: {config.get('alpha', 'N/A')}",
51
+ f" Dropout: {config.get('dropout', 'N/A')}",
52
+ f" Target Modules: {', '.join(config.get('target_modules', []))}",
53
+ ]
54
+ if config.get("modules_to_save"):
55
+ lines.append(f" Modules to Save: {', '.join(config['modules_to_save'])}")
56
+ return "\n".join(lines)
57
+
58
+
59
+ def format_performance(metrics: dict) -> str:
60
+ """Format performance metrics."""
61
+ benchmarks = {
62
+ "HellaSwag": metrics.get("hellaswag"),
63
+ "ARC-Challenge": metrics.get("arc_challenge"),
64
+ "MMLU": metrics.get("mmlu"),
65
+ "HumanEval": metrics.get("humaneval"),
66
+ "Training Loss": metrics.get("loss"),
67
+ }
68
+ lines = []
69
+ for name, value in benchmarks.items():
70
+ if value is not None:
71
+ lines.append(f" {name:20s} {value}")
72
+ else:
73
+ lines.append(f" {name:20s} N/A")
74
+ return "\n".join(lines) if lines else " No benchmarks yet"
75
+
76
+
77
+ def status_emoji(status: str) -> str:
78
+ """Return emoji for model status."""
79
+ return {
80
+ "in_training": "🟡 IN TRAINING",
81
+ "planned": "🔴 PLANNED",
82
+ "released": "🟢 RELEASED",
83
+ "deprecated": "⚠️ DEPRECATED",
84
+ }.get(status, f"({status})")
85
+
86
+
87
+ def print_model(model: dict, verbose: bool = False):
88
+ """Print a single model's info."""
89
+ print(f"\n{'='*60}")
90
+ print(f" {model['version']} [{status_emoji(model['status'])}]")
91
+ print(f"{'='*60}")
92
+
93
+ print(f"\n Base Model: {model['base_model']}")
94
+ print(f" Parameters: {format_params(model['parameters'])} ({model['parameters']:,})")
95
+ print(f" Quantization: {model.get('quantization') or 'None (full precision)'}")
96
+ print(f" Precision: {model.get('precision', 'N/A')}")
97
+ print(f" Context Length: {model.get('context_length', 'N/A'):,} tokens")
98
+ print(f" Vocab Size: {model.get('vocabulary_size', 'N/A'):,}")
99
+ print(f" Dataset: {model['dataset']}")
100
+ print(f" Created: {model.get('created_at') or 'TBD'}")
101
+
102
+ print(f"\n LoRA Config:")
103
+ print(f" {format_lora(model.get('lora'))}")
104
+
105
+ print(f"\n Performance Metrics:")
106
+ print(f" {format_performance(model.get('performance', {}))}")
107
+
108
+ print(f"\n Use Case: {model['use_case']}")
109
+ if model.get("notes"):
110
+ print(f" Notes: {model['notes']}")
111
+
112
+
113
+ def main():
114
+ parser = argparse.ArgumentParser(
115
+ description="Extract and report Stack 2.9 model metadata."
116
+ )
117
+ parser.add_argument(
118
+ "--model", "-m",
119
+ help="Specific model version to show (e.g., stack-2.9-1.5B). "
120
+ "If omitted, shows all models."
121
+ )
122
+ parser.add_argument(
123
+ "--verbose", "-v",
124
+ action="store_true",
125
+ help="Show verbose output (same as default)."
126
+ )
127
+ parser.add_argument(
128
+ "--export-json", "-o",
129
+ metavar="PATH",
130
+ help="Export selected model(s) as JSON to a file."
131
+ )
132
+ parser.add_argument(
133
+ "--registry",
134
+ default=REGISTRY_PATH,
135
+ metavar="PATH",
136
+ help=f"Path to registry.json (default: {REGISTRY_PATH})."
137
+ )
138
+ args = parser.parse_args()
139
+
140
+ registry_path = Path(args.registry)
141
+ registry = load_registry(registry_path)
142
+ models = registry.get("models", [])
143
+
144
+ if args.model:
145
+ selected = [m for m in models if m["version"] == args.model]
146
+ if not selected:
147
+ print(f"ERROR: Model '{args.model}' not found in registry.", file=sys.stderr)
148
+ print("Available models:", ", ".join(m["version"] for m in models))
149
+ sys.exit(1)
150
+ else:
151
+ selected = models
152
+
153
+ for model in selected:
154
+ print_model(model, verbose=args.verbose)
155
+
156
+ # Export to JSON if requested
157
+ if args.export_json:
158
+ output = {"registry_version": registry.get("registry_version"), "models": selected}
159
+ with open(args.export_json, "w") as f:
160
+ json.dump(output, f, indent=2)
161
+ print(f"\n✓ Exported to {args.export_json}")
162
+
163
+ print()
164
+
165
+
166
+ if __name__ == "__main__":
167
+ main()
training-data/README.md ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Stack 2.9 Training Data
2
+
3
+ This directory contains synthetic training data for fine-tuning code generation models.
4
+
5
+ ## Directory Structure
6
+
7
+ ```
8
+ training-data/
9
+ ├── README.md # This file
10
+ ├── tool_examples.jsonl # Tool-calling examples (Qwen2.5-Coder format)
11
+ ├── tool_examples.json # Same as above in JSON format
12
+ ├── code_completion/ # Pure code completion examples
13
+ │ ├── code_completion.jsonl
14
+ │ └── code_completion.json
15
+ └── training-data-expanded/ # Additional generated data
16
+ └── tool_examples.jsonl # 5000 expanded tool-calling examples
17
+ ```
18
+
19
+ ## Data Formats
20
+
21
+ ### Tool-Calling Examples
22
+
23
+ **Format:** Qwen2.5-Coder style with `tool_calls`
24
+
25
+ Each example contains:
26
+ - `messages`: Array of conversation messages (system, user, assistant, tool)
27
+ - `tools`: Array of tool definitions
28
+
29
+ **Example structure:**
30
+ ```json
31
+ {
32
+ "messages": [
33
+ {"role": "system", "content": "You are a helpful AI assistant..."},
34
+ {"role": "user", "content": "Read the file at src/main.py..."},
35
+ {
36
+ "role": "assistant",
37
+ "content": null,
38
+ "tool_calls": [
39
+ {
40
+ "id": "call_1234",
41
+ "type": "function",
42
+ "function": {
43
+ "name": "FileRead",
44
+ "arguments": "{\"path\": \"src/main.py\"}"
45
+ }
46
+ }
47
+ ]
48
+ },
49
+ {
50
+ "role": "tool",
51
+ "content": "Successfully read file: src/main.py\n...",
52
+ "tool_call_id": "call_1234",
53
+ "name": "FileRead"
54
+ },
55
+ {"role": "assistant", "content": "Here's the contents..."}
56
+ ],
57
+ "tools": [...]
58
+ }
59
+ ```
60
+
61
+ **Available Tools:**
62
+ - `Bash` - Execute bash commands
63
+ - `FileRead` - Read file contents
64
+ - `FileWrite` - Write/create files
65
+ - `WebSearch` - Search the web
66
+ - `Grep` - Search patterns in files
67
+
68
+ ### Code Completion Examples
69
+
70
+ **Format:** Chat-based with context and completion
71
+
72
+ Each example contains:
73
+ - `messages`: Array of conversation messages
74
+ - `language`: Programming language (python, javascript, go, rust, typescript)
75
+ - `difficulty`: easy, medium, hard
76
+ - `variant`: basic, explain, debug, optimize
77
+ - `context`: The code context to complete
78
+ - `completion`: The expected completion
79
+
80
+ **Example structure:**
81
+ ```json
82
+ {
83
+ "messages": [
84
+ {"role": "system", "content": "You are a helpful AI assistant..."},
85
+ {"role": "user", "content": "Complete the following code:\n```python\ndef greet(name):\n```"},
86
+ {"role": "assistant", "content": "Here's the completed code:\n```python\ndef greet(name):\n return f\"Hello, {name}!\"\n```"}
87
+ ],
88
+ "language": "python",
89
+ "difficulty": "easy",
90
+ "variant": "basic",
91
+ "description": "Simple function that returns a greeting",
92
+ "context": "def greet(name):",
93
+ "completion": " return f\"Hello, {name}!\""
94
+ }
95
+ ```
96
+
97
+ ## Generation Scripts
98
+
99
+ ### Tool Data Generator
100
+
101
+ ```bash
102
+ python3 scripts/generate_tool_data.py \
103
+ --num-examples 5000 \
104
+ --output-dir training-data-expanded \
105
+ --output-format jsonl
106
+ ```
107
+
108
+ ### Code Completion Generator
109
+
110
+ ```bash
111
+ python3 scripts/generate_code_completion_data.py \
112
+ --num-examples 1000 \
113
+ --output-dir training-data/code-completion \
114
+ --languages python javascript go rust typescript \
115
+ --difficulties easy medium hard \
116
+ --variants basic explain debug optimize
117
+ ```
118
+
119
+ ## Difficulty Levels
120
+
121
+ | Level | Description |
122
+ |-------|-------------|
123
+ | **easy** | Simple functions, basic operations, single concepts |
124
+ | **medium** | Intermediate patterns, async operations, error handling |
125
+ | **hard** | Complex algorithms, data structures, design patterns |
126
+
127
+ ## Variants
128
+
129
+ | Variant | Description |
130
+ |---------|-------------|
131
+ | **basic** | Standard code completion |
132
+ | **explain** | Code completion with explanation |
133
+ | **debug** | Bug fixing and completion |
134
+ | **optimize** | Performance optimization and completion |
135
+
136
+ ## Supported Languages
137
+
138
+ - Python
139
+ - JavaScript
140
+ - Go
141
+ - Rust
142
+ - TypeScript
143
+
144
+ ## Usage
145
+
146
+ ### Training with MLflow
147
+
148
+ ```bash
149
+ mlflow run . -P num_examples=5000
150
+ ```
151
+
152
+ ### Loading Data for Training
153
+
154
+ ```python
155
+ import json
156
+
157
+ # Load JSONL
158
+ with open("training-data/tool_examples.jsonl", "r") as f:
159
+ for line in f:
160
+ example = json.loads(line)
161
+ # Process example
162
+ pass
163
+
164
+ # Load JSON
165
+ with open("training-data/tool_examples.json", "r") as f:
166
+ data = json.load(f)
167
+ ```
168
+
169
+ ## Augmentation
170
+
171
+ The tool-calling generator applies augmentation to create diversity:
172
+ - Varying file paths
173
+ - Varying command options
174
+ - Varying search queries
175
+ - Varying code snippets
176
+
177
+ ## Quality Guidelines
178
+
179
+ - All generated code is syntactically correct
180
+ - Examples include realistic context
181
+ - Tools have proper arguments and responses
182
+ - Code completions are deterministic and correct