ameforge commited on
Commit
375d66b
·
verified ·
1 Parent(s): 1a589f7

Create README.md

Browse files

model cofos code entreprise

Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsl-1.0
3
+ ---
4
+
5
+ # Cofos Code Model ({MODEL_VERSION}) — SparseMind 500M
6
+
7
+ **Cofos v2** is a 500M-parameter code model built on AMFORGE's **SparseMind v15**
8
+ architecture. Same essence as Cofos v1 (296M @ 34% real_syntax_valid),
9
+ scaled larger and trained with multilingual instructions + chain-of-thought.
10
+
11
+ Developed by **{ORGANIZATION}**.
12
+
13
+ ## Architecture (SparseMind v15)
14
+
15
+
16
+ ## Parameters
17
+ - `dim={cfg.dim}` (v1: 768), `n_layers={cfg.n_layers}`, `n_heads={cfg.n_heads}`
18
+ (`head_dim={cfg.dim // cfg.n_heads}` — same as v1)
19
+ - `max_seq_len={cfg.max_seq_len}` (v1: 512), `vocab_size={cfg.vocab_size}`
20
+ - `channel_top_k={cfg.channel_top_k}`, `token_top_k={cfg.token_top_k}`
21
+ (same sparsity ratios as v1)
22
+ - **Total parameters:** {model.n_params:,}
23
+
24
+ ## Training data (3-way mix)
25
+ - **30% real HF Python** (`iamtarun/python_code_instructions_18k_alpaca`)
26
+
27
+
28
+ ## Result
29
+ - **Best `real_syntax_valid`:** {best_syntax:.1f}% on held-out real Python instructions
30
+
31
+ ## Tokenizer
32
+ - v2 tokenizer at [{HF_TOK_REPO_ID}](https://huggingface.co/{HF_TOK_REPO_ID})
33
+
34
+
35
+ ## How to use
36
+ ```python
37
+ import torch
38
+ import sentencepiece as spm
39
+
40
+ # Load checkpoint
41
+ ckpt = torch.load("cofos_best.pt", map_location="cpu")
42
+ cfg_dict = ckpt["config"]
43
+
44
+ # Instantiate model architecture
45
+ # model = SparseMind(Config(**cfg_dict))
46
+ # model.load_state_dict(ckpt["model"])
47
+ # model.eval()