AMFORGE
/

cofos_v2

ameforge commited on 3 days ago

Commit

375d66b

verified ·

1 Parent(s): 1a589f7

Create README.md

model cofos code entreprise

Files changed (1) hide show

README.md ADDED Viewed

+---
+license: bsl-1.0
+---
+# Cofos Code Model ({MODEL_VERSION}) — SparseMind 500M
+**Cofos v2** is a 500M-parameter code model built on AMFORGE's **SparseMind v15**
+architecture. Same essence as Cofos v1 (296M @ 34% real_syntax_valid),
+scaled larger and trained with multilingual instructions + chain-of-thought.
+Developed by **{ORGANIZATION}**.
+## Architecture (SparseMind v15)
+## Parameters
+- `dim={cfg.dim}` (v1: 768), `n_layers={cfg.n_layers}`, `n_heads={cfg.n_heads}`
+  (`head_dim={cfg.dim // cfg.n_heads}` — same as v1)
+- `max_seq_len={cfg.max_seq_len}` (v1: 512), `vocab_size={cfg.vocab_size}`
+- `channel_top_k={cfg.channel_top_k}`, `token_top_k={cfg.token_top_k}`
+  (same sparsity ratios as v1)
+- **Total parameters:** {model.n_params:,}
+## Training data (3-way mix)
+- **30% real HF Python** (`iamtarun/python_code_instructions_18k_alpaca`)
+## Result
+- **Best `real_syntax_valid`:** {best_syntax:.1f}% on held-out real Python instructions
+## Tokenizer
+- v2 tokenizer at [{HF_TOK_REPO_ID}](https://huggingface.co/{HF_TOK_REPO_ID})
+## How to use
+```python
+import torch
+import sentencepiece as spm
+# Load checkpoint
+ckpt = torch.load("cofos_best.pt", map_location="cpu")
+cfg_dict = ckpt["config"]
+# Instantiate model architecture
+# model = SparseMind(Config(**cfg_dict))
+# model.load_state_dict(ckpt["model"])
+# model.eval()