| --- |
| license: bsl-1.0 |
| --- |
| |
| # Cofos Code Model ({MODEL_VERSION}) — SparseMind 500M |
| |
| **Cofos v2** is a 500M-parameter code model built on AMFORGE's **SparseMind v15** |
| architecture. Same essence as Cofos v1 (296M @ 34% real_syntax_valid), |
| scaled larger and trained with multilingual instructions + chain-of-thought. |
| |
| Developed by **{ORGANIZATION}**. |
| |
| ## Architecture (SparseMind v15) |
| |
| |
| ## Parameters |
| - `dim={cfg.dim}` (v1: 768), `n_layers={cfg.n_layers}`, `n_heads={cfg.n_heads}` |
| (`head_dim={cfg.dim // cfg.n_heads}` — same as v1) |
| - `max_seq_len={cfg.max_seq_len}` (v1: 512), `vocab_size={cfg.vocab_size}` |
| - `channel_top_k={cfg.channel_top_k}`, `token_top_k={cfg.token_top_k}` |
| (same sparsity ratios as v1) |
| - **Total parameters:** {model.n_params:,} |
|
|
| ## Training data (3-way mix) |
| - **30% real HF Python** (`iamtarun/python_code_instructions_18k_alpaca`) |
|
|
|
|
| ## Result |
| - **Best `real_syntax_valid`:** {best_syntax:.1f}% on held-out real Python instructions |
| |
| ## Tokenizer |
| - v2 tokenizer at [{HF_TOK_REPO_ID}](https://huggingface.co/{HF_TOK_REPO_ID}) |
| |
| |
| ## How to use |
| ```python |
| import torch |
| import sentencepiece as spm |
| |
| # Load checkpoint |
| ckpt = torch.load("cofos_best.pt", map_location="cpu") |
| cfg_dict = ckpt["config"] |
|
|
| # Instantiate model architecture |
| # model = SparseMind(Config(**cfg_dict)) |
| # model.load_state_dict(ckpt["model"]) |
| # model.eval() |