--- license: bsl-1.0 --- # Cofos Code Model ({MODEL_VERSION}) — SparseMind 500M **Cofos v2** is a 500M-parameter code model built on AMFORGE's **SparseMind v15** architecture. Same essence as Cofos v1 (296M @ 34% real_syntax_valid), scaled larger and trained with multilingual instructions + chain-of-thought. Developed by **{ORGANIZATION}**. ## Architecture (SparseMind v15) ## Parameters - `dim={cfg.dim}` (v1: 768), `n_layers={cfg.n_layers}`, `n_heads={cfg.n_heads}` (`head_dim={cfg.dim // cfg.n_heads}` — same as v1) - `max_seq_len={cfg.max_seq_len}` (v1: 512), `vocab_size={cfg.vocab_size}` - `channel_top_k={cfg.channel_top_k}`, `token_top_k={cfg.token_top_k}` (same sparsity ratios as v1) - **Total parameters:** {model.n_params:,} ## Training data (3-way mix) - **30% real HF Python** (`iamtarun/python_code_instructions_18k_alpaca`) ## Result - **Best `real_syntax_valid`:** {best_syntax:.1f}% on held-out real Python instructions ## Tokenizer - v2 tokenizer at [{HF_TOK_REPO_ID}](https://huggingface.co/{HF_TOK_REPO_ID}) ## How to use ```python import torch import sentencepiece as spm # Load checkpoint ckpt = torch.load("cofos_best.pt", map_location="cpu") cfg_dict = ckpt["config"] # Instantiate model architecture # model = SparseMind(Config(**cfg_dict)) # model.load_state_dict(ckpt["model"]) # model.eval()