Instructions to use VectorNomad/arkadiko-v4-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use VectorNomad/arkadiko-v4-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="VectorNomad/arkadiko-v4-base")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("VectorNomad/arkadiko-v4-base", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use VectorNomad/arkadiko-v4-base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "VectorNomad/arkadiko-v4-base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VectorNomad/arkadiko-v4-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/VectorNomad/arkadiko-v4-base
- SGLang
How to use VectorNomad/arkadiko-v4-base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "VectorNomad/arkadiko-v4-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VectorNomad/arkadiko-v4-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "VectorNomad/arkadiko-v4-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VectorNomad/arkadiko-v4-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use VectorNomad/arkadiko-v4-base with Docker Model Runner:
docker model run hf.co/VectorNomad/arkadiko-v4-base
| license: cc-by-nc-4.0 | |
| language: | |
| - ar | |
| - en | |
| - de | |
| - fr | |
| - es | |
| - it | |
| tags: | |
| - arkadiko | |
| - arabic | |
| - bilingual | |
| - pretrained | |
| - causal-lm | |
| - research | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| # Arkadiko V4 โ Base (pretrained, no SFT) | |
| 214M-parameter causal decoder pretrained from scratch on ~100B tokens across 9 domains. **Pretraining only โ no instruction tuning, no chat alignment, no RLHF.** Released as a research artifact. | |
| This is **V4**, not V5. The Arkadiko model family advances to V5 only after demonstrating four post-SFT capabilities (multi-turn chat, arโen translation, tool calling, structured thinking). None of those have been validated on this checkpoint. See the [Honest Limitations](#honest-limitations) section before considering use. | |
| ## Quick facts | |
| | | | | |
| |---|---| | |
| | Parameters | 213,934,720 | | |
| | Architecture | Pure causal decoder, 18 layers | | |
| | Hidden size | 640 | | |
| | Attention | GQA, 10 query heads / 2 KV heads, head_dim=64 | | |
| | FFN | SwiGLU, hidden=3456 (โ5.4ร) | | |
| | Vocab | 60,000 (SentencePiece BPE) | | |
| | Context | 2,048 tokens | | |
| | Position | RoPE, theta=10000 | | |
| | Tied embeddings | No (separate `wte` and `lm_head`) | | |
| | Tokens trained | 100,000,006,144 (~100B) | | |
| | Training steps | 9,114,584 | | |
| | Training hours | 524.7 | | |
| | Hardware | 1ร NVIDIA RTX PRO 4000 Blackwell (24GB) | | |
| | Run completed | 2026-05-06 | | |
| ## Final evaluation (held-out per-domain) | |
| Loss in nats, perplexity = exp(loss). Best-ever overall val PPL was **26.6** at step 8,815k; the released final checkpoint is at PPL ~28.8 (cosine-tail polish). | |
| | Domain | Val loss (MA3) | Perplexity | | |
| |---|---|---| | |
| | code | 1.93 | 6.9 | | |
| | math | 3.10 | 22.1 | | |
| | fr | 3.32 | 27.7 | | |
| | es | 3.43 | 30.9 | | |
| | it | 3.50 | 32.9 | | |
| | de | 3.57 | 35.6 | | |
| | classical (Arabic) | 3.78 | 43.7 | | |
| | en | 3.75 | 42.5 | | |
| | **ar (modern)** | **3.80** | **44.5** | | |
| | **overall** | 3.36 | 28.8 | | |
| ## Training data | |
| Roughly: | |
| | Domain | Tokens | Source | | |
| |---|---|---| | |
| | Arabic (modern) | 24B | ArabicWeb24 + cc100-ar + CulturaX-ar | | |
| | English | 28B | FineWeb-Edu | | |
| | German | 12B | cc100-de | | |
| | French | 8B | cc100-fr | | |
| | Spanish | 8B | cc100-es | | |
| | Italian | 7B | cc100-it | | |
| | Code | 8B | CodeParrot + StarCoderData | | |
| | Math | 7B | OpenWebMath | | |
| | Classical Arabic | 2.7B | Custom (hadith, tafsir, OpenITI, poetry, tashkeela) | | |
| Single SentencePiece BPE tokenizer shared across all 9 domains. **Token-fertility is uneven** โ Arabic averages roughly 2ร the tokens-per-word of English in this vocab, which we believe is a primary cause of weaker Arabic perplexity. The next iteration uses an Arabic-aware tokenizer (see [Roadmap](#roadmap)). | |
| ## Honest limitations | |
| This base model has known structural failures verified through completion testing across the run. Use accordingly. | |
| 1. **Coherent generation horizon โ 50 tokens.** Past that, drift, topic-loop, or repetition. Capacity-bound at this size; SFT cannot extend it. | |
| 2. **No factual recall in long form.** Capitals, public figures, dates โ the model produces fluent confabulation, not facts. Pair with retrieval/tools, do not deploy as a Q&A system. | |
| 3. **Cross-language code bleed.** Code prompts in one language frequently produce output flavored by another (JS prompt โ Python output). Vocab-level issue. | |
| 4. **Arabic โ the primary target language โ is the second-worst text domain by PPL.** Surface fluency reaches ~30-50 token spans; long-form Arabic reasoning is not present. The "Arabic-first" framing was not delivered at this scale. | |
| 5. **No safety alignment.** No RLHF, no DPO, no toxicity filtering of training data beyond source-level curation. Outputs may be biased, false, or offensive. | |
| 6. **No instruction-following.** Base model only. Will not reliably follow chat templates, refuse harmful requests, or call tools. | |
| ### Configuration / tokenizer ID misalignment (read before using) | |
| The `config.json` shipped here records the values used during training: `bos_token_id=0, eos_token_id=2, pad_token_id=1`. The actual SentencePiece model (`tokenizer.model`) defines these tokens at different IDs: | |
| | Token | SPM ID | config.json | | |
| |---|---|---| | |
| | `<unk>` | 0 | (not specified) | | |
| | `<bos>` | 1 | `bos_token_id=0` | | |
| | `<eos>` | 2 | `eos_token_id=2` | | |
| | `<pad>` | 3 | `pad_token_id=1` | | |
| **Use the IDs from the SPM model when serving.** `tokenizer_config.json` lists the SPM-derived IDs in `added_tokens`. The misaligned values in `config.json` are preserved for reproducibility โ the model was trained with them โ but downstream code should treat the SPM model as the source of truth. | |
| This also affects all other special tokens, which the SPM model places at IDs 7โ14: | |
| ``` | |
| <system>=7 <user>=8 <assistant>=9 | |
| <think>=10 </think>=11 <tool_call>=12 <tool_result>=13 <eot>=14 | |
| ``` | |
| `<think>` is the only special with a paired closer; `<tool_call>` and `<tool_result>` content is bounded by `<eos>` rather than a closing tag. | |
| ## Loading | |
| The model uses a custom architecture (`ArkadikoForCausalLM`) which is not part of `transformers` upstream. To load weights, use the `arkadiko/llm/model.py` definition from the project repo, or load the `safetensors` tensors directly: | |
| ```python | |
| import json | |
| from safetensors.torch import load_file | |
| state_dict = load_file("model.safetensors") | |
| config = json.load(open("config.json")) | |
| # Initialize your ArkadikoConfig + ArkadikoForCausalLM | |
| # (see https://github.com/... for the model code) | |
| # model.load_state_dict(state_dict, strict=False) | |
| ``` | |
| The repository code is not yet public. Drop a note in the discussions tab if you need it earlier than the planned release. | |
| ## What this artifact is good for | |
| - **Research baseline.** Reproducible 214M / 100B-token Arabic-inclusive base. | |
| - **SFT experiments.** Suitable starting point for short-context, structured-output tasks (tool calling, format compliance) at small scale. | |
| - **Capability-curve studies.** Final eval and run log are included; full per-checkpoint curve available on request. | |
| ## What this artifact is **not** good for | |
| - Production chat or assistant deployment. | |
| - Factual question answering. | |
| - Long-form generation (>50 tokens). | |
| - Translation as native generation. (A translation tool wrapper around any base may work better than this model alone.) | |
| ## Roadmap | |
| The next planned iteration drops German/French/Spanish/Italian, focuses on Arabic + English + Classical + Code + Math, and grows to ~700M parameters with a 128K Arabic-aware tokenizer. See ADR-210 / ADR-211 in the project repo. This V4 base remains the experimental control. | |
| ## License | |
| **CC BY-NC 4.0** โ non-commercial use only. Attribution required. No warranty, no liability. | |
| ## Citation | |
| ```bibtex | |
| @misc{arkadiko_v4_base_2026, | |
| author = {{VectorNomad}}, | |
| title = {Arkadiko V4: A 214M Arabic-Inclusive Pretrained Base Model}, | |
| year = {2026}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/VectorNomad/arkadiko-v4-base}} | |
| } | |
| ``` | |
| ## Acknowledgements | |
| Trained on a single RTX PRO 4000 Blackwell. Bridges, not factories. | |