VectraYX-Base 260M

VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained from scratch using the same three-phase curriculum and replay-buffer recipe as VectraYX-Nano, scaled to a mid-tier architecture (d_model=1024, n_layers=16).

arXiv Zenodo


Results (VectraYX-Bench, single seed)

Model Params B1 KW B3 TM B4 Tool B5 Chat
VectraYX-Nano v7 (N=4) 42M 0.332±0.005 0.230±0.052 0.725±0.130
VectraYX-Base 260M 260M 0.325 0.114 0.000 0.800
Base + LoRA mini (ratio 1:21, N=4) 260M 0.019±0.003 0.445±0.201 0.600
VectraYX-Pro 3B 3.2B 0.341 0.686 0.600 0.800

B4=0.000 on mixed SFT is a corpus-density artifact — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201.


Architecture

Component Value
Parameters 260M
Layers 16
Hidden dim 1024
Attention heads 16 (GQA 16q/4kv)
FFN SwiGLU
Positional encoding RoPE
Normalization RMSNorm + QK-Norm
Tokenizer BPE-16384 (same as Nano)

Same architecture config as configs/base.json in vectrayx-paper-code.


Files

File Description
base_sft_v1_s42.pt Base 260M post-SFT, seed 42 (~3.1 GB)

Training ran on AWS SageMaker ml.g5.xlarge (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD.


Citation

@misc{santillana2026vectrayx,
  title     = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
               with Curriculum Learning and Native Tool Use},
  author    = {Santillana, Juan S.},
  year      = {2026},
  eprint    = {2605.13989},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CL},
  url       = {https://arxiv.org/abs/2605.13989}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for jsantillana/vectrayx-base