Component 4: Model Architecture (420M Starter)
What This Component Builds
- A decoder-only transformer language model for code generation.
- Configurable size through YAML config.
- Presets for small, medium (420M target), and large.
- Attention + rotary positional encoding + feed-forward blocks.
Main Files
src/model_architecture/code_transformer.pyconfigs/component4_model_config.yamlscripts/build_component4_model.pyscripts/verify_component4_model.py
Commands (run from project root)
.\.venv\Scripts\Activate.ps1
python .\scripts\build_component4_model.py --config .\configs\component4_model_config.yaml
python .\scripts\verify_component4_model.py --config .\configs\component4_model_config.yaml --batch_size 1 --seq_len 256
What Success Looks Like
- Build script prints parameter count near the 420M target.
- Verify script prints:
- VRAM usage at multiple stages
- output tensor shape
Component 4 verification passed.