🎼 ORCH Next.js 3B

A 3 billion parameter decoder-only transformer trained from scratch for generating complete, production-ready Next.js applications β€” pages, API routes, Prisma schemas, Tailwind components, configs.

GitHub License


TL;DR

Parameters ~3.0 Billion
Architecture Custom LLaMA-style decoder-only transformer
Training From scratch β€” no base model
Vocabulary 32,000 (custom)
Context length 16,384 tokens
Hardware NVIDIA A40 48GB (RunPod)
Training duration ~2 hours (3 epochs, ~29,000 steps)
License Apache 2.0

What this is

The largest model in the from-scratch ORCH lineup. Designed for full-stack Next.js generation: not just snippets, but complete project structures β€” TypeScript components, App Router pages, server actions, Prisma schemas, Tailwind utilities, and configuration files.

This is not a fine-tune of any pretrained model. Architecture and weights are trained end-to-end on curated Next.js repositories.

Architecture

Layers:                32
Hidden size:           2,560
Intermediate size:     10,240
Attention heads:       32
KV heads (GQA):        8
Max position:          16,384
RoPE theta:            10,000
Activation:            SwiGLU
Normalization:         RMSNorm
Tied embeddings:       no
Vocab size:            32,000 (custom)

Standard LLaMA-style ingredients (RoPE, GQA, SwiGLU, RMSNorm) at scale. The 16,384-token context length allows the model to keep a meaningful portion of a project in context during generation.

Training

  • Data: curated Next.js repositories from GitHub
  • Hardware: single NVIDIA A40 48GB (RunPod)
  • Duration: ~2 hours
  • Epochs: 3 (~29,000 steps)
  • Batch size: 1 with gradient accumulation = 16
  • Sequence length: 512 (training) β€” the 16K context length applies at inference
  • Precision: BFloat16

Generated artifacts

The model is trained to produce complete project structures:

  • Frontend: React components, App Router pages, layouts, hooks (TypeScript)
  • Backend: API routes, server actions, middleware
  • Database: Prisma schemas, query utilities
  • Styling: Tailwind CSS, shadcn/ui-style component patterns
  • Configuration: package.json, tsconfig.json, next.config.js

Usage

Custom PyTorch format β€” use the ORCH inference code:

import torch
from orch.model.config import OrchConfig
from orch.model.transformer import OrchForCausalLM

model = OrchForCausalLM.from_pretrained("raihan-js/orch-nextjs-3b")
# ... use the same tokenizer.json from the repo

Intended use

  • Full Next.js project bootstrapping from a natural language description
  • Research into scaling SLMs trained from scratch on domain-specific data
  • A from-scratch baseline to compare against fine-tuned models like ORCH-7B

Limitations

  • Training data scale: 3 epochs on curated Next.js repos. Don't expect the world knowledge of a 7B+ general-purpose model.
  • Sequence length during training (512): capable of using long context at inference but may show degradation outside the training distribution.
  • No safety alignment.
  • Custom format: requires the ORCH inference code, not loadable with AutoModelForCausalLM.

Related models

Author

Akteruzzaman Raihan Sikder β€” AI/ML engineer, CTO at ClarioScope AI. Portfolio Β· GitHub.

Citation

@misc{sikder2025orchnextjs3b,
  title  = {ORCH Next.js 3B: A 3-Billion-Parameter Decoder-Only Transformer Trained From Scratch for Full-Stack Next.js Code Generation},
  author = {Sikder, Akteruzzaman Raihan},
  year   = {2025},
  url    = {https://huggingface.co/raihan-js/orch-nextjs-3b}
}
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using raihan-js/orch-nextjs-3b 1