fst_noff_1_1B

Custom FST (Finite State Transformer) — 1.1B parameters, no feed-forward (NOFF) variant. Uses RoPE (rotary position embeddings), RMSNorm, and SwiGLU MLP in the feature blocks.

Requirements

pip install transformers torch safetensors rotary_embedding_torch

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model     = AutoModelForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("williamconvertino/fst_noff_1_1B")

inputs  = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Loading the base model (no LM head)

from transformers import AutoModel
model = AutoModel.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)

Loading the config and model class directly

from transformers import AutoConfig
from modeling_fst_noff import FST_NOFFForCausalLM

config = AutoConfig.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
model  = FST_NOFFForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", config=config, trust_remote_code=True)
Downloads last month
517
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support