fst_noff_1_1B

Custom FST (Finite State Transformer) — 1.1B parameters, no feed-forward (NOFF) variant. Uses RoPE (rotary position embeddings), RMSNorm, and SwiGLU MLP in the feature blocks.

Requirements

pip install transformers torch safetensors rotary_embedding_torch

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model     = AutoModelForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("williamconvertino/fst_noff_1_1B")

inputs  = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Loading the base model (no LM head)

from transformers import AutoModel
model = AutoModel.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)

Loading the config and model class directly

from transformers import AutoConfig
from modeling_fst_noff import FST_NOFFForCausalLM

config = AutoConfig.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
model  = FST_NOFFForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", config=config, trust_remote_code=True)

Downloads last month: 517

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support