fst_noff_1_1B
Custom FST (Finite State Transformer) — 1.1B parameters, no feed-forward (NOFF) variant. Uses RoPE (rotary position embeddings), RMSNorm, and SwiGLU MLP in the feature blocks.
Requirements
pip install transformers torch safetensors rotary_embedding_torch
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("williamconvertino/fst_noff_1_1B")
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Loading the base model (no LM head)
from transformers import AutoModel
model = AutoModel.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
Loading the config and model class directly
from transformers import AutoConfig
from modeling_fst_noff import FST_NOFFForCausalLM
config = AutoConfig.from_pretrained("williamconvertino/fst_noff_1_1B", trust_remote_code=True)
model = FST_NOFFForCausalLM.from_pretrained("williamconvertino/fst_noff_1_1B", config=config, trust_remote_code=True)
- Downloads last month
- 517
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support