modeling_quasar_long.py imports a raven/ package that isn't included β€” model can't be instantiated

#2
by sahilchachra - opened

Summary

Thanks for releasing the Quasar-Preview weights and the fla-based modeling code. I'm working on an MLX port so the model can run on Apple Silicon, and I've hit a blocker: the published repo references a local raven/ package that isn't part of the release, so the model can't be instantiated.

Details

In modeling_quasar_long.py, QuasarLongHybridReplacementSdpaAttention.init runs this for every hybrid attention layer:

if not os.path.isdir(os.path.join(_HERE, "raven")):
raise ModuleNotFoundError("Quasar requires the bundled repo-local raven/ folder for Raven hybrid layers")
from raven.layers.raven import RavenAttention

The raven/ folder is not in the model repo (nor in Quasar-3B-A1B-Preview, the SILX-LABS GitHub org, or PyPI). With config.json setting hybrid_attention_layers = [4..19], this guard fires for all of them, so AutoModelForCausalLM.from_pretrained(...) raises before any forward pass.

Per the config, the Raven branch is used for layers 5, 10, and 15 (hybrid_layerwise_cycle = ["quasar","raven","quasar","quasar","gla"], decay_type="Mamba2", slots=64, topk=32). The checkpoint has the matching weights, but the recurrence they drive isn't recoverable from tensor names alone.

My Ask

Could you publish the raven/ package β€” specifically raven/layers/raven.py (RavenAttention) and anything it imports? That's the one missing piece blocking instantiation. The Quasar branch (fla.layers.quasar.QuasarAttention), GLA, the MoE block, and the standard attention are all already present in the repo.

If it's easier, even a minimal reference forward pass / equations for the Mamba2-style slot+top-k recurrence would be enough for me to reimplement it faithfully and verify against the original.

Why

Without it, the published checkpoint can't be loaded with the published code, and a faithful port (or any independent reimplementation) can't be verified for correctness. Happy to share the MLX port back once it works. Thanks!

Read the previous post in the discussion and I am good to go. Thanks!

sahilchachra changed discussion status to closed

Sign up or log in to comment