Commit History

Use create_bidirectional_mask for backend-agnostic attention mask handling
d8a5a92
verified

kashif HF Staff commited on

fix: align _init_weights with Qwen2Moe using nn.init API
c555c2f
verified

kashif HF Staff commited on

fix: call super()._init_weights() to match Qwen2Moe convention for transformers v5
7729892
verified

kashif HF Staff commited on

fix: align RotaryEmbedding with Qwen2Moe pattern for transformers compat
dfa9ac6
verified

kashif HF Staff commited on

Update README.md
408f916
verified

utdawn commited on

Update README.md
afa23bf
verified

utdawn commited on

Update modeling_llada2_moe.py
bb290a9
verified

utdawn commited on

Update README.md
c538cf1
verified

utdawn commited on

Create modeling_llada2_moe.py
c54780e
verified

utdawn commited on

Update configuration_llada2_moe.py
d439464
verified

utdawn commited on

Create README.md
e763a7e
verified

utdawn commited on

Add files using upload-large-folder tool
849c58d
verified

m1ngcheng commited on

initial commit
fbe251b
verified

m1ngcheng commited on