Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Echoes123-3
/
moe-router-test
like
0
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
moe-router-test
2.63 GB
Ctrl+K
Ctrl+K
1 contributor
History:
2 commits
Echoes123-3
Upload MoE + pretrained router
44e5009
verified
6 months ago
.gitattributes
Safe
1.52 kB
initial commit
6 months ago
added_tokens.json
Safe
1.08 kB
Upload MoE + pretrained router
6 months ago
merges.txt
Safe
456 kB
Upload MoE + pretrained router
6 months ago
moe_model.pt
pickle
Detected Pickle imports (58)
"torch.bfloat16"
,
"torch.FloatStorage"
,
"torch.nn.modules.linear.Linear"
,
"transformers.models.bert.modeling_bert.BertSdpaSelfAttention"
,
"transformers.models.bert.modeling_bert.BertModel"
,
"transformers.models.bert.modeling_bert.BertEncoder"
,
"torch.float32"
,
"transformers.quantizers.quantizer_bnb_4bit.Bnb4BitHfQuantizer"
,
"transformers.models.phi.modeling_phi.PhiForCausalLM"
,
"transformers.models.llama.modeling_llama.LlamaForCausalLM"
,
"collections.OrderedDict"
,
"torch.uint8"
,
"transformers.models.bert.modeling_bert.BertPooler"
,
"transformers.models.bert.configuration_bert.BertConfig"
,
"torch.nn.modules.dropout.Dropout"
,
"transformers.models.bert.modeling_bert.BertSelfOutput"
,
"transformers.models.llama.modeling_llama.LlamaRMSNorm"
,
"__builtin__.set"
,
"transformers.models.bert.modeling_bert.BertIntermediate"
,
"transformers.modeling_rope_utils._compute_default_rope_parameters"
,
"transformers.models.bert.modeling_bert.BertAttention"
,
"transformers.models.phi.modeling_phi.PhiMLP"
,
"torch.Size"
,
"transformers.models.phi.modeling_phi.PhiRotaryEmbedding"
,
"transformers.models.llama.configuration_llama.LlamaConfig"
,
"torch.LongStorage"
,
"torch._C._nn.gelu"
,
"transformers.models.llama.modeling_llama.LlamaModel"
,
"torch.nn.modules.sparse.Embedding"
,
"transformers.models.llama.modeling_llama.LlamaAttention"
,
"bitsandbytes.nn.modules.Linear4bit"
,
"transformers.models.phi.modeling_phi.PhiDecoderLayer"
,
"transformers.models.bert.modeling_bert.BertOutput"
,
"torch.ByteStorage"
,
"transformers.models.llama.modeling_llama.LlamaMLP"
,
"torch.nn.modules.container.ModuleList"
,
"torch._utils._rebuild_tensor_v2"
,
"torch._utils._rebuild_parameter"
,
"transformers.models.phi.modeling_phi.PhiModel"
,
"torch._utils._rebuild_parameter_with_state"
,
"torch.float16"
,
"transformers.utils.quantization_config.BitsAndBytesConfig"
,
"transformers.models.bert.modeling_bert.BertEmbeddings"
,
"transformers.models.llama.modeling_llama.LlamaDecoderLayer"
,
"torch.HalfStorage"
,
"torch.nn.modules.activation.Tanh"
,
"transformers.models.bert.modeling_bert.BertLayer"
,
"__main__.SimpleMoE"
,
"transformers.activations.NewGELUActivation"
,
"torch.nn.modules.normalization.LayerNorm"
,
"transformers.models.llama.modeling_llama.LlamaRotaryEmbedding"
,
"transformers.models.phi.configuration_phi.PhiConfig"
,
"transformers.activations.GELUActivation"
,
"transformers.utils.quantization_config.QuantizationMethod"
,
"bitsandbytes.functional.QuantState"
,
"transformers.models.phi.modeling_phi.PhiAttention"
,
"transformers.generation.configuration_utils.GenerationConfig"
,
"transformers.activations.SiLUActivation"
How to fix it?
2.61 GB
xet
Upload MoE + pretrained router
6 months ago
router.pt
17.6 MB
xet
Upload MoE + pretrained router
6 months ago
special_tokens_map.json
Safe
441 Bytes
Upload MoE + pretrained router
6 months ago
tokenizer.json
Safe
3.56 MB
Upload MoE + pretrained router
6 months ago
tokenizer_config.json
Safe
7.4 kB
Upload MoE + pretrained router
6 months ago
vocab.json
Safe
798 kB
Upload MoE + pretrained router
6 months ago