2.63 GB

Ctrl+K

1 contributor

History: 2 commits

Echoes123-3

Upload MoE + pretrained router

44e5009 verified 8 months ago

.gitattributes

1.52 kB
initial commit 8 months ago
added_tokens.json

1.08 kB
Upload MoE + pretrained router 8 months ago
merges.txt

456 kB
Upload MoE + pretrained router 8 months ago
moe_model.pt
Detected Pickle imports (58)
- "torch.bfloat16",
- "torch.FloatStorage",
- "torch.nn.modules.linear.Linear",
- "transformers.models.bert.modeling_bert.BertSdpaSelfAttention",
- "transformers.models.bert.modeling_bert.BertModel",
- "transformers.models.bert.modeling_bert.BertEncoder",
- "torch.float32",
- "transformers.quantizers.quantizer_bnb_4bit.Bnb4BitHfQuantizer",
- "transformers.models.phi.modeling_phi.PhiForCausalLM",
- "transformers.models.llama.modeling_llama.LlamaForCausalLM",
- "collections.OrderedDict",
- "torch.uint8",
- "transformers.models.bert.modeling_bert.BertPooler",
- "transformers.models.bert.configuration_bert.BertConfig",
- "torch.nn.modules.dropout.Dropout",
- "transformers.models.bert.modeling_bert.BertSelfOutput",
- "transformers.models.llama.modeling_llama.LlamaRMSNorm",
- "__builtin__.set",
- "transformers.models.bert.modeling_bert.BertIntermediate",
- "transformers.modeling_rope_utils._compute_default_rope_parameters",
- "transformers.models.bert.modeling_bert.BertAttention",
- "transformers.models.phi.modeling_phi.PhiMLP",
- "torch.Size",
- "transformers.models.phi.modeling_phi.PhiRotaryEmbedding",
- "transformers.models.llama.configuration_llama.LlamaConfig",
- "torch.LongStorage",
- "torch._C._nn.gelu",
- "transformers.models.llama.modeling_llama.LlamaModel",
- "torch.nn.modules.sparse.Embedding",
- "transformers.models.llama.modeling_llama.LlamaAttention",
- "bitsandbytes.nn.modules.Linear4bit",
- "transformers.models.phi.modeling_phi.PhiDecoderLayer",
- "transformers.models.bert.modeling_bert.BertOutput",
- "torch.ByteStorage",
- "transformers.models.llama.modeling_llama.LlamaMLP",
- "torch.nn.modules.container.ModuleList",
- "torch._utils._rebuild_tensor_v2",
- "torch._utils._rebuild_parameter",
- "transformers.models.phi.modeling_phi.PhiModel",
- "torch._utils._rebuild_parameter_with_state",
- "torch.float16",
- "transformers.utils.quantization_config.BitsAndBytesConfig",
- "transformers.models.bert.modeling_bert.BertEmbeddings",
- "transformers.models.llama.modeling_llama.LlamaDecoderLayer",
- "torch.HalfStorage",
- "torch.nn.modules.activation.Tanh",
- "transformers.models.bert.modeling_bert.BertLayer",
- "__main__.SimpleMoE",
- "transformers.activations.NewGELUActivation",
- "torch.nn.modules.normalization.LayerNorm",
- "transformers.models.llama.modeling_llama.LlamaRotaryEmbedding",
- "transformers.models.phi.configuration_phi.PhiConfig",
- "transformers.activations.GELUActivation",
- "transformers.utils.quantization_config.QuantizationMethod",
- "bitsandbytes.functional.QuantState",
- "transformers.models.phi.modeling_phi.PhiAttention",
- "transformers.generation.configuration_utils.GenerationConfig",
- "transformers.activations.SiLUActivation"
How to fix it?
2.61 GB
xet

Upload MoE + pretrained router 8 months ago
router.pt

17.6 MB
xet

Upload MoE + pretrained router 8 months ago
special_tokens_map.json

441 Bytes
Upload MoE + pretrained router 8 months ago
tokenizer.json

3.56 MB
Upload MoE + pretrained router 8 months ago
tokenizer_config.json

7.4 kB
Upload MoE + pretrained router 8 months ago
vocab.json

798 kB
Upload MoE + pretrained router 8 months ago

Detected Pickle imports (58)