Text Generation
Transformers
Safetensors
mixtral
Mixture of Experts
frankenmoe
Merge
mergekit
lazymergekit
CultriX/MonaTrix-v4
mlabonne/OmniTruthyBeagle-7B-v0
CultriX/MoNeuTrix-7B-v1
paulml/OmniBeagleSquaredMBX-v3-7B
text-generation-inference
Instructions to use CultriX/NeuralMona_MoE-4x7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CultriX/NeuralMona_MoE-4x7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="CultriX/NeuralMona_MoE-4x7B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("CultriX/NeuralMona_MoE-4x7B") model = AutoModelForCausalLM.from_pretrained("CultriX/NeuralMona_MoE-4x7B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use CultriX/NeuralMona_MoE-4x7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "CultriX/NeuralMona_MoE-4x7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CultriX/NeuralMona_MoE-4x7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/CultriX/NeuralMona_MoE-4x7B
- SGLang
How to use CultriX/NeuralMona_MoE-4x7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "CultriX/NeuralMona_MoE-4x7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CultriX/NeuralMona_MoE-4x7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "CultriX/NeuralMona_MoE-4x7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CultriX/NeuralMona_MoE-4x7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use CultriX/NeuralMona_MoE-4x7B with Docker Model Runner:
docker model run hf.co/CultriX/NeuralMona_MoE-4x7B
NeuralMona_MoE-4x7B
NeuralMona_MoE-4x7B is a Mixture of Experts (MoE) made with the following models using LazyMergekit:
- CultriX/MonaTrix-v4
- mlabonne/OmniTruthyBeagle-7B-v0
- CultriX/MoNeuTrix-7B-v1
- paulml/OmniBeagleSquaredMBX-v3-7B
๐งฉ Configuration
base_model: CultriX/MonaTrix-v4
dtype: bfloat16
experts:
- source_model: "CultriX/MonaTrix-v4" # Historical Analysis, Geopolitics, and Economic Evaluation
positive_prompts:
- "Historic analysis"
- "Geopolitical impacts"
- "Evaluate significance"
- "Predict impact"
- "Assess consequences"
- "Discuss implications"
- "Explain geopolitical"
- "Analyze historical"
- "Examine economic"
- "Evaluate role"
- "Analyze importance"
- "Discuss cultural impact"
- "Discuss historical"
negative_prompts:
- "Compose"
- "Translate"
- "Debate"
- "Solve math"
- "Analyze data"
- "Forecast"
- "Predict"
- "Process"
- "Coding"
- "Programming"
- "Code"
- "Datascience"
- "Cryptography"
- source_model: "mlabonne/OmniTruthyBeagle-7B-v0" # Multilingual Communication and Cultural Insights
positive_prompts:
- "Describe cultural"
- "Explain in language"
- "Translate"
- "Compare cultural differences"
- "Discuss cultural impact"
- "Narrate in language"
- "Explain impact on culture"
- "Discuss national identity"
- "Describe cultural significance"
- "Narrate cultural"
- "Discuss folklore"
negative_prompts:
- "Compose"
- "Debate"
- "Solve math"
- "Analyze data"
- "Forecast"
- "Predict"
- "Coding"
- "Programming"
- "Code"
- "Datascience"
- "Cryptography"
- source_model: "CultriX/MoNeuTrix-7B-v1" # Problem Solving, Innovation, and Creative Thinking
positive_prompts:
- "Devise strategy"
- "Imagine society"
- "Invent device"
- "Design concept"
- "Propose theory"
- "Reason math"
- "Develop strategy"
- "Invent"
negative_prompts:
- "Translate"
- "Discuss"
- "Debate"
- "Summarize"
- "Explain"
- "Detail"
- "Compose"
- source_model: "paulml/OmniBeagleSquaredMBX-v3-7B" # Explaining Scientific Phenomena and Principles
positive_prompts:
- "Explain scientific"
- "Discuss impact"
- "Analyze potential"
- "Elucidate significance"
- "Summarize findings"
- "Detail explanation"
negative_prompts:
- "Cultural significance"
- "Engage in creative writing"
- "Perform subjective judgment tasks"
- "Discuss cultural traditions"
- "Write review"
- "Design"
- "Create"
- "Narrate"
- "Discuss"
๐ป Usage
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "CultriX/NeuralMona_MoE-4x7B"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
- Downloads last month
- 8,255
Model tree for CultriX/NeuralMona_MoE-4x7B
Merge model
this model