Instructions to use mlx-community/Meta-Llama-3-70B-Instruct-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Meta-Llama-3-70B-Instruct-4bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("mlx-community/Meta-Llama-3-70B-Instruct-4bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- MLX LM
How to use mlx-community/Meta-Llama-3-70B-Instruct-4bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "mlx-community/Meta-Llama-3-70B-Instruct-4bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "mlx-community/Meta-Llama-3-70B-Instruct-4bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/Meta-Llama-3-70B-Instruct-4bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
Any difference from Meta-Llama-3-70B-Instruct-4bit-mlx?
#2
by yamikumods - opened
I'm not getting the difference between this and one with suffix -mlx.
I got broken outputs like below with the other, so will this solve the problem?
I can provide information on various topics. If you have a specific question or topic in mind, feel free to ask, and I'll do my best to assist you.<|reserved_special_token_5|><|start_header_id|>assistant<|end_header_id|>
I can provide information on various topics. If you have a specific question or topic in mind, feel free to ask, and I'll do my best to assist you.<|reserved_special_token_5|><|start_header_id|>assistant<|end_header_id|>
I'm happy to help! Please go ahead and ask your question or share your topic. I'll do
Please provide reproducible code example
Make sure you redownload the new model weights because Huggingface usually caches it locally