Instructions to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("mervinpraison/Llama-3.1-8B-bnb-4bit-python", dtype="auto") - llama-cpp-python
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="mervinpraison/Llama-3.1-8B-bnb-4bit-python", filename="unsloth.BF16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M # Run inference directly in the terminal: llama-cli -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M # Run inference directly in the terminal: llama-cli -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
Use Docker
docker model run hf.co/mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with Ollama:
ollama run hf.co/mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
- Unsloth Studio new
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mervinpraison/Llama-3.1-8B-bnb-4bit-python to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mervinpraison/Llama-3.1-8B-bnb-4bit-python to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for mervinpraison/Llama-3.1-8B-bnb-4bit-python to start chatting
- Docker Model Runner
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with Docker Model Runner:
docker model run hf.co/mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
- Lemonade
How to use mervinpraison/Llama-3.1-8B-bnb-4bit-python with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull mervinpraison/Llama-3.1-8B-bnb-4bit-python:Q4_K_M
Run and chat with the model
lemonade run user.Llama-3.1-8B-bnb-4bit-python-Q4_K_M
List all available models
lemonade list
License compatibility
Hi, I'd like to report a License Conflict in mervinpraison/Llama-3.1-8B-bnb-4bit-python. I noticed that this model appears to be quantized
from unsloth/Meta-Llama-3.1-8B-bnb-4bit, while being published under the Apache-2.0 license. Given the terms outlined in the LLaMA 3.1 Community License, especially regarding redistribution, attribution, and naming, this combination of licenses could potentially lead to legal or usage misunderstandings.
Key incompatibilities with LLaMA 3.1 Community License:
Clause 1.b.i – Redistribution and Use:
• No license file included (should contain the LLaMA 3.1 Community License)
• "Built with Llama" is not clearly indicated
• Model name does not begin with “Llama 3”, which is required for any derivative
Clause 1.b.iii – Required Notice:
• Missing the following required text in a "NOTICE" file:
“Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
Clause 1.iv – Acceptable Use Policy:
• Meta’s Acceptable Use Policy is not mentioned or passed along to users
Clause 2 – Sublicensing and Relicensing:
• LLaMA 3.1 license does not allow sublicensing under a more permissive license such as Apache-2.0
• The Apache-2.0 License permits nearly unrestricted commercial use, which contradicts Meta’s limits and conditions (e.g. commercial MAU threshold)
On the flip side, Apache-2.0 lets you:
• Use it commercially without asking for extra permission
• Sublicense and redistribute it under more flexible terms
• You don’t have to pass along any non-permissive terms or use restrictions from upstream
This creates a bit of a conflict because the LLaMA 3 license specifically says you can’t sublicense it under more flexible terms and requires downstream users to follow certain use restrictions, which Apache-2.0 doesn’t enforce.
So I'm thinking there might be a licensing conflict here that needs to be sorted out.
🔹 Suggestion:
To resolve the mismatch, here are a few steps that might help bring things into alignment:
1. To make sure everything aligns with the LLaMA 3.1 terms, you might want to tweak the licensing setup a bit, like:
• Maybe include a copy of the LLaMA 3.1 Community License in the repo or model card
• Include this notice in a “NOTICE” file or the docs:
> “Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
• A “Built with LLaMA” note somewhere in the model card could be helpful too
• Maybe a quick note about usage restrictions, especially for folks using it in commercial settings
• A statement clarifying that use of the model must comply with Meta’s Acceptable Use Policy
2. Maybe we can just drop the Apache-2.0 tag and going with the LLaMA 3.1 Community License. This approach may help reduce potential confusion about redistribution rights and downstream usage conditions.
Hope this helps! 😊 Let me know if you have any questions or need more info.
Thanks for your attention!
Would love to hear your view on this!
Happy to change the licence to how it should be, please create a pull request