Instructions to use davidfred/Qwen2.5-0.5BHEBREW with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use davidfred/Qwen2.5-0.5BHEBREW with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="davidfred/Qwen2.5-0.5BHEBREW", filename="Qwen2.5-0.5B-Instruct-Q8_0.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use davidfred/Qwen2.5-0.5BHEBREW with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0 # Run inference directly in the terminal: llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0 # Run inference directly in the terminal: llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Use Docker
docker model run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0
- LM Studio
- Jan
- Ollama
How to use davidfred/Qwen2.5-0.5BHEBREW with Ollama:
ollama run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0
- Unsloth Studio
How to use davidfred/Qwen2.5-0.5BHEBREW with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting
- Pi
How to use davidfred/Qwen2.5-0.5BHEBREW with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "davidfred/Qwen2.5-0.5BHEBREW:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use davidfred/Qwen2.5-0.5BHEBREW with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Run Hermes
hermes
- Docker Model Runner
How to use davidfred/Qwen2.5-0.5BHEBREW with Docker Model Runner:
docker model run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0
- Lemonade
How to use davidfred/Qwen2.5-0.5BHEBREW with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull davidfred/Qwen2.5-0.5BHEBREW:Q8_0
Run and chat with the model
lemonade run user.Qwen2.5-0.5BHEBREW-Q8_0
List all available models
lemonade list
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Model Card: Multilingual Qwen2.5-0.5B-Instruct-Q8_0 Model Details Name: Qwen2.5-0.5B-Instruct-Q8_0-Multilingual Base Model: Qwen2.5-0.5B-Instruct Model Type: Instruction-tuned Language Model Size: 500MB (Quantized) Supported Languages: English, Hebrew, French Format: GGUF (Compatible with llama.cpp) Model Description This is a quantized and fine-tuned version of the Qwen2.5-0.5B-Instruct model, specifically optimized for multilingual capabilities in English, Hebrew, and French. The model represents a significant advancement in compact, efficient language models while maintaining strong performance across multiple languages. Intended Use Multilingual text generation and understanding Cross-lingual question answering Translation assistance between supported languages General instruction following in three languages How to Download and Use Download the Model: bash
huggingface-cli download / qwen2.5-0.5b-instruct-q8_0.gguf --local-dir . [[3]] Basic Usage with llama.cpp: bash
./main -m qwen2.5-0.5b-instruct-q8_0.gguf -n 512 --temp 0.7 Training Details Base Model: Qwen2.5-0.5B-Instruct Fine-tuning Data: Multilingual dataset comprising: English text corpus Hebrew text corpus French text corpus Quantization: Q8_0 quantization for optimal balance between model size and performance Performance and Limitations Strengths: Efficient 500MB size making it suitable for local deployment Balanced performance across English, Hebrew, and French Optimized for instruction-following tasksLimitations**: May show reduced performance compared to larger models Limited context window Performance may vary across languages May struggle with complex technical content Ethical Considerations The model should be used in compliance with local regulations and ethical guidelines Users should be aware of potential biases in multilingual outputs Verify critical outputs, especially for sensitive applications Example Usage python
Example code for model inference
from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model
model = AutoModelForCausalLM.from_pretrained("path_to_model") tokenizer = AutoTokenizer.from_pretrained("path_to_model")
Multilingual example
prompts = { "English": "Translate 'Hello' to French:", "Hebrew": "转专讙诐 '砖诇讜诐' 诇爪专驻转讬转:", "French": "Traduisez 'Bonjour' en h茅breu:" } Citation and License Based on Qwen2.5 developed by the Qwen team at Alibaba Cloud Please refer to the original Qwen2.5 license for usage terms and conditions
- Downloads last month
- 3
8-bit