Instructions to use davidfred/Qwen2.5-0.5BHEBREW with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use davidfred/Qwen2.5-0.5BHEBREW with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="davidfred/Qwen2.5-0.5BHEBREW",
	filename="Qwen2.5-0.5B-Instruct-Q8_0.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use davidfred/Qwen2.5-0.5BHEBREW with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
# Run inference directly in the terminal:
llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
# Run inference directly in the terminal:
llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Use Docker

docker model run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0

LM Studio
Jan
Ollama
How to use davidfred/Qwen2.5-0.5BHEBREW with Ollama:
```
ollama run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0
```

Unsloth Studio

How to use davidfred/Qwen2.5-0.5BHEBREW with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for davidfred/Qwen2.5-0.5BHEBREW to start chatting

How to use davidfred/Qwen2.5-0.5BHEBREW with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "davidfred/Qwen2.5-0.5BHEBREW:Q8_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use davidfred/Qwen2.5-0.5BHEBREW with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Run Hermes

hermes

Docker Model Runner
How to use davidfred/Qwen2.5-0.5BHEBREW with Docker Model Runner:
```
docker model run hf.co/davidfred/Qwen2.5-0.5BHEBREW:Q8_0
```

Lemonade

How to use davidfred/Qwen2.5-0.5BHEBREW with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull davidfred/Qwen2.5-0.5BHEBREW:Q8_0

Run and chat with the model

lemonade run user.Qwen2.5-0.5BHEBREW-Q8_0

List all available models

lemonade list

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Model Card: Multilingual Qwen2.5-0.5B-Instruct-Q8_0 Model Details Name: Qwen2.5-0.5B-Instruct-Q8_0-Multilingual Base Model: Qwen2.5-0.5B-Instruct Model Type: Instruction-tuned Language Model Size: 500MB (Quantized) Supported Languages: English, Hebrew, French Format: GGUF (Compatible with llama.cpp) Model Description This is a quantized and fine-tuned version of the Qwen2.5-0.5B-Instruct model, specifically optimized for multilingual capabilities in English, Hebrew, and French. The model represents a significant advancement in compact, efficient language models while maintaining strong performance across multiple languages. Intended Use Multilingual text generation and understanding Cross-lingual question answering Translation assistance between supported languages General instruction following in three languages How to Download and Use Download the Model: bash

huggingface-cli download / qwen2.5-0.5b-instruct-q8_0.gguf --local-dir . [[3]] Basic Usage with llama.cpp: bash

./main -m qwen2.5-0.5b-instruct-q8_0.gguf -n 512 --temp 0.7 Training Details Base Model: Qwen2.5-0.5B-Instruct Fine-tuning Data: Multilingual dataset comprising: English text corpus Hebrew text corpus French text corpus Quantization: Q8_0 quantization for optimal balance between model size and performance Performance and Limitations Strengths: Efficient 500MB size making it suitable for local deployment Balanced performance across English, Hebrew, and French Optimized for instruction-following tasksLimitations**: May show reduced performance compared to larger models Limited context window Performance may vary across languages May struggle with complex technical content Ethical Considerations The model should be used in compliance with local regulations and ethical guidelines Users should be aware of potential biases in multilingual outputs Verify critical outputs, especially for sensitive applications Example Usage python

Example code for model inference

from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model

model = AutoModelForCausalLM.from_pretrained("path_to_model") tokenizer = AutoTokenizer.from_pretrained("path_to_model")

Multilingual example

prompts = { "English": "Translate 'Hello' to French:", "Hebrew": "תרגם 'שלום' לצרפתית:", "French": "Traduisez 'Bonjour' en hébreu:" } Citation and License Based on Qwen2.5 developed by the Qwen team at Alibaba Cloud Please refer to the original Qwen2.5 license for usage terms and conditions

Downloads last month: 3

GGUF

Model size

0.5B params

Architecture

qwen2

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support