Instructions to use VortexHunter23/LeoPARD-0.27-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use VortexHunter23/LeoPARD-0.27-4bit with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="VortexHunter23/LeoPARD-0.27-4bit")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("VortexHunter23/LeoPARD-0.27-4bit")
model = AutoModelForCausalLM.from_pretrained("VortexHunter23/LeoPARD-0.27-4bit")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use VortexHunter23/LeoPARD-0.27-4bit with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "VortexHunter23/LeoPARD-0.27-4bit"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VortexHunter23/LeoPARD-0.27-4bit",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/VortexHunter23/LeoPARD-0.27-4bit

SGLang

How to use VortexHunter23/LeoPARD-0.27-4bit with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "VortexHunter23/LeoPARD-0.27-4bit" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VortexHunter23/LeoPARD-0.27-4bit",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "VortexHunter23/LeoPARD-0.27-4bit" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "VortexHunter23/LeoPARD-0.27-4bit",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use VortexHunter23/LeoPARD-0.27-4bit with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="VortexHunter23/LeoPARD-0.27-4bit",
    max_seq_length=2048,
)

Docker Model Runner
How to use VortexHunter23/LeoPARD-0.27-4bit with Docker Model Runner:
```
docker model run hf.co/VortexHunter23/LeoPARD-0.27-4bit
```

Model Card for LeoPARD 0.27

LeoPARD 0.27 is a fine-tuned version of LLaMA 3.1 8B, developed by AxisSmart | Labs. It incorporates reasoning thinking and chain-of-thought (CoT) capabilities (beta), making it suitable for tasks requiring logical reasoning and step-by-step problem-solving.

Model Details

Model Description

This model is a fine-tuned version of LLaMA 3.1 8B, optimized for improved reasoning and chain-of-thought capabilities. It is designed to handle complex tasks that require logical thinking, structured reasoning, and multi-step problem-solving.

Developed by: AxisSmart | Labs
Model type: Fine-tuned language model
Language(s) (NLP): Primarily English (multilingual capabilities may vary)
License: Creative Commons Attribution-NonCommercial 2.0 (CC BY-NC 2.0)
Finetuned from model: LLaMA 3.1 8B

License Details

The CC BY-NC 4.0 license allows users to:

Share: Copy and redistribute the model in any medium or format.
Adapt: Remix, transform, and build upon the model for non-commercial purposes.

Under the following terms:

Attribution: Users must give appropriate credit to AxisSmart | Labs, provide a link to the license, and indicate if changes were made.
NonCommercial: The model cannot be used for commercial purposes.

For commercial use, explicit permission from AxisSmart | Labs is required.

Uses

Direct Use

LeoPARD 0.27 can be used directly for tasks requiring reasoning and chain-of-thought capabilities, such as:

Logical problem-solving
Step-by-step reasoning tasks
Educational applications (e.g., math, science)
Decision support systems

Downstream Use [optional]

The model can be fine-tuned further for specific applications, such as:

Custom reasoning pipelines
Domain-specific problem-solving (e.g., finance, healthcare)
Integration into larger AI systems

Out-of-Scope Use

Tasks requiring real-time, low-latency responses without proper optimization
Applications involving highly sensitive or unethical use cases
Tasks outside the scope of its reasoning and language capabilities

Bias, Risks, and Limitations

Bias: The model may inherit biases present in the training data or the base LLaMA model.
Risks: Potential for incorrect or misleading reasoning outputs if not properly validated.
Limitations: The chain-of-thought capability is still in beta and may produce incomplete or suboptimal reasoning paths.

Recommendations

Users should validate the model's outputs, especially for critical applications. Fine-tuning on domain-specific data may improve performance and reduce biases.

How to Get Started with the Model

Use the code below to load and use LeoPARD 0.27:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "model_name"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "Explain the reasoning behind the solution to this problem: ..."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on a curated dataset designed to enhance reasoning and chain-of-thought capabilities. The dataset includes:

Logical reasoning problems
Step-by-step solutions
General-purpose language data

Training Procedure

Training time: 6 hours
Training regime: Mixed precision (bf16)
Hardware: [Confidential]

Training Hyperparameters

Learning rate: 2e-4
Batch size: 2
Epochs: 4

Evaluation

Testing has not yet been conducted. Evaluation metrics and results will be added in future updates.

Model Card Authors

AxisSmart | Labs
VortexHunter(Alvin)

Model Card Contact

Contact Comming Soon

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

F16

F32

Model tree for VortexHunter23/LeoPARD-0.27-4bit

Base model

meta-llama/Llama-3.1-8B

Quantized

unsloth/Meta-Llama-3.1-8B-bnb-4bit

Quantized

(237)

this model