Instructions to use VortexHunter23/LeoPARD-0.27-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use VortexHunter23/LeoPARD-0.27-4bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="VortexHunter23/LeoPARD-0.27-4bit")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("VortexHunter23/LeoPARD-0.27-4bit") model = AutoModelForCausalLM.from_pretrained("VortexHunter23/LeoPARD-0.27-4bit") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use VortexHunter23/LeoPARD-0.27-4bit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "VortexHunter23/LeoPARD-0.27-4bit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VortexHunter23/LeoPARD-0.27-4bit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/VortexHunter23/LeoPARD-0.27-4bit
- SGLang
How to use VortexHunter23/LeoPARD-0.27-4bit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "VortexHunter23/LeoPARD-0.27-4bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VortexHunter23/LeoPARD-0.27-4bit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "VortexHunter23/LeoPARD-0.27-4bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VortexHunter23/LeoPARD-0.27-4bit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Unsloth Studio new
How to use VortexHunter23/LeoPARD-0.27-4bit with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for VortexHunter23/LeoPARD-0.27-4bit to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="VortexHunter23/LeoPARD-0.27-4bit", max_seq_length=2048, ) - Docker Model Runner
How to use VortexHunter23/LeoPARD-0.27-4bit with Docker Model Runner:
docker model run hf.co/VortexHunter23/LeoPARD-0.27-4bit
Model Card for LeoPARD 0.27
LeoPARD 0.27 is a fine-tuned version of LLaMA 3.1 8B, developed by AxisSmart | Labs. It incorporates reasoning thinking and chain-of-thought (CoT) capabilities (beta), making it suitable for tasks requiring logical reasoning and step-by-step problem-solving.
Model Details
Model Description
This model is a fine-tuned version of LLaMA 3.1 8B, optimized for improved reasoning and chain-of-thought capabilities. It is designed to handle complex tasks that require logical thinking, structured reasoning, and multi-step problem-solving.
- Developed by: AxisSmart | Labs
- Model type: Fine-tuned language model
- Language(s) (NLP): Primarily English (multilingual capabilities may vary)
- License: Creative Commons Attribution-NonCommercial 2.0 (CC BY-NC 2.0)
- Finetuned from model: LLaMA 3.1 8B
License Details
The CC BY-NC 4.0 license allows users to:
- Share: Copy and redistribute the model in any medium or format.
- Adapt: Remix, transform, and build upon the model for non-commercial purposes.
Under the following terms:
- Attribution: Users must give appropriate credit to AxisSmart | Labs, provide a link to the license, and indicate if changes were made.
- NonCommercial: The model cannot be used for commercial purposes.
For commercial use, explicit permission from AxisSmart | Labs is required.
Uses
Direct Use
LeoPARD 0.27 can be used directly for tasks requiring reasoning and chain-of-thought capabilities, such as:
- Logical problem-solving
- Step-by-step reasoning tasks
- Educational applications (e.g., math, science)
- Decision support systems
Downstream Use [optional]
The model can be fine-tuned further for specific applications, such as:
- Custom reasoning pipelines
- Domain-specific problem-solving (e.g., finance, healthcare)
- Integration into larger AI systems
Out-of-Scope Use
- Tasks requiring real-time, low-latency responses without proper optimization
- Applications involving highly sensitive or unethical use cases
- Tasks outside the scope of its reasoning and language capabilities
Bias, Risks, and Limitations
- Bias: The model may inherit biases present in the training data or the base LLaMA model.
- Risks: Potential for incorrect or misleading reasoning outputs if not properly validated.
- Limitations: The chain-of-thought capability is still in beta and may produce incomplete or suboptimal reasoning paths.
Recommendations
Users should validate the model's outputs, especially for critical applications. Fine-tuning on domain-specific data may improve performance and reduce biases.
How to Get Started with the Model
Use the code below to load and use LeoPARD 0.27:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "model_name"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "Explain the reasoning behind the solution to this problem: ..."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned on a curated dataset designed to enhance reasoning and chain-of-thought capabilities. The dataset includes:
- Logical reasoning problems
- Step-by-step solutions
- General-purpose language data
Training Procedure
- Training time: 6 hours
- Training regime: Mixed precision (bf16)
- Hardware: [Confidential]
Training Hyperparameters
- Learning rate: 2e-4
- Batch size: 2
- Epochs: 4
Evaluation
Testing has not yet been conducted. Evaluation metrics and results will be added in future updates.
Model Card Authors
AxisSmart | Labs
VortexHunter(Alvin)
Model Card Contact
Contact Comming Soon
- Downloads last month
- -
Model tree for VortexHunter23/LeoPARD-0.27-4bit
Base model
meta-llama/Llama-3.1-8B