Instructions to use Sculptor-AI/Ursa_Minor_Smashed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sculptor-AI/Ursa_Minor_Smashed with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Sculptor-AI/Ursa_Minor_Smashed",
	filename="model_optimized_f32.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Sculptor-AI/Ursa_Minor_Smashed with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Sculptor-AI/Ursa_Minor_Smashed:F32
# Run inference directly in the terminal:
llama-cli -hf Sculptor-AI/Ursa_Minor_Smashed:F32

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Sculptor-AI/Ursa_Minor_Smashed:F32
# Run inference directly in the terminal:
llama-cli -hf Sculptor-AI/Ursa_Minor_Smashed:F32

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Sculptor-AI/Ursa_Minor_Smashed:F32
# Run inference directly in the terminal:
./llama-cli -hf Sculptor-AI/Ursa_Minor_Smashed:F32

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Sculptor-AI/Ursa_Minor_Smashed:F32
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Sculptor-AI/Ursa_Minor_Smashed:F32

Use Docker

docker model run hf.co/Sculptor-AI/Ursa_Minor_Smashed:F32

LM Studio
Jan
Ollama
How to use Sculptor-AI/Ursa_Minor_Smashed with Ollama:
```
ollama run hf.co/Sculptor-AI/Ursa_Minor_Smashed:F32
```

Unsloth Studio new

How to use Sculptor-AI/Ursa_Minor_Smashed with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Sculptor-AI/Ursa_Minor_Smashed to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Sculptor-AI/Ursa_Minor_Smashed to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Sculptor-AI/Ursa_Minor_Smashed to start chatting

Docker Model Runner
How to use Sculptor-AI/Ursa_Minor_Smashed with Docker Model Runner:
```
docker model run hf.co/Sculptor-AI/Ursa_Minor_Smashed:F32
```

Lemonade

How to use Sculptor-AI/Ursa_Minor_Smashed with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Sculptor-AI/Ursa_Minor_Smashed:F32

Run and chat with the model

lemonade run user.Ursa_Minor_Smashed-F32

List all available models

lemonade list

Kaileh57 commited on Jun 28, 2025

Commit

8691f4b

verified ·

1 Parent(s): d575ce4

Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

README.md +35 -1
chat_cpu.py +2 -2
chat_cuda.py +2 -2
requirements-cuda.txt +38 -0
setup-cuda.bat +100 -0
setup-cuda.sh +101 -0

README.md CHANGED Viewed

@@ -68,16 +68,50 @@ This model can be used for:
 ## Quick Start
 ### Installation
 ```bash
 # Clone the repository
 git clone https://github.com/Kaileh57/Ursa_Minor_Smashed.git
 cd Ursa_Minor_Smashed
-# Set up environment
 pip install -r requirements.txt
 # or use the setup script: ./setup.sh
 ```
 ### Basic Usage
 #### Command Line Interface

 ## Quick Start
 ### Installation
+#### CPU Installation (Default)
 ```bash
 # Clone the repository
 git clone https://github.com/Kaileh57/Ursa_Minor_Smashed.git
 cd Ursa_Minor_Smashed
+# Set up CPU environment
 pip install -r requirements.txt
 # or use the setup script: ./setup.sh
 ```
+#### CUDA Installation (For GPU Acceleration)
+If you have a CUDA-capable GPU and want to use GPU acceleration:
+**Linux/macOS:**
+```bash
+# Use the automated CUDA setup script
+chmod +x setup-cuda.sh
+./setup-cuda.sh
+```
+**Windows:**
+```batch
+# Use the Windows CUDA setup script
+setup-cuda.bat
+```
+**Manual CUDA Installation:**
+```bash
+# Create separate CUDA environment
+python -m venv venv-cuda
+source venv-cuda/bin/activate  # On Windows: venv-cuda\Scripts\activate
+# Install CUDA requirements
+pip install -r requirements-cuda.txt
+```
+**CUDA Requirements:**
+- NVIDIA GPU with CUDA Compute Capability 3.5 or higher
+- NVIDIA drivers installed
+- CUDA 11.8 or 12.1 toolkit (optional, PyTorch includes CUDA runtime)
+- At least 4GB GPU memory recommended
 ### Basic Usage
 #### Command Line Interface

chat_cpu.py CHANGED Viewed

@@ -47,8 +47,8 @@ def main():
             full_response = generate_direct(
                 model,
                 context,
-                max_new_tokens=80,  # Lower for CPU efficiency
-                temperature=0.7,
                 top_p=0.9,
                 top_k=30,  # Lower for CPU efficiency
                 repetition_penalty=1.1

             full_response = generate_direct(
                 model,
                 context,
+                max_new_tokens=100,  # Match inference_cpu.py default
+                temperature=0.8,    # Match inference_cpu.py default
                 top_p=0.9,
                 top_k=30,  # Lower for CPU efficiency
                 repetition_penalty=1.1

chat_cuda.py CHANGED Viewed

@@ -51,8 +51,8 @@ def main():
             full_response = generate_direct(
                 model,
                 context,
-                max_new_tokens=150,  # Higher for CUDA
-                temperature=0.7,
                 top_p=0.9,
                 top_k=50,  # Higher for better quality
                 repetition_penalty=1.1

             full_response = generate_direct(
                 model,
                 context,
+                max_new_tokens=100,  # Match inference_cuda.py default
+                temperature=0.8,    # Match inference_cuda.py default
                 top_p=0.9,
                 top_k=50,  # Higher for better quality
                 repetition_penalty=1.1

requirements-cuda.txt ADDED Viewed

	@@ -0,0 +1,38 @@

+# CUDA Requirements for Ursa Minor Smashed Model
+# Install these dependencies for CUDA inference
+# Core CUDA dependencies
+# Install PyTorch with CUDA support (CUDA 11.8 or 12.1)
+# For CUDA 11.8:
+--extra-index-url https://download.pytorch.org/whl/cu118
+torch>=2.0.0+cu118
+torchaudio>=2.0.0+cu118
+torchvision>=2.0.0+cu118
+# For CUDA 12.1, comment out the above lines and uncomment these:
+# --extra-index-url https://download.pytorch.org/whl/cu121
+# torch>=2.0.0+cu121
+# torchaudio>=2.0.0+cu121
+# torchvision>=2.0.0+cu121
+# Core dependencies (same as CPU version)
+numpy>=1.24.0
+tiktoken>=0.5.0
+tqdm>=4.65.0
+# Optional dependencies
+gguf>=0.6.0          # For GGUF conversion
+sentencepiece>=0.1.99 # For tokenizer support
+safetensors>=0.4.0   # For safe model serialization
+psutil>=5.8.0        # For system monitoring
+# CUDA-specific utilities
+pynvml>=11.4.1       # For GPU monitoring and management
+nvidia-ml-py3>=7.352.0  # Alternative GPU monitoring
+# Development dependencies
+matplotlib>=3.7.0
+jupyter>=1.0.0
+pytest>=7.4.0
+black>=23.0.0
+flake8>=6.0.0

setup-cuda.bat ADDED Viewed

	@@ -0,0 +1,100 @@

+@echo off
+REM CUDA Setup script for Ursa Minor Smashed model (Windows)
+echo 🔥 Setting up CUDA environment for Ursa Minor Smashed model...
+REM Check if NVIDIA GPU is available
+nvidia-smi >nul 2>&1
+if errorlevel 1 (
+    echo ❌ ERROR: nvidia-smi not found. Make sure NVIDIA drivers are installed.
+    pause
+    exit /b 1
+)
+echo 🔍 Checking GPU information...
+nvidia-smi
+REM Get CUDA version (simplified for Windows)
+echo 📌 Please ensure you have CUDA 11.8 or 12.1 installed
+REM Create virtual environment
+echo 📦 Creating virtual environment...
+python -m venv venv-cuda
+REM Activate virtual environment
+echo ✅ Activating virtual environment...
+call venv-cuda\Scripts\activate
+REM Upgrade pip
+python -m pip install --upgrade pip
+REM Install CUDA requirements
+echo 🚀 Installing CUDA requirements...
+echo This may take a few minutes as PyTorch CUDA packages are large...
+REM Ask user for CUDA version
+echo.
+echo Please select your CUDA version:
+echo 1. CUDA 11.8
+echo 2. CUDA 12.1
+echo 3. Auto-detect (default)
+set /p cuda_choice=Enter choice (1-3, default 3):
+if "%cuda_choice%"=="1" (
+    echo Installing PyTorch for CUDA 11.8...
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu118
+) else if "%cuda_choice%"=="2" (
+    echo Installing PyTorch for CUDA 12.1...
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu121
+) else (
+    echo Installing PyTorch for CUDA 11.8 (default)...
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu118
+)
+REM Install remaining requirements
+echo 📋 Installing remaining dependencies...
+pip install numpy>=1.24.0 tiktoken>=0.5.0 tqdm>=4.65.0
+pip install gguf>=0.6.0 sentencepiece>=0.1.99 safetensors>=0.4.0 psutil>=5.8.0
+pip install pynvml>=11.4.1 nvidia-ml-py3>=7.352.0
+pip install matplotlib>=3.7.0 jupyter>=1.0.0
+REM Test CUDA availability
+echo 🧪 Testing CUDA setup...
+python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}'); import sys; sys.exit(0 if torch.cuda.is_available() else 1)"
+if errorlevel 1 (
+    echo ❌ CUDA not available in PyTorch installation
+    pause
+    exit /b 1
+)
+python -c "import torch; print(f'CUDA device count: {torch.cuda.device_count()}'); print(f'Current CUDA device: {torch.cuda.current_device()}'); print(f'CUDA device name: {torch.cuda.get_device_name()}'); print(f'CUDA version: {torch.version.cuda}')"
+REM Verify model file exists
+if exist "model_optimized.pt" (
+    echo ✅ Model file found: model_optimized.pt
+) else (
+    echo ⚠️  Warning: model_optimized.pt not found in current directory
+    echo Make sure you have the model file in the same directory as this script
+)
+echo.
+echo 🎉 CUDA setup complete!
+echo.
+echo 📖 Usage Instructions:
+echo To activate CUDA environment:
+echo   venv-cuda\Scripts\activate
+echo.
+echo To run CUDA inference:
+echo   python inference_cuda.py --prompt "Your prompt here"
+echo.
+echo To run CUDA chat:
+echo   python chat_cuda.py
+echo.
+echo To run CUDA benchmark:
+echo   python benchmark_cuda.py
+echo.
+echo 📊 Test your setup:
+echo   python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
+pause

setup-cuda.sh ADDED Viewed

	@@ -0,0 +1,101 @@

+#!/bin/bash
+# CUDA Setup script for Ursa Minor Smashed model
+echo "🔥 Setting up CUDA environment for Ursa Minor Smashed model..."
+# Check if NVIDIA GPU is available
+if ! command -v nvidia-smi &> /dev/null; then
+    echo "❌ ERROR: nvidia-smi not found. Make sure NVIDIA drivers are installed."
+    exit 1
+fi
+echo "🔍 Checking GPU information..."
+nvidia-smi
+# Check CUDA version
+CUDA_VERSION=$(nvidia-smi | grep "CUDA Version" | awk '{print $9}' | cut -d. -f1,2)
+echo "📌 Detected CUDA Version: $CUDA_VERSION"
+# Create virtual environment
+echo "📦 Creating virtual environment..."
+python -m venv venv-cuda
+# Activate virtual environment
+if [[ "$OSTYPE" == "msys" || "$OSTYPE" == "win32" ]]; then
+    source venv-cuda/Scripts/activate
+else
+    source venv-cuda/bin/activate
+fi
+echo "✅ Virtual environment activated"
+# Upgrade pip
+pip install --upgrade pip
+# Install CUDA requirements
+echo "🚀 Installing CUDA requirements..."
+echo "This may take a few minutes as PyTorch CUDA packages are large..."
+# Detect CUDA version and install appropriate PyTorch
+if [[ "$CUDA_VERSION" == "12."* ]]; then
+    echo "Installing PyTorch for CUDA 12.1..."
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu121
+elif [[ "$CUDA_VERSION" == "11."* ]]; then
+    echo "Installing PyTorch for CUDA 11.8..."
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu118
+else
+    echo "⚠️  Warning: Unsupported CUDA version $CUDA_VERSION"
+    echo "Installing default CUDA 11.8 PyTorch..."
+    pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cu118
+fi
+# Install remaining requirements
+echo "📋 Installing remaining dependencies..."
+pip install -r requirements-cuda.txt
+# Test CUDA availability
+echo "🧪 Testing CUDA setup..."
+python -c "
+import torch
+print(f'PyTorch version: {torch.__version__}')
+print(f'CUDA available: {torch.cuda.is_available()}')
+if torch.cuda.is_available():
+    print(f'CUDA device count: {torch.cuda.device_count()}')
+    print(f'Current CUDA device: {torch.cuda.current_device()}')
+    print(f'CUDA device name: {torch.cuda.get_device_name()}')
+    print(f'CUDA version: {torch.version.cuda}')
+else:
+    print('❌ CUDA not available in PyTorch installation')
+    exit(1)
+"
+# Verify model file exists
+if [[ -f "model_optimized.pt" ]]; then
+    echo "✅ Model file found: model_optimized.pt"
+else
+    echo "⚠️  Warning: model_optimized.pt not found in current directory"
+    echo "Make sure you have the model file in the same directory as this script"
+fi
+echo ""
+echo "🎉 CUDA setup complete!"
+echo ""
+echo "📖 Usage Instructions:"
+echo "To activate CUDA environment:"
+if [[ "$OSTYPE" == "msys" || "$OSTYPE" == "win32" ]]; then
+    echo "  source venv-cuda/Scripts/activate"
+else
+    echo "  source venv-cuda/bin/activate"
+fi
+echo ""
+echo "To run CUDA inference:"
+echo "  python inference_cuda.py --prompt 'Your prompt here'"
+echo ""
+echo "To run CUDA chat:"
+echo "  python chat_cuda.py"
+echo ""
+echo "To run CUDA benchmark:"
+echo "  python benchmark_cuda.py"
+echo ""
+echo "📊 Test your setup:"
+echo "  python -c \"import torch; print('CUDA available:', torch.cuda.is_available())\""