Update README.md

78db3c1 verified 8 months ago

7.52 kB

license: apache-2.0
language:
  - en
library_name: gguf
pipeline_tag: text-generation
tags:
  - mathematical-reasoning
  - qwen3
  - gguf
  - quantized
  - math
  - reasoning
  - fine-tuned
base_model: PinkPixel/Crystal-Think-V2
quantized_by: PinkPixel

🧠 Crystal Think V2 - GGUF Quantized ✨

Optimized GGUF Quantizations for Efficient Mathematical Reasoning

🔗 Original Model: PinkPixel/Crystal-Think-V2
📦 Quantized by: Pink Pixel
🏷️ License: Apache 2.0

📋 About This Repository

This repository contains GGUF quantized versions of Crystal Think V2, an advanced mathematical reasoning model based on Qwen3-4B. These quantized versions are optimized for efficient inference while maintaining excellent mathematical reasoning capabilities.

🎯 Original Model Features

🧮 Advanced Mathematical Reasoning with enhanced chain-of-thought
📐 Multi-step Problem Solving with clear explanations
💻 Mathematical Code Generation and algorithm explanation
🎯 Enhanced <think></think> Reasoning Format
📊 85.2% GSM8K accuracy (+8.8% over base Qwen3-4B)

📦 Available Quantizations

Quantization	File Size	Use Case	Memory Required	Quality
Q4_K_M	2.3GB	Balanced efficiency	~6GB RAM	Good
Q5_K_M	2.7GB	Better quality	~7GB RAM	Very Good
Q6_K	3.1GB	High quality	~8GB RAM	Excellent
Q8_0	4.0GB	Maximum quality	~10GB RAM	Near-Original

💡 Quantization Guide:

Q4_K_M - Best for limited hardware, good performance
Q5_K_M - Recommended balance of speed and quality
Q6_K - High quality with reasonable speed
Q8_0 - Near-original quality, slower inference

🚀 Quick Start

Using llama.cpp

# Download your preferred quantization
wget https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF/resolve/main/crystal-think-v2-q5_k_m.gguf

# Run with llama.cpp
./llama.cpp/main -m crystal-think-v2-q5_k_m.gguf -p "Solve this step by step: If x + 2y = 10 and 2x - y = 5, find x and y." -n 512

Using llama-cpp-python

from llama_cpp import Llama

# Load the model
llm = Llama(
    model_path="crystal-think-v2-q5_k_m.gguf",
    n_ctx=4096,  # Context length
    n_threads=8, # CPU threads
    verbose=False
)

# Mathematical reasoning example
prompt = """Solve this step by step:
A rectangle has a length that is 3 more than twice its width. If the perimeter is 42 cm, what are the dimensions?

Use <think></think> for your reasoning."""

response = llm(
    prompt,
    max_tokens=512,
    temperature=0.7,
    stop=["</SOLUTION>", "<|endoftext|>"]
)

print(response["choices"][0]["text"])

Using Ollama

# Create Modelfile
echo 'FROM ./crystal-think-v2-q5_k_m.gguf' > Modelfile

# Create Ollama model
ollama create crystal-think-v2 -f Modelfile

# Run the model
ollama run crystal-think-v2 "What is the derivative of x^3 + 2x^2 - 5?"

🎯 Enhanced Reasoning Format

Crystal Think V2 uses a structured reasoning approach:

<think>
[Step-by-step reasoning process]
- Variable definitions
- Equation setup
- Mathematical operations
- Verification steps
</think>

<SOLUTION>
[Final organized answer]
1) Specific results
2) Numerical values
3) Units and context
</SOLUTION>

📊 Performance Benchmarks

Original Model Performance

Benchmark	Score	Improvement over Base
GSM8K	85.2%	+8.8%
MATH	42.1%	+10.4%
Algebra	78.9%	+13.7%
Geometry	71.3%	+12.5%
Code Math	82.6%	+13.5%

GGUF Quantization Impact

Q8_0: ~99% original performance
Q6_K: ~97% original performance
Q5_K_M: ~95% original performance
Q4_K_M: ~92% original performance

💻 Hardware Requirements

Minimum Requirements

Quantization	RAM	VRAM (GPU)	CPU
Q4_K_M	6GB	4GB	4 cores
Q5_K_M	7GB	5GB	4 cores
Q6_K	8GB	6GB	6 cores
Q8_0	10GB	8GB	8 cores

Recommended for Best Performance

CPU: Modern 8+ core processor
RAM: 16GB+ system memory
GPU: 8GB+ VRAM (optional, for GPU acceleration)

🔧 Installation & Dependencies

llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

llama-cpp-python

pip install llama-cpp-python
# For GPU support (optional)
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

Ollama

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

📚 Usage Examples

Basic Mathematical Problem

Input: "What is the integral of 2x + 3?"
Expected: Step-by-step integration with explanation

Complex Word Problem

Input: "A train travels 120 miles in 2 hours, then 180 miles in 3 hours. What's the average speed?"
Expected: Detailed solution with calculations

Algebraic Reasoning

Input: "Solve the system: 3x + 2y = 12, x - y = 1"
Expected: Systematic solution using substitution or elimination

🔗 Related Links

🏠 Original Model: PinkPixel/Crystal-Think-V2
📖 Model Documentation: Crystal Think V2 README
🛠️ llama.cpp: GitHub Repository
🐍 llama-cpp-python: PyPI Package

⚠️ Limitations

Domain Focus: Optimized for mathematical reasoning; may be less effective for general conversation
Quantization Trade-offs: Lower quantizations may show reduced accuracy on complex problems
Language: Primarily trained on English mathematical content
Hardware Dependency: Performance varies significantly with hardware specifications

📈 Benchmarking Your Setup

Test your quantization choice with this sample problem:

Prompt: "A rectangular garden has a length that is 4 meters more than twice its width. The garden is surrounded by a walkway that is 2 meters wide on all sides. If the total area (garden + walkway) is 294 square meters, find the dimensions of the garden."

Expected: The model should show step-by-step reasoning and arrive at width ≈ 8.13m, length ≈ 20.26m

🤝 Contributing

Found an issue with the quantizations or have suggestions for improvements? Please open an issue or reach out!

📧 Contact & Support

Developer: Pink Pixel
GitHub: https://github.com/pinkpixel-dev
Website: https://pinkpixel.dev
Email: admin@pinkpixel.dev

🙏 Acknowledgments

Original Model: Crystal Think V2 by Pink Pixel
Base Model: Qwen/Qwen3-4B by Qwen Team
Quantization Tools: llama.cpp by Georgi Gerganov
Training Dataset: NVIDIA OpenMathReasoning

Made with ❤️ by Pink Pixel ✨
"Dream it, Pixel it"