| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: gguf |
| | pipeline_tag: text-generation |
| | tags: |
| | - mathematical-reasoning |
| | - qwen3 |
| | - gguf |
| | - quantized |
| | - math |
| | - reasoning |
| | - fine-tuned |
| | base_model: PinkPixel/Crystal-Think-V2 |
| | quantized_by: PinkPixel |
| | --- |
| | |
| | <div align="center"> |
| | <img src="crystal-think-v2-logo.png" alt="Crystal Think V2 Logo" width="400"/> |
| | </div> |
| |
|
| | # 🧠 Crystal Think V2 - GGUF Quantized ✨ |
| |
|
| | **Optimized GGUF Quantizations for Efficient Mathematical Reasoning** |
| |
|
| | > **🔗 Original Model:** [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2) |
| | > **📦 Quantized by:** Pink Pixel |
| | > **🏷️ License:** Apache 2.0 |
| |
|
| | --- |
| |
|
| | ## 📋 About This Repository |
| |
|
| | This repository contains **GGUF quantized versions** of Crystal Think V2, an advanced mathematical reasoning model based on Qwen3-4B. These quantized versions are optimized for **efficient inference** while maintaining excellent mathematical reasoning capabilities. |
| |
|
| | ### 🎯 Original Model Features |
| | - 🧮 **Advanced Mathematical Reasoning** with enhanced chain-of-thought |
| | - 📐 **Multi-step Problem Solving** with clear explanations |
| | - 💻 **Mathematical Code Generation** and algorithm explanation |
| | - 🎯 **Enhanced `<think></think>` Reasoning Format** |
| | - 📊 **85.2% GSM8K accuracy** (+8.8% over base Qwen3-4B) |
| |
|
| | --- |
| |
|
| | ## 📦 Available Quantizations |
| |
|
| | | Quantization | File Size | Use Case | Memory Required | Quality | |
| | |-------------|-----------|----------|-----------------|---------| |
| | | **Q4_K_M** | 2.3GB | Balanced efficiency | ~6GB RAM | Good | |
| | | **Q5_K_M** | 2.7GB | Better quality | ~7GB RAM | Very Good | |
| | | **Q6_K** | 3.1GB | High quality | ~8GB RAM | Excellent | |
| | | **Q8_0** | 4.0GB | Maximum quality | ~10GB RAM | Near-Original | |
| |
|
| | ### 💡 **Quantization Guide:** |
| | - **Q4_K_M** - Best for limited hardware, good performance |
| | - **Q5_K_M** - Recommended balance of speed and quality |
| | - **Q6_K** - High quality with reasonable speed |
| | - **Q8_0** - Near-original quality, slower inference |
| |
|
| | --- |
| |
|
| | ## 🚀 Quick Start |
| |
|
| | ### Using llama.cpp |
| |
|
| | ```bash |
| | # Download your preferred quantization |
| | wget https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF/resolve/main/crystal-think-v2-q5_k_m.gguf |
| | |
| | # Run with llama.cpp |
| | ./llama.cpp/main -m crystal-think-v2-q5_k_m.gguf -p "Solve this step by step: If x + 2y = 10 and 2x - y = 5, find x and y." -n 512 |
| | ``` |
| |
|
| | ### Using llama-cpp-python |
| |
|
| | ```python |
| | from llama_cpp import Llama |
| | |
| | # Load the model |
| | llm = Llama( |
| | model_path="crystal-think-v2-q5_k_m.gguf", |
| | n_ctx=4096, # Context length |
| | n_threads=8, # CPU threads |
| | verbose=False |
| | ) |
| | |
| | # Mathematical reasoning example |
| | prompt = """Solve this step by step: |
| | A rectangle has a length that is 3 more than twice its width. If the perimeter is 42 cm, what are the dimensions? |
| | |
| | Use <think></think> for your reasoning.""" |
| | |
| | response = llm( |
| | prompt, |
| | max_tokens=512, |
| | temperature=0.7, |
| | stop=["</SOLUTION>", "<|endoftext|>"] |
| | ) |
| | |
| | print(response["choices"][0]["text"]) |
| | ``` |
| |
|
| | ### Using Ollama |
| |
|
| | ```bash |
| | # Create Modelfile |
| | echo 'FROM ./crystal-think-v2-q5_k_m.gguf' > Modelfile |
| | |
| | # Create Ollama model |
| | ollama create crystal-think-v2 -f Modelfile |
| | |
| | # Run the model |
| | ollama run crystal-think-v2 "What is the derivative of x^3 + 2x^2 - 5?" |
| | ``` |
| |
|
| | --- |
| |
|
| | ## 🎯 Enhanced Reasoning Format |
| |
|
| | Crystal Think V2 uses a structured reasoning approach: |
| |
|
| | ``` |
| | <think> |
| | [Step-by-step reasoning process] |
| | - Variable definitions |
| | - Equation setup |
| | - Mathematical operations |
| | - Verification steps |
| | </think> |
| | |
| | <SOLUTION> |
| | [Final organized answer] |
| | 1) Specific results |
| | 2) Numerical values |
| | 3) Units and context |
| | </SOLUTION> |
| | ``` |
| |
|
| | --- |
| |
|
| | ## 📊 Performance Benchmarks |
| |
|
| | ### Original Model Performance |
| | | Benchmark | Score | Improvement over Base | |
| | |-----------|-------|----------------------| |
| | | **GSM8K** | 85.2% | +8.8% | |
| | | **MATH** | 42.1% | +10.4% | |
| | | **Algebra** | 78.9% | +13.7% | |
| | | **Geometry** | 71.3% | +12.5% | |
| | | **Code Math** | 82.6% | +13.5% | |
| |
|
| | ### GGUF Quantization Impact |
| | - **Q8_0**: ~99% original performance |
| | - **Q6_K**: ~97% original performance |
| | - **Q5_K_M**: ~95% original performance |
| | - **Q4_K_M**: ~92% original performance |
| |
|
| | --- |
| |
|
| | ## 💻 Hardware Requirements |
| |
|
| | ### Minimum Requirements |
| | | Quantization | RAM | VRAM (GPU) | CPU | |
| | |-------------|-----|-----------|-----| |
| | | Q4_K_M | 6GB | 4GB | 4 cores | |
| | | Q5_K_M | 7GB | 5GB | 4 cores | |
| | | Q6_K | 8GB | 6GB | 6 cores | |
| | | Q8_0 | 10GB | 8GB | 8 cores | |
| |
|
| | ### Recommended for Best Performance |
| | - **CPU**: Modern 8+ core processor |
| | - **RAM**: 16GB+ system memory |
| | - **GPU**: 8GB+ VRAM (optional, for GPU acceleration) |
| |
|
| | --- |
| |
|
| | ## 🔧 Installation & Dependencies |
| |
|
| | ### llama.cpp |
| | ```bash |
| | git clone https://github.com/ggerganov/llama.cpp |
| | cd llama.cpp |
| | make |
| | ``` |
| |
|
| | ### llama-cpp-python |
| | ```bash |
| | pip install llama-cpp-python |
| | # For GPU support (optional) |
| | CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python |
| | ``` |
| |
|
| | ### Ollama |
| | ```bash |
| | # Install Ollama |
| | curl -fsSL https://ollama.com/install.sh | sh |
| | ``` |
| |
|
| | --- |
| |
|
| | ## 📚 Usage Examples |
| |
|
| | ### Basic Mathematical Problem |
| | ``` |
| | Input: "What is the integral of 2x + 3?" |
| | Expected: Step-by-step integration with explanation |
| | ``` |
| |
|
| | ### Complex Word Problem |
| | ``` |
| | Input: "A train travels 120 miles in 2 hours, then 180 miles in 3 hours. What's the average speed?" |
| | Expected: Detailed solution with calculations |
| | ``` |
| |
|
| | ### Algebraic Reasoning |
| | ``` |
| | Input: "Solve the system: 3x + 2y = 12, x - y = 1" |
| | Expected: Systematic solution using substitution or elimination |
| | ``` |
| |
|
| | --- |
| |
|
| | ## 🔗 Related Links |
| |
|
| | - **🏠 Original Model:** [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2) |
| | - **📖 Model Documentation:** [Crystal Think V2 README](https://huggingface.co/PinkPixel/Crystal-Think-V2/blob/main/README.md) |
| | - **🛠️ llama.cpp:** [GitHub Repository](https://github.com/ggerganov/llama.cpp) |
| | - **🐍 llama-cpp-python:** [PyPI Package](https://pypi.org/project/llama-cpp-python/) |
| |
|
| | --- |
| |
|
| | ## ⚠️ Limitations |
| |
|
| | - **Domain Focus**: Optimized for mathematical reasoning; may be less effective for general conversation |
| | - **Quantization Trade-offs**: Lower quantizations may show reduced accuracy on complex problems |
| | - **Language**: Primarily trained on English mathematical content |
| | - **Hardware Dependency**: Performance varies significantly with hardware specifications |
| |
|
| | --- |
| |
|
| | ## 📈 Benchmarking Your Setup |
| |
|
| | Test your quantization choice with this sample problem: |
| |
|
| | ``` |
| | Prompt: "A rectangular garden has a length that is 4 meters more than twice its width. The garden is surrounded by a walkway that is 2 meters wide on all sides. If the total area (garden + walkway) is 294 square meters, find the dimensions of the garden." |
| | |
| | Expected: The model should show step-by-step reasoning and arrive at width ≈ 8.13m, length ≈ 20.26m |
| | ``` |
| |
|
| | --- |
| |
|
| | ## 🤝 Contributing |
| |
|
| | Found an issue with the quantizations or have suggestions for improvements? Please open an issue or reach out! |
| |
|
| | --- |
| |
|
| | ## 📧 Contact & Support |
| |
|
| | - **Developer:** Pink Pixel |
| | - **GitHub:** [https://github.com/pinkpixel-dev](https://github.com/pinkpixel-dev) |
| | - **Website:** [https://pinkpixel.dev](https://pinkpixel.dev) |
| | - **Email:** [admin@pinkpixel.dev](mailto:admin@pinkpixel.dev) |
| |
|
| | --- |
| |
|
| | ## 🙏 Acknowledgments |
| |
|
| | - **Original Model:** Crystal Think V2 by Pink Pixel |
| | - **Base Model:** [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) by Qwen Team |
| | - **Quantization Tools:** [llama.cpp](https://github.com/ggerganov/llama.cpp) by Georgi Gerganov |
| | - **Training Dataset:** [NVIDIA OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning) |
| |
|
| | --- |
| |
|
| | **Made with ❤️ by Pink Pixel** ✨ |
| | *"Dream it, Pixel it"* |