Update README.md

78db3c1 verified 8 months ago

7.52 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: gguf
	pipeline_tag: text-generation
	tags:
	- mathematical-reasoning
	- qwen3
	- gguf
	- quantized
	- math
	- reasoning
	- fine-tuned
	base_model: PinkPixel/Crystal-Think-V2
	quantized_by: PinkPixel
	---

	<div align="center">
	<img src="crystal-think-v2-logo.png" alt="Crystal Think V2 Logo" width="400"/>
	</div>

	# 🧠 Crystal Think V2 - GGUF Quantized ✨

	Optimized GGUF Quantizations for Efficient Mathematical Reasoning

	> 🔗 Original Model: [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2)
	> 📦 Quantized by: Pink Pixel
	> 🏷️ License: Apache 2.0

	---

	## 📋 About This Repository

	This repository contains GGUF quantized versions of Crystal Think V2, an advanced mathematical reasoning model based on Qwen3-4B. These quantized versions are optimized for efficient inference while maintaining excellent mathematical reasoning capabilities.

	### 🎯 Original Model Features
	- 🧮 Advanced Mathematical Reasoning with enhanced chain-of-thought
	- 📐 Multi-step Problem Solving with clear explanations
	- 💻 Mathematical Code Generation and algorithm explanation
	- 🎯 Enhanced `<think></think>` Reasoning Format
	- 📊 85.2% GSM8K accuracy (+8.8% over base Qwen3-4B)

	---

	## 📦 Available Quantizations

	\| Quantization \| File Size \| Use Case \| Memory Required \| Quality \|
	\|-------------\|-----------\|----------\|-----------------\|---------\|
	\| Q4_K_M \| 2.3GB \| Balanced efficiency \| ~6GB RAM \| Good \|
	\| Q5_K_M \| 2.7GB \| Better quality \| ~7GB RAM \| Very Good \|
	\| Q6_K \| 3.1GB \| High quality \| ~8GB RAM \| Excellent \|
	\| Q8_0 \| 4.0GB \| Maximum quality \| ~10GB RAM \| Near-Original \|

	### 💡 Quantization Guide:
	- Q4_K_M - Best for limited hardware, good performance
	- Q5_K_M - Recommended balance of speed and quality
	- Q6_K - High quality with reasonable speed
	- Q8_0 - Near-original quality, slower inference

	---

	## 🚀 Quick Start

	### Using llama.cpp

	```bash
	# Download your preferred quantization
	wget https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF/resolve/main/crystal-think-v2-q5_k_m.gguf

	# Run with llama.cpp
	./llama.cpp/main -m crystal-think-v2-q5_k_m.gguf -p "Solve this step by step: If x + 2y = 10 and 2x - y = 5, find x and y." -n 512
	```

	### Using llama-cpp-python

	```python
	from llama_cpp import Llama

	# Load the model
	llm = Llama(
	model_path="crystal-think-v2-q5_k_m.gguf",
	n_ctx=4096, # Context length
	n_threads=8, # CPU threads
	verbose=False
	)

	# Mathematical reasoning example
	prompt = """Solve this step by step:
	A rectangle has a length that is 3 more than twice its width. If the perimeter is 42 cm, what are the dimensions?

	Use <think></think> for your reasoning."""

	response = llm(
	prompt,
	max_tokens=512,
	temperature=0.7,
	stop=["</SOLUTION>", "<\|endoftext\|>"]
	)

	print(response["choices"][0]["text"])
	```

	### Using Ollama

	```bash
	# Create Modelfile
	echo 'FROM ./crystal-think-v2-q5_k_m.gguf' > Modelfile

	# Create Ollama model
	ollama create crystal-think-v2 -f Modelfile

	# Run the model
	ollama run crystal-think-v2 "What is the derivative of x^3 + 2x^2 - 5?"
	```

	---

	## 🎯 Enhanced Reasoning Format

	Crystal Think V2 uses a structured reasoning approach:

	```
	<think>
	[Step-by-step reasoning process]
	- Variable definitions
	- Equation setup
	- Mathematical operations
	- Verification steps
	</think>

	<SOLUTION>
	[Final organized answer]
	1) Specific results
	2) Numerical values
	3) Units and context
	</SOLUTION>
	```

	---

	## 📊 Performance Benchmarks

	### Original Model Performance
	\| Benchmark \| Score \| Improvement over Base \|
	\|-----------\|-------\|----------------------\|
	\| GSM8K \| 85.2% \| +8.8% \|
	\| MATH \| 42.1% \| +10.4% \|
	\| Algebra \| 78.9% \| +13.7% \|
	\| Geometry \| 71.3% \| +12.5% \|
	\| Code Math \| 82.6% \| +13.5% \|

	### GGUF Quantization Impact
	- Q8_0: ~99% original performance
	- Q6_K: ~97% original performance
	- Q5_K_M: ~95% original performance
	- Q4_K_M: ~92% original performance

	---

	## 💻 Hardware Requirements

	### Minimum Requirements
	\| Quantization \| RAM \| VRAM (GPU) \| CPU \|
	\|-------------\|-----\|-----------\|-----\|
	\| Q4_K_M \| 6GB \| 4GB \| 4 cores \|
	\| Q5_K_M \| 7GB \| 5GB \| 4 cores \|
	\| Q6_K \| 8GB \| 6GB \| 6 cores \|
	\| Q8_0 \| 10GB \| 8GB \| 8 cores \|

	### Recommended for Best Performance
	- CPU: Modern 8+ core processor
	- RAM: 16GB+ system memory
	- GPU: 8GB+ VRAM (optional, for GPU acceleration)

	---

	## 🔧 Installation & Dependencies

	### llama.cpp
	```bash
	git clone https://github.com/ggerganov/llama.cpp
	cd llama.cpp
	make
	```

	### llama-cpp-python
	```bash
	pip install llama-cpp-python
	# For GPU support (optional)
	CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
	```

	### Ollama
	```bash
	# Install Ollama
	curl -fsSL https://ollama.com/install.sh \| sh
	```

	---

	## 📚 Usage Examples

	### Basic Mathematical Problem
	```
	Input: "What is the integral of 2x + 3?"
	Expected: Step-by-step integration with explanation
	```

	### Complex Word Problem
	```
	Input: "A train travels 120 miles in 2 hours, then 180 miles in 3 hours. What's the average speed?"
	Expected: Detailed solution with calculations
	```

	### Algebraic Reasoning
	```
	Input: "Solve the system: 3x + 2y = 12, x - y = 1"
	Expected: Systematic solution using substitution or elimination
	```

	---

	## 🔗 Related Links

	- 🏠 Original Model: [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2)
	- 📖 Model Documentation: [Crystal Think V2 README](https://huggingface.co/PinkPixel/Crystal-Think-V2/blob/main/README.md)
	- 🛠️ llama.cpp: [GitHub Repository](https://github.com/ggerganov/llama.cpp)
	- 🐍 llama-cpp-python: [PyPI Package](https://pypi.org/project/llama-cpp-python/)

	---

	## ⚠️ Limitations

	- Domain Focus: Optimized for mathematical reasoning; may be less effective for general conversation
	- Quantization Trade-offs: Lower quantizations may show reduced accuracy on complex problems
	- Language: Primarily trained on English mathematical content
	- Hardware Dependency: Performance varies significantly with hardware specifications

	---

	## 📈 Benchmarking Your Setup

	Test your quantization choice with this sample problem:

	```
	Prompt: "A rectangular garden has a length that is 4 meters more than twice its width. The garden is surrounded by a walkway that is 2 meters wide on all sides. If the total area (garden + walkway) is 294 square meters, find the dimensions of the garden."

	Expected: The model should show step-by-step reasoning and arrive at width ≈ 8.13m, length ≈ 20.26m
	```

	---

	## 🤝 Contributing

	Found an issue with the quantizations or have suggestions for improvements? Please open an issue or reach out!

	---

	## 📧 Contact & Support

	- Developer: Pink Pixel
	- GitHub: [https://github.com/pinkpixel-dev](https://github.com/pinkpixel-dev)
	- Website: [https://pinkpixel.dev](https://pinkpixel.dev)
	- Email: [admin@pinkpixel.dev](mailto:admin@pinkpixel.dev)

	---

	## 🙏 Acknowledgments

	- Original Model: Crystal Think V2 by Pink Pixel
	- Base Model: [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) by Qwen Team
	- Quantization Tools: [llama.cpp](https://github.com/ggerganov/llama.cpp) by Georgi Gerganov
	- Training Dataset: [NVIDIA OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning)

	---

	Made with ❤️ by Pink Pixel ✨
	"Dream it, Pixel it"