| --- |
| license: apache-2.0 |
| base_model: |
| - Qwen/Qwen3-8B |
| - allura-org/remnant-qwen3-8b |
| tags: |
| - merge |
| - qwen3 |
| - creative-writing |
| - roleplay |
| - gguf |
| - llama-cpp |
| model_type: qwen3 |
| --- |
| |
| # RemnantInstruct-8B-GGUF |
|
|
| GGUF quantizations of RemnantInstruct-8B, a SLERP merge combining instruction-following with creative writing capabilities. |
|
|
| ## Model Details |
|
|
| **Base Models:** |
| - [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) - Strong instruction following and reasoning |
| - [allura-org/remnant-qwen3-8b](https://huggingface.co/allura-org/remnant-qwen3-8b) - Enhanced creative writing and roleplay |
|
|
| **Merge Method:** SLERP (Spherical Linear Interpolation) |
|
|
| The merge uses a complementary interpolation strategy: |
| - Self-attention layers: Gradual blend from base to creative (0 -> 0.5 -> 0.3 -> 0.7 -> 1) |
| - MLP layers: Inverse blend (1 -> 0.5 -> 0.7 -> 0.3 -> 0) |
| - Default: 50/50 blend |
|
|
| This approach preserves the base model's instruction-following while incorporating the creative writing capabilities of the remnant fine-tune. |
|
|
| ## Quantizations |
|
|
| | Quant | Size | Description | |
| |-------|------|-------------| |
| | Q4_K_M | 4.7 GB | Balanced quality and size (recommended) | |
| | Q5_K_M | 5.5 GB | Better quality, slightly larger | |
| | Q8_0 | 8.2 GB | Highest quality quantization | |
| |
| ## Usage |
| |
| ### llama.cpp |
| ```bash |
| ./llama-cli -m RemnantInstruct-8B-Q4_K_M.gguf -p "Write a story about..." -n 512 |
| ``` |
| |
| ### Ollama |
| ```bash |
| ollama run anthonym21/remnantinstruct-8b |
| ``` |
| |
| ### LM Studio |
| Download any GGUF file and load it directly in LM Studio. |
| |
| ## Merge Configuration |
| |
| ```yaml |
| slices: |
| - sources: |
| - model: Qwen/Qwen3-8B |
| layer_range: [0, 36] |
| - model: allura-org/remnant-qwen3-8b |
| layer_range: [0, 36] |
| merge_method: slerp |
| base_model: Qwen/Qwen3-8B |
| parameters: |
| t: |
| - filter: self_attn |
| value: [0, 0.5, 0.3, 0.7, 1] |
| - filter: mlp |
| value: [1, 0.5, 0.7, 0.3, 0] |
| - value: 0.5 |
| dtype: bfloat16 |
| ``` |
| |
| ## License |
|
|
| Apache 2.0 (inherited from Qwen3-8B) |
|
|