prithivMLmods
/

GCIRS-Reasoning-1.5B-R1-GGUF

Text Generation

text-generation-inference

Model card Files Files and versions

GCIRS-Reasoning-1.5B-R1-GGUF / README.md

prithivMLmods's picture

Update README.md

15bc9e0 verified 8 months ago

|

history blame contribute delete

2.78 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- prithivMLmods/GCIRS-Reasoning-1.5B-R1
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- text-generation-inference
	- code
	- math
	- RL
	- science
	---

	# GCIRS-Reasoning-1.5B-R1-GGUF

	> GCIRS-Reasoning-1.5B-R1 is a research-grade reasoning model fine-tuned from Qwen2.5-1.5B-Instruct, focused on non-fictional reasoning, factual consistency, and scientific depth. Trained with reinforcement learning using the Big Reasoning Traces dataset from DeepSeek, this model is tailored for complex analytical tasks and scientific rigor in high-stakes or research environments.


	## Model Files

	\| File Name \| Format \| Size \| Precision \| Use Case \|
	\|-----------\|--------\|------\|-----------\|----------\|
	\| `GCIRS-Reasoning-1.5B-R1.F32.gguf` \| GGUF \| 7.11 GB \| F32 \| Highest precision, research use \|
	\| `GCIRS-Reasoning-1.5B-R1.BF16.gguf` \| GGUF \| 3.56 GB \| BF16 \| High precision, balanced performance \|
	\| `GCIRS-Reasoning-1.5B-R1.F16.gguf` \| GGUF \| 3.56 GB \| F16 \| High precision, memory efficient \|
	\| `GCIRS-Reasoning-1.5B-R1.Q8_0.gguf` \| GGUF \| 1.89 GB \| Q8_0 \| Excellent quality, moderate compression \|
	\| `GCIRS-Reasoning-1.5B-R1.Q6_K.gguf` \| GGUF \| 1.46 GB \| Q6_K \| Very good quality, good compression \|
	\| `GCIRS-Reasoning-1.5B-R1.Q5_K_M.gguf` \| GGUF \| 1.29 GB \| Q5_K_M \| Balanced quality/size (recommended) \|
	\| `GCIRS-Reasoning-1.5B-R1.Q5_K_S.gguf` \| GGUF \| 1.26 GB \| Q5_K_S \| Good quality, smaller size \|
	\| `GCIRS-Reasoning-1.5B-R1.Q4_K_M.gguf` \| GGUF \| 1.12 GB \| Q4_K_M \| Good balance for most users \|
	\| `GCIRS-Reasoning-1.5B-R1.Q4_K_S.gguf` \| GGUF \| 1.07 GB \| Q4_K_S \| Decent quality, compact size \|
	\| `GCIRS-Reasoning-1.5B-R1.Q3_K_L.gguf` \| GGUF \| 980 MB \| Q3_K_L \| Lower quality, very compact \|
	\| `GCIRS-Reasoning-1.5B-R1.Q3_K_M.gguf` \| GGUF \| 924 MB \| Q3_K_M \| Fast inference, limited quality \|
	\| `GCIRS-Reasoning-1.5B-R1.Q3_K_S.gguf` \| GGUF \| 861 MB \| Q3_K_S \| Fastest inference, basic quality \|
	\| `GCIRS-Reasoning-1.5B-R1.Q2_K.gguf` \| GGUF \| 753 MB \| Q2_K \| Minimal size, experimental use \|

	### Quick Selection Guide

	- For Research/Development: Use `F32` or `BF16` for maximum accuracy
	- For Production (Recommended): Use `Q5_K_M` or `Q6_K` for best quality/performance balance
	- For General Use: Use `Q4_K_M` or `Q4_K_S` for good performance
	- For Resource-Constrained Environments: Use `Q3_K_M` or `Q3_K_L`
	- For Edge Devices: Use `Q2_K` for minimal footprint

	## Quants Usage

	(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

	Here is a handy graph by ikawrakow comparing some lower-quality quant
	types (lower is better):

	![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)