Spaces:

George-API
/

qwen4bit

Sleeping

qwen4bit / README.md

Upload README.md with huggingface_hub

4dfe8a5 verified 9 months ago

1.05 kB

	# Fine-tuned DeepSeek-R1-Distill-Qwen-14B

	This space hosts a fine-tuned version of the [unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit) model.

	## Model Details

	- Base Model: `unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit`
	- Fine-tuned on: `phi4-cognitive-dataset`
	- Quantization: Already 4-bit quantized (no additional quantization applied)

	## Current Status

	This space is currently being prepared. The fine-tuned model will be available soon.

	## Usage

	Once deployed, you can interact with the model through the Gradio interface or via API.

	## Training Process

	The model is being fine-tuned with the following specifications:
	- Training dataset processed in ascending order by `prompt_number`
	- Custom training parameters optimized for the L40S GPU
	- Mixed precision training for optimal performance

	## Contact

	For questions or issues, please reach out through the [Hugging Face community](https://huggingface.co/discussions).