qwen4bit / README.md
George-API's picture
Upload README.md with huggingface_hub
4dfe8a5 verified
|
raw
history blame
1.05 kB
# Fine-tuned DeepSeek-R1-Distill-Qwen-14B
This space hosts a fine-tuned version of the [unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit) model.
## Model Details
- **Base Model**: `unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit`
- **Fine-tuned on**: `phi4-cognitive-dataset`
- **Quantization**: Already 4-bit quantized (no additional quantization applied)
## Current Status
This space is currently being prepared. The fine-tuned model will be available soon.
## Usage
Once deployed, you can interact with the model through the Gradio interface or via API.
## Training Process
The model is being fine-tuned with the following specifications:
- Training dataset processed in ascending order by `prompt_number`
- Custom training parameters optimized for the L40S GPU
- Mixed precision training for optimal performance
## Contact
For questions or issues, please reach out through the [Hugging Face community](https://huggingface.co/discussions).