Update README.md
Browse files# Dharma-DeepScaleR-1.5B-Preview-Q4_K_M
This model is a Q4_K_M quantized version of the DeepScaleR-1.5B-Preview model originally developed by Agentica. The quantization reduces the model size while maintaining most of its performance capabilities, making it more efficient for deployment on resource-constrained environments.
## Model Description
Dharma-DeepScaleR-1.5B-Preview-Q4_K_M is a 4-bit quantized variant of the DeepScaleR-1.5B-Preview model, optimized using the Q4_K_M quantization method. This quantization approach significantly reduces the model's memory footprint and inference time while preserving its core capabilities.
### Key Features
- **Efficient Resource Usage**: The Q4_K_M quantization reduces memory requirements by approximately 75% compared to the original model
- **Faster Inference**: Enjoy reduced latency and higher throughput, particularly beneficial for production deployments
- **Maintained Performance**: Preserves most of the original model's capabilities with minimal degradation in quality
- **Indian Market Optimization**: Particularly suitable for deployment in the Indian market where computational resources may be constrained
### Quantization Details
The model was quantized using the Q4_K_M method, which:
- Represents weights with 4-bit precision
- Uses K-means clustering for weight grouping
- Maintains mixed precision for certain sensitive layers
## Intended Use
This quantized model is ideal for:
- Mobile and edge device deployments
- Cost-effective cloud implementations
- Applications requiring real-time responses
- Scenarios with limited computational resources
## Limitations
- Slight reduction in performance compared to the full-precision model
- May show degradation in handling complex or nuanced language tasks
- Not recommended for tasks requiring extremely high precision
## Training and Quantization Process
This model is a quantized version of DeepScaleR-1.5B-Preview. The original model's architecture and training methodology remain unchanged. The quantization process focused on optimizing the model for deployment efficiency while minimizing performance loss.
## License
This model is licensed under the Apache 2.0 license, which allows for both commercial and non-commercial use with proper attribution.
## Base Model
```yaml
base_model: agentica-org/DeepScaleR-1.5B-Preview
```
## Additional Metadata
```yaml
license: apache-2.0
tags:
- quantized
- q4_k_m
- efficient-inference
- indian-market
- llm
```