BLOOMZ-1B1 Fine-tuned with Ray Train + DeepSpeed ZeRO-3

This model is a fine-tuned version of bigscience/bloomz-1b1, fine-tuned using Ray Train with DeepSpeed ZeRO-3 for scalable distributed training with minimal configuration overhead.

This model was fine-tuned using the IMDB dataset on 2 × T4 16GB GPUs, achieving 11% training loss reduction (from ~3.6 to ~3.2) with automated distributed orchestration.

For detailed implementation, Ray configurations, and distributed training setup, please check out the project repository.

Downloads last month: 1

Safetensors

Model size

1B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support