Llama-3.2-1B Fine-tuned with DeepSpeed Pipeline Parallelism

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct, fine-tuned using DeepSpeed's Pipeline Parallelism to enable efficient training by distributing model layers across multiple GPUs with overlapped computation.

This model was fine-tuned using the arxiv-abstract-dataset on 2 × T4 16GB GPUs with 2-stage pipeline distribution.

For detailed implementation, pipeline configuration, and checkpoint conversion instructions, please check out the project repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support