Link model to paper and improve model card

by nielsr HF Staff - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+32

-21

nielsr

6 days ago

Hi! I'm Niels from the Hugging Face community team.

This PR improves your model card by linking it to the corresponding paper page on Hugging Face: Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm.

I've also added the author list and provided more context about the SPES framework (SParse Expert Synchronization) to help users understand the decentralized training approach used for this 9B MoE model.

Link model to paper and improve model card5f210da9

zjr2000 changed pull request status to merged 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment