openbmb
/

DensingLaw-ScalingModels

Text Generation

reference-models

Model card Files Files and versions

caijie12138 commited on Jul 26, 2025

Commit

23aaeac

·

verified ·

1 Parent(s): 75933c6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ This repository contains a series of reference models of varying sizes, released
 ## 💡 Overview
-The core contribution of our paper is the concept of **LLM Density** \\(\rho\\), defined as the ratio of a model's *effective* parameter size \\(\ghat N\\) to its *actual* parameter size \\(N\\). To accurately determine a model's effective size, we must first establish a reliable "ruler"—a scaling law that maps training compute to performance on downstream tasks.
 The models in this repository serve as that "ruler". We trained a series of six models, ranging from **5 million to 800 million parameters**, on a consistent dataset. By measuring their loss on various benchmarks, we fitted a precise scaling function. This function allows us to take any other LLM, measure its performance, and infer its effective parameter size by seeing where it lands on our reference scale.

 ## 💡 Overview
+The core contribution of our paper is the concept of **LLM Density** \\(\rho\\), defined as the ratio of a model's *effective* parameter size \\(\hat{N}\\) to its *actual* parameter size \\(N\\). To accurately determine a model's effective size, we must first establish a reliable "ruler"—a scaling law that maps training compute to performance on downstream tasks.
 The models in this repository serve as that "ruler". We trained a series of six models, ranging from **5 million to 800 million parameters**, on a consistent dataset. By measuring their loss on various benchmarks, we fitted a precise scaling function. This function allows us to take any other LLM, measure its performance, and infer its effective parameter size by seeing where it lands on our reference scale.