Update README.md
#1
by
axay
- opened
README.md
CHANGED
|
@@ -12,7 +12,7 @@ pipeline_tag: text-generation
|
|
| 12 |
|
| 13 |
# Qwen3-1.7B (from-scratch, 41B-token pretrain)
|
| 14 |
|
| 15 |
-
A 1.7B-parameter decoder-only transformer (Qwen3 family) pre-trained **from scratch** on ~**
|
| 16 |
|
| 17 |
---
|
| 18 |
|
|
@@ -30,7 +30,7 @@ A 1.7B-parameter decoder-only transformer (Qwen3 family) pre-trained **from scra
|
|
| 30 |
### Model Sources
|
| 31 |
|
| 32 |
- **Repository:** https://huggingface.co/qvac/genesisI-model
|
| 33 |
-
- **Paper / Blog :**
|
| 34 |
|
| 35 |
---
|
| 36 |
|
|
@@ -104,7 +104,7 @@ print(tok.decode(out[0], skip_special_tokens=True))
|
|
| 104 |
|
| 105 |
### Training Data
|
| 106 |
|
| 107 |
-
* **Size:** ~**
|
| 108 |
* **Domains:** Mixed general + STEM/technical sources (expository text, problem sets, references).
|
| 109 |
* **Format:** Hugging Face Datasets (Arrow).
|
| 110 |
* **Tokenizer:** **Qwen3** tokenizer.
|
|
@@ -262,27 +262,7 @@ srun -N 60 -n 480 --ntasks-per-node=8 --gpus-per-task=1 \
|
|
| 262 |
|
| 263 |
---
|
| 264 |
|
| 265 |
-
## Citation
|
| 266 |
-
|
| 267 |
-
If you use this model, please cite:
|
| 268 |
-
|
| 269 |
-
**BibTeX**
|
| 270 |
-
|
| 271 |
-
Xxxx
|
| 272 |
-
|
| 273 |
-
**APA**
|
| 274 |
-
|
| 275 |
-
xxxxxx
|
| 276 |
-
|
| 277 |
-
---
|
| 278 |
-
|
| 279 |
-
|
| 280 |
-
## Model Card Authors
|
| 281 |
-
|
| 282 |
-
XXXYYYYZZZ
|
| 283 |
-
|
| 284 |
-
---
|
| 285 |
|
| 286 |
## Changelog
|
| 287 |
|
| 288 |
-
* **v0.1 (
|
|
|
|
| 12 |
|
| 13 |
# Qwen3-1.7B (from-scratch, 41B-token pretrain)
|
| 14 |
|
| 15 |
+
A 1.7B-parameter decoder-only transformer (Qwen3 family) pre-trained **from scratch** on ~**40B tokens** of multi-domain text with **BF16 mixed precision** and a **4,096-token** context. Checkpoints are provided in standard Hugging Face format for easy inference and fine-tuning.
|
| 16 |
|
| 17 |
---
|
| 18 |
|
|
|
|
| 30 |
### Model Sources
|
| 31 |
|
| 32 |
- **Repository:** https://huggingface.co/qvac/genesisI-model
|
| 33 |
+
- **Paper / Blog :** https://huggingface.co/blog/qvac/genesis-i
|
| 34 |
|
| 35 |
---
|
| 36 |
|
|
|
|
| 104 |
|
| 105 |
### Training Data
|
| 106 |
|
| 107 |
+
* **Size:** ~**40B tokens**, single epoch.
|
| 108 |
* **Domains:** Mixed general + STEM/technical sources (expository text, problem sets, references).
|
| 109 |
* **Format:** Hugging Face Datasets (Arrow).
|
| 110 |
* **Tokenizer:** **Qwen3** tokenizer.
|
|
|
|
| 262 |
|
| 263 |
---
|
| 264 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 265 |
|
| 266 |
## Changelog
|
| 267 |
|
| 268 |
+
* **v0.1 (2025-11-17):** Initial public release — 40B-token 1-epoch pretrain; HF conversion.
|