clm-600m / README.md
Cheng98's picture
Update README.md
750ffb6 verified
---
library_name: transformers
tags:
- language-model
license: odc-by
datasets:
- HuggingFaceFW/fineweb-edu
language:
- en
---
# Model Card for AICrossSim/bitflip-clm-600m
A 600M parameter bitflip-aware language model trained on `22 * 600M` tokens from FineWeb-Edu dataset.
## Model Details
bitflip-aixsim-600M is a transformer-based language model with approximately 600 million parameters (embedding layer params excluded).
It uses RMSNorm for normalization and is trained on the FineWeb-Edu dataset.
- **Developed by:** AICrossSim
- **Funded by:** [ARIA](https://www.aria.org.uk/)
- **Model type:** Transformer Language Model
- **Language(s) (NLP):** English
- **Tokenizer:** [HuggingFaceTB/cosmo2-tokenizer](https://huggingface.co/HuggingFaceTB/cosmo2-tokenizer)
- **Repository:** [AICrossSim/NewComputeBench](https://github.com/AICrossSim/NewComputeBench)
## Training Details
Experiment setup and training logs can be found at [wandb run](https://wandb.ai/cz98/torchtitan/runs/wvjj4xaf?nw=nwusercz98).