Dango🍡

Dango is a large language model (LLM) trained on an extensively filtered Japanese corpus.
It is intended primarily for research on second language acquisition, but it can also be used as a Japanese speaker simulator.

We release the checkpoint trained on 100B tokens in this repository.

Model Details

Dango is architecturally comparable to the llm-jp-3 family, which adopts a Llama 2-style decoder architecture.

Please visit our github page for filtering and training codes: https://github.com/mattashiho233/dango

License

Citation

If you use Dango in your research, please cite:

@inproceedings{matta2026anlp,
  author    = {Shiho Matta and Yin Jou Huang and Fei Cheng and Takashi Kodama and Hirokazu Kiyomaru and Yugo Murawaki},
  title     = {Pretraining a Japanese-Only Large Language Model for Studying Second Language Acquisition},
  booktitle = {Proceedings of the Thirty-second Annual Meeting of the Association for Natural Language Processing},
  year      = {2026},
  pages     = {3225--3230},
  publisher = {Association for Natural Language Processing},
  address   = {Utsunomiya, Japan}
}
Downloads last month
1
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support