Access Request for Research-Only Model

Please provide your professional details and acknowledge the terms of use to request access.

By requesting access, you acknowledge that this model is provided solely for research purposes, is offered 'as-is' without any guarantees, and cannot be utilized for for-profit tasks or commercial applications.

Log in or Sign Up to review the conditions and access this model content.

Racka-4B-GGUF Model Card

Racka icon

Racka

Racka (Regionális Adatokon Célzottan Kialakított Alapmodell) is a continually pretrained large language model designed to bridge the resource gap between Hungarian and high-resource languages. It employs parameter-efficient continual pretraining via Low-Rank Adaptation (LoRA) on a Qwen3-4B (reasoning/instruct) backbone.

The model was trained on a mixture of 160B tokens (44% Hungarian, 24% German, 21% English, 11% Code) on the Komondor HPC. To better match the training distribution, Racka uses an adapted tokenizer that achieves substantially improved tokenization fertility for Hungarian while maintaining competitive performance in English and German.

Model Details

Benchmarks

  • WIP

Limitations

  • The model is capable of both instruction following chat and English reasoning using the original Qwen settings, this is a preserved capability with no direct training targetting this functionality.
  • The model has not been aligned and is unsafe for use with end-users.
  • This model is only to be used for research purposes, commercial or for-profit usage is not permitted.

Team

In alphabetical order:

  • Zsolt Csibi (ELTE-IK, AI Dept.)
  • Bence Gortka (ELTE-BTK, DH-Lab)
  • Natabara Gyöngyössy (ELTE-IK, AI Dept.)
  • Kornél Nagy (ELTE-BTK, DH-Lab)
  • Dávid Nemeskey (ELTE-BTK, DH-Lab)
  • Gábor Palkó (ELTE-BTK, DH-Lab)
  • Martin Sallai (ELTE-BTK, DH-Lab)
  • András Simonyi (ELTE-IK, AI Dept.)
  • András Szekeres (ELTE-BTK, DH-Lab)

Acknowledgements

We acknowledge the Digital Government Development and Project Management Ltd. for awarding us access to the Komondor HPC facility based in Hungary.

This research was supported by the EKÖP-24 University Excellence Scholarship Program of the Ministry for Culture and Innovation, funded by the National Research, Development and Innovation Fund.

The authors acknowledge the support of the National Laboratory for Digital Heritage. Project no. 2022-2.1.1-NL-2022-00009 has been implemented with the support provided by the Ministry of Culture and Innovation of Hungary from the National Research, Development and Innovation Fund, financed under the 2022-2.1.1-NL funding scheme.

We would like to thank Levente Szabados for the name idea and initial informal discussions.

Citation

@article{racka2026,
  title={Racka: Efficient Hungarian LLM Adaptation on Academic Infrastructure},
  author={Csibi, Zsolt and Gortka, Bence Gy\"orgy and Nagy, Korn\'el and Nemeskey, D\'avid M\'ark and Sallai, Martin and Simonyi, Andr\'as and Szekeres, Andr\'as M\'ark and Palk\'o, G\'abor},
  journal={Proceedings of the XXII. Hungarian Computational Linguistics Conference},
  year={2026}
}
Downloads last month
136
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for elte-nlp/Racka-4B-GGUF

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
elte-nlp/Racka-4B
Quantized
(1)
this model