Text Generation
Transformers
Safetensors
German
mistral
conversational
text-generation-inference

SteuerLLM: Local specialized large language model for German tax law analysis

SteuerLLM is a domain-adapted Large Language Model (LLM) consisting of 28 billion parameters, specifically designed for German tax law analysis. It was introduced in the paper SteuerLLM: Local specialized large language model for German tax law analysis.

The model excels in domains governed by strict formal rules, precise terminology, and legally binding structures, such as tax law, where correct answers require exact statutory citation, structured legal argumentation, and numerical accuracy.

Model Description

SteuerLLM is based on an expanded Mistral Small architecture (extended from 24B to 28B parameters through a block expansion method). It was trained on a large-scale synthetic dataset generated from authentic German university tax law examination material using a controlled retrieval-augmented pipeline.

The training procedure follows a two-stage approach:

  1. Continual Pretraining: The base model's representations are adapted to tax-specific terminology and concepts by pretraining on domain-filtered web data.
  2. Instruction Fine-tuning: The model is then fine-tuned on synthetically generated question-answer pairs derived from primary German legal sources (e.g., EStG, AO, KStG) using the "Water Fountain Algorithm." This algorithm employs retrieval-augmented generation with semantic ranking to ensure factual grounding and contextual relevance.

SteuerLLM consistently outperforms general-purpose instruction-tuned models of comparable size and, in several cases, substantially larger systems, demonstrating the critical role of domain-specific data and architectural adaptation for performance on realistic legal reasoning tasks.

Evaluation

The model's performance was evaluated using SteuerEx, the first open benchmark derived from authentic German university tax law examinations. SteuerEx comprises 115 expert-validated examination questions spanning six core tax law domains and multiple academic levels, utilizing a statement-level, partial-credit evaluation framework.

Usage

SteuerLLM can be served via various frameworks, including Transformers, vLLM, and SGLang, as it is based on the Mistral architecture.

Recommended Inference Parameters:

  • Temperature: 0.3

Citation

If you use this work, please cite:

@article{steuerllm,
  author = {Wind, Sebastian and Sopa, Jeta and Schmid, Laurin and Jackl, Quirin and Kiefer, Sebastian and Wu, Fei and Mayr, Martin and Köstler, Harald and Wellein, Gerhard and Maier, Andreas and Tayebi Arasteh, Soroosh},
  title = {SteuerLLM: Local specialized large language model for German tax law analysis},
  year = {2026},
  journal = {arXiv preprint arXiv:2602.11081},
  url = {https://arxiv.org/abs/2602.11081}
}

License

CC BY-NC 4.0 Research and academic use only.

Downloads last month
23
Safetensors
Model size
28B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train windprak/open_steuerllm

Paper for windprak/open_steuerllm