TaxBERT-Sentence-Classification

This repository accompanies the paper: Hechtner, F., Schmidt, L., Seebeck, A., & Weiß, M. (2026). How to design and employ specialized large language models for accounting and tax research: The example of TaxBERT. TaxBERT is a domain-adapated RoBERTa model, specifically designed to analyze qualitative corporate tax disclosures.

SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5146523 The paper provides an ‘A-to-Z’ description of how to design and employ specialized Bidirectional Encoder Representation of Transformers (BERT) models that are environmentally sustainable and practically feasible for accounting and tax researchers.

GitHub: https://github.com/TaxBERT/TaxBERT

Intended Use: This model is intended for sentence-level classification tasks in the context of qualitative corporate tax disclosures. It can be used to classify individual disclosure sentences into predefined categories. Performance outside this domain may be limited and should be validated separately before use.

If the following Guide/Repository is used for academic or scientific purposes, please cite the paper: Hechtner, F., Schmidt, L., Seebeck, A., & Weiß, M. (2026). How to design and employ specialized large language models for accounting and tax research: The example of TaxBERT.

Downloads last month
-
Safetensors
Model size
82.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support