File size: 1,112 Bytes
f7cf8f7
 
 
 
 
 
 
7c17c86
f7cf8f7
 
c29a69b
 
57440b5
 
43d00ee
c29a69b
062cd6e
c29a69b
 
 
 
 
 
 
7c17c86
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
title: README
emoji: 🏢
colorFrom: gray
colorTo: indigo
sdk: static
pinned: false
short_description: Description page of AIaLT-IICT organization
---

# Artificial Intelligence and Language Technologies Department at IICT-BAS

Welcome to the HuggingFace organization page of the **Artificial Intelligence and Language Technologies Department** at the **Institute of Information and Communication Technologies, Bulgarian Academy of Sciences**!   

The department focuses on developing language resources, theoretical machine learning, information retrieval, speech recognition and generation, and language models development for Bulgarian NLP applications.

## This repository offers openly available pre-trained language models designed for the Bulgarian language:
- ModernBERT based models
  - base (149M) and large (395M) variants with 8192 tokens context length
- BERT based models
  - base (124M) and large (355M) both uncased and cased variants
  - extra large variant (859M)
- T5 based models
  - 403M and 1.1B variants
  - 470M with character level tokenization suitable for spelling correction tasks