diaslmb commited on
Commit
5f003a2
Β·
verified Β·
1 Parent(s): fa0f4cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -18
README.md CHANGED
@@ -5,33 +5,23 @@ colorFrom: blue
5
  colorTo: indigo
6
  sdk: docker
7
  pinned: false
8
- app_file: app.py
9
  ---
10
 
11
- # Inflexion Lab
12
 
13
- **Advancing State-of-the-Art NLP for the Kazakh Language**
14
 
15
- Inflexion Lab is an AI research and development group dedicated to solving the challenges of low-resource language processing. Our primary focus is building robust, industrial-grade Natural Language Processing (NLP) and Automatic Speech Recognition (ASR) systems for the Kazakh language and the Central Asian region.
16
 
17
- ### 🎯 Our Mission
18
- To bridge the digital divide for the Kazakh language by developing open-source models, datasets, and tools that enable seamless human-AI interaction. We combine advanced Deep Learning techniques with linguistic precision to create models that truly understand the nuances of the language.
19
-
20
- ### πŸ”¬ Key Areas of Research
21
- * **Automatic Speech Recognition (ASR):** Fine-tuning large-scale models (Whisper) for mixed-language environments (Kazakh/Russian).
22
- * **Data Engineering:** Syntactic restructuring of raw speech corpora using LLMs (Gemma, Llama) to create high-quality training data.
23
- * **Large Language Models (LLMs):** Adapting and aligning foundation models for Turkic languages.
24
 
25
  ### πŸ‘₯ Team
26
- We are a team of engineers and researchers passionate about AI infrastructure and linguistics.
27
 
 
28
  * **Askhat Sabitkhanov**
29
  * **Dias Ilyas**
30
  * **Sergey Klimov**
31
 
32
- ### πŸš€ Featured Projects
33
- * **Sybyrla (Whisper Large V3):** A robust ASR model achieving ~12% WER on the KSC2 benchmark, optimized for code-switching.
34
- * **KSC2 Structured:** An enhanced version of the ISSAI KSC2 corpus with punctuation and capitalization restored via LLM post-processing.
35
-
36
- ---
37
- *Open Science. Open Source. Inflexion.*
 
5
  colorTo: indigo
6
  sdk: docker
7
  pinned: false
 
8
  ---
9
 
10
+ # Welcome to Inflexion Lab πŸ‘‹
11
 
12
+ This is the organization of **Inflexion Lab**, an AI research and development group dedicated to advancing state-of-the-art **Natural Language Processing (NLP)** and **Automatic Speech Recognition (ASR)** for the Kazakh language.
13
 
14
+ In this organization, we continuously release fine-tuned foundation models (such as Whisper), syntactically structured open-source datasets, and tools designed to bridge the digital divide for low-resource languages. Our primary focus is solving complex linguistic challenges, including code-switching and punctuation restoration, to build industrial-grade AI infrastructure for Central Asia.
15
 
16
+ Feel free to explore our **Sybyrla** model or our **Structured KSC2** dataset below!
 
 
 
 
 
 
17
 
18
  ### πŸ‘₯ Team
 
19
 
20
+ We are a specialized team of researchers and engineers:
21
  * **Askhat Sabitkhanov**
22
  * **Dias Ilyas**
23
  * **Sergey Klimov**
24
 
25
+ ### πŸ”¬ Featured Work
26
+ * **[Sybyrla (Whisper Large V3)](https://huggingface.co/InflexionLab/sybyrla)**: A robust ASR model achieving ~12% WER, trained on a strategic mix of KSC2 and Russian Common Voice.
27
+ * **[KSC2 Structured](https://huggingface.co/datasets/InflexionLab/ksc2-structured)**: The official ISSAI corpus remastered with Gemma 27B to restore punctuation and capitalization.