File size: 1,838 Bytes

c97cd85

---
language: dje
tags:
- fasttext
- word-embeddings
- zarma
- nlp
license: apache-2.0
datasets:
- 27Group/noisy_zarma
---

## Description
This repository contains a pre-trained FastText model for the Zarma language. The model generates word embeddings for Zarma text, capturing semantic and contextual information for various NLP tasks.


## Tasks
- **Word Embeddings**: Generate vector representations for Zarma words.
- **Part-of-Speech (POS) Tagging**: Provide features for POS tagging models.
- **Text Classification**: Use embeddings for sentiment analysis or topic classification.
- **Semantic Similarity**: Compute similarity between Zarma words or phrases.

## Usage Examples

### 1. Word Embeddings
Load the FastText model to get word embeddings for Zarma text.

```python
import fasttext

model = fasttext.load_model('zarma_fasttext.bin')

word = "ay"
embedding = model.get_word_vector(word)
print(f"Embedding for '{word}': {embedding[:5]}...")
```
### 2. Semantic Similarity
```python
import fasttext
import numpy as np

model = fasttext.load_model('zarma_fasttext.bin')

word1 = "ay"
word2 = "ni"
vec1 = model.get_word_vector(word1)
vec2 = model.get_word_vector(word2)

similarity = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2) + 1e-8)
print(f"Similarity between '{word1}' and '{word2}': {similarity:.4f}")
```

## How to Use
Install FastText: **pip install fasttext**

Download **zarma_fasttext.bin** from this repository.

Use the code snippets above to integrate the model into your NLP pipeline.

## How to cite
If you use this model in your work, please cite:
```
@misc{zarma_fasttext,
  title     = {Pre-trained FastText Embeddings for Zarma},
  author    = {Mamadou K. Keita and Christopher Homan},
  year      = {2025},
  howpublished = {\url{https://huggingface.co/27Group/zarma_fasttext}}
}
```