File size: 1,838 Bytes
c97cd85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
language: dje
tags:
- fasttext
- word-embeddings
- zarma
- nlp
license: apache-2.0
datasets:
- 27Group/noisy_zarma
---
## Description
This repository contains a pre-trained FastText model for the Zarma language. The model generates word embeddings for Zarma text, capturing semantic and contextual information for various NLP tasks.
## Tasks
- **Word Embeddings**: Generate vector representations for Zarma words.
- **Part-of-Speech (POS) Tagging**: Provide features for POS tagging models.
- **Text Classification**: Use embeddings for sentiment analysis or topic classification.
- **Semantic Similarity**: Compute similarity between Zarma words or phrases.
## Usage Examples
### 1. Word Embeddings
Load the FastText model to get word embeddings for Zarma text.
```python
import fasttext
model = fasttext.load_model('zarma_fasttext.bin')
word = "ay"
embedding = model.get_word_vector(word)
print(f"Embedding for '{word}': {embedding[:5]}...")
```
### 2. Semantic Similarity
```python
import fasttext
import numpy as np
model = fasttext.load_model('zarma_fasttext.bin')
word1 = "ay"
word2 = "ni"
vec1 = model.get_word_vector(word1)
vec2 = model.get_word_vector(word2)
similarity = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2) + 1e-8)
print(f"Similarity between '{word1}' and '{word2}': {similarity:.4f}")
```
## How to Use
Install FastText: **pip install fasttext**
Download **zarma_fasttext.bin** from this repository.
Use the code snippets above to integrate the model into your NLP pipeline.
## How to cite
If you use this model in your work, please cite:
```
@misc{zarma_fasttext,
title = {Pre-trained FastText Embeddings for Zarma},
author = {Mamadou K. Keita and Christopher Homan},
year = {2025},
howpublished = {\url{https://huggingface.co/27Group/zarma_fasttext}}
}
``` |