File size: 1,014 Bytes
147d921 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
---
license: mit
---
# Bengali Word2Vec Model
This is a pre-trained word2vec model for Bengali language.
This model is build for [bengalinlp](https://github.com/banglawiki/bengalinlp) package.
## Datasets
- [Wikipedia dump datasets](https://dumps.wikimedia.org/bnwiki/latest/)
## Training details
- Word2Vec word embedding dimension = 100, min_count=5, window=5, epochs=10
## Usage
- `pip install -U bengalinlp_toolkit`
- Generate Vector using pretrain model
```py
from bengalinlp import BengaliWord2Vec
bwv = BengaliWord2Vec()
model_path = "bengali_word2vec.model"
word = 'গ্রাম'
vector = bwv.generate_word_vector(model_path, word)
print(vector.shape)
print(vector)
```
- Find Most Similar Word Using Pretrained Model
```py
from bengalinlp import BengaliWord2Vec
bwv = BengaliWord2Vec()
model_path = "bengali_word2vec.model"
word = 'গ্রাম'
similar = bwv.most_similar(model_path, word, topn=10)
print(similar)
``` |