File size: 1,014 Bytes
147d921
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: mit
---

# Bengali Word2Vec Model
This is a pre-trained word2vec model for Bengali language.

This model is build for [bengalinlp](https://github.com/banglawiki/bengalinlp) package.

## Datasets
- [Wikipedia dump datasets](https://dumps.wikimedia.org/bnwiki/latest/)

## Training details
- Word2Vec word embedding dimension = 100, min_count=5, window=5, epochs=10

## Usage
- `pip install -U bengalinlp_toolkit`
- Generate Vector using pretrain model

    ```py
    from bengalinlp import BengaliWord2Vec

    bwv = BengaliWord2Vec()
    model_path = "bengali_word2vec.model"
    word = 'গ্রাম'
    vector = bwv.generate_word_vector(model_path, word)
    print(vector.shape)
    print(vector)

    ```

 - Find Most Similar Word Using Pretrained Model

    ```py
    from bengalinlp import BengaliWord2Vec

    bwv = BengaliWord2Vec()
    model_path = "bengali_word2vec.model"
    word = 'গ্রাম'
    similar = bwv.most_similar(model_path, word, topn=10)
    print(similar)

    ```