theformatisvalid commited on
Commit
e7acdb2
·
verified ·
1 Parent(s): e876156

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -3
README.md CHANGED
@@ -1,3 +1,24 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ # [Lenta Word2Vec CBOW 300D]
5
+
6
+ ## 🗃️ Corpus
7
+ 109k+ words from lenta.ru (2025)
8
+
9
+ ## ⚙️ Параметры
10
+ - Algorithm: Word2Vec CBOW
11
+ - Vector size: 300
12
+ - Window size: 10
13
+ - Min frequency: 10
14
+
15
+ ## 📊 Metrics
16
+ - Word analogy accuracy: 42.86%
17
+ - Semantic similarity correlation: 0.18
18
+ - Vocabulary coverage: 28.76%
19
+
20
+ ## 💻 Use case
21
+ ```python
22
+ from gensim.models import Word2Vec
23
+ model = Word2Vec.load("lenta_w2v_cbow_300d.model")
24
+ similar = model.wv.most_similar("путин")