metadata
license: mit
[Lenta Word2Vec CBOW 300D]
🗃️ Corpus
109k+ words from lenta.ru (2025)
⚙️ Параметры
- Algorithm: Word2Vec CBOW
- Vector size: 300
- Window size: 10
- Min frequency: 10
📊 Metrics
- Word analogy accuracy: 42.86%
- Semantic similarity correlation: 0.18
- Vocabulary coverage: 28.76%
💻 Use case
from gensim.models import Word2Vec
model = Word2Vec.load("lenta_w2v_cbow_300d.model")
similar = model.wv.most_similar("путин")