lenta-w2v-cbow-300d / README.md
theformatisvalid's picture
Update README.md
e7acdb2 verified
metadata
license: mit

[Lenta Word2Vec CBOW 300D]

🗃️ Corpus

109k+ words from lenta.ru (2025)

⚙️ Параметры

  • Algorithm: Word2Vec CBOW
  • Vector size: 300
  • Window size: 10
  • Min frequency: 10

📊 Metrics

  • Word analogy accuracy: 42.86%
  • Semantic similarity correlation: 0.18
  • Vocabulary coverage: 28.76%

💻 Use case

from gensim.models import Word2Vec
model = Word2Vec.load("lenta_w2v_cbow_300d.model")
similar = model.wv.most_similar("путин")