Papers
arxiv:2604.26230

A New Semisupervised Technique for Polarity Analysis using Masked Language Models

Published on Apr 29
Authors:

Abstract

A new probabilistic approach to latent semantic scaling using word2vec as a masked language model demonstrates improved accuracy and interpretability over traditional spatial methods in analyzing pandemic coverage in Chinese media.

AI-generated summary

I developed a new version of Latent Semantic Scaling (LSS) employing word2vec as a masked language model. Unlike original spatial models, it assigns polarity scores to words and documents as predicted probabilities of seed words to occur in given contexts. These probabilistic polarity scores are more accurate, interpretable and consistent than those spatial polarity models can produce in text analysis. I demonstrate these advantages by applying both probabilistic and spatial models to China Daily's coverage of China and other countries during the coronavirus disease (COVID) pandemic in terms of achievement in health issues. The result suggests that more advanced masked language models would further improve the semisupervised machine learning technique.

Community

hf papers read 2604.03583

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.26230 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.26230 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.26230 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.