Files changed (1) hide show
  1. app.py +7 -7
app.py CHANGED
@@ -380,7 +380,7 @@ if selected == "About":
380
 
381
  This interface was developed in the framework of Silvia Stopponi’s PhD project, \
382
  supervised by Saskia Peels-Matthey and Malvina Nissim at the University of Groningen (The Netherlands). \
383
- The aim of this tool is to make language models trained on Ancient Greek available to all interested people, respectless of their coding skills. \
384
 
385
  The following people were involved in the creation of this interface:
386
 
@@ -415,8 +415,8 @@ if selected == "FAQ":
415
 
416
  with st.expander(r"$\textsf{\Large What is this interface based on?}$"):
417
  st.write(
418
- "This interface is based on language models. Language models are probability distributions of \
419
- words or word sequences, which store statistical information about word co-occurrences. \
420
  This happens during the training phase, in which models process a corpus of texts in the \
421
  target language(s). Once trained, linguistic information can be extracted from the models, or \
422
  the models can be used to perform specific linguistic tasks. In this interface, we focus on the \
@@ -427,12 +427,12 @@ if selected == "FAQ":
427
 
428
  with st.expander(r"$\textsf{\Large What are Word Embeddings?}$"):
429
  st.write(
430
- "Word Embeddings are representations of words obtained via language modelling. More in \
431
- detail, they are strings of numbers (called *vectors*) produced by a language model to \
432
  represent each word in the training corpus in a multi-dimensional space. Words that are more \
433
  similar in meaning will be closer to one another in this vector space (or semantic space) than \
434
  words that are less similar in meaning. The term *word embeddings* is often used as a \
435
- synonym of *predict models*, a type of language models introduced by Mikolov *et al.* (2013) \
436
  with the Word2Vec architecture. This interface is built upon Word2Vec models."
437
  )
438
 
@@ -536,7 +536,7 @@ if selected == "FAQ":
536
  meaning, in its specific training corpus. \
537
  \
538
  Please take into account that the results for words occurring very rarely may be inaccurate. \
539
- Language modelling works on a statistical basis, so that a word with only few occurrences \
540
  may not provide enough evidence to obtain reliable results. But it has been observed that an \
541
  extremely high word frequency can also affect the results. It often happens that the nearest \
542
  neighbours to words occurring very often are other high-frequency words, such as stop \
 
380
 
381
  This interface was developed in the framework of Silvia Stopponi’s PhD project, \
382
  supervised by Saskia Peels-Matthey and Malvina Nissim at the University of Groningen (The Netherlands). \
383
+ The aim of this tool is to make distributional semantic models trained on Ancient Greek available to all interested people, respectless of their coding skills. \
384
 
385
  The following people were involved in the creation of this interface:
386
 
 
415
 
416
  with st.expander(r"$\textsf{\Large What is this interface based on?}$"):
417
  st.write(
418
+ "This interface is based on distributional semantic models. Distributional semantic models \
419
+ are computatinoal models that store statistical information about word co-occurrences. \
420
  This happens during the training phase, in which models process a corpus of texts in the \
421
  target language(s). Once trained, linguistic information can be extracted from the models, or \
422
  the models can be used to perform specific linguistic tasks. In this interface, we focus on the \
 
427
 
428
  with st.expander(r"$\textsf{\Large What are Word Embeddings?}$"):
429
  st.write(
430
+ "Word Embeddings are representations of words obtained via training on a corpus of texts. More in \
431
+ detail, they are ordered sequences of numbers (called *vectors*) produced by a model to \
432
  represent each word in the training corpus in a multi-dimensional space. Words that are more \
433
  similar in meaning will be closer to one another in this vector space (or semantic space) than \
434
  words that are less similar in meaning. The term *word embeddings* is often used as a \
435
+ synonym of *predict models*, a type of distributional semantic models introduced by Mikolov *et al.* (2013) \
436
  with the Word2Vec architecture. This interface is built upon Word2Vec models."
437
  )
438
 
 
536
  meaning, in its specific training corpus. \
537
  \
538
  Please take into account that the results for words occurring very rarely may be inaccurate. \
539
+ Distributional semantic models learn on a statistical basis, so that a word with only few occurrences \
540
  may not provide enough evidence to obtain reliable results. But it has been observed that an \
541
  extremely high word frequency can also affect the results. It often happens that the nearest \
542
  neighbours to words occurring very often are other high-frequency words, such as stop \