Spaces:

GroNLP
/

agalma

Running

App Files Files Community

updated FAQ

by silvia-st - opened 23 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

-7

Files changed (1) hide show

app.py +7 -7

app.py CHANGED Viewed

@@ -380,7 +380,7 @@ if selected == "About":
         This interface was developed in the framework of Silvia Stopponi’s PhD project, \
         supervised by Saskia Peels-Matthey and Malvina Nissim at the University of Groningen (The Netherlands). \
-        The aim of this tool is to make language models trained on Ancient Greek available to all interested people, respectless of their coding skills. \
         The following people were involved in the creation of this interface:
@@ -415,8 +415,8 @@ if selected == "FAQ":
     with st.expander(r"$\textsf{\Large What is this interface based on?}$"):
         st.write(
-                "This interface is based on language models. Language models are probability distributions of \
-                words or word sequences, which store statistical information about word co-occurrences. \
                 This happens during the training phase, in which models process a corpus of texts in the \
                 target language(s). Once trained, linguistic information can be extracted from the models, or \
                 the models can be used to perform specific linguistic tasks. In this interface, we focus on the \
@@ -427,12 +427,12 @@ if selected == "FAQ":
     with st.expander(r"$\textsf{\Large What are Word Embeddings?}$"):
         st.write(
-            "Word Embeddings are representations of words obtained via language modelling. More in \
-            detail, they are strings of numbers (called *vectors*) produced by a language model to \
             represent each word in the training corpus in a multi-dimensional space. Words that are more \
             similar in meaning will be closer to one another in this vector space (or semantic space) than \
             words that are less similar in meaning. The term *word embeddings* is often used as a \
-            synonym of *predict models*, a type of language models introduced by Mikolov *et al.* (2013) \
             with the Word2Vec architecture. This interface is built upon Word2Vec models."
         )
@@ -536,7 +536,7 @@ if selected == "FAQ":
             meaning, in its specific training corpus. \
             \
             Please take into account that the results for words occurring very rarely may be inaccurate. \
-            Language modelling works on a statistical basis, so that a word with only few occurrences \
             may not provide enough evidence to obtain reliable results. But it has been observed that an \
             extremely high word frequency can also affect the results. It often happens that the nearest \
             neighbours to words occurring very often are other high-frequency words, such as stop \

         This interface was developed in the framework of Silvia Stopponi’s PhD project, \
         supervised by Saskia Peels-Matthey and Malvina Nissim at the University of Groningen (The Netherlands). \
+        The aim of this tool is to make distributional semantic models trained on Ancient Greek available to all interested people, respectless of their coding skills. \
         The following people were involved in the creation of this interface:
     with st.expander(r"$\textsf{\Large What is this interface based on?}$"):
         st.write(
+                "This interface is based on distributional semantic models. Distributional semantic models \
+                are computatinoal models that store statistical information about word co-occurrences. \
                 This happens during the training phase, in which models process a corpus of texts in the \
                 target language(s). Once trained, linguistic information can be extracted from the models, or \
                 the models can be used to perform specific linguistic tasks. In this interface, we focus on the \
     with st.expander(r"$\textsf{\Large What are Word Embeddings?}$"):
         st.write(
+            "Word Embeddings are representations of words obtained via training on a corpus of texts. More in \
+            detail, they are ordered sequences of numbers (called *vectors*) produced by a model to \
             represent each word in the training corpus in a multi-dimensional space. Words that are more \
             similar in meaning will be closer to one another in this vector space (or semantic space) than \
             words that are less similar in meaning. The term *word embeddings* is often used as a \
+            synonym of *predict models*, a type of distributional semantic models introduced by Mikolov *et al.* (2013) \
             with the Word2Vec architecture. This interface is built upon Word2Vec models."
         )
             meaning, in its specific training corpus. \
             \
             Please take into account that the results for words occurring very rarely may be inaccurate. \
+            Distributional semantic models learn on a statistical basis, so that a word with only few occurrences \
             may not provide enough evidence to obtain reliable results. But it has been observed that an \
             extremely high word frequency can also affect the results. It often happens that the nearest \
             neighbours to words occurring very often are other high-frequency words, such as stop \