Upload articles.txt with huggingface_hub
Browse files- articles.txt +242 -0
articles.txt
ADDED
|
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
26.11.2021 ###################################################################3
|
| 2 |
+
negative samples reduction http://ceur-ws.org/Vol-2007/LEARNER2017_short_1.pdf
|
| 3 |
+
bert for ranking latest review https://arxiv.org/abs/2010.06467
|
| 4 |
+
new sampling approach USEFUL https://arxiv.org/abs/2104.06967
|
| 5 |
+
multitask learning https://github.com/CAMTL/CA-MTL
|
| 6 |
+
distillation https://arxiv.org/pdf/2111.09645.pdf
|
| 7 |
+
|
| 8 |
+
22.09.2022 ###################################################################
|
| 9 |
+
New search paradigm
|
| 10 |
+
https://arxiv.org/pdf/2204.10628.pdf
|
| 11 |
+
https://arxiv.org/pdf/2206.02743.pdf
|
| 12 |
+
https://arxiv.org/pdf/2202.06991.pdf
|
| 13 |
+
|
| 14 |
+
Auto prompting
|
| 15 |
+
|
| 16 |
+
Gurevich Irina
|
| 17 |
+
TU Darmstadt
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
#useful#######################################################################
|
| 21 |
+
videos about foundation models
|
| 22 |
+
https://www.youtube.com/playlist?list=PL9t0xVFP90GD8hox0KipBkJcLX_C3ja67
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
09.10.2022 #############################################################################
|
| 26 |
+
From "Autoregressive Search Engines: Generating Substrings as Document Identifiers"
|
| 27 |
+
"Query likelihood models" --
|
| 28 |
+
Cicero Nogueira dos Santos, Xiaofei Ma, Ramesh Nallapati, Zhiheng Huang, and Bing Xiang. 2020. Beyond [CLS] through ranking by generation.
|
| 29 |
+
ShengyaoZhuangandGuidoZuccon.2021.TILDE: termindependentlikelihoodmodelforpassagereranking.
|
| 30 |
+
Oleg Lesota, Navid Rekabsaz, Daniel Cohen, Klaus Antonius Grasserbauer, Carsten Eickhoff, and Markus Schedl. 2021. A modern perspective on query likelihood with deep generative retrieval models.
|
| 31 |
+
|
| 32 |
+
Prompting to generate queries --
|
| 33 |
+
Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, and Nikolai Grigorev. 2022. Internetaugmented language models through few-shot prompting for open-domain question answering.
|
| 34 |
+
|
| 35 |
+
11.10.2022 #############################################################################
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
18.10.2022 ############################################################################
|
| 40 |
+
Articles with BEIR:
|
| 41 |
+
|
| 42 |
+
Researcher: Gautier Izacard
|
| 43 |
+
|
| 44 |
+
################################################################################3
|
| 45 |
+
###################################################################################3
|
| 46 |
+
#####################################################################################
|
| 47 |
+
|
| 48 |
+
23.02.2023 ############################################################################
|
| 49 |
+
Sparse CLIP (STAIR paper from Apple) https://arxiv.org/pdf/2301.13081.pdf
|
| 50 |
+
|
| 51 |
+
#########################################################################################################
|
| 52 |
+
Chain of thought reasoning
|
| 53 |
+
|
| 54 |
+
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models https://arxiv.org/pdf/2201.11903.pdf NIPS 2022
|
| 55 |
+
(Кратко -- чуваки просто взяли несколько примеров из датасетов и зафигачили для них промпты (in context learning)
|
| 56 |
+
в стиле пошаговых действий; Это улучшило очень сильно метрики на математике, на всяких логических задачах)
|
| 57 |
+
|
| 58 |
+
Large Language Models are Zero-Shot Reasoners https://arxiv.org/pdf/2205.11916.pdf NIPS 2022
|
| 59 |
+
(Чуваки добавляют промрт "Let's think step by step" с помощью него генерируют последовательное решение задачи,
|
| 60 |
+
затем подставляют это решение снова как промпт в модель и получают ответ. Это тоже бустит метрики на арифметике
|
| 61 |
+
и commonsense. Можно сказать, что модель сама может генерировать себе решение задачи.) (нужно почитать подробнее)
|
| 62 |
+
|
| 63 |
+
AUTOMATIC CHAIN OF THOUGHT PROMPTING IN LARGE LANGUAGE MODELS https://arxiv.org/pdf/2210.03493.pdf
|
| 64 |
+
(Чуваки хотят придумать auto-cot. Они разбивают вопросы на несколько кластеров,
|
| 65 |
+
затем берут из каждого кластера репрезентативный вопрос и генерируют для него auto-cot.
|
| 66 |
+
Генерация auto-cot не идеальная. Может попасться один кластер, в котором все плохо.
|
| 67 |
+
Авторы делят все вопросы на кластеры (с помощью sentence bert!!!). (Спросить у Димы, как они используют кластеры))
|
| 68 |
+
|
| 69 |
+
TO READ Multimodal Chain-of-Thought Reasoning in Language Models https://arxiv.org/pdf/2302.00923.pdf
|
| 70 |
+
(Самый простой способ реализовать multimodal cot -- перевести картинки в текст и реализовать обычный cot.
|
| 71 |
+
LLMs до 100B параметров могут производить галлюцинирующие rationale)
|
| 72 |
+
|
| 73 |
+
27.02.2023 ################################################################################
|
| 74 |
+
Выбор коллокаций
|
| 75 |
+
https://nlp.stanford.edu/fsnlp/promo/colloc.pdf
|
| 76 |
+
|
| 77 |
+
Large Language models
|
| 78 |
+
TO READ Scaling Laws for Neural Language Models https://arxiv.org/pdf/2001.08361.pdf
|
| 79 |
+
|
| 80 |
+
LLAMA https://scontent-ams4-1.xx.fbcdn.net/v/t39.2365-6/333007794_1182140292435357_4481174526219500228_n.pdf?_nc_cat=101&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=Z5B8LP9penMAX_SWEqj&_nc_ht=scontent-ams4-1.xx&oh=00_AfAogQwG27t4J0ui35Jxwf1G31cgj2HiZGtw8v3cHk3szA&oe=6401D9D1
|
| 81 |
+
Чуваки просто взяли много очищенных данных и натренировали модели меньше, чем GPT-3 и PALM, показав,
|
| 82 |
+
что данных для больших моделей нужно больше. У них получилось, что даже в статье Hoffman, где показано,
|
| 83 |
+
что для обучения больших моделей нужно больше данных, была недостаточно хорошая оценка.
|
| 84 |
+
Модель лучше или comparable to 175B gpt-3 или 450B PALM. (Не бьет code-davinci-002 на MMLU)
|
| 85 |
+
|
| 86 |
+
TO READ Training compute optimal large language models https://arxiv.org/pdf/2203.15556.pdf
|
| 87 |
+
|
| 88 |
+
Toolformer: Language Models Can Teach Themselves to Use Tools https://arxiv.org/pdf/2302.04761.pdf
|
| 89 |
+
Тут взяли GPT-J, аугментировали с помощью нее данные вызовами api, затем дообучили ее на этом.
|
| 90 |
+
Таким образом, GPT-J научилась вызывать калькулятор, поиск по вики,
|
| 91 |
+
переводчик и побеждать большие GPT-3 и OPT на некоторых задачах
|
| 92 |
+
|
| 93 |
+
To READ Generating Datasets with Pretrained Language Models https://aclanthology.org/2021.emnlp-main.555.pdf
|
| 94 |
+
|
| 95 |
+
28.02.2023 ###########################################################################################################################3
|
| 96 |
+
|
| 97 |
+
TO READ Atlas: Few-shot Learning with Retrieval Augmented Language Models https://arxiv.org/pdf/2208.03299.pdf
|
| 98 |
+
|
| 99 |
+
TO READ GTP-J
|
| 100 |
+
|
| 101 |
+
TO READ Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks https://arxiv.org/pdf/1908.10084.pdf
|
| 102 |
+
|
| 103 |
+
TO READ SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes https://arxiv.org/abs/2302.06587
|
| 104 |
+
|
| 105 |
+
TO READ LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval https://arxiv.org/pdf/2302.02908.pdf
|
| 106 |
+
|
| 107 |
+
TO READ InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval https://arxiv.org/pdf/2301.01820.pdf
|
| 108 |
+
|
| 109 |
+
TO READ ExaRanker: Explanation-Augmented Neural Ranker https://arxiv.org/abs/2301.10521
|
| 110 |
+
|
| 111 |
+
01.03.2023 #######################################################################################################
|
| 112 |
+
|
| 113 |
+
Language Is Not All You Need: Aligning Perception with Language Models (Kosmos-1 from microsoft) https://arxiv.org/pdf/2302.14045.pdf
|
| 114 |
+
Authors combine image embeddings from VIT-L/14 and texts. Then train LLM on it.
|
| 115 |
+
|
| 116 |
+
03.03.2023 #######################################################################################################
|
| 117 |
+
DEMONSTRATE–SEARCH–PREDICT: Composing retrieval and language models for knowledge-intensive NLP https://arxiv.org/pdf/2212.14024.pdf
|
| 118 |
+
GPT-3 взаимодействует с Colbert-V2. Примеры взаимодействия: https://colab.research.google.com/github/stanfordnlp/dsp/blob/main/intro.ipynb#scrollTo=773rwc-aMuVD
|
| 119 |
+
(TODO дочитать про последнюю часть ноутбука (qa-v2))
|
| 120 |
+
|
| 121 |
+
TO READ Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval https://cs.stanford.edu/~matei/papers/2021/neurips_baleen.pdf
|
| 122 |
+
|
| 123 |
+
10.03.2023 #########################################################################
|
| 124 |
+
Scaling Language-Image Pre-training via Masking https://arxiv.org/pdf/2212.00794.pdf
|
| 125 |
+
(authors present FLIP -- new way to train CLIP faster. They simply mask images during pretraining.
|
| 126 |
+
It allows to use larger batch size (not all patches from image are used) and also allows model
|
| 127 |
+
understand image-text distribution faster)
|
| 128 |
+
|
| 129 |
+
TO READ Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
|
| 130 |
+
|
| 131 |
+
TO READ How to avoid machine learning pitfalls: a guide for academic researchers
|
| 132 |
+
|
| 133 |
+
14.03.2023 ##########################################################################
|
| 134 |
+
TO READ Less is more: Pretrain a strong Siamese encoder for dense text
|
| 135 |
+
retrieval using a weak decoder. https://aclanthology.org/2021.emnlp-main.220.pdf
|
| 136 |
+
"We hypothesize that to perform robust retrieval, the [CLS] vector used for computing
|
| 137 |
+
matching scores should encode all the essential information in the passage. "
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
SIMLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval https://arxiv.org/pdf/2207.02578.pdf
|
| 141 |
+
Authors claim that improved GLUE score does not result in better retrieval performance
|
| 142 |
+
Основная тема -- авторы обучают вместе энеодер и shallow декодер на задаче, похожей на LM.
|
| 143 |
+
Декодер всего из двух слоев и принимает на вход помимо текста CLS эмбеддинг из энкодера.
|
| 144 |
+
Таким образом CLS эмбеддинги лучше выучиваются. Затем энкодер обучается в стиле contriever.
|
| 145 |
+
(TO DO -- посмотреть в ablation. Возможно, они не проверили, что их претрейнинг помогает)
|
| 146 |
+
|
| 147 |
+
TO READ LEXMAE: LEXICON-BOTTLENECKED PRETRAINING FOR LARGE-SCALE RETRIEVAL https://arxiv.org/pdf/2208.14754.pdf
|
| 148 |
+
|
| 149 |
+
17.03.2023 ##########################################################################
|
| 150 |
+
ART: Automatic multi-step reasoning and tool-use for large language models https://arxiv.org/pdf/2303.09014v1.pdf
|
| 151 |
+
|
| 152 |
+
19.03.2023 #########################################################################
|
| 153 |
+
How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval
|
| 154 |
+
|
| 155 |
+
04.04.2023 ########################################################################
|
| 156 |
+
TOKEN MERGING: YOUR VIT BUT FASTER https://arxiv.org/pdf/2210.09461.pdf
|
| 157 |
+
Чуваки предлагают ускорять вижуал трансформер при помощи соединения токенов.
|
| 158 |
+
На каждом слое после аттеншона они делят токены на две части (A и B), затем считают скоры между A и B.
|
| 159 |
+
Потом соединяют токены с максимальными симилярити скорами (они также предлагают нормировку на Q и K).
|
| 160 |
+
Таким образом им удалось достичь x2 в скорости при уменьшении качества всего на 0.4%.
|
| 161 |
+
|
| 162 |
+
SPLADE: Sparse Lexical and Expansion Model
|
| 163 |
+
for First Stage Ranking https://arxiv.org/pdf/2107.05720.pdf
|
| 164 |
+
Questions -- Weight tying (use input embeddings as embeddings for MLM head) (does original BERT use weight tying)
|
| 165 |
+
Improvements -- log saturation effect, FLOPS-regularizer
|
| 166 |
+
0.322 MRR@10 on MSMARCO 0.665 on TREC DL 2019
|
| 167 |
+
|
| 168 |
+
SPLADE v2: Sparse Lexical and Expansion Model for
|
| 169 |
+
Information Retrieval
|
| 170 |
+
Modified pooling mechanism from original splade (from sum to max)
|
| 171 |
+
Extension of model without query expansion (SPLADE-doc)
|
| 172 |
+
Distillation (I did not understand the pipeline)
|
| 173 |
+
SPLADE-doc 0.368 MSMARCO
|
| 174 |
+
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
TO READ
|
| 178 |
+
Learning to retrieve prompts for in-context learning.
|
| 179 |
+
Selective annotation makes language models better few-shot learners.
|
| 180 |
+
Rethinking the role of demonstrations: What makes in-context learning work?
|
| 181 |
+
Language Model Crossover: Variation through Few-Shot Prompting
|
| 182 |
+
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback∗
|
| 183 |
+
Active Prompting with Chain-of-Thought for Large Language Models
|
| 184 |
+
ControlNet
|
| 185 |
+
How Does In-Context Learning Help Prompt Tuning?
|
| 186 |
+
BLEU metric
|
| 187 |
+
|
| 188 |
+
TO READ!!!!!
|
| 189 |
+
1) Ultra-High Dimensional Sparse Representations with Binarization for
|
| 190 |
+
Efficient Text Retrieval - https://aclanthology.org/2021.emnlp-main.78.pdf UHD-BERT
|
| 191 |
+
2) (query likelihood) TILDE https://espace.library.uq.edu.au/data/UQ_b024b10/arvin2021tilde.pdf?Expires=1680013702&Key-Pair-Id=APKAJKNBJ4MJBJNC6NLQ&Signature=bDdC3xFxyJngCdV69kr3J99~UsnjdFEH6jzRgwy7KkRAZFhbZNTRBJSp6p5cC3hz8dp7lc85-flXx00sBVRd1DqP9sG73-sI6aPNNEDoNxc0eBcZafmbzQ7ARBCAPmpybc4Z2F1RnH29eGW1AExWyQKquBBLQE8li-iLT~jILV5p3YCt-Shzt9HBV7pNUB7zJA3R~GTYVlCiFfLZhy7PvyQ6KH~rJHukWua5ULsuJcicdHg01SKviH2nt9YPuFVV6SDECMJVaALgiZYhCo9GzftC-Sh1BgZLlLFIpGYxU4C1M1xwGykzQUkHKx0CPJu56DtrZGNQGqDWzXIkyvaBPA__
|
| 192 |
+
3) DeepCT - term weightning as regression problem measuring query term recall. !!!
|
| 193 |
+
4) Learning to Tokenize for Generative Retrieval
|
| 194 |
+
|
| 195 |
+
RELEVANT DATASETS
|
| 196 |
+
Social media conversations
|
| 197 |
+
|
| 198 |
+
TASKS
|
| 199 |
+
WikiHow
|
| 200 |
+
history.stackexchange.com
|
| 201 |
+
*.stackexchange.com
|
| 202 |
+
список источников с QA со ссылками и длинными ответами. Обозначить темы
|
| 203 |
+
Посмотреть, на какие ссылки ссылаются в ответах
|
| 204 |
+
|
| 205 |
+
METRICS
|
| 206 |
+
for longform qa -- ROUGE-L
|
| 207 |
+
|
| 208 |
+
PROBLEMS
|
| 209 |
+
|
| 210 |
+
dataset ELI5 - data leak (article Hurdles to Progress in Long-form Question Answering -- https://arxiv.org/pdf/2103.06332v2.pdf)
|
| 211 |
+
"Our analysis reveals that this result is partially due to significant train / validation overlap in the ELI5 dataset"
|
| 212 |
+
"A human study shows that at least 81% of validation questions have a paraphrase in the training set, and almost all validation questions are topically similar
|
| 213 |
+
to a training set question."
|
| 214 |
+
"While Fan et al. (2019) attempted to identify and remove question overlap using TF-IDF similarity, more complex semantic matching methods & human verification is needed to address this issue in future LFQA datasets."
|
| 215 |
+
"Digging deeper, we identify fundamental issues with using ROUGE-L to evaluate generated answer quality (Figure 1b). Simple baselines such as just repeatedly copying the question, or choosing a random training set answer,
|
| 216 |
+
can outperform LFQA systems such as RAG (Lewis et al., 2020c) in terms of ROUGE-L.
|
| 217 |
+
On the other hand, our system achieves
|
| 218 |
+
higher ROUGE-L than reference human-written
|
| 219 |
+
answers, which is misleading since human A/B
|
| 220 |
+
testers strongly prefer reference answers to our system’s."
|
| 221 |
+
"We conclude that ROUGE-L is not a reliable metric to evaluate LFQA due to its large and
|
| 222 |
+
relatively unconstrained output space (e.g., compared
|
| 223 |
+
to translation or summarization), and we offer suggestions for better automatic & human evaluations
|
| 224 |
+
to enable meaningful progress on this task."
|
| 225 |
+
##################################################################################################################
|
| 226 |
+
|
| 227 |
+
|
| 228 |
+
|
| 229 |
+
TO FIND:
|
| 230 |
+
2/2 "Soft Prompt Decoding for Multilingual Dense Retrieval" was made possible by the first author
|
| 231 |
+
@huang_zhiqi
|
| 232 |
+
, alone with collaborators James Allen and
|
| 233 |
+
@HamedZamani
|
| 234 |
+
Smooth Operators 😎 (for Effective Systematic Review Queries) accepted at #sigir2023 w/
|
| 235 |
+
@fschlatt1
|
| 236 |
+
and
|
| 237 |
+
@martinpotthast
|
| 238 |
+
|
| 239 |
+
Webis group
|
| 240 |
+
Universität Tübingen
|
| 241 |
+
AIHannover
|
| 242 |
+
|