Narratives in LLM Pretraining Data Collection Models & datasets from Characterizing Narrative Content in Web-Scale LLM Pretraining Data (NarraDolma & NarraBERT) • 7 items • Updated 1 day ago • 2
bearcove/zipa-small-crctc-ns-no-diacritics-700k-mlx-q8 Automatic Speech Recognition • 18.7M • Updated Apr 8 • 27 • 4