π Twinkle Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt, see more in https://github.com/ai-twinkle/Eval β’ 20 items β’ Updated 4 days ago β’ 1
LLM PlayBooks Collection All useful playbooks for training LLM β’ 6 items β’ Updated 3 days ago β’ 2
π€ Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing β’ 14 items β’ Updated 10 days ago β’ 12
Running on CPU Upgrade 159 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 159 Explore synthetic data experiments in a bookshelf view
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 21 days ago β’ 483
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Paper β’ 2409.02813 β’ Published Sep 4, 2024 β’ 33
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Paper β’ 2404.16006 β’ Published Apr 24, 2024 β’ 2