SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models Paper • 2311.08370 • Published Nov 14, 2023
FinanceBench: A New Benchmark for Financial Question Answering Paper • 2311.11944 • Published Nov 20, 2023
GLIDER: Grading LLM Interactions and Decisions using Explainable Ranking Paper • 2412.14140 • Published Dec 18, 2024 • 1
Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning Paper • 2503.19193 • Published Mar 24, 2025 • 1
view article Article Explore, Curate and Vector Search Any Hugging Face Dataset with Nomic Atlas Jan 23, 2025 • 30
PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct-v1.1 Text Generation • 8B • Updated Jul 31, 2024 • 37.7k • • 10
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 708
PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct Text Generation • 8B • Updated Jul 22, 2024 • 798 • • 43