-
NousResearch/Hermes-2-Theta-Llama-3-70B
Text Generation • 71B • Updated • 1.77k • • 82 -
NousResearch/Hermes-2-Pro-Llama-3-70B
Text Generation • 71B • Updated • 180 • • 35 -
NousResearch/Hermes-2-Theta-Llama-3-8B
Text Generation • 8B • Updated • 11.3k • • 207 -
NousResearch/Hermes-2-Pro-Llama-3-8B
Text Generation • 8B • Updated • 20.5k • • 453
Collections
Discover the best community collections!
Collections trending this week
-
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 24 -
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Paper • 2505.20355 • Published • 37 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 47 -
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
Paper • 2509.15591 • Published • 46
-
distilbert/distilbert-base-cased
Fill-Mask • 65.8M • Updated • 315k • • 67 -
distilbert/distilbert-base-uncased
Fill-Mask • 67M • Updated • 8.86M • • 903 -
distilbert/distilbert-base-multilingual-cased
Fill-Mask • 0.1B • Updated • 600k • • 244 -
distilbert/distilbert-base-uncased-finetuned-sst-2-english
Text Classification • 67M • Updated • 3.64M • • 912
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 85 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper • 2309.10818 • Published • 11 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 24 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 40
-
NousResearch/Hermes-2-Theta-Llama-3-70B
Text Generation • 71B • Updated • 1.77k • • 82 -
NousResearch/Hermes-2-Pro-Llama-3-70B
Text Generation • 71B • Updated • 180 • • 35 -
NousResearch/Hermes-2-Theta-Llama-3-8B
Text Generation • 8B • Updated • 11.3k • • 207 -
NousResearch/Hermes-2-Pro-Llama-3-8B
Text Generation • 8B • Updated • 20.5k • • 453
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 85 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper • 2309.10818 • Published • 11 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 24 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 40
-
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 24 -
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Paper • 2505.20355 • Published • 37 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 47 -
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
Paper • 2509.15591 • Published • 46
-
distilbert/distilbert-base-cased
Fill-Mask • 65.8M • Updated • 315k • • 67 -
distilbert/distilbert-base-uncased
Fill-Mask • 67M • Updated • 8.86M • • 903 -
distilbert/distilbert-base-multilingual-cased
Fill-Mask • 0.1B • Updated • 600k • • 244 -
distilbert/distilbert-base-uncased-finetuned-sst-2-english
Text Classification • 67M • Updated • 3.64M • • 912