Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections trending this week

Interesting Models

meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 250 • 4.46k

Named Entity Recognition and Classification

tmskss/linux-man-pages-tldr-summarized

Viewer • Updated Aug 29, 2023 • 481 • 72 • 9

Whisper for audio captioning

Whisper models finetuned on audio captioning instead of speech recognition. These model aim to briefly describe what happens in the audio scene.

MU-NLPC/whisper-large-v2-audio-captioning

Updated Mar 11, 2024 • 64 • 11
MU-NLPC/whisper-small-audio-captioning

Updated Mar 13, 2024 • 168 • 10
MU-NLPC/whisper-tiny-audio-captioning

Updated Mar 11, 2024 • 593 • 15

Papers related to parameter efficient finetuning methods.

LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery

Paper • 2310.18356 • Published Oct 24, 2023 • 24
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 27
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers

Paper • 2309.16119 • Published Sep 28, 2023 • 1
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 46

Papers: Instruct

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 12
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Paper • 2310.00492 • Published Sep 30, 2023 • 2

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 74

Llemma: An Open Language Model For Mathematics

Paper • 2310.10631 • Published Oct 16, 2023 • 57
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 58
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 38
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Paper • 2309.11568 • Published Sep 20, 2023 • 11

Optimised Translation Models 🌍

A collection of optimised and quantised multilingual translation models.

KomorebiAI/nllb-200-3.3B-int8-ct2

Translation • Updated Dec 19, 2024 • 9 • 3
KomorebiAI/nllb-200-3.3B-float16-ct2

Translation • Updated Dec 19, 2024 • 8 • 3
KomorebiAI/nllb-200-3.3B-ct2

Translation • Updated Dec 19, 2024 • 1 • 2
KomorebiAI/nllb-200-1.3B-int8-ct2

Translation • Updated Dec 19, 2024 • 9 • 2

Papers: MoE/Ensemble

Papers related to Mixture of Experts topics.

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 27
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Paper • 2310.03094 • Published Oct 4, 2023 • 13

Interesting Models

meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 250 • 4.46k

Papers: Instruct

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 12
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Paper • 2310.00492 • Published Sep 30, 2023 • 2

Named Entity Recognition and Classification

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 74

tmskss/linux-man-pages-tldr-summarized

Viewer • Updated Aug 29, 2023 • 481 • 72 • 9

Llemma: An Open Language Model For Mathematics

Paper • 2310.10631 • Published Oct 16, 2023 • 57
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 58
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 38
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Paper • 2309.11568 • Published Sep 20, 2023 • 11

Whisper for audio captioning

Whisper models finetuned on audio captioning instead of speech recognition. These model aim to briefly describe what happens in the audio scene.

MU-NLPC/whisper-large-v2-audio-captioning

Updated Mar 11, 2024 • 64 • 11
MU-NLPC/whisper-small-audio-captioning

Updated Mar 13, 2024 • 168 • 10
MU-NLPC/whisper-tiny-audio-captioning

Updated Mar 11, 2024 • 593 • 15

Optimised Translation Models 🌍

A collection of optimised and quantised multilingual translation models.

KomorebiAI/nllb-200-3.3B-int8-ct2

Translation • Updated Dec 19, 2024 • 9 • 3
KomorebiAI/nllb-200-3.3B-float16-ct2

Translation • Updated Dec 19, 2024 • 8 • 3
KomorebiAI/nllb-200-3.3B-ct2

Translation • Updated Dec 19, 2024 • 1 • 2
KomorebiAI/nllb-200-1.3B-int8-ct2

Translation • Updated Dec 19, 2024 • 9 • 2

Papers related to parameter efficient finetuning methods.

LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery

Paper • 2310.18356 • Published Oct 24, 2023 • 24
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 27
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers

Paper • 2309.16119 • Published Sep 28, 2023 • 1
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 46

Papers: MoE/Ensemble

Papers related to Mixture of Experts topics.

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 27
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Paper • 2310.03094 • Published Oct 4, 2023 • 13

Previous
1
...
18,371
18,372
18,373
18,374
18,375
...
19,030
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs