Collections
Discover the best community collections!
Collections trending this week
-
Sparse Backpropagation for MoE Training
Paper • 2310.00811 • Published • 2 -
The Forward-Forward Algorithm: Some Preliminary Investigations
Paper • 2212.13345 • Published • 5 -
Fine-Tuning Language Models with Just Forward Passes
Paper • 2305.17333 • Published • 4 -
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Paper • 2309.13192 • Published • 1
-
Concept-Oriented Deep Learning with Large Language Models
Paper • 2306.17089 • Published • 1 -
Extracting Mathematical Concepts with Large Language Models
Paper • 2309.00642 • Published • 1 -
An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning
Paper • 2310.12274 • Published • 13 -
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Paper • 2211.04079 • Published • 1
-
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Paper • 2210.17432 • Published • 2 -
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Paper • 2305.08379 • Published • 3 -
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Paper • 2308.12219 • Published • 1 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 74
-
A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies
Paper • 2302.06218 • Published • 1 -
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Paper • 2306.10209 • Published • 2 -
SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System
Paper • 2205.10034 • Published • 1 -
A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training
Paper • 2303.06318 • Published • 1
-
Concept-Oriented Deep Learning with Large Language Models
Paper • 2306.17089 • Published • 1 -
Extracting Mathematical Concepts with Large Language Models
Paper • 2309.00642 • Published • 1 -
An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning
Paper • 2310.12274 • Published • 13 -
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Paper • 2211.04079 • Published • 1
-
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Paper • 2210.17432 • Published • 2 -
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Paper • 2305.08379 • Published • 3 -
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Paper • 2308.12219 • Published • 1 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 74
-
Sparse Backpropagation for MoE Training
Paper • 2310.00811 • Published • 2 -
The Forward-Forward Algorithm: Some Preliminary Investigations
Paper • 2212.13345 • Published • 5 -
Fine-Tuning Language Models with Just Forward Passes
Paper • 2305.17333 • Published • 4 -
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Paper • 2309.13192 • Published • 1
-
A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies
Paper • 2302.06218 • Published • 1 -
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Paper • 2306.10209 • Published • 2 -
SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System
Paper • 2205.10034 • Published • 1 -
A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training
Paper • 2303.06318 • Published • 1