ChaangHaan 's Collections Music
updated
aMUSEd: An Open MUSE Reproduction
Paper
• 2401.01808
• Published • 31
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper
• 2401.01885
• Published • 28
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via
Stein Identity
Paper
• 2401.00604
• Published • 6
LARP: Language-Agent Role Play for Open-World Games
Paper
• 2312.17653
• Published • 33
Learning Vision from Models Rivals Learning Vision from Data
Paper
• 2312.17742
• Published • 16
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper
• 2312.16862
• Published • 31
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Paper
• 2312.16457
• Published • 15
InsActor: Instruction-driven Physics-based Characters
Paper
• 2312.17135
• Published • 10
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with
Time-Decoupled Training and Reusable Coop-Diffusion
Paper
• 2312.16486
• Published • 7
SSR-Encoder: Encoding Selective Subject Representation for
Subject-Driven Generation
Paper
• 2312.16272
• Published • 7
Prompt Expansion for Adaptive Text-to-Image Generation
Paper
• 2312.16720
• Published • 5
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
• 2312.15166
• Published • 61
Make-A-Character: High Quality Text-to-3D Character Generation within
Minutes
Paper
• 2312.15430
• Published • 28
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
Paper
• 2312.15715
• Published • 20
LangSplat: 3D Language Gaussian Splatting
Paper
• 2312.16084
• Published • 16
One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and
Erasing Applications
Paper
• 2312.16145
• Published • 10
Supervised Knowledge Makes Large Language Models Better In-context
Learners
Paper
• 2312.15918
• Published • 9
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Paper
• 2312.14233
• Published • 16
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
Visual-Linguistic Tasks
Paper
• 2312.14238
• Published • 20
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
Paper
• 2312.14878
• Published • 15
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
Paper
• 2312.14385
• Published • 7
Shai: A large language model for asset management
Paper
• 2312.14203
• Published • 6
LLM4VG: Large Language Models Evaluation for Video Grounding
Paper
• 2312.14206
• Published • 3
DreamTuner: Single Image is Enough for Subject-Driven Generation
Paper
• 2312.13691
• Published • 27
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
Paper
• 2312.13913
• Published • 24
Time is Encoded in the Weights of Finetuned Language Models
Paper
• 2312.13401
• Published • 20
PIA: Your Personalized Image Animator via Plug-and-Play Modules in
Text-to-Image Models
Paper
• 2312.13964
• Published • 19
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image
Inpainting with Diffusion Models
Paper
• 2312.14091
• Published • 17
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Paper
• 2312.13789
• Published • 15
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion
Models with RL Finetuning
Paper
• 2312.13980
• Published • 14
Neural feels with neural fields: Visuo-tactile perception for in-hand
manipulation
Paper
• 2312.13469
• Published • 11
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed
Diffusion Models
Paper
• 2312.13763
• Published • 10
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors
Paper
• 2312.13324
• Published • 11
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
Paper
• 2312.13314
• Published • 8
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs
Paper
• 2312.14140
• Published • 7
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper
• 2312.12456
• Published • 45
Generative Multimodal Models are In-Context Learners
Paper
• 2312.13286
• Published • 36
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model
Paper
• 2312.13252
• Published • 27
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Paper
• 2312.12490
• Published • 19
Cached Transformers: Improving Transformers with Differentiable Memory
Cache
Paper
• 2312.12742
• Published • 13
Repaint123: Fast and High-quality One Image to 3D Generation with
Progressive Controllable 2D Repainting
Paper
• 2312.13271
• Published • 5
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
• 2312.11514
• Published • 264
StarVector: Generating Scalable Vector Graphics Code from Images
Paper
• 2312.11556
• Published • 38
3D-LFM: Lifting Foundation Model
Paper
• 2312.11894
• Published • 15
HAAR: Text-Conditioned Generative Model of 3D Strand-based Human
Hairstyles
Paper
• 2312.11666
• Published • 13
Jack of All Tasks, Master of Many: Designing General-purpose
Coarse-to-Fine Vision-Language Model
Paper
• 2312.12423
• Published • 13
MixRT: Mixed Neural Representations For Real-Time NeRF Rendering
Paper
• 2312.11841
• Published • 11
Tracking Any Object Amodally
Paper
• 2312.12433
• Published • 12
FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple
Super-Resolution Pipeline
Paper
• 2312.11537
• Published • 8
TIP: Text-Driven Image Processing with Semantic and Restoration
Instructions
Paper
• 2312.11595
• Published • 6
Text-Conditioned Resampler For Long Form Video Understanding
Paper
• 2312.11897
• Published • 6
Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided
Document Generation
Paper
• 2312.11532
• Published • 6
Customize-It-3D: High-Quality 3D Creation from A Single Image Using
Subject-Specific Knowledge Prior
Paper
• 2312.11535
• Published • 7
Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint
Method
Paper
• 2312.12030
• Published • 6
VecFusion: Vector Font Generation with Diffusion
Paper
• 2312.10540
• Published • 22
Rich Human Feedback for Text-to-Image Generation
Paper
• 2312.10240
• Published • 20
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip
Connection Editing
Paper
• 2312.11392
• Published • 20
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Paper
• 2312.11370
• Published • 20
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Paper
• 2312.10763
• Published • 19
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
Paper
• 2312.11461
• Published • 20
MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual
Storytelling via Multi-Layered Semantic-Aware Denoising
Paper
• 2312.10899
• Published • 15
MAG-Edit: Localized Image Editing in Complex Scenarios via
Mask-Based Attention-Adjusted
Guidance
Paper
• 2312.11396
• Published • 11
Cascade Speculative Drafting for Even Faster LLM Inference
Paper
• 2312.11462
• Published • 10
Silkie: Preference Distillation for Large Visual Language Models
Paper
• 2312.10665
• Published • 11
VidToMe: Video Token Merging for Zero-Shot Video Editing
Paper
• 2312.10656
• Published • 11
ProTIP: Progressive Tool Retrieval Improves Planning
Paper
• 2312.10332
• Published • 8
Your Student is Better Than Expected: Adaptive Teacher-Student
Collaboration for Text-Conditional Diffusion Models
Paper
• 2312.10835
• Published • 7
VolumeDiffusion: Flexible Text-to-3D Generation with Efficient
Volumetric Encoder
Paper
• 2312.11459
• Published • 6
GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View
Synthesis
Paper
• 2312.11458
• Published • 5
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper
• 2312.09911
• Published • 55
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper
• 2312.10003
• Published • 44
DreamTalk: When Expressive Talking Head Generation Meets Diffusion
Probabilistic Models
Paper
• 2312.09767
• Published • 27
MobileSAMv2: Faster Segment Anything to Everything
Paper
• 2312.09579
• Published • 24
Point Transformer V3: Simpler, Faster, Stronger
Paper
• 2312.10035
• Published • 23
Weight subcloning: direct initialization of transformers using larger
pretrained ones
Paper
• 2312.09299
• Published • 18
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
• 2312.09608
• Published • 16
Self-Evaluation Improves Selective Generation in Large Language Models
Paper
• 2312.09300
• Published • 16
Stable Score Distillation for High-Quality 3D Generation
Paper
• 2312.09305
• Published • 10
Faithful Persona-based Conversational Dataset Generation with Large
Language Models
Paper
• 2312.10007
• Published • 11
StemGen: A music generation model that listens
Paper
• 2312.08723
• Published • 48
TinyGSM: achieving >80% on GSM8k with small language models
Paper
• 2312.09241
• Published • 40
CogAgent: A Visual Language Model for GUI Agents
Paper
• 2312.08914
• Published • 31
VideoLCM: Video Latent Consistency Model
Paper
• 2312.09109
• Published • 23
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style
Models on Dense Captions
Paper
• 2312.08578
• Published • 20
Pixel Aligned Language Models
Paper
• 2312.09237
• Published • 16
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained
Geometry and Appearance
Paper
• 2312.08889
• Published • 15
Vision-Language Models as a Source of Rewards
Paper
• 2312.09187
• Published • 12
FineControlNet: Fine-level Text Control for Image Generation with
Spatially Aligned Text Control Injection
Paper
• 2312.09252
• Published • 12
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Paper
• 2312.09067
• Published • 15
LIME: Localized Image Editing via Attention Regularization in Diffusion
Models
Paper
• 2312.09256
• Published • 10
General Object Foundation Model for Images and Videos at Scale
Paper
• 2312.09158
• Published • 11
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D
Generation
Paper
• 2312.08754
• Published • 11
VL-GPT: A Generative Pre-trained Transformer for Vision and Language
Understanding and Generation
Paper
• 2312.09251
• Published • 10
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
Paper
• 2312.09246
• Published • 8
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper
• 2312.07987
• Published • 41
Distributed Inference and Fine-tuning of Large Language Models Over The
Internet
Paper
• 2312.08361
• Published • 27
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Paper
• 2312.07661
• Published • 18
Foundation Models in Robotics: Applications, Challenges, and the Future
Paper
• 2312.07843
• Published • 16
Invariant Graph Transformer
Paper
• 2312.07859
• Published • 9
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Paper
• 2312.08344
• Published • 13
ProNeRF: Learning Efficient Projection-Aware Ray Sampling for
Fine-Grained Implicit Neural Radiance Fields
Paper
• 2312.08136
• Published • 6
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Paper
• 2312.07537
• Published • 27
VILA: On Pre-training for Visual Language Models
Paper
• 2312.07533
• Published • 21
FreeControl: Training-Free Spatial Control of Any Text-to-Image
Diffusion Model with Any Condition
Paper
• 2312.07536
• Published • 18
Interfacing Foundation Models' Embeddings
Paper
• 2312.07532
• Published • 11
CCM: Adding Conditional Controls to Text-to-Image Consistency Models
Paper
• 2312.06971
• Published • 12
Steering Llama 2 via Contrastive Activation Addition
Paper
• 2312.06681
• Published • 14
Honeybee: Locality-enhanced Projector for Multimodal LLM
Paper
• 2312.06742
• Published • 13
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point
Clouds Generation
Paper
• 2312.07231
• Published • 10
PEEKABOO: Interactive Video Generation via Masked-Diffusion
Paper
• 2312.07509
• Published • 11
"I Want It That Way": Enabling Interactive Decision Support Using Large
Language Models and Constraint Programming
Paper
• 2312.06908
• Published • 8
LLM360: Towards Fully Transparent Open-Source LLMs
Paper
• 2312.06550
• Published • 57
Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D
Prior
Paper
• 2312.06655
• Published • 24
Photorealistic Video Generation with Diffusion Models
Paper
• 2312.06662
• Published • 24
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Paper
• 2312.06109
• Published • 21
Context Tuning for Retrieval Augmented Generation
Paper
• 2312.05708
• Published • 16
From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3"
Paper
• 2312.06571
• Published • 13
Efficient Quantization Strategies for Latent Diffusion Models
Paper
• 2312.05431
• Published • 11
Federated Full-Parameter Tuning of Billion-Sized Language Models with
Communication Cost under 18 Kilobytes
Paper
• 2312.06353
• Published • 7
Evaluation of Large Language Models for Decision Making in Autonomous
Driving
Paper
• 2312.06351
• Published • 6
Using Captum to Explain Generative Language Models
Paper
• 2312.05491
• Published • 4
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable
Sequence Processing
Paper
• 2312.05605
• Published • 4
DreaMoving: A Human Dance Video Generation Framework based on Diffusion
Models
Paper
• 2312.05107
• Published • 39
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Paper
• 2312.04655
• Published • 21
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D
priors
Paper
• 2312.04963
• Published • 17
Customizing Motion in Text-to-Video Diffusion Models
Paper
• 2312.04966
• Published • 11
PathFinder: Guided Search over Multi-Step Reasoning Paths
Paper
• 2312.05180
• Published • 10
MVDD: Multi-View Depth Diffusion Models
Paper
• 2312.04875
• Published • 10
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
Models with 3D Parallelism
Paper
• 2312.04916
• Published • 7
Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Paper
• 2312.04837
• Published • 3
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Paper
• 2312.03818
• Published • 34
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Paper
• 2312.04474
• Published • 34
Controllable Human-Object Interaction Synthesis
Paper
• 2312.03913
• Published • 23
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Paper
• 2312.03793
• Published • 18
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper
• 2312.04461
• Published • 62
Pearl: A Production-ready Reinforcement Learning Agent
Paper
• 2312.03814
• Published • 15
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper
• 2312.04410
• Published • 15
GenTron: Delving Deep into Diffusion Transformers for Image and Video
Generation
Paper
• 2312.04557
• Published • 13
NeRFiller: Completing Scenes via Generative 3D Inpainting
Paper
• 2312.04560
• Published • 13
Large Language Models for Mathematicians
Paper
• 2312.04556
• Published • 12
Gen2Det: Generate to Detect
Paper
• 2312.04566
• Published • 10
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Paper
• 2312.04483
• Published • 7
Efficient Monotonic Multihead Attention
Paper
• 2312.04515
• Published • 8
Generating Illustrated Instructions
Paper
• 2312.04552
• Published • 9
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction
Tuning
Paper
• 2312.03849
• Published • 8
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper
• 2312.03491
• Published • 34
Relightable Gaussian Codec Avatars
Paper
• 2312.03704
• Published • 32
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic
Gaussians
Paper
• 2312.03029
• Published • 27
MotionCtrl: A Unified and Flexible Motion Controller for Video
Generation
Paper
• 2312.03641
• Published • 22
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Paper
• 2312.03209
• Published • 21
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian
Splatting
Paper
• 2312.03461
• Published • 17
Context Diffusion: In-Context Aware Image Generation
Paper
• 2312.03584
• Published • 15
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
Paper
• 2312.03079
• Published • 15
DreamComposer: Controllable 3D Object Generation via Multi-View
Conditions
Paper
• 2312.03611
• Published • 8
MagicStick: Controllable Video Editing via Control Handle
Transformations
Paper
• 2312.03047
• Published • 11
Self-conditioned Image Generation via Generating Representations
Paper
• 2312.03701
• Published • 9
Generative agent-based modeling with actions grounded in physical,
social, or digital space using Concordia
Paper
• 2312.03664
• Published • 11
Language-Informed Visual Concept Learning
Paper
• 2312.03587
• Published • 8
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded
Diffusion Model
Paper
• 2312.02238
• Published • 27
LivePhoto: Real Image Animation with Text-guided Motion Control
Paper
• 2312.02928
• Published • 18
Describing Differences in Image Sets with Natural Language
Paper
• 2312.02974
• Published • 15
Orthogonal Adaptation for Modular Customization of Diffusion Models
Paper
• 2312.02432
• Published • 14
DragVideo: Interactive Drag-style Video Editing
Paper
• 2312.02216
• Published • 12
MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human
Captures
Paper
• 2312.02963
• Published • 10
Fine-grained Controllable Video Generation via Object Appearance and
Context
Paper
• 2312.02919
• Published • 13
ReconFusion: 3D Reconstruction with Diffusion Priors
Paper
• 2312.02981
• Published • 10
Training Chain-of-Thought via Latent-Variable Inference
Paper
• 2312.02179
• Published • 9
Alchemist: Parametric Control of Material Properties with Diffusion
Models
Paper
• 2312.02970
• Published • 9
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Paper
• 2312.02949
• Published • 14
GPT4Point: A Unified Framework for Point-Language Understanding and
Generation
Paper
• 2312.02980
• Published • 9
Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions
Paper
• 2312.02772
• Published • 7
Magicoder: Source Code Is All You Need
Paper
• 2312.02120
• Published • 82
VMC: Video Motion Customization using Temporal Attention Adaption for
Text-to-Video Diffusion Models
Paper
• 2312.00845
• Published • 39
DeepCache: Accelerating Diffusion Models for Free
Paper
• 2312.00858
• Published • 23
Nash Learning from Human Feedback
Paper
• 2312.00886
• Published • 18
DiffiT: Diffusion Vision Transformers for Image Generation
Paper
• 2312.02139
• Published • 15
GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for
Real-time Human Novel View Synthesis
Paper
• 2312.02155
• Published • 14
Object Recognition as Next Token Prediction
Paper
• 2312.02142
• Published • 13
GIVT: Generative Infinite-Vocabulary Transformers
Paper
• 2312.02116
• Published • 12
Paper
• 2312.00860
• Published • 10
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from
Fine-grained Correctional Human Feedback
Paper
• 2312.00849
• Published • 12
Style Aligned Image Generation via Shared Attention
Paper
• 2312.02133
• Published • 11
Generative Rendering: Controllable 4D-Guided Video Generation with 2D
Diffusion Models
Paper
• 2312.01409
• Published • 10
VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
Paper
• 2312.01407
• Published • 8
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training
Paper
• 2312.01663
• Published • 6
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
• 2312.00752
• Published • 150
Merlin:Empowering Multimodal LLMs with Foresight Minds
Paper
• 2312.00589
• Published • 27
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper
• 2312.00777
• Published • 24
SeaLLMs -- Large Language Models for Southeast Asia
Paper
• 2312.00738
• Published • 25
MoMask: Generative Masked Modeling of 3D Human Motions
Paper
• 2312.00063
• Published • 18
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Paper
• 2312.00093
• Published • 17
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion
Models
Paper
• 2312.00079
• Published • 17
Dolphins: Multimodal Language Model for Driving
Paper
• 2312.00438
• Published • 15
Instruction-tuning Aligns LLMs to the Human Brain
Paper
• 2312.00575
• Published • 15
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style
Adapter
Paper
• 2312.00330
• Published • 13
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
Paper
• 2312.00109
• Published • 12
PyNeRF: Pyramidal Neural Radiance Fields
Paper
• 2312.00252
• Published • 11
Towards Accurate Differential Diagnosis with Large Language Models
Paper
• 2312.00164
• Published • 11
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
Paper
• 2312.00451
• Published • 12
Text-Guided 3D Face Synthesis -- From Generation to Editing
Paper
• 2312.00375
• Published • 11
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap
Between Text-to-2D and Text-to-3D Generation
Paper
• 2312.00085
• Published • 9
FusionFrames: Efficient Architectural Aspects for Text-to-Video
Generation Pipeline
Paper
• 2311.13073
• Published • 58
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper
• 2311.13384
• Published • 53
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper
• 2311.13600
• Published • 47
Diffusion Model Alignment Using Direct Preference Optimization
Paper
• 2311.12908
• Published • 49
Using Human Feedback to Fine-tune Diffusion Models without Any Reward
Model
Paper
• 2311.13231
• Published • 28
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Paper
• 2311.13435
• Published • 18
Visual In-Context Prompting
Paper
• 2311.13601
• Published • 18
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on
Diffusion Models
Paper
• 2311.13141
• Published • 16
MagicDance: Realistic Human Dance Video Generation with Motions & Facial
Expressions Transfer
Paper
• 2311.12052
• Published • 32
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
Paper
• 2311.12198
• Published • 22
NeuroPrompts: An Adaptive Framework to Optimize Prompts for
Text-to-Image Generation
Paper
• 2311.12229
• Published • 25
Exponentially Faster Language Modelling
Paper
• 2311.10770
• Published • 119
Make Pixels Dance: High-Dynamic Video Generation
Paper
• 2311.10982
• Published • 68
Orca 2: Teaching Small Language Models How to Reason
Paper
• 2311.11045
• Published • 77
System 2 Attention (is something you might need too)
Paper
• 2311.11829
• Published • 43
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper
• 2311.11501
• Published • 37
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human
Expression
Paper
• 2311.10794
• Published • 27
AutoStory: Generating Diverse Storytelling Images with Minimal Human
Effort
Paper
• 2311.11243
• Published • 16
Drivable 3D Gaussian Avatars
Paper
• 2311.08581
• Published • 47
GRIM: GRaph-based Interactive narrative visualization for gaMes
Paper
• 2311.09213
• Published • 13
UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations
Paper
• 2311.08469
• Published • 11
PEARL: Personalizing Large Language Model Writing Assistants with
Generation-Calibrated Retrievers
Paper
• 2311.09180
• Published • 8
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads
to Answers Faster
Paper
• 2311.08263
• Published • 16