Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ZhenYE's picture
10 7 33

ZhenYE

ZhenYe234
HKUST-Audio's profile picture iearthshine's profile picture ZheqiDAI's profile picture
·
https://github.com/zhenye234
  • zhenye234

AI & ML interests

None yet

Organizations

HKUST Audio's profile picture

upvoted a paper 7 months ago

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

Paper • 2510.09606 • Published Oct 10, 2025 • 18
upvoted 2 collections about 1 year ago

Canary-TTS

Collection
10 items • Updated Mar 2 • 3

Multimodal Reasoning

Collection
179 items • Updated Feb 7 • 40
upvoted a paper about 1 year ago

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6, 2025 • 72
upvoted an article over 1 year ago
view article
Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

Steveeeeeeen
•
Feb 11, 2025
• 34
upvoted a paper over 1 year ago

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published Feb 6, 2025 • 27
upvoted a collection over 1 year ago

Llasa

Collection
TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated May 11, 2025 • 21
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs