Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Aalisha Dalal's picture
1 1

Aalisha Dalal

aalisha
DShah-11's profile picture
ยท

AI & ML interests

* Computer Vision * Deep Learning

Recent Activity

updated a dataset about 2 months ago
aalisha/srmdtranslations
published a dataset about 2 months ago
aalisha/srmdtranslations
reacted to reach-vb's post with ๐Ÿ‘€ over 1 year ago
Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! ๐Ÿ”ฅ > Pure language modeling approach to TTS > Zero-shot voice cloning > LLaMa architecture w/ Audio tokens (WavTokenizer) > BONUS: Works on-device w/ llama.cpp โšก Three-step approach to TTS: > Audio tokenization using WavTokenizer (75 tok per second) > CTC forced alignment for word-to-audio token mapping > Structured prompt creation w/ transcription, duration, audio tokens The model is extremely impressive for 350M parameters! Kudos to the OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM ๐Ÿค— Check out the models here: https://huggingface.co/collections/OuteAI/outetts-6728aa71a53a076e4ba4817c
View all activity

Organizations

Shrimad Rajchandra Mission Dharampur's profile picture SRMD's profile picture

aalisha 's models

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs