maya-research
/

maya1

@@ -7,11 +7,9 @@ datasets: proprietary
 pipeline_tag: text-to-speech
 ---
-# Maya-1-Voice
-**Maya-1-Voice** is an open source voice AI model for English with voice design and 20+ human emotions.
-State-of-the-art from the open source community. Production-ready.
 **What it does:**
 - Voice design through natural language descriptions
@@ -98,7 +96,7 @@ After all we went through to pull him out of that mess <cry> I can't believe he
 ---
-## Why Maya-1-Voice is Different: Voice Design Features That Matter
 ### 1. Natural Language Voice Control
 Describe voices like you would brief a voice actor:
@@ -133,7 +131,7 @@ Real-time voice synthesis with SNAC neural codec (~0.98 kbps). Perfect for:
 ---
-## How to Use Maya-1-Voice: Download and Run in Minutes
 ### Quick Start: Generate Voice with Emotions
@@ -145,18 +143,18 @@ import soundfile as sf
 # Load the best open source voice AI model
 model = AutoModelForCausalLM.from_pretrained(
-    "maya-research/maya-1-voice",
     torch_dtype=torch.bfloat16,
     device_map="auto"
 )
-tokenizer = AutoTokenizer.from_pretrained("maya-research/maya-1-voice")
 # Load SNAC audio decoder (24kHz)
 snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz").eval().to("cuda")
 # Design your voice with natural language
 description = "Realistic male voice in the 30s age with american accent. Normal pitch, warm timbre, conversational pacing."
-text = "Hello! This is Maya-1-Voice <laugh> the best open source voice AI model with emotions."
 # Create prompt with voice design
 prompt = f'<description="{description}"> {text}'
@@ -192,14 +190,14 @@ with torch.inference_mode():
 # Save your emotional voice output
 sf.write("output.wav", audio, 24000)
-print("✅ Voice generated successfully! Play output.wav")
 ```
 ### Advanced: Production Streaming with vLLM
 For production deployments with real-time streaming, use our vLLM script:
-**Download:** [vllm_streaming_inference.py](https://huggingface.co/maya-research/maya-1-voice/blob/main/vllm_streaming_inference.py)
 **Key Features:**
 - Automatic Prefix Caching (APC) for repeated voice descriptions
@@ -209,7 +207,7 @@ For production deployments with real-time streaming, use our vLLM script:
 ---
-## Technical Excellence: What Makes Maya-1-Voice the Best
 ### Architecture: 3B-Parameter Llama Backbone for Voice
@@ -278,7 +276,7 @@ Build screen readers and assistive technologies with natural, engaging voices.
 ## Frequently Asked Questions
-**Q: What makes Maya-1-Voice different?**
 A: We're the only open source model offering 20+ emotions, zero-shot voice design, production-ready streaming, and 3B parameters—all in one package.
 **Q: Can I use this commercially?**
@@ -303,7 +301,7 @@ A: Yes. SNAC codec enables sub-100ms latency with vLLM deployment.
 ## Comparison
-| Feature | Maya-1-Voice | ElevenLabs | OpenAI TTS | Coqui TTS |
 |---------|-------------|------------|------------|-----------|
 | **Open Source** | Yes | No | No | Yes |
 | **Emotions** | 20+ | Limited | No | No |
@@ -337,11 +335,11 @@ A: Yes. SNAC codec enables sub-100ms latency with vLLM deployment.
 ```bash
 # Clone the model repository
 git lfs install
-git clone https://huggingface.co/maya-research/maya-1-voice
 # Or load directly in Python
 from transformers import AutoModelForCausalLM
-model = AutoModelForCausalLM.from_pretrained("maya-research/maya-1-voice")
 ```
 ### Requirements
@@ -350,23 +348,23 @@ pip install torch transformers snac soundfile
 ```
 ### Additional Resources
-- **Full emotion list:** [emotions.txt](https://huggingface.co/maya-research/maya-1-voice/blob/main/emotions.txt)
-- **Prompt examples:** [prompt.txt](https://huggingface.co/maya-research/maya-1-voice/blob/main/prompt.txt)
-- **Streaming script:** [vllm_streaming_inference.py](https://huggingface.co/maya-research/maya-1-voice/blob/main/vllm_streaming_inference.py)
 ---
 ## Citations & References
-If you use Maya-1-Voice in your research or product, please cite:
 ```bibtex
 @misc{maya1voice2025,
-  title={Maya-1-Voice: Open Source Voice AI with Emotional Intelligence},
   author={Maya Research},
   year={2025},
   publisher={Hugging Face},
-  howpublished={\url{https://huggingface.co/maya-research/maya-1-voice}},
 }
 ```

 pipeline_tag: text-to-speech
 ---
+# Maya1
+**Maya1** is a speech model built for expressive voice generation with rich human emotion and precise voice design.
 **What it does:**
 - Voice design through natural language descriptions
 ---
+## Why Maya1 is Different: Voice Design Features That Matter
 ### 1. Natural Language Voice Control
 Describe voices like you would brief a voice actor:
 ---
+## How to Use maya1: Download and Run in Minutes
 ### Quick Start: Generate Voice with Emotions
 # Load the best open source voice AI model
 model = AutoModelForCausalLM.from_pretrained(
+    "maya-research/maya1",
     torch_dtype=torch.bfloat16,
     device_map="auto"
 )
+tokenizer = AutoTokenizer.from_pretrained("maya-research/maya1")
 # Load SNAC audio decoder (24kHz)
 snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz").eval().to("cuda")
 # Design your voice with natural language
 description = "Realistic male voice in the 30s age with american accent. Normal pitch, warm timbre, conversational pacing."
+text = "Hello! This is Maya1 <laugh> the best open source voice AI model with emotions."
 # Create prompt with voice design
 prompt = f'<description="{description}"> {text}'
 # Save your emotional voice output
 sf.write("output.wav", audio, 24000)
+print("Voice generated successfully! Play output.wav")
 ```
 ### Advanced: Production Streaming with vLLM
 For production deployments with real-time streaming, use our vLLM script:
+**Download:** [vllm_streaming_inference.py](https://huggingface.co/maya-research/maya1/blob/main/vllm_streaming_inference.py)
 **Key Features:**
 - Automatic Prefix Caching (APC) for repeated voice descriptions
 ---
+## Technical Excellence: What Makes Maya1 the Best
 ### Architecture: 3B-Parameter Llama Backbone for Voice
 ## Frequently Asked Questions
+**Q: What makes Maya1 different?**
 A: We're the only open source model offering 20+ emotions, zero-shot voice design, production-ready streaming, and 3B parameters—all in one package.
 **Q: Can I use this commercially?**
 ## Comparison
+| Feature | Maya1 | ElevenLabs | OpenAI TTS | Coqui TTS |
 |---------|-------------|------------|------------|-----------|
 | **Open Source** | Yes | No | No | Yes |
 | **Emotions** | 20+ | Limited | No | No |
 ```bash
 # Clone the model repository
 git lfs install
+git clone https://huggingface.co/maya-research/maya1
 # Or load directly in Python
 from transformers import AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("maya-research/maya1")
 ```
 ### Requirements
 ```
 ### Additional Resources
+- **Full emotion list:** [emotions.txt](https://huggingface.co/maya-research/maya1/blob/main/emotions.txt)
+- **Prompt examples:** [prompt.txt](https://huggingface.co/maya-research/maya1/blob/main/prompt.txt)
+- **Streaming script:** [vllm_streaming_inference.py](https://huggingface.co/maya-research/maya1/blob/main/vllm_streaming_inference.py)
 ---
 ## Citations & References
+If you use Maya1 in your research or product, please cite:
 ```bibtex
 @misc{maya1voice2025,
+  title={Maya1: Open Source Voice AI with Emotional Intelligence},
   author={Maya Research},
   year={2025},
   publisher={Hugging Face},
+  howpublished={\url{https://huggingface.co/maya-research/maya1}},
 }
 ```