Upload folder using huggingface_hub

Files changed (3) hide show

README.md ADDED Viewed

+---
+language:
+- en
+tags:
+- text-to-speech
+- tts
+- audio
+license: mit
+---
+# Simple TTS Model
+A lightweight Text-to-Speech model trained on LJSpeech dataset.
+## Model Description
+This is a FastSpeech2-style TTS model with:
+- Transformer encoder for text encoding
+- Duration predictor
+- Transformer decoder for mel spectrogram generation
+## Training
+- Dataset: LJSpeech (5000 samples)
+- Hardware: Kaggle T4 GPU
+- Training time: 20 epochs
+## Model Parameters
+- Total parameters: 5,168,465
+- Hidden dimension: 256
+- Number of layers: 3
+- Attention heads: 4
+## Usage
+```python
+import torch
+# Load model
+checkpoint = torch.load('pytorch_model.bin')
+# Initialize model with config and load weights
+```
+## Limitations
+This is a basic model for demonstration purposes. For production use, consider:
+- Training on more data
+- Adding a vocoder (e.g., HiFi-GAN) for audio generation
+- Using phoneme-based input instead of characters

config.json ADDED Viewed

+{
+  "model_type": "simple_tts",
+  "vocab_size": 60,
+  "d_model": 256,
+  "n_heads": 4,
+  "n_layers": 3,
+  "n_mels": 80,
+  "sample_rate": 22050,
+  "hop_length": 256,
+  "n_fft": 1024
+}

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f4c1b52a4bb2b3f7c06c9d8209872a796cbf7d15fb95b190c30227a6ed41e45
+size 25827261