Spaces:

mnhatdaous
/

learnable-speech

Sleeping

App Files Files Community

mnhatdaous commited on Sep 9

Commit

f5d5446

1 Parent(s): 36fbe52

Refactor Dockerfile and requirements for improved dependency management and clarity

Browse files

Files changed (4) hide show

Dockerfile +2 -2
TRAINING_GUIDE.md +13 -3
requirements-hf.txt +0 -14
requirements.txt +7 -1

Dockerfile CHANGED Viewed

@@ -9,8 +9,8 @@ RUN apt-get update && apt-get install -y \
     curl \
     && rm -rf /var/lib/apt/lists/*
-# Copy requirements first for better caching
-COPY requirements-hf.txt ./requirements.txt
 # Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt

     curl \
     && rm -rf /var/lib/apt/lists/*
+# Copy requirements file first for better caching
+COPY requirements.txt ./requirements.txt
 # Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt

TRAINING_GUIDE.md CHANGED Viewed

@@ -111,27 +111,31 @@ def synthesize_speech(text, speaker_id=0):
 ## 🎯 Training Configurations
-### For Different Environments:
 1. **Local Development** (Single GPU):
    ```bash
    export CUDA_VISIBLE_DEVICES="0"
    python speech/train.py --config speech/config.yaml --model llm ...
    ```
 2. **Multi-GPU Training**:
    ```bash
    export CUDA_VISIBLE_DEVICES="0,1,2,3"
    torchrun --nproc_per_node=4 speech/train.py ...
    ```
 3. **Cloud Training** (Google Colab/Kaggle):
    ```python
    # Use config_hf.yaml for resource-constrained environments
    !python speech/train.py --config speech/config_hf.yaml ...
    ```
 4. **Hugging Face Spaces**:
    ```bash
    # For direct training on HF infrastructure
    python speech/train.py --config speech/config_hf.yaml --timeout 1800 ...
@@ -140,6 +144,7 @@ def synthesize_speech(text, speaker_id=0):
 ## 📊 Monitoring Training
 1. **Comet ML** (Recommended):
    ```bash
    # Set up Comet ML for experiment tracking
    export COMET_API_KEY="your_api_key"
@@ -147,11 +152,13 @@ def synthesize_speech(text, speaker_id=0):
    ```
 2. **Tensorboard**:
    ```bash
    tensorboard --logdir ./tensorboard
    ```
 3. **Command Line**:
    ```bash
    # Monitor log files
    tail -f checkpoints/llm/train.log
@@ -159,7 +166,7 @@ def synthesize_speech(text, speaker_id=0):
 ## 🔧 Troubleshooting
-### Common Issues:
 1. **Out of Memory**:
    - Reduce batch size in config
@@ -176,9 +183,10 @@ def synthesize_speech(text, speaker_id=0):
    - Verify data preprocessing
    - Use pretrained checkpoints
-### Performance Tips:
 1. **Data Loading Optimization**:
    ```yaml
    # In config.yaml
    num_workers: 24
@@ -187,12 +195,14 @@ def synthesize_speech(text, speaker_id=0):
    ```
 2. **Memory Optimization**:
    ```bash
    # Use gradient checkpointing
    --use_amp --accum_grad 2
    ```
 3. **Speed Optimization**:
    ```bash
    # Compile model for faster training (PyTorch 2.0+)
    export TORCH_COMPILE=1

 ## 🎯 Training Configurations
+### For Different Environments
 1. **Local Development** (Single GPU):
    ```bash
    export CUDA_VISIBLE_DEVICES="0"
    python speech/train.py --config speech/config.yaml --model llm ...
    ```
 2. **Multi-GPU Training**:
    ```bash
    export CUDA_VISIBLE_DEVICES="0,1,2,3"
    torchrun --nproc_per_node=4 speech/train.py ...
    ```
 3. **Cloud Training** (Google Colab/Kaggle):
    ```python
    # Use config_hf.yaml for resource-constrained environments
    !python speech/train.py --config speech/config_hf.yaml ...
    ```
 4. **Hugging Face Spaces**:
    ```bash
    # For direct training on HF infrastructure
    python speech/train.py --config speech/config_hf.yaml --timeout 1800 ...
 ## 📊 Monitoring Training
 1. **Comet ML** (Recommended):
    ```bash
    # Set up Comet ML for experiment tracking
    export COMET_API_KEY="your_api_key"
    ```
 2. **Tensorboard**:
    ```bash
    tensorboard --logdir ./tensorboard
    ```
 3. **Command Line**:
    ```bash
    # Monitor log files
    tail -f checkpoints/llm/train.log
 ## 🔧 Troubleshooting
+### Common Issues
 1. **Out of Memory**:
    - Reduce batch size in config
    - Verify data preprocessing
    - Use pretrained checkpoints
+### Performance Tips
 1. **Data Loading Optimization**:
    ```yaml
    # In config.yaml
    num_workers: 24
    ```
 2. **Memory Optimization**:
    ```bash
    # Use gradient checkpointing
    --use_amp --accum_grad 2
    ```
 3. **Speed Optimization**:
    ```bash
    # Compile model for faster training (PyTorch 2.0+)
    export TORCH_COMPILE=1

requirements-hf.txt DELETED Viewed

@@ -1,14 +0,0 @@
-gradio==3.50.2
-torch==2.1.0
-torchaudio==2.1.0
-numpy==1.24.3
-soundfile==0.12.1
-librosa==0.10.1
-transformers==4.36.0
-omegaconf==2.3.0
-hydra-core==1.3.2
-# Optional: Add these if you need the full training pipeline
-# deepspeed==0.12.6
-# tensorboard==2.14.0
-# matplotlib==3.7.2

requirements.txt CHANGED Viewed

@@ -17,6 +17,7 @@ lightning==2.2.4
 matplotlib==3.7.5
 modelscope==1.20.0
 networkx==3.1
 omegaconf==2.3.0
 onnx==1.16.0
 onnxruntime-gpu==1.18.0; sys_platform == 'linux'
@@ -41,4 +42,9 @@ wget==3.2
 flatten_dict
 julius
 importlib_resources
-randomname

 matplotlib==3.7.5
 modelscope==1.20.0
 networkx==3.1
+numpy==1.24.3
 omegaconf==2.3.0
 onnx==1.16.0
 onnxruntime-gpu==1.18.0; sys_platform == 'linux'
 flatten_dict
 julius
 importlib_resources
+randomname
+# Optional: Add these if you need the full training pipeline
+# deepspeed==0.12.6
+# tensorboard==2.14.0
+# matplotlib==3.7.2