Configuration Parsing Warning: Config file tokenizer_config.json cannot be fetched (too big)

Built with Axolotl

See axolotl config

axolotl version: 0.13.0.dev0

# Axolotl config for NeuTTS Danish fine-tuning (FIXED)
# Key changes: disabled sample_packing, more epochs, higher LR

base_model: syvai/plapre-base
model_type: LlamaForCausalLM

# Pre-tokenized dataset
datasets:
  - path: syvai/danish-tts-voice-cloning-tokenized
    ds_type: json
    type:

val_set_size: 0.01

# Output
output_dir: ./outputs/neutts-danish-v2
dataset_prepared_path: last_run_prepared_v2

# Sequence length
sequence_len: 2048

# Training hyperparameters - adjusted
learning_rate: 1e-4
lr_scheduler: cosine
warmup_ratio: 0.03
num_epochs: 3
micro_batch_size: 2
gradient_accumulation_steps: 16

# Memory optimization
bf16: true
tf32: true
gradient_checkpointing: true

resume_from_checkpoint:
logging_steps: 10
flash_attention: true

# Optimizer
optimizer: adamw_bnb_8bit
weight_decay: 0.01

# Logging & saving
save_steps: 5000
eval_steps: 5000
save_total_limit: 3

# wandb
wandb_project: tts
wandb_entity:
wandb_watch:
wandb_name: neutts-danish-v2
wandb_log_model:

# Sample packing is OK with flash_attention: true
# Flash attention uses cu_seqlens to prevent cross-attention between packed samples
sample_packing: true
pad_to_sequence_len: false


special_tokens:                                                                               
  eos_token: <|SPEECH_GENERATION_END|>

outputs/neutts-danish-v2

This model is a fine-tuned version of syvai/plapre-base on the syvai/danish-tts-voice-cloning-tokenized dataset. It achieves the following results on the evaluation set:

  • Loss: 6.9211
  • Ppl: 1013.4261
  • Memory/max Active (gib): 8.93
  • Memory/max Allocated (gib): 8.93
  • Memory/device Reserved (gib): 22.86

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 716
  • training_steps: 23889

Training results

Training Loss Epoch Step Validation Loss Ppl Active (gib) Allocated (gib) Reserved (gib)
No log 0 0 8.6548 5737.6073 7.85 7.85 22.75
7.0324 0.6278 5000 7.0311 1131.2746 8.93 8.93 22.86
6.9324 1.2555 10000 6.9570 1050.5248 8.93 8.93 22.86
6.9077 1.8833 15000 6.9266 1018.9874 8.93 8.93 22.86
6.8949 2.5110 20000 6.9211 1013.4261 8.93 8.93 22.86

Framework versions

  • Transformers 4.57.6
  • Pytorch 2.9.1+cu128
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
10
Safetensors
Model size
0.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for syvai/plapre-voice-clone

Base model

syvai/plapre-base
Finetuned
(1)
this model

Dataset used to train syvai/plapre-voice-clone