You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

PL-BERT Valenciano

Modelo PL-BERT entrenado para s铆ntesis de voz en valenciano/catal谩n, dise帽ado para uso con StyleTTS2.

Descripci贸n

Este modelo es un AlbertModel entrenado con arquitectura dual:

  • Encoder: AlbertModel (este modelo)
  • mask_predictor: Predicci贸n de fonemas enmascarados (descartado tras entrenamiento)
  • word_predictor: Predicci贸n de palabras con RoBERTa-ca (descartado tras entrenamiento)

Configuraci贸n

Par谩metro Valor
vocab_size 178
hidden_size 768
num_hidden_layers 12
num_attention_heads 12
intermediate_size 2048
embedding_size 128 (default AlbertModel)

Entrenamiento

  • Dataset: Corts Valencianes (~89,331 muestras)
  • Steps: 50000
  • Batch size: 32
  • Supervisi贸n sem谩ntica: RoBERTa-ca (projecte-aina/roberta-base-ca-v2)

M茅tricas

M茅trica Valor
Perplexity 5.93
Word Accuracy Top-1 97.23%
Word Accuracy Top-5 99.18%

Uso con StyleTTS2

from transformers import AlbertModel, AlbertConfig

class CustomAlbert(AlbertModel):
    def forward(self, *args, **kwargs):
        outputs = super().forward(*args, **kwargs)
        return outputs.last_hidden_state

# Cargar modelo
model = CustomAlbert.from_pretrained("javiimts/plbert-valenciano")
model.eval()

Licencia

Apache 2.0

Downloads last month
1
Safetensors
Model size
6.29M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support