| --- |
| license: apache-2.0 |
| language: |
| - es |
| tags: |
| - sentence-transformers |
| - style-embeddings |
| - stylometry |
| - spanish |
| - contrastive-learning |
| base_model: StyleDistance/mstyledistance |
| datasets: |
| - cespinr/SynthSTEL-ES |
| pretty_name: StyleECU |
| --- |
| |
| # StyleECU |
|
|
| **StyleECU** is a style embedding model for Spanish, obtained by fine-tuning |
| [mStyleDistance](https://huggingface.co/StyleDistance/mstyledistance) on |
| [SynthSTEL-ES](https://huggingface.co/datasets/cespinr/SynthSTEL-ES), |
| a purpose-built Spanish contrastive dataset of 51,400 triplets covering 71 stylistic dimensions. |
|
|
| ## Model Description |
|
|
| StyleECU specializes the mStyleDistance embedding space toward stylistic phenomena most relevant to Spanish, |
| including dialectal variation (*voseo/tuteo*), expressive morphology, syntactic complexity, and digital style. |
|
|
| ## Training |
|
|
| - **Base model:** `StyleDistance/mstyledistance` |
| - **Training objective:** TripletLoss (contrastive learning) |
| - **Dataset:** [cespinr/SynthSTEL-ES](https://huggingface.co/datasets/cespinr/SynthSTEL-ES) |
| - **Training size:** 51,400 triplets |
| - **Epochs:** 2 |
|
|
| ## Usage |
|
|
| ```python |
| from sentence_transformers import SentenceTransformer |
| |
| model = SentenceTransformer("cespinr/StyleECU") |
| embeddings = model.encode(["Tu texto aquí"]) |
| ``` |
|
|
| ## Evaluation |
|
|
| Evaluated on PAN author profiling tasks (Spanish): |
|
|
| | Task | Base (mStyleDistance) | StyleECU | Δ | |
| |------|----------------------|----------|---| |
| | PAN 2018 – Gender prediction | baseline | +3 pp | +3 pp | |
| | PAN 2021 – Hate speech spreaders | 0.70 | 0.81 | +11 pp | |
|
|
| ## Authors |
|
|
| **César Espín-Riofrio** — Researcher, University of Guayaquil, Ecuador & |
| SINAI, University of Jaén, Spain | |
| Director, Research Project FCI-036-2023, University of Guayaquil, Ecuador |
|
|
| **Arturo Montejo-Ráez** — Researcher, SINAI, University of Jaén, Spain |
|
|
| **Steven Ramírez-Gurumendi, Gabriel Delgado-Gómez** |
| University of Guayaquil, Ecuador — Research Project FCI-036-2023 |
|
|
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| *Paper under review. Citation will be updated upon publication.* |
|
|
| ```bibtex |
| @misc{espinriofrio2026stylecu, |
| author = {Espín-Riofrio, César and Montejo-Ráez, Arturo and |
| Ramírez-Gurumendi, Steven and Delgado-Gómez, Gabriel}, |
| title = {StyleECU: A Spanish Style Embedding Model}, |
| year = {2026}, |
| url = {https://huggingface.co/cespinr/StyleECU} |
| } |
| ``` |
|
|
|
|