Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ tags:
|
|
| 8 |
- seamless
|
| 9 |
- subtitle-editing-time-prediction
|
| 10 |
library_name: transformers
|
| 11 |
-
|
| 12 |
---
|
| 13 |
|
| 14 |
# videoloc/seamless-basic
|
|
@@ -24,7 +24,6 @@ The model is built on top of Meta's SeamlessM4T and fine-tuned on a multimodal d
|
|
| 24 |
- **Multimodal Processing**: Simultaneously processes audio (16kHz) and text inputs
|
| 25 |
- **Frozen Encoders**: Uses pre-trained SeamlessM4T encoders (frozen for stability)
|
| 26 |
- **TTE Prediction**: Predicts editing time required for subtitle segments
|
| 27 |
-
- **Efficient Architecture**: Optimized for inference with gradient checkpointing support
|
| 28 |
- **Direct Output**: Raw time values in seconds for immediate use
|
| 29 |
|
| 30 |
## Model Architecture
|
|
@@ -156,8 +155,6 @@ data = [
|
|
| 156 |
- **Dataset Split**: 80/20 train/test
|
| 157 |
- **Random Seed**: 42
|
| 158 |
- **Metric**: RMSE (lower is better)
|
| 159 |
-
- **Audio Caching**: Enabled with compression
|
| 160 |
-
- **Workers**: 8
|
| 161 |
|
| 162 |
## Training Configuration
|
| 163 |
|
|
|
|
| 8 |
- seamless
|
| 9 |
- subtitle-editing-time-prediction
|
| 10 |
library_name: transformers
|
| 11 |
+
base_model: facebook/hf-seamless-m4t-medium
|
| 12 |
---
|
| 13 |
|
| 14 |
# videoloc/seamless-basic
|
|
|
|
| 24 |
- **Multimodal Processing**: Simultaneously processes audio (16kHz) and text inputs
|
| 25 |
- **Frozen Encoders**: Uses pre-trained SeamlessM4T encoders (frozen for stability)
|
| 26 |
- **TTE Prediction**: Predicts editing time required for subtitle segments
|
|
|
|
| 27 |
- **Direct Output**: Raw time values in seconds for immediate use
|
| 28 |
|
| 29 |
## Model Architecture
|
|
|
|
| 155 |
- **Dataset Split**: 80/20 train/test
|
| 156 |
- **Random Seed**: 42
|
| 157 |
- **Metric**: RMSE (lower is better)
|
|
|
|
|
|
|
| 158 |
|
| 159 |
## Training Configuration
|
| 160 |
|