matchcommentary / model_card.md
abocide's picture
Upload folder using huggingface_hub
1ea8d66 verified
---
library_name: transformers
tags:
- multimodal
- video-understanding
- sports
- commentary-generation
- llama3
- soccer
language:
- en
datasets:
- MatchTime
pipeline_tag: text-generation
---
# Matchcommentary: Automatic Soccer Game Commentary Generation
## Model Description
Matchcommentary is a multimodal model designed for automatic soccer game commentary generation. It combines video feature understanding with large language models to generate fluent and contextually appropriate soccer commentary.
## Architecture
The model consists of:
- **Vision Encoder**: Q-Former architecture for processing video features
- **Language Model**: LLaMA-3-8B-Instruct for text generation
- **Feature Fusion**: Cross-attention mechanism between visual and textual information
- **Domain Adaptation**: Soccer-specific vocabulary constraints
## Intended Use
### Primary Use Cases
- Automatic soccer game commentary generation
- Sports video understanding and description
- Multimodal video-to-text generation
### Limitations
- Trained specifically on soccer/football content
- Requires pre-extracted video features
- Performance may vary on different video qualities or angles
## Training Data
The model was trained on the MatchTime dataset, which contains:
- Soccer game videos with corresponding commentary
- Multiple leagues and seasons
- Temporal alignment between visual events and commentary
## Performance
The model achieves state-of-the-art performance on the MatchTime benchmark, with the best validation CIDEr score among tested configurations.
## Usage
```python
from models.matchvoice_model import matchvoice_model
import torch
# Load model
model = matchvoice_model(
llm_ckpt="meta-llama/Meta-Llama-3-8B-Instruct",
tokenizer_ckpt="meta-llama/Meta-Llama-3-8B-Instruct",
num_video_query_token=32,
num_features=512,
device="cuda:0",
inference=True
)
# Load checkpoint
checkpoint = torch.load("model_save_best_val_CIDEr.pth")
model.load_state_dict(checkpoint)
model.eval()
# Generate commentary
with torch.no_grad():
commentary = model(video_samples)
```