Feature Extraction
Transformers
Safetensors
finelap
audio grounding
audio-text retrieval
sound-event-detection
multimodal
clap
custom_code
Instructions to use AndreasXi/FineLAP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AndreasXi/FineLAP with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="AndreasXi/FineLAP", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AndreasXi/FineLAP", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload folder using huggingface_hub
Browse files- modeling_finelap.py +6 -5
modeling_finelap.py
CHANGED
|
@@ -121,11 +121,12 @@ class FineLAPModel(PreTrainedModel):
|
|
| 121 |
global_text = self.get_global_text_embeds(text_labels, device)
|
| 122 |
|
| 123 |
logits = torch.matmul(global_text, global_audio.transpose(-1, -2))
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
|
|
|
| 129 |
|
| 130 |
@torch.no_grad()
|
| 131 |
def plot_frame_level_score(self, audio_path, text_labels, output_path="similarity_plot.png", device=None):
|
|
|
|
| 121 |
global_text = self.get_global_text_embeds(text_labels, device)
|
| 122 |
|
| 123 |
logits = torch.matmul(global_text, global_audio.transpose(-1, -2))
|
| 124 |
+
return logits
|
| 125 |
+
# if hasattr(self, "temp_global"):
|
| 126 |
+
# logits = logits / self.temp_global
|
| 127 |
+
# if hasattr(self, "b_global"):
|
| 128 |
+
# logits = logits + self.b_global
|
| 129 |
+
# return torch.sigmoid(logits).squeeze(-1)
|
| 130 |
|
| 131 |
@torch.no_grad()
|
| 132 |
def plot_frame_level_score(self, audio_path, text_labels, output_path="similarity_plot.png", device=None):
|