sander-wood
/

clamp3

Feature Extraction

music

Model card Files Files and versions

xet

Community

sander-wood commited on Feb 24, 2025

Commit

355625c

verified ·

1 Parent(s): f319dd8

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -7

README.md CHANGED Viewed

@@ -164,7 +164,7 @@ pip install -r requirements.txt
 ```
 ### **Overview of `clamp3_*.py` Scripts**
-CLaMP 3 provides scripts for **semantic similarity calculation**, **semantic search**, and **retrieval performance evaluation** across five modalities. Simply provide the file path, and the script will automatically detect the modality and extract the relevant features.
 Supported formats include:
 - **Audio**: `.mp3`, `.wav`
@@ -179,6 +179,14 @@ Supported formats include:
 > **Note**: All files in a folder must belong to the same modality for processing.
 #### **[`clamp3_score.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_score.py) - Semantic Similarity Calculation**
 This script calculates semantic similarity between query and reference files. By default, it uses **pairwise mode**, but you can switch to **group mode** using the `--group` flag.
@@ -223,22 +231,26 @@ python clamp3_score.py <query_dir> <ref_dir> [--group]
   python clamp3_score.py query_dir ref_dir --group
   ```
-#### **[`clamp3_search.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_search.py) - Semantic Search**
-Run retrieval tasks by comparing a query file to reference files in `ref_dir`. The query and `ref_dir` can be **any modality**, so there are **25 possible retrieval combinations**, e.g., text-to-music, image-to-text, music-to-music, music-to-text (zero-shot music classification), etc.
 ```bash
-python clamp3_search.py <query_file> <ref_dir> [--top_k TOP_K]
 ```
-#### **[`clamp3_eval.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_eval.py) - Retrieval Performance Evaluation**
-Evaluates **CLaMP3's retrieval performance** on a paired dataset using metrics like **MRR** and **Hit@K**. Works the same way as **pairwise mode** in `clamp3_score.py`—requiring **matching folder structure** and **filenames** between `query_dir` and `ref_dir`.
 ```bash
-python clamp3_eval.py <query_dir> <ref_dir>
 ```
 ## **Repository Structure**
 - **[code/](https://github.com/sanderwood/clamp3/tree/main/code)** → Training & feature extraction scripts.
 - **[classification/](https://github.com/sanderwood/clamp3/tree/main/classification)** → Linear classification training and prediction.

 ```
 ### **Overview of `clamp3_*.py` Scripts**
+CLaMP 3 provides scripts for **semantic search**, **semantic similarity calculation**, **retrieval performance evaluation**, and **feature extraction** across five modalities. Simply provide the file path, and the script will automatically detect the modality and extract the relevant features.
 Supported formats include:
 - **Audio**: `.mp3`, `.wav`
 > **Note**: All files in a folder must belong to the same modality for processing.
+#### **[`clamp3_search.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_search.py) - Semantic Search**
+Run retrieval tasks by comparing a query file to reference files in `ref_dir`. The query and `ref_dir` can be **any modality**, so there are **25 possible retrieval combinations**, e.g., text-to-music, image-to-music, music-to-music, music-to-text (zero-shot music classification), etc.
+```bash
+python clamp3_search.py <query_file> <ref_dir> [--top_k TOP_K]
+```
 #### **[`clamp3_score.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_score.py) - Semantic Similarity Calculation**
 This script calculates semantic similarity between query and reference files. By default, it uses **pairwise mode**, but you can switch to **group mode** using the `--group` flag.
   python clamp3_score.py query_dir ref_dir --group
   ```
+#### **[`clamp3_eval.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_eval.py) - Retrieval Performance Evaluation**
+Evaluates **CLaMP3's retrieval performance** on a paired dataset using metrics like **MRR** and **Hit@K**. Works the same way as **pairwise mode** in `clamp3_score.py`—requiring **matching folder structure** and **filenames** between `query_dir` and `ref_dir`.
 ```bash
+python clamp3_eval.py <query_dir> <ref_dir>
 ```
+#### **[`clamp3_embd.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_embd.py) - Feature Extraction**
+If other scripts don't meet your needs, use `clamp3_embd.py` to extract features.
 ```bash
+python clamp3_embd.py <input_dir_path> <output_dir_path> [--get_global]
 ```
+**Feature Output:**
+  - **Without `--get_global`** → Shape: **(1, T, 768)** (T = time steps). Uses last hidden states before avg pooling, ideal for applications needing temporal info. Fine-tuning recommended.
+  - **With `--get_global`** → Shape: **(1, 768)**. Uses avg pooled features, suitable for applications needing global info, can be used directly.
 ## **Repository Structure**
 - **[code/](https://github.com/sanderwood/clamp3/tree/main/code)** → Training & feature extraction scripts.
 - **[classification/](https://github.com/sanderwood/clamp3/tree/main/classification)** → Linear classification training and prediction.