Ocean-FAISS: Marine Image Retrieval Index
High-speed FAISS vector index and metadata for marine image retrieval using BioCLIP embeddings. Core component of the OceanGPT-X pipeline.
Repository Contents
| Path | Description |
|---|---|
faiss/index.faiss |
Pre-built FAISS index containing BioCLIP feature vectors |
faiss/id_map.json |
Mapping between FAISS internal IDs and dataset image IDs |
metadata/metadata.jsonl |
Rich metadata for each indexed image (species, location, capture info) |
Usage
Requires faiss-cpu or faiss-gpu.
import faiss
import json
import jsonlines
index = faiss.read_index("faiss/index.faiss")
with open("faiss/id_map.json", "r") as f:
id_map = json.load(f)
# Query vector must match the embedding dimension of the index
query_vector = ... # Shape: (1, dim), dtype: float32
D, I = index.search(query_vector, k=5)
# Retrieve metadata
with jsonlines.open("metadata/metadata.jsonl") as reader:
metadata = {obj["id"]: obj for obj in reader}
for idx in I[0]:
img_id = id_map[str(idx)]
print(metadata.get(img_id, "Not found"))
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support