YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Backend Data Artifacts

This directory stores the precomputed data used by the FastAPI recommender.

Files

  • merged_embeddings.npy — Float32 matrix of movie embeddings. Shape is recorded in merged_shape.txt (rows × dims).
  • merged_shape.txt — Two integers separated by a space or comma indicating the embedding matrix shape, e.g. 200000 384.
  • index.faiss — FAISS index built from merged_embeddings.npy for fast nearest‑neighbor queries.

Optional/auxiliary:

  • Any metadata/lookups you maintain (e.g., movies.csv, id→row maps) referenced by your Recommender class.

Expected Paths

The backend constructs paths like:

BASE_DIR = os.path.dirname(__file__)
DATA_DIR = os.path.join(BASE_DIR, "data")

If you relocate artifacts, update your app init or expose env vars to override.

Regenerating Artifacts (outline)

  1. Produce/collect per‑movie text or features.
  2. Encode to embeddings (e.g., sentence transformers) → save merged_embeddings.npy (Float32, contiguous).
  3. Record shape in merged_shape.txt.
  4. Build FAISS index (e.g., IndexFlatIP or IndexIVFFlat) and write to index.faiss.
  5. Verify index/dims match the embeddings.

Example (Python, sketch):

import faiss, numpy as np
X = np.load('merged_embeddings.npy').astype('float32')
index = faiss.IndexFlatIP(X.shape[1])
faiss.normalize_L2(X)
index.add(X)
faiss.write_index(index, 'index.faiss')

Integrity & Size Tips

  • Keep dtype=float32; mismatched dims cause runtime errors.
  • Consider checksums (e.g., SHA256SUMS) for CI/CD verification.
  • Large files: use Git LFS or fetch on first run.

Licensing

Ensure your source datasets (TMDb, etc.) comply with their licenses and terms of use.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support