YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Backend Data Artifacts
This directory stores the precomputed data used by the FastAPI recommender.
Files
merged_embeddings.npy— Float32 matrix of movie embeddings. Shape is recorded inmerged_shape.txt(rows × dims).merged_shape.txt— Two integers separated by a space or comma indicating the embedding matrix shape, e.g.200000 384.index.faiss— FAISS index built frommerged_embeddings.npyfor fast nearest‑neighbor queries.
Optional/auxiliary:
- Any metadata/lookups you maintain (e.g.,
movies.csv, id→row maps) referenced by yourRecommenderclass.
Expected Paths
The backend constructs paths like:
BASE_DIR = os.path.dirname(__file__)
DATA_DIR = os.path.join(BASE_DIR, "data")
If you relocate artifacts, update your app init or expose env vars to override.
Regenerating Artifacts (outline)
- Produce/collect per‑movie text or features.
- Encode to embeddings (e.g., sentence transformers) → save
merged_embeddings.npy(Float32, contiguous). - Record shape in
merged_shape.txt. - Build FAISS index (e.g.,
IndexFlatIPorIndexIVFFlat) and write toindex.faiss. - Verify index/dims match the embeddings.
Example (Python, sketch):
import faiss, numpy as np
X = np.load('merged_embeddings.npy').astype('float32')
index = faiss.IndexFlatIP(X.shape[1])
faiss.normalize_L2(X)
index.add(X)
faiss.write_index(index, 'index.faiss')
Integrity & Size Tips
- Keep
dtype=float32; mismatched dims cause runtime errors. - Consider checksums (e.g.,
SHA256SUMS) for CI/CD verification. - Large files: use Git LFS or fetch on first run.
Licensing
Ensure your source datasets (TMDb, etc.) comply with their licenses and terms of use.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support