Upload ModelLens checkpoint (v2-trained, slim)

Browse files

Files changed (3) hide show

ModelLens.pt +3 -0
README.md +105 -0
args.json +1 -0

ModelLens.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6c7a6ff547ee205e713593e3f0f539b6a646e8eaf02069c9fdfc8dfe052af9ee
+size 709051757

README.md ADDED Viewed

	@@ -0,0 +1,105 @@

+---
+license: mit
+library_name: pytorch
+tags:
+  - model-recommendation
+  - model-selection
+  - ranking
+  - model-routing
+  - benchmarks
+  - leaderboard
+pipeline_tag: tabular-regression
+---
+# ModelLens — Trained Recommender Checkpoint
+This is the released **ModelLens** checkpoint — a metric-aware ranker that,
+given a dataset description + task + metric, returns a ranked list of
+HuggingFace models likely to perform well on it. No fine-tuning, no
+forward pass on the target dataset.
+This repo only ships the weights. For:
+- **Live demo (Gradio)**: 🤗 [`luisrui/ModelLens`](https://huggingface.co/spaces/luisrui/ModelLens)
+- **Training data**: 🤗 [`luisrui/ModelLens-corpus-v2`](https://huggingface.co/datasets/luisrui/ModelLens-corpus-v2) (1.81M rows, recommended)
+- **Source code**: [github.com/luisrui/ModelLens](https://github.com/luisrui/ModelLens)
+- **Paper**: see citation below
+## What's in here
+| File | Size | Description |
+|---|---:|---|
+| `ModelLens.pt` | ~709 MB | Trained recommender weights (slim — inference-ready, ~3 unused parent-class buffers dropped) |
+| `args.json` | ~2 KB | Training-time hyperparameters (model dims, num_models / num_tasks / num_metrics / etc.) |
+## Provenance
+- **Trained on**: [`luisrui/ModelLens-corpus-v2`](https://huggingface.co/datasets/luisrui/ModelLens-corpus-v2) — 1,807,133 (model × dataset × metric × value) records
+- **Coverage**: 47,242 HuggingFace models · 2,581 tasks · 3,714 metrics · ~86k datasets
+- **Architecture**: `MLPMetricFull` (the paper model — see [github repo](https://github.com/luisrui/ModelLens))
+- **Loss**: ensemble (listwise + pairwise + pointwise, `λ_list=0.5, λ_pair=1.0, w_point=0.1`)
+- **Training**: 30 epochs, DDP × 4 GPUs, `bs=8`, `lr=1e-3`, `wd=1e-4`, learnable `τ`
+- **Slimmed checkpoint**: inference-unused parent-class buffers + train-set `dataset_desc_matrix` stripped (load with `strict=False`).
+## Loading
+```python
+from huggingface_hub import hf_hub_download
+import torch, json
+ckpt_path = hf_hub_download("luisrui/ModelLens", "ModelLens.pt")
+args_path = hf_hub_download("luisrui/ModelLens", "args.json")
+args  = json.load(open(args_path))
+state = torch.load(ckpt_path, map_location="cpu")
+# Build the model from source (see github.com/luisrui/ModelLens) and load:
+# model = MLPMetricFull(**args_to_kwargs(args))
+# model.load_state_dict(state, strict=False)   # strict=False is intentional
+```
+For a complete, ready-to-run setup including the candidate model pool +
+metadata, see [`inference_lib.py`](https://huggingface.co/spaces/luisrui/ModelLens/blob/main/inference_lib.py)
+and [`recommend.py`](https://huggingface.co/spaces/luisrui/ModelLens/blob/main/recommend.py)
+in the Space.
+## How it works
+1. The dataset description is embedded with OpenAI `text-embedding-3-small`
+   (1536-dim — same encoder used at training time).
+2. The ranker scores every candidate model conditioned on
+   `(dataset_embedding, task_id, metric_id, model_size_bucket, model_family_id, model_id)`.
+3. Returns the top-K candidates, optionally filtered by param count /
+   "HF-hosted only" / "official pretrained only".
+## Intended use
+- Picking a starting model for a new task / dataset, without running
+  every candidate.
+- Cheap pre-filter ahead of a more expensive transferability estimator
+  or partial fine-tune.
+## Limitations
+- Knowledge is bounded by what's in `corpus-v2` (up to early 2026).
+- Models / datasets that don't appear in the corpus fall back to text
+  similarity over their descriptions — useful but weaker than the full
+  signal available for in-corpus entities.
+- Scores are *relative* — the ranking is what matters; the absolute
+  numbers are not calibrated to any specific metric scale.
+## Citation
+```bibtex
+@misc{modellens2026,
+  title  = {ModelLens: Finding the Best Model for Your Task from Myriads of Models},
+  author = {Cai, Rui and Mo, Weijie Jacky and Wen, Xiaofei and Ma, Qiyao and
+            Zhu, Wenhui and Chen, Xiwen and Chen, Muhao and Zhao, Zhe},
+  year   = {2026},
+  url    = {https://huggingface.co/luisrui/ModelLens},
+}
+```
+## License
+MIT.

args.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"device": "cuda:0", "use_data_parallel": false, "device_ids": [0, 1, 2, 3], "use_ddp": true, "ddp_find_unused_parameters": false, "num_workers": 0, "pin_memory": false, "persistent_workers": false, "data_name": "unified_augmented_v2", "ood_split_mode": "new_dataset_evaluation", "test_split_mode": "val", "seed": 2025, "use_wandb": true, "wandb_project": "ModelProfile", "wandb_entity": "ruicai-ucdavis", "trail_name": "FinalModel_v2_full_data_deployment", "start_epoch": 0, "checkpoint_path": "", "is_train": true, "is_ood": false, "loss_type": "ensemble", "point_loss_weight": 0.1, "early_stop": 99999, "eval_every": 99999, "num_epochs": 30, "save_every": 5, "save_final_checkpoint": true, "batch_size": 8, "pair_batch_size": 1024, "learning_rate": 0.001, "weight_decay": 0.0001, "tau": 10.0, "lambda_list": 0.5, "lambda_pair": 1.0, "alpha": 3.0, "size_bucket": [0.001, 0.003, 0.01, 0.03, 0.06, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.8, 1, 3, 7, 14, 35, 70, 100, 1000], "use_id_emb": true, "model_dim": 1536, "token_dim": 512, "use_size_prior": true, "size_dim": 64, "use_family_prior": true, "family_dim": 64, "model_desp_emb_dim": 1536, "model_desp_emb_path": "data/unified_augmented_v2/model2desp_embeddings.npz", "use_dataset_id_as_desp": true, "dataset_desp_dim": 1, "dataset_id_emb_dim": 256, "dataset_desp_emb_dim": 1536, "task_dim": 256, "model_name": "MLPMetricFull", "hidden_dim": 512, "dropout_rate": 0.02, "id_dropout_rate": 0.1, "topk": [1, 10, 30, 50], "margin_eps": 0.02, "val_eval_target_models_all_datasets": false, "val_eval_fixed_backbones": false, "save_best_ic8x10_checkpoint": false, "test_eval_target_models_all_datasets": false, "config": "config/FinalModel_unified_augmented_v2.yaml", "is_distributed": true, "world_size": 4, "rank": 0, "local_rank": 0, "num_models": 47242, "num_tasks": 2581, "num_metrics": 3714, "num_datasets": 85937, "unknown_metric_id": 0, "num_size_buckets": 23, "num_families": 332}