README: fix citation block (use correct @article format with arXiv ID + full author list)

68fabcb verified 4 days ago

4.41 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- model-recommendation
	- model-selection
	- ranking
	- model-routing
	- benchmarks
	- leaderboard
	pipeline_tag: tabular-regression
	---

	# ModelLens — Trained Recommender Checkpoint

	📄 Paper: [ModelLens: Finding the Best Model for Your Task from Myriads of Models](https://huggingface.co/papers/2605.07075)
	·  🤗 Collection: [luisrui/modellens](https://huggingface.co/collections/luisrui/modellens)
	·  💻 Code: [github.com/luisrui/ModelLens](https://github.com/luisrui/ModelLens)

	This is the released ModelLens checkpoint — a metric-aware ranker that,
	given a dataset description + task + metric, returns a ranked list of
	HuggingFace models likely to perform well on it. No fine-tuning, no
	forward pass on the target dataset.

	This repo only ships the weights. For:

	- Live demo (Gradio): 🤗 [`luisrui/ModelLens`](https://huggingface.co/spaces/luisrui/ModelLens)
	- Training data: 🤗 [`luisrui/ModelLens-corpus-v2`](https://huggingface.co/datasets/luisrui/ModelLens-corpus-v2) (1.81M rows, recommended)
	- Source code: [github.com/luisrui/ModelLens](https://github.com/luisrui/ModelLens)
	- Paper: see citation below

	## What's in here

	\| File \| Size \| Description \|
	\|---\|---:\|---\|
	\| `ModelLens.pt` \| ~709 MB \| Trained recommender weights (slim — inference-ready, ~3 unused parent-class buffers dropped) \|
	\| `args.json` \| ~2 KB \| Training-time hyperparameters (model dims, num_models / num_tasks / num_metrics / etc.) \|

	## Provenance

	- Trained on: [`luisrui/ModelLens-corpus-v2`](https://huggingface.co/datasets/luisrui/ModelLens-corpus-v2) — 1,807,133 (model × dataset × metric × value) records
	- Coverage: 47,242 HuggingFace models · 2,581 tasks · 3,714 metrics · ~86k datasets
	- Architecture: `MLPMetricFull` (the paper model — see [github repo](https://github.com/luisrui/ModelLens))
	- Loss: ensemble (listwise + pairwise + pointwise, `λ_list=0.5, λ_pair=1.0, w_point=0.1`)
	- Training: 30 epochs, DDP × 4 GPUs, `bs=8`, `lr=1e-3`, `wd=1e-4`, learnable `τ`
	- Slimmed checkpoint: inference-unused parent-class buffers + train-set `dataset_desc_matrix` stripped (load with `strict=False`).

	## Loading

	```python
	from huggingface_hub import hf_hub_download
	import torch, json

	ckpt_path = hf_hub_download("luisrui/ModelLens", "ModelLens.pt")
	args_path = hf_hub_download("luisrui/ModelLens", "args.json")

	args = json.load(open(args_path))
	state = torch.load(ckpt_path, map_location="cpu")

	# Build the model from source (see github.com/luisrui/ModelLens) and load:
	# model = MLPMetricFull(**args_to_kwargs(args))
	# model.load_state_dict(state, strict=False) # strict=False is intentional
	```

	For a complete, ready-to-run setup including the candidate model pool +
	metadata, see [`inference_lib.py`](https://huggingface.co/spaces/luisrui/ModelLens/blob/main/inference_lib.py)
	and [`recommend.py`](https://huggingface.co/spaces/luisrui/ModelLens/blob/main/recommend.py)
	in the Space.

	## How it works

	1. The dataset description is embedded with OpenAI `text-embedding-3-small`
	(1536-dim — same encoder used at training time).
	2. The ranker scores every candidate model conditioned on
	`(dataset_embedding, task_id, metric_id, model_size_bucket, model_family_id, model_id)`.
	3. Returns the top-K candidates, optionally filtered by param count /
	"HF-hosted only" / "official pretrained only".

	## Intended use

	- Picking a starting model for a new task / dataset, without running
	every candidate.
	- Cheap pre-filter ahead of a more expensive transferability estimator
	or partial fine-tune.

	## Limitations

	- Knowledge is bounded by what's in `corpus-v2` (up to early 2026).
	- Models / datasets that don't appear in the corpus fall back to text
	similarity over their descriptions — useful but weaker than the full
	signal available for in-corpus entities.
	- Scores are relative — the ranking is what matters; the absolute
	numbers are not calibrated to any specific metric scale.

	## Citation

	```bibtex
	@article{cai2026modellens,
	title={ModelLens: Finding the Best for Your Task from Myriads of Models},
	author={Cai, Rui and Mo, Weijie Jacky and Wen, Xiaofei and Ma, Qiyao and Zhu, Wenhui and Chen, Xiwen and Chen, Muhao and Zhao, Zhe},
	journal={arXiv preprint arXiv:2605.07075},
	year={2026}
	}
	```

	## License

	MIT.