Cooolder commited on
Commit
b7eecf5
·
verified ·
1 Parent(s): 2d29cef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -11,8 +11,16 @@ tags:
11
  ---
12
 
13
  # SCOPE: Scalable and Controllable Outcome Performance Estimator
 
 
 
 
 
 
 
 
 
14
 
15
- SCOPE is a specialized model that predicts how a target LLM will perform on a given question. Given a target question and a set of anchor questions with known performance results, SCOPE predicts the **output length** and **correctness** of the target model's response.
16
 
17
  ## Model Description
18
 
@@ -396,12 +404,6 @@ for sample in dataset:
396
  3. **Batch Processing**: Use vLLM for high-throughput batch inference
397
  4. **Anchor Selection**: Choose anchors similar to your target question domain
398
 
399
- ## Limitations
400
-
401
- - Performance predictions are estimates based on anchor patterns
402
- - Accuracy depends on the quality and relevance of anchor questions
403
- - Works best when anchors are from the same domain as the target question
404
-
405
  ## Citation
406
 
407
  ```bibtex
 
11
  ---
12
 
13
  # SCOPE: Scalable and Controllable Outcome Performance Estimator
14
+ [📄 Paper (arXiv:2601.22323)](https://www.arxiv.org/abs/2601.22323)
15
+
16
+ This repository accompanies the paper “Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning”, which introduces SCOPE (Scalable and Controllable Outcome Performance Estimator) — a new framework for large language model (LLM) routing.
17
+ SCOPE reframes model routing as a pre-hoc estimation problem: instead of directly selecting a model from a fixed candidate set, it predicts each model’s expected performance (correctness) and inference cost (token length) before execution, based on the model’s historical behaviors on similar queries. This enables training-free generalization to unseen models and allows users to flexibly control the trade-off between accuracy and cost through a budget-aware utility function.
18
+ Overall, SCOPE provides a scalable, explainable, and controllable solution for allocating test-time compute across heterogeneous model portfolios.
19
+
20
+ ![SCOPE paradigm](assets/1.pdf)
21
+ The figure above illustrates the core difference between traditional routers and SCOPE.
22
+ Conventional LLM routers treat routing as a closed-set classification problem, simply memorizing model names and selecting one model per query. In contrast, SCOPE reasons over models’ past behaviors, explicitly predicting outcome correctness and token cost, and then makes a budget-aware decision based on these estimates. This design allows SCOPE to generalize to unseen models and supports dynamic cost–accuracy control at inference time.
23
 
 
24
 
25
  ## Model Description
26
 
 
404
  3. **Batch Processing**: Use vLLM for high-throughput batch inference
405
  4. **Anchor Selection**: Choose anchors similar to your target question domain
406
 
 
 
 
 
 
 
407
  ## Citation
408
 
409
  ```bibtex