Spaces:

CARROT-LLM-Routing
/

README

Running

App Files Files Community

README / README.md

smrstep

Update README.md

88c8a74 verified about 1 year ago

2.77 kB

title: README
emoji: 📈
colorFrom: yellow
colorTo: green
sdk: static
pinned: false

Welcome to CARROT-LLM-Routing! For a given desired trade off between performance and cost, CARROT makes it easy to pick the best model among a set of 13 LLMs for any query. Below you may read the CARROT paper, replicate the training process of CARROT, or see how to utilize CARROT out of the box for routing.

Read the paper

Train CARROT

As is, CARROT supports routing to the following collection of large language models. Instantiating the CarrotRouter class automatically loads the trained predictors for ouput token count and performance that are provided below. You may set mu to control the priority given to performance vs cost!

	claude-3-5-sonnet-v1	titan-text-premier-v1	openai-gpt-4o	openai-gpt-4o-mini	granite-3-2b-instruct	granite-3-8b-instruct	llama-3-1-70b-instruct	llama-3-1-8b-instruct	llama-3-2-1b-instruct	llama-3-2-3b-instruct	llama-3-3-70b-instruct	mixtral-8x7b-instruct	llama-3-405b-instruct
Input Token Cost ($ per 1M tokens)	3	0.5	2.5	0.15	0.1	0.2	0.9	0.2	0.06	0.06	0.9	0.6	3.5
Output Token Cost ($ per 1M tokens)	15	1.5	10	0.6	0.1	0.2	0.9	0.2	0.06	0.06	0.9	0.6	3.5

Example: Using CARROT for Routing

from carrot.py import CarrotRouter

# Initialize the router
router = CarrotRouter(hf_token='YOUR_HF_TOKEN')

# Define a query
query = ["What is the value of i^i?"]

# Get the best model for cost-performance tradeoff
best_model = router.route(query, mu = 0.3)

print(f"Recommended Model: {best_model}")