README / README.md
smrstep's picture
Update README.md
77fe375 verified
|
raw
history blame
2.58 kB
metadata
title: README
emoji: 📈
colorFrom: yellow
colorTo: green
sdk: static
pinned: false

Welcome to CARROT-LLM-Routing! For a given desired trade off between performance and cost, CARROT makes it easy to pick the best model among a set of 13 LLMs for any query. Below you may read the CARROT paper, replicate the training process of CARROT, or see how to utilize CARROT out of the box for routing.

Read the paper
Train CARROT

As is, CARROT supports routing to the following collection of large language models.

claude-3-5-sonnet-v1 titan-text-premier-v1 openai-gpt-4o openai-gpt-4o-mini granite-3-2b-instruct granite-3-8b-instruct llama-3-1-70b-instruct llama-3-1-8b-instruct llama-3-2-1b-instruct llama-3-2-3b-instruct llama-3-3-70b-instruct mixtral-8x7b-instruct llama-3-405b-instruct
Input Token Cost ($ per 1M tokens) 3 0.5 2.5 0.15 0.1 0.2 0.9 0.2 0.06 0.06 0.9 0.6 3.5
Output Token Cost ($ per 1M tokens) 15 1.5 10 0.6 0.1 0.2 0.9 0.2 0.06 0.06 0.9 0.6 3.5

Example: Using CARROT for Routing


import carrot_router

Initialize the router

router = carrot_router.Router()

Define a query

query = "What are the latest advancements in AI?"

Get the best model for cost-performance tradeoff

best_model = router.route(query)

print(f"Recommended Model: {best_model}")