| # litellm-proxy | |
| A local, fast, and lightweight **OpenAI-compatible server** to call 100+ LLM APIs. | |
| ## usage | |
| ```shell | |
| $ pip install litellm | |
| ``` | |
| ```shell | |
| $ litellm --model ollama/codellama | |
| #INFO: Ollama running on http://0.0.0.0:8000 | |
| ``` | |
| ## replace openai base | |
| ```python | |
| import openai # openai v1.0.0+ | |
| client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url | |
| # request sent to model set on litellm proxy, `litellm --model` | |
| response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ | |
| { | |
| "role": "user", | |
| "content": "this is a test request, write a short poem" | |
| } | |
| ]) | |
| print(response) | |
| ``` | |
| [**See how to call Huggingface,Bedrock,TogetherAI,Anthropic, etc.**](https://docs.litellm.ai/docs/simple_proxy) | |