openai_api_server.py可以多卡启动吗
#91
by RoboTerh - opened
你这个多张显卡是多少呢,如果你显存不够,建议用transformers的底座启动,能让没张卡平均负载在到13G左右,但是调用token多了以后kvcache仍然很占用显存。
For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.
