Instructions to use Nexusflow/NexusRaven-V2-13B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Nexusflow/NexusRaven-V2-13B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Nexusflow/NexusRaven-V2-13B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Nexusflow/NexusRaven-V2-13B") model = AutoModelForCausalLM.from_pretrained("Nexusflow/NexusRaven-V2-13B") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Nexusflow/NexusRaven-V2-13B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Nexusflow/NexusRaven-V2-13B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/NexusRaven-V2-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Nexusflow/NexusRaven-V2-13B
- SGLang
How to use Nexusflow/NexusRaven-V2-13B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Nexusflow/NexusRaven-V2-13B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/NexusRaven-V2-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Nexusflow/NexusRaven-V2-13B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Nexusflow/NexusRaven-V2-13B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Nexusflow/NexusRaven-V2-13B with Docker Model Runner:
docker model run hf.co/Nexusflow/NexusRaven-V2-13B
Scenario issue
Thanks for the great model,
this is a case that maybe can be fixed in next releases
input
How much is the nightly rate at El Palacio Resort, Sunny Days, Hurghada?
Answer
The nightly rate at El Palacio Resort, Sunny Days, Hurghada is not provided in the search results.
However, the average price range for a 3-star hotel in Hurghada is between $30 to $50 per night, according to various sources. Please note that this information may not be accurate or up-to-date, and it's recommended to check with the hotel directly or use a booking platform to get the most accurate pricing information.
But when my input was
What is the price of a night at El Palacio Resort in Sunny Days, Hurghada?
the answer
No relevant information
another input
How much is the night at El Palacio Resort Sunny Days in Hurghada?
the answer
No relevant information.
another case
what is the price of euro / usd today
the answer
No relevant information.
Thank you for your follow up! This is mainly the issue of Google places API (and possibly the chat model). NexusRaven is only in charge of generating the function call. And in the case of our demo, it can only use functions provided by Google places API. If the function call cannot find the relevant content, the summary model will just say no relevant information.
It seems that Google places API will not return the exact price for hotel, and also not the price of euro / usd. So chat model does not have enough information to respond.
If you supply Raven with corresponding search tools like Google search API, it's likely that it will get the right information. However, our current demo is mainly limited to places API.