Instructions to use NOVA-vision-language/PlanLLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NOVA-vision-language/PlanLLM with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NOVA-vision-language/PlanLLM")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NOVA-vision-language/PlanLLM")
model = AutoModelForCausalLM.from_pretrained("NOVA-vision-language/PlanLLM")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use NOVA-vision-language/PlanLLM with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NOVA-vision-language/PlanLLM"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NOVA-vision-language/PlanLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/NOVA-vision-language/PlanLLM

SGLang

How to use NOVA-vision-language/PlanLLM with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NOVA-vision-language/PlanLLM" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NOVA-vision-language/PlanLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NOVA-vision-language/PlanLLM" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NOVA-vision-language/PlanLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use NOVA-vision-language/PlanLLM with Docker Model Runner:
```
docker model run hf.co/NOVA-vision-language/PlanLLM
```

dmgcsilva commited on Feb 20, 2024

Commit

f3f999d

verified ·

1 Parent(s): e2e0f07

Update README.md

Browse files

Files changed (1) hide show

README.md +60 -6

README.md CHANGED Viewed

@@ -1,29 +1,83 @@
 ---
 license: apache-2.0
-inference: False
 language:
 - en
 ---
 # PlanLLM
 <img src="https://i.imgur.com/nHuVNAn.png" alt="drawing" style="width:300px;"/>
-### Model Details
 PlanLLM is a conversational assistant trained to assist users in completing a recipe from beginning to end and be able to answer any related or relevant requests that the user might have.
 The model was also tested with DIY Tasks and performed similarly.
-#### Training
 PlanLLM was trained by fine-tuning a [Vicuna](https://huggingface.co/lmsys/vicuna-7b-v1.1) model on synthetic dialogue between users and an assistant about a given recipe.
 The model was first trained using SFT and then using Direct Preference Optimization (DPO).
-#### License
 It's the same as Vicuna. A non-commercial Apache 2.0 license.
-#### Paper
 ["Plan-Grounded Large Language Models for Dual Goal Conversational Settings" (Accepted at EACL 2024)
-Diogo Glória-Silva, Rafael Ferreira, Diogo Tavares, David Semedo, João Magalhães](https://arxiv.org/abs/2402.01053)

 ---
 license: apache-2.0
+inference: false
 language:
 - en
+library_name: transformers
 ---
 # PlanLLM
 <img src="https://i.imgur.com/nHuVNAn.png" alt="drawing" style="width:300px;"/>
+## Model Details
 PlanLLM is a conversational assistant trained to assist users in completing a recipe from beginning to end and be able to answer any related or relevant requests that the user might have.
 The model was also tested with DIY Tasks and performed similarly.
+### Training
 PlanLLM was trained by fine-tuning a [Vicuna](https://huggingface.co/lmsys/vicuna-7b-v1.1) model on synthetic dialogue between users and an assistant about a given recipe.
 The model was first trained using SFT and then using Direct Preference Optimization (DPO).
+#### Details
+SFT:
+  - Train Type: Fully Sharded Data Parallel (FSDP) with 4 A100 40GB GPUs
+  - Batch Size: 1
+  - Gradient Acc. Steps: 64
+  - Train steps: 600
+DPO:
+  - Train Type: Low-Rank Adaptation (LoRA) with 1 A100 40GB GPU
+  - LoRA Rank: 64
+  - LoRA Alpha: 16
+  - Batch Size: 1
+  - Gradient Acc. Steps: 64
+  - Train steps: 350
+### Dataset
+PlanLLM was trained on synthetic user-system dialogues where the role of the system is to aid the user in completing a predetermined task. For our case, we used recipes.
+These dialogues were generated using the user utterances collected from Alexa users who interacted with TWIZ, our entry in the Alexa Prize Taskbot Challenge 1.
+Using an intent classifier we mapped each user utterance to a specific intent allowing us to collect intent-specific utterances and a dialogue graph of each dialogue (with intents being the graph nodes).
+For the system responses, we used a combination of templates, external knowledge sources, and Large Language Models.
+Using this we built a pipeline that would navigate a dialogue graph generating user requests and system responses for each turn, creating complete dialogues that follow a similar dialogue pattern used by real users.
+#### Details
+SFT:
+  - Dialogues: 10k (90/5/5 splits)
+  - Recipes: 1000
+DPO:
+  - Dialogues: 3k (90/5/5 splits)
+  - Recipes: 1000 (same recipes used for SFT)
+### License
 It's the same as Vicuna. A non-commercial Apache 2.0 license.
+### Paper
 ["Plan-Grounded Large Language Models for Dual Goal Conversational Settings" (Accepted at EACL 2024)
+Diogo Glória-Silva, Rafael Ferreira, Diogo Tavares, David Semedo, João Magalhães](https://arxiv.org/abs/2402.01053)
+#### Cite Us!
+```
+@InProceedings{planllm_eacl24,
+  author="Glória-Silva, Diogo
+          and Ferreira, Rafael
+          and Tavares, Diogo
+          and Semedo, David
+          and Magalhães, João",
+  title="Plan-Grounded Large Language Models for Dual Goal Conversational Settings",
+  booktitle="European Chapter of the Association for Computational Linguistics (EACL 2024)",
+  year="2024",
+}
+```