Instructions to use Mercury7353/PyLlama3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Mercury7353/PyLlama3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Mercury7353/PyLlama3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Mercury7353/PyLlama3")
model = AutoModelForCausalLM.from_pretrained("Mercury7353/PyLlama3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Mercury7353/PyLlama3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Mercury7353/PyLlama3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mercury7353/PyLlama3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Mercury7353/PyLlama3

SGLang

How to use Mercury7353/PyLlama3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Mercury7353/PyLlama3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mercury7353/PyLlama3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Mercury7353/PyLlama3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mercury7353/PyLlama3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Mercury7353/PyLlama3 with Docker Model Runner:
```
docker model run hf.co/Mercury7353/PyLlama3
```

Mercury7353 commited on Jul 23, 2024

Commit

64d01aa

verified ·

1 Parent(s): 35eab02

Upload 7 files

Browse files

Files changed (8) hide show

.gitattributes +3 -0
README.md +86 -3
images/Screen_recording-2024-07-03_16-39-54.mp4 +3 -0
images/data.png +0 -0
images/generateTraj.png +0 -0
images/hook.png +0 -0
images/leaderboard.png +3 -0
images/main.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+images/leaderboard.png filter=lfs diff=lfs merge=lfs -text
+images/main.png filter=lfs diff=lfs merge=lfs -text
+images/Screen_recording-2024-07-03_16-39-54.mp4 filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,86 @@
----
-license: apache-2.0
----

+<h1 align="center"> PyBench: Evaluate LLM Agent on Real World Tasks </h1>
+<p align="center">
+<a href="comming soon">📃 Paper</a>
+•
+<a href="https://huggingface.co/datasets/Mercury7353/PyInstruct" >🤗 Data (PyInstruct)</a>
+•
+<a href="https://huggingface.co/Mercury7353/PyLlama3" >🤗 Model (PyLlama3)</a>
+•
+</p>
+PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
+ We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
+![Overview](images/main.png)
+## Why PyBench?
+The LLM Agent, equipped with a code interpreter, is capable of automatically solving real-world coding tasks, such as data analysis and image processing.
+%
+However, existing benchmarks primarily focus on either simplistic tasks, such as completing a few lines of code, or on extremely complex and specific tasks at the repository level, neither of which are representative of various daily coding tasks.
+%
+To address this gap, we introduce **PyBench**, a benchmark that encompasses 6 main categories of real-world tasks, covering more than 10 types of files.
+![How PyBench Works](images/generateTraj.png)
+## 📁 PyInstruct
+To figure out a way to enhance the model's ability on PyBench, we generate a homologous dataset: **PyInstruct**. The PyInstruct contains multi-turn interaction between the model and files, stimulating the model's capability on coding, debugging and multi-turn complex task solving.  Compare to other Datasets focus on multi-turn coding ability, PyInstruct has longer turns and tokens per trajectory.
+![Data Statistics](images/data.png)
+*Dataset Statistics. Token statistics are computed using Llama-2 tokenizer.*
+## 🪄 PyLlama
+We trained Llama3-8B-base on PyInstruct, CodeActInstruct, CodeFeedback, and Jupyter Notebook Corpus to get PyLlama3, achieving an outstanding performance on PyBench
+## 🚀 Model Evaluation with PyBench!
+<video src="https://github.com/Mercury7353/PyBench/assets/103104011/fef85310-55a3-4ee8-a441-612e7dbbaaab"> </video>
+*Demonstration of the chat interface.*
+### Environment Setup:
+Begin by establishing the required environment:
+```bash
+conda env create -f environment.yml
+```
+### Model Configuration
+Initialize a local server using the vllm framework, which defaults to port "8001":
+```bash
+bash SetUpModel.sh
+```
+A Jinja template is necessary to launch a vllm server. Commonly used templates can be located in the `./jinja/` directory.
+Prior to starting the vllm server, specify the model path and Jinja template path in `SetUpModel.sh`.
+### Configuration Adjustments
+Specify your model's path and the server port in `./config/model.yaml`. This configuration file also allows for customization of the system prompts.
+### Execution on PyBench
+Ensure to update the output trajectory file path in the script before execution:
+```bash
+python /data/zyl7353/codeinterpreterbenchmark/inference.py --config_path ./config/<your config>.yaml --task_path ./data/meta/task.json --output_path <your trajectory.jsonl path>
+```
+### Unit Testing Procedure
+- **Step 1:** Store the output files in `./output`.
+- **Step 2:** Define the trajectory file path in
+  `./data/unit_test/enter_point.py`.
+- **Step 3:** Execute the unit test script:
+  ```bash
+  python data/unit_test/enter_point.py
+  ```
+## 📊 LeaderBoard
+![LLM Leaderboard](images/leaderboard.png)
+# 📚 Citation
+```bibtex
+TBD
+```

images/Screen_recording-2024-07-03_16-39-54.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43692e7d0a6925a082f33d4ae2c5326fdd3af118e37849c8d858c0a8a7fde029
+size 5153739

images/data.png ADDED Viewed

images/generateTraj.png ADDED Viewed

images/hook.png ADDED Viewed

images/leaderboard.png ADDED Viewed

Git LFS Details

SHA256: 084d7fc0ae7f65e632a1998392ef156cc051d0e251b2ebb40077c2e91c187d55
Pointer size: 132 Bytes
Size of remote file: 1 MB

images/main.png ADDED Viewed

Git LFS Details

SHA256: f9f9b942c1ef76ccab6c04f5dd9961e905959c5bff72c0ebca25a01efd48af5a
Pointer size: 132 Bytes
Size of remote file: 1.33 MB