Instructions to use Bin12345/AutoCoder_S_6.7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Bin12345/AutoCoder_S_6.7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Bin12345/AutoCoder_S_6.7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Bin12345/AutoCoder_S_6.7B")
model = AutoModelForCausalLM.from_pretrained("Bin12345/AutoCoder_S_6.7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Bin12345/AutoCoder_S_6.7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Bin12345/AutoCoder_S_6.7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Bin12345/AutoCoder_S_6.7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Bin12345/AutoCoder_S_6.7B

SGLang

How to use Bin12345/AutoCoder_S_6.7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Bin12345/AutoCoder_S_6.7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Bin12345/AutoCoder_S_6.7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Bin12345/AutoCoder_S_6.7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Bin12345/AutoCoder_S_6.7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Bin12345/AutoCoder_S_6.7B with Docker Model Runner:
```
docker model run hf.co/Bin12345/AutoCoder_S_6.7B
```

Tell me this isnt the just copy and pasted weights from deepseekcoder-6.7b-instruct? Because thats what it looks like

by rombodawg - opened Jun 1, 2024

Discussion

rombodawg

Jun 1, 2024

I ran both models through human eval, here are the results. They scored exactly the same. Like bruh, this is a scam

Deepseekcoder-6.7-instruct

{
  "humaneval": {
    "pass@1": 0.725609756097561
  },
  "config": {
    "prefix": "",
    "do_sample": true,
    "temperature": 0.1,
    "top_k": 0,
    "top_p": 0.95,
    "n_samples": 1,
    "eos": "<|endoftext|>",
    "seed": 0,
    "model": "deepseek-ai/deepseek-coder-6.7b-instruct",
    "modeltype": "causal",
    "peft_model": null,
    "revision": null,
    "use_auth_token": false,
    "trust_remote_code": false,
    "tasks": "humaneval",
    "instruction_tokens": null,
    "batch_size": 1,
    "max_length_generation": 4000,
    "precision": "fp16",
    "load_in_8bit": true,
    "load_in_4bit": false,
    "left_padding": false,
    "limit": null,
    "limit_start": 0,
    "save_every_k_tasks": -1,
    "postprocess": true,
    "allow_code_execution": true,
    "generation_only": false,
    "load_generations_path": null,
    "load_data_path": null,
    "metric_output_path": "evaluation_results.json",
    "save_generations": false,
    "load_generations_intermediate_paths": null,
    "save_generations_path": "generations.json",
    "save_references": false,
    "save_references_path": "references.json",
    "prompt": "prompt",
    "max_memory_per_gpu": null,
    "check_references": false
  }
}

AutoCoder_S_6.7B

{
  "humaneval": {
    "pass@1": 0.725609756097561
  },
  "config": {
    "prefix": "",
    "do_sample": true,
    "temperature": 0.1,
    "top_k": 0,
    "top_p": 0.95,
    "n_samples": 1,
    "eos": "<|endoftext|>",
    "seed": 0,
    "model": "Bin12345/AutoCoder_S_6.7B",
    "modeltype": "causal",
    "peft_model": null,
    "revision": null,
    "use_auth_token": false,
    "trust_remote_code": false,
    "tasks": "humaneval",
    "instruction_tokens": null,
    "batch_size": 1,
    "max_length_generation": 4000,
    "precision": "fp16",
    "load_in_8bit": true,
    "load_in_4bit": false,
    "left_padding": false,
    "limit": null,
    "limit_start": 0,
    "save_every_k_tasks": -1,
    "postprocess": true,
    "allow_code_execution": true,
    "generation_only": false,
    "load_generations_path": null,
    "load_data_path": null,
    "metric_output_path": "evaluation_results.json",
    "save_generations": false,
    "load_generations_intermediate_paths": null,
    "save_generations_path": "generations.json",
    "save_references": false,
    "save_references_path": "references.json",
    "prompt": "prompt",
    "max_memory_per_gpu": null,
    "check_references": false
  }
}

Bin12345

Owner Jun 1, 2024

Thanks for the verification, could you check if all these outputs are the same?

There are 164 problems inside the humaneval dataset. If both model correctly solved 119 problems. The final accuracy will be the same.

And

We add some special tokens to the Deepseek-Coder and fine-tuned the model. These two models are different, you can use the following code to check two model's weights

from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "" # Bin12345/AutoCoder_S_6.7B, deepseek-ai/deepseek-coder-6.7b-instruct
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")

model_weights = model.state_dict()

for name, param in model_weights.items():
    print(f"Layer: {name} | Size: {param.size()} | Values: {param[:2]}")

The results shown in our paper are performed by using greedy sampling method. The steps to reproduce the experiments are shown in https://github.com/bin123apple/AutoCoder
Could you share your experiment code and I will run it on my side and check the results.

rombodawg

Jun 1, 2024

Yes I ran the benchmark in google colab, here is the code:

!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes

!git clone https://github.com/bigcode-project/bigcode-evaluation-harness.git

%cd bigcode-evaluation-harness

!pip install -r requirements.txt

!accelerate launch main.py --tasks humaneval --model Bin12345/AutoCoder_S_6.7B --load_in_8bit --allow_code_execution --max_length_generation 4000 --precision fp16 --temperature 0.1

And for the other model just replace the model name

!accelerate launch main.py --tasks humaneval --model deepseek-ai/deepseek-coder-6.7b-instruct --load_in_8bit --allow_code_execution --max_length_generation 4000 --precision fp16 --temperature 0.1

rombodawg

Jun 1, 2024

You have to add a flag to get the results though, it wont save them without the extra flag, i forget what it is, you should be able to find it here
https://github.com/bigcode-project/bigcode-evaluation-harness

Bin12345

Owner Jun 1, 2024

Thanks for the feedback! I will try it.

If you want to check if these two models are the same, you can use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "" # Bin12345/AutoCoder_S_6.7B, deepseek-ai/deepseek-coder-6.7b-instruct
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")

model_weights = model.state_dict()

for name, param in model_weights.items():
    print(f"Layer: {name} | Size: {param.size()} | Values: {param[:2]}")

to print out their weights.

You will see that they are different.

Bin12345

Owner Jun 1, 2024

Hey @rombodawg , I just reproduced your experiment, the output of these two models are different. You can add two flags --save_generations and --save_generations_path to save and check the generated code. Thanks

rombodawg changed discussion status to closed Jun 1, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment