Instructions to use Bin12345/AutoCoder_S_6.7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Bin12345/AutoCoder_S_6.7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Bin12345/AutoCoder_S_6.7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Bin12345/AutoCoder_S_6.7B") model = AutoModelForCausalLM.from_pretrained("Bin12345/AutoCoder_S_6.7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Bin12345/AutoCoder_S_6.7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Bin12345/AutoCoder_S_6.7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bin12345/AutoCoder_S_6.7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Bin12345/AutoCoder_S_6.7B
- SGLang
How to use Bin12345/AutoCoder_S_6.7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Bin12345/AutoCoder_S_6.7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bin12345/AutoCoder_S_6.7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Bin12345/AutoCoder_S_6.7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bin12345/AutoCoder_S_6.7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Bin12345/AutoCoder_S_6.7B with Docker Model Runner:
docker model run hf.co/Bin12345/AutoCoder_S_6.7B
Tell me this isnt the just copy and pasted weights from deepseekcoder-6.7b-instruct? Because thats what it looks like
I ran both models through human eval, here are the results. They scored exactly the same. Like bruh, this is a scam
Deepseekcoder-6.7-instruct
{
"humaneval": {
"pass@1": 0.725609756097561
},
"config": {
"prefix": "",
"do_sample": true,
"temperature": 0.1,
"top_k": 0,
"top_p": 0.95,
"n_samples": 1,
"eos": "<|endoftext|>",
"seed": 0,
"model": "deepseek-ai/deepseek-coder-6.7b-instruct",
"modeltype": "causal",
"peft_model": null,
"revision": null,
"use_auth_token": false,
"trust_remote_code": false,
"tasks": "humaneval",
"instruction_tokens": null,
"batch_size": 1,
"max_length_generation": 4000,
"precision": "fp16",
"load_in_8bit": true,
"load_in_4bit": false,
"left_padding": false,
"limit": null,
"limit_start": 0,
"save_every_k_tasks": -1,
"postprocess": true,
"allow_code_execution": true,
"generation_only": false,
"load_generations_path": null,
"load_data_path": null,
"metric_output_path": "evaluation_results.json",
"save_generations": false,
"load_generations_intermediate_paths": null,
"save_generations_path": "generations.json",
"save_references": false,
"save_references_path": "references.json",
"prompt": "prompt",
"max_memory_per_gpu": null,
"check_references": false
}
}
AutoCoder_S_6.7B
{
"humaneval": {
"pass@1": 0.725609756097561
},
"config": {
"prefix": "",
"do_sample": true,
"temperature": 0.1,
"top_k": 0,
"top_p": 0.95,
"n_samples": 1,
"eos": "<|endoftext|>",
"seed": 0,
"model": "Bin12345/AutoCoder_S_6.7B",
"modeltype": "causal",
"peft_model": null,
"revision": null,
"use_auth_token": false,
"trust_remote_code": false,
"tasks": "humaneval",
"instruction_tokens": null,
"batch_size": 1,
"max_length_generation": 4000,
"precision": "fp16",
"load_in_8bit": true,
"load_in_4bit": false,
"left_padding": false,
"limit": null,
"limit_start": 0,
"save_every_k_tasks": -1,
"postprocess": true,
"allow_code_execution": true,
"generation_only": false,
"load_generations_path": null,
"load_data_path": null,
"metric_output_path": "evaluation_results.json",
"save_generations": false,
"load_generations_intermediate_paths": null,
"save_generations_path": "generations.json",
"save_references": false,
"save_references_path": "references.json",
"prompt": "prompt",
"max_memory_per_gpu": null,
"check_references": false
}
}
Thanks for the verification, could you check if all these outputs are the same?
There are 164 problems inside the humaneval dataset. If both model correctly solved 119 problems. The final accuracy will be the same.
And
- We add some special tokens to the Deepseek-Coder and fine-tuned the model. These two models are different, you can use the following code to check two model's weights
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "" # Bin12345/AutoCoder_S_6.7B, deepseek-ai/deepseek-coder-6.7b-instruct
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
model_weights = model.state_dict()
for name, param in model_weights.items():
print(f"Layer: {name} | Size: {param.size()} | Values: {param[:2]}")
The results shown in our paper are performed by using greedy sampling method. The steps to reproduce the experiments are shown in https://github.com/bin123apple/AutoCoder
Could you share your experiment code and I will run it on my side and check the results.
Yes I ran the benchmark in google colab, here is the code:
!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes
!git clone https://github.com/bigcode-project/bigcode-evaluation-harness.git
%cd bigcode-evaluation-harness
!pip install -r requirements.txt
!accelerate launch main.py --tasks humaneval --model Bin12345/AutoCoder_S_6.7B --load_in_8bit --allow_code_execution --max_length_generation 4000 --precision fp16 --temperature 0.1
And for the other model just replace the model name
!accelerate launch main.py --tasks humaneval --model deepseek-ai/deepseek-coder-6.7b-instruct --load_in_8bit --allow_code_execution --max_length_generation 4000 --precision fp16 --temperature 0.1
You have to add a flag to get the results though, it wont save them without the extra flag, i forget what it is, you should be able to find it here
https://github.com/bigcode-project/bigcode-evaluation-harness
Thanks for the feedback! I will try it.
If you want to check if these two models are the same, you can use
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "" # Bin12345/AutoCoder_S_6.7B, deepseek-ai/deepseek-coder-6.7b-instruct
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
model_weights = model.state_dict()
for name, param in model_weights.items():
print(f"Layer: {name} | Size: {param.size()} | Values: {param[:2]}")
to print out their weights.
You will see that they are different.
Hey @rombodawg , I just reproduced your experiment, the output of these two models are different. You can add two flags --save_generations and --save_generations_path to save and check the generated code. Thanks