duyntnet commited on
Commit
10400a0
·
verified ·
1 Parent(s): 3dac452

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - Seed-Coder-8B-Instruct
12
+ ---
13
+
14
+ Quantizations of https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Instruct
15
+
16
+
17
+ ### Open source inference clients/UIs
18
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp)
19
+ * [KoboldCPP](https://github.com/LostRuins/koboldcpp)
20
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
21
+
22
+ ### Closed source inference clients/UIs
23
+ * [LM Studio](https://lmstudio.ai/)
24
+ * More will be added...
25
+ ---
26
+
27
+ # From original readme
28
+
29
+ We are thrilled to introduce Seed-Coder, a powerful, transparent, and parameter-efficient family of open-source code models at the 8B scale, featuring base, instruct, and reasoning variants. Seed-Coder contributes to promote the evolution of open code models through the following highlights.
30
+
31
+ - **Model-centric:** Seed-Coder predominantly leverages LLMs instead of hand-crafted rules for code data filtering, minimizing manual effort in pretraining data construction.
32
+ - **Transparent:** We openly share detailed insights into our model-centric data pipeline, including methods for curating GitHub data, commits data, and code-related web data.
33
+ - **Powerful:** Seed-Coder achieves state-of-the-art performance among open-source models of comparable size across a diverse range of coding tasks.
34
+
35
+ This repo contains the **Seed-Coder-8B-Instruct** model, which has the following features:
36
+ - Type: Causal language models
37
+ - Training Stage: Pretraining & Post-training
38
+ - Data Source: Public datasets, synthetic data
39
+ - Context Length: 32,768
40
+
41
+
42
+ ## Model Downloads
43
+ | Model Name | Length | Download | Notes |
44
+ |---------------------------------------------------------|--------|------------------------------------|-----------------------|
45
+ | Seed-Coder-8B-Base | 32K | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Base) | Pretrained on our model-centric code data. |
46
+ | 👉 **Seed-Coder-8B-Instruct** | 32K | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Instruct) | Instruction-tuned for alignment with user intent. |
47
+ | Seed-Coder-8B-Reasoning | 64K | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Reasoning) | RL trained to boost reasoning capabilities. |
48
+ | Seed-Coder-8B-Reasoning-bf16 | 64K | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-Coder-8B-Reasoning-bf16) | RL trained to boost reasoning capabilities. |
49
+
50
+ ## Requirements
51
+ You will need to install the latest versions of `transformers` and `accelerate`:
52
+
53
+ ```bash
54
+ pip install -U transformers accelerate
55
+ ```
56
+
57
+ ## Quickstart
58
+
59
+ Here is a simple example demonstrating how to load the model and generate code using the Hugging Face `pipeline` API:
60
+
61
+ ```python
62
+ from transformers import AutoTokenizer, AutoModelForCausalLM
63
+ import torch
64
+
65
+ model_id = "ByteDance-Seed/Seed-Coder-8B-Instruct"
66
+
67
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
68
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
69
+
70
+ messages = [
71
+ {"role": "user", "content": "Write a quick sort algorithm."},
72
+ ]
73
+
74
+ input_ids = tokenizer.apply_chat_template(
75
+ messages,
76
+ tokenize=True,
77
+ return_tensors="pt",
78
+ add_generation_prompt=True,
79
+ ).to(model.device)
80
+
81
+ outputs = model.generate(input_ids, max_new_tokens=512)
82
+ response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
83
+ print(response)
84
+
85
+ ```
86
+
87
+ ## Evaluation
88
+
89
+ Seed-Coder-8B-Instruct has been evaluated on a wide range of coding tasks, including code generation, code reasoning, code editing, and software engineering, achieving state-of-the-art performance among ~8B open-source models.
90
+
91
+ | Model | HumanEval | MBPP | MHPP | BigCodeBench (Full) | BigCodeBench (Hard) | LiveCodeBench (2410 – 2502) |
92
+ |:-----------------------------:|:---------:|:----:|:----:|:-------------------:|:-------------------:|:-------------------------:|
93
+ | CodeLlama-7B-Instruct | 40.9 | 54.0 | 6.7 | 25.7 | 4.1 | 3.6 |
94
+ | DeepSeek-Coder-6.7B-Instruct | 74.4 | 74.9 | 20.0 | 43.8 | 15.5 | 9.6 |
95
+ | CodeQwen1.5-7B-Chat | 83.5 | 77.7 | 17.6 | 43.6 | 15.5 | 3.0 |
96
+ | Yi-Coder-9B-Chat | 82.3 | 82.0 | 26.7 | 49.0 | 17.6 | 17.5 |
97
+ | Llama-3.1-8B-Instruct | 68.3 | 70.1 | 17.1 | 40.5 | 13.5 | 11.5 |
98
+ | OpenCoder-8B-Instruct | 83.5 | 79.1 | 30.5 | 50.9 | 18.9 | 17.1 |
99
+ | Qwen2.5-Coder-7B-Instruct | **88.4** | 83.5 | 26.7 | 48.8 | 20.3 | 17.3 |
100
+ | Qwen3-8B | 84.8 | 77.0 | 32.8 | 51.7 | 23.0 | 23.5 |
101
+ | Seed-Coder-8B-Instruct | 84.8 | **85.2** | **36.2** | **53.3** | **26.4** | **24.7** |
102
+
103
+
104
+ For detailed benchmark performance, please refer to our [📑 Technical Report](https://github.com/ByteDance-Seed/Seed-Coder/blob/master/Seed-Coder.pdf).