aashish1904 commited on
Commit
58a04e7
·
verified ·
1 Parent(s): 0f72524

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +103 -0
README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: apache-2.0
5
+ datasets:
6
+ - open-r1/codeforces-cots
7
+ language:
8
+ - en
9
+ base_model:
10
+ - Qwen/Qwen2.5-Coder-7B-Instruct
11
+ pipeline_tag: text-generation
12
+ library_name: transformers
13
+
14
+ ---
15
+
16
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
17
+
18
+
19
+ # QuantFactory/OlympicCoder-7B-GGUF
20
+ This is quantized version of [open-r1/OlympicCoder-7B](https://huggingface.co/open-r1/OlympicCoder-7B) created using llama.cpp
21
+
22
+ # Original Model Card
23
+
24
+
25
+ # Model Card for OlympicCoder-7B
26
+
27
+ OlympicCoder-7B is a code model that achieves strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics.
28
+
29
+ * Repository: https://github.com/huggingface/open-r1
30
+ * Blog post: https://huggingface.co/blog/open-r1/update-3
31
+
32
+ ## Model description
33
+
34
+ - **Model type:** A 7B parameter model fine-tuned on a decontaminated version of the codeforces dataset.
35
+ - **Language(s) (NLP):** Primarily English
36
+ - **License:** apache-2.0
37
+ - **Finetuned from model:** [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
38
+
39
+ ## Evaluation
40
+
41
+ We compare the performance of OlympicCoder models on two main benchmarks for competitive coding:
42
+
43
+ * **[IOI'2024:](https://github.com/huggingface/ioi)** 6 very challenging problems from the 2024 International Olympiad in Informatics. Models are allowed up to 50 submissions per problem.
44
+ * **[LiveCodeBench:](https://livecodebench.github.io)** Python programming problems source from platforms like CodeForces and LeetCoder. We use the `v4_v5` subset of [`livecodebench/code_generation_lite`](https://huggingface.co/datasets/livecodebench/code_generation_lite), which corresponds to 268 problems. We use `lighteval` to evaluate models on LiveCodeBench using the sampling parameters described [here](https://github.com/huggingface/open-r1?tab=readme-ov-file#livecodebench).
45
+
46
+ > [!NOTE]
47
+ > The OlympicCoder models were post-trained exclusively on C++ solutions generated by DeepSeek-R1. As a result the performance on LiveCodeBench should be considered to be partially _out-of-domain_, since this expects models to output solutions in Python.
48
+
49
+ ### IOI'24
50
+
51
+ ![](./ioi-evals.png)
52
+
53
+ ### LiveCodeBench
54
+
55
+ ![](./lcb-evals.png)
56
+
57
+ ## Usage
58
+ Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
59
+
60
+ ```python
61
+ # pip install transformers
62
+ # pip install accelerate
63
+
64
+ import torch
65
+ from transformers import pipeline
66
+
67
+ pipe = pipeline("text-generation", model="open-r1/OlympicCoder-7B", torch_dtype=torch.bfloat16, device_map="auto")
68
+
69
+ # We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
70
+ messages = [
71
+ {"role": "user", "content": "Write a python program to calculate the 10th Fibonacci number"},
72
+ ]
73
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
74
+ outputs = pipe(prompt, max_new_tokens=8000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
75
+ print(outputs[0]["generated_text"])
76
+ #<|im_start|>user
77
+ #Write a python program to calculate the 10th fibonacci number<|im_end|>
78
+ #<|im_start|>assistant
79
+ #<think>Okay, I need to write a Python program that calculates the 10th Fibonacci number. Hmm, the Fibonacci sequence starts with 0 and 1. Each subsequent number is the sum of the two preceding ones. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. ...
80
+ ```
81
+
82
+ > [!WARNING]
83
+ > To ensure that the model consistently outputs a long chain-of-thought, we have edited the chat template to prefill the first assistant turn with a `<think>` token. As a result, the outputs from this model will not show the opening `<think>` token if you use the model's `generate()` method. To apply reinforcement learning with a format reward, either prepend the `<think>` token to the model's completions or amend the chat template to remove the prefill.
84
+
85
+ ## Training procedure
86
+ ### Training hyper-parameters
87
+
88
+ The following hyperparameters were used during training:
89
+
90
+ - dataset: open-r1/codeforces-cots
91
+ - learning_rate: 4.0e-5
92
+ - train_batch_size: 2
93
+ - seed: 42
94
+ - packing: false
95
+ - distributed_type: deepspeed-zero-3
96
+ - num_devices: 8
97
+ - gradient_accumulation_steps: 8
98
+ - total_train_batch_size: 16
99
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
100
+ - lr_scheduler_type: cosine_with_min_lr
101
+ - min_lr_rate: 0.1
102
+ - lr_scheduler_warmup_ratio: 0.03
103
+ - num_epochs: 10.0