Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

.gitattributes +1 -0
README.md +67 -1
results.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+results.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,69 @@
 ---
-license: mit
 ---

 ---
+license: apache-2.0
+base_model:
+  - IIGroup/X-Coder-SFT-Qwen3-8B
+datasets:
+  - IIGroup/X-Coder-RL-40k
+language:
+  - en
+tags:
+  - code
+  - rl
+  - competitive-programming
 ---
+# X-Coder-RL-Qwen3-8B
+X-Coder-RL-Qwen3-8B is a code generation model trained with RLVR (Reinforcement Learning with Verifiable Rewards) on fully synthetic data, achieving state-of-the-art performance on competitive programming benchmarks.
+## Model Description
+- **Base Model**: [IIGroup/X-Coder-SFT-Qwen3-8B](https://huggingface.co/IIGroup/X-Coder-SFT-Qwen3-8B)
+- **Training Method**: RLVR (Reinforcement Learning with Verifiable Rewards)
+- **Training Data**: [IIGroup/X-Coder-RL-40k](https://huggingface.co/datasets/IIGroup/X-Coder-RL-40k) (40k fully synthetic tasks)
+- **Parameters**: 8B
+## Performance
+**LiveCodeBench Average Performance: 64.0**
+![Results](results.png)
+**Performance on LiveCodeBench v5.** X-Coder shows strong coding expertise with fewer, fully synthetic tasks, and achieves additional gains through subsequent RL stages.
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "IIGroup/X-Coder-RL-Qwen3-8B"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
+prompt = "Write a Python function to solve the two sum problem."
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Training
+This model was trained using the X-Coder RLVR framework. For training details and code, please refer to the [X-Coder GitHub repository](https://github.com/JieWu02/X-Coder).
+## Citation
+```bibtex
+@inproceedings{
+anonymous2025xcoder,
+title={X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests},
+author={Anonymous},
+booktitle={Submitted to The Fourteenth International Conference on Learning Representations},
+year={2025},
+url={https://openreview.net/forum?id=jp4dzBilqH},
+note={under review}
+}
+```
+## License
+This project is licensed under the Apache License 2.0.

results.png ADDED Viewed

Git LFS Details

SHA256: 1a65e6a94d2e445ded612e3b2f401e9cbd64766893462f545dcb4baefd18c60c
Pointer size: 131 Bytes
Size of remote file: 445 kB