L0-PolyCore-4B-Base / README.md
r-e1's picture
Update README.md
de560ca verified
---
license: other
language:
- en
base_model:
- Qwen/Qwen3-4B-Base
base_model_relation: finetune
tags:
- code
- c
- clang
- cpp
- c++
- qlora
- cpt
library_name: transformers
pipeline_tag: text-generation
---
## Training Data
This model was trained on a dataset of curated C/C++ code from multiple licenses (GPL-2.0, Apache-2.0, MIT, public domain, and some source-available licenses, etc.).
The original authors are not affiliated with or responsible for this model.
## Base Model
Base model: [Qwen/Qwen3-4B-Base](https://huggingface.co/Qwen/Qwen3-4B-Base)
## Fine-tuning Method
- Adapter: QLoRA
- Method: CPT
- Precision: trained with 4-bit base weights + BF16 compute, then merged to safetensors
## Training Details
- Training time: ~74 hours
- Hardware: 1x NVIDIA RTX 5060 Ti
## Notes
- This is an **L0 base model**, it is not instruction-tuned and may be more verbose with strict formatting request compared to an instruct model.
- Recommended usage is raw code continuation, or pairing with an external template strategy.
## Intended use
- Code generation for C/C++
- Fast code completion
- Examples and prototyping
## Constraints
- May produce incorrect code
- May reproduce identifiable upstream code fragments (including license headers) when prompted.
- Verify outputs, especially for memory safety and security-sensitive code.