| --- |
| language: |
| - en |
| license: other |
| library_name: transformers |
| pipeline_tag: text-generation |
| tags: |
| - python |
| - code-generation |
| - code-assistant |
| - causal-lm |
| - full-finetune |
| - hunyuan |
| - transformers |
| - safetensors |
| - instruct |
| base_model: |
| - tencent/Hunyuan-0.5B-Instruct |
| model-index: |
| - name: Hunyuan-PythonGOD-0.5B |
| results: [] |
| datasets: |
| - WithinUsAI/Python_GOD_Coder_Omniforge_AI_12k |
| - WithinUsAI/Python_GOD_Coder_5k |
| - WithinUsAI/Legend_Python_CoderV.1 |
| --- |
| |
| # Hunyuan-PythonGOD-0.5B |
|
|
| Hunyuan-PythonGOD-0.5B is a Python-focused full fine-tune of `tencent/Hunyuan-0.5B-Instruct`, built for code generation, coding assistance, implementation tasks, and instruction-following for Python-heavy workflows. |
|
|
| This release is intended as a compact coding model that keeps the small footprint of the 0.5B Hunyuan base while shifting its behavior toward practical Python generation and code-oriented responses. |
|
|
| ## Model Details |
|
|
| ### Model Description |
|
|
| - **Model name:** `gss1147/Hunyuan-PythonGOD-0.5B` |
| - **Base model:** `tencent/Hunyuan-0.5B-Instruct` |
| - **Architecture:** causal decoder-only language model |
| - **Model family tag:** `hunyuan_v1_dense` |
| - **Primary domain:** Python coding / coding assistant |
| - **Parameter count:** ~0.5B |
| - **Weights format:** safetensors |
| - **Tensor type in repo:** F16 |
|
|
| ### Developed by |
|
|
| - **Shared by:** `gss1147` |
|
|
| ### Finetuned from model |
|
|
| - `tencent/Hunyuan-0.5B-Instruct` |
|
|
| ## Intended Uses |
|
|
| ### Direct Use |
|
|
| This model is intended for: |
|
|
| - Python function generation |
| - Python script writing |
| - debugging-oriented coding help |
| - implementation tasks |
| - code completion |
| - coding chat assistants |
| - lightweight local or cloud inference where a small coding model is preferred |
|
|
| ### Downstream Use |
|
|
| Possible downstream uses include: |
|
|
| - code copilots |
| - coding bots |
| - Python tutoring helpers |
| - automation script generation |
| - benchmark experimentation for small code LLMs |
|
|
| ### Out-of-Scope Use |
|
|
| This model is not designed for: |
|
|
| - safety-critical code deployment without human review |
| - medical, legal, or financial decision support |
| - secure production code without auditing |
| - autonomous execution pipelines without sandboxing |
| - guaranteed factual or bug-free code generation |
|
|
| ## Training Details |
|
|
| ### Training Objective |
|
|
| This model was trained as a **full fine-tune**, not as an adapter-only release. |
|
|
| Based on the training workflow you described and the run logs you shared, this release is meant to represent: |
|
|
| - **full-parameter fine-tuning** |
| - **no LoRA** |
| - **no QLoRA** |
| - **no PEFT adapters in the final model** |
| - **standard exported Hugging Face model weights** |
|
|
| ### Training Data |
|
|
| This model was trained on the following datasets: |
|
|
| - `WithinUsAI/Python_GOD_Coder_Omniforge_AI_12k` |
| - `WithinUsAI/Python_GOD_Coder_5k` |
| - `WithinUsAI/Legend_Python_CoderV.1` |
|
|
| From the training logs you shared, the combined training corpus used: |
|
|
| - **11,760 rows** from `Python_GOD_Coder_Omniforge_AI_12k` |
| - **5,000 rows** from `Python_GOD_Coder_5k` |
| - **5,000 rows** from `Legend_Python_CoderV.1` |
|
|
| **Total rows:** **21,760** |
|
|
| ### Training Procedure |
|
|
| From the training setup you shared, this model was trained with: |
|
|
| - **dual-GPU Kaggle training** |
| - **DeepSpeed-assisted distributed training** |
| - **full model fine-tuning** |
| - **evaluation during training** |
| - **final-save upload flow to Hugging Face** |
|
|
| ### Sequence Length |
|
|
| - **Practical fine-tuning sequence length:** 4096 tokens |
|
|
| ### Context Window Note |
|
|
| If the base model family exposes larger context metadata in config fields, that should not be taken as proof that the full fine-tuning run itself was performed at that larger length. This release should be treated as fine-tuned at **4096 tokens** unless revalidated separately. |
|
|
| ## Evaluation |
|
|
| Formal benchmark results are not finalized in this card. |
|
|
| Benchmark attempts were made on free public coding benchmarks such as: |
|
|
| - HumanEval+ |
| - MBPP+ |
| - BigCodeBench-style workflows |
|
|
| However, based on the evaluation runs you shared, the harness setup encountered tool/runtime issues during some benchmark attempts, so this card does **not** claim final official benchmark scores yet. |
|
|
| ### Observed Training Behavior |
|
|
| From the run logs you shared during training, the model showed: |
|
|
| - strong reduction in training loss over time |
| - strong reduction in eval loss over time |
| - stable continued learning well into the run |
| - increasingly code-specialized behavior relative to the base model |
|
|
| Examples from your shared eval progression included values around: |
|
|
| - ~0.2879 early in training |
| - ~0.1071 |
| - ~0.0604 |
| - ~0.0550 |
| - ~0.0422 |
| - ~0.0329 |
| - ~0.0266 |
| - ~0.0299 |
| - ~0.0290 |
|
|
| These are training/eval-run observations, not official public benchmark scores. |
|
|
| ## How to Use |
|
|
| ### Transformers |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model_id = "gss1147/Hunyuan-PythonGOD-0.5B" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| trust_remote_code=True, |
| torch_dtype=torch.float16, |
| device_map="auto", |
| ) |
| |
| prompt = "Write a Python function that merges overlapping intervals." |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=512, |
| do_sample=False, |
| ) |
| |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |