Update README.md

4a4c5ce verified 2 days ago

5.34 kB

	---
	language:
	- en
	license: other
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- python
	- code-generation
	- code-assistant
	- causal-lm
	- full-finetune
	- hunyuan
	- transformers
	- safetensors
	- instruct
	base_model:
	- tencent/Hunyuan-0.5B-Instruct
	model-index:
	- name: Hunyuan-PythonGOD-0.5B
	results: []
	datasets:
	- WithinUsAI/Python_GOD_Coder_Omniforge_AI_12k
	- WithinUsAI/Python_GOD_Coder_5k
	- WithinUsAI/Legend_Python_CoderV.1
	---

	# Hunyuan-PythonGOD-0.5B

	Hunyuan-PythonGOD-0.5B is a Python-focused full fine-tune of `tencent/Hunyuan-0.5B-Instruct`, built for code generation, coding assistance, implementation tasks, and instruction-following for Python-heavy workflows.

	This release is intended as a compact coding model that keeps the small footprint of the 0.5B Hunyuan base while shifting its behavior toward practical Python generation and code-oriented responses.

	## Model Details

	### Model Description

	- Model name: `gss1147/Hunyuan-PythonGOD-0.5B`
	- Base model: `tencent/Hunyuan-0.5B-Instruct`
	- Architecture: causal decoder-only language model
	- Model family tag: `hunyuan_v1_dense`
	- Primary domain: Python coding / coding assistant
	- Parameter count: ~0.5B
	- Weights format: safetensors
	- Tensor type in repo: F16

	### Developed by

	- Shared by: `gss1147`

	### Finetuned from model

	- `tencent/Hunyuan-0.5B-Instruct`

	## Intended Uses

	### Direct Use

	This model is intended for:

	- Python function generation
	- Python script writing
	- debugging-oriented coding help
	- implementation tasks
	- code completion
	- coding chat assistants
	- lightweight local or cloud inference where a small coding model is preferred

	### Downstream Use

	Possible downstream uses include:

	- code copilots
	- coding bots
	- Python tutoring helpers
	- automation script generation
	- benchmark experimentation for small code LLMs

	### Out-of-Scope Use

	This model is not designed for:

	- safety-critical code deployment without human review
	- medical, legal, or financial decision support
	- secure production code without auditing
	- autonomous execution pipelines without sandboxing
	- guaranteed factual or bug-free code generation

	## Training Details

	### Training Objective

	This model was trained as a full fine-tune, not as an adapter-only release.

	Based on the training workflow you described and the run logs you shared, this release is meant to represent:

	- full-parameter fine-tuning
	- no LoRA
	- no QLoRA
	- no PEFT adapters in the final model
	- standard exported Hugging Face model weights

	### Training Data

	This model was trained on the following datasets:

	- `WithinUsAI/Python_GOD_Coder_Omniforge_AI_12k`
	- `WithinUsAI/Python_GOD_Coder_5k`
	- `WithinUsAI/Legend_Python_CoderV.1`

	From the training logs you shared, the combined training corpus used:

	- 11,760 rows from `Python_GOD_Coder_Omniforge_AI_12k`
	- 5,000 rows from `Python_GOD_Coder_5k`
	- 5,000 rows from `Legend_Python_CoderV.1`

	Total rows: 21,760

	### Training Procedure

	From the training setup you shared, this model was trained with:

	- dual-GPU Kaggle training
	- DeepSpeed-assisted distributed training
	- full model fine-tuning
	- evaluation during training
	- final-save upload flow to Hugging Face

	### Sequence Length

	- Practical fine-tuning sequence length: 4096 tokens

	### Context Window Note

	If the base model family exposes larger context metadata in config fields, that should not be taken as proof that the full fine-tuning run itself was performed at that larger length. This release should be treated as fine-tuned at 4096 tokens unless revalidated separately.

	## Evaluation

	Formal benchmark results are not finalized in this card.

	Benchmark attempts were made on free public coding benchmarks such as:

	- HumanEval+
	- MBPP+
	- BigCodeBench-style workflows

	However, based on the evaluation runs you shared, the harness setup encountered tool/runtime issues during some benchmark attempts, so this card does not claim final official benchmark scores yet.

	### Observed Training Behavior

	From the run logs you shared during training, the model showed:

	- strong reduction in training loss over time
	- strong reduction in eval loss over time
	- stable continued learning well into the run
	- increasingly code-specialized behavior relative to the base model

	Examples from your shared eval progression included values around:

	- ~0.2879 early in training
	- ~0.1071
	- ~0.0604
	- ~0.0550
	- ~0.0422
	- ~0.0329
	- ~0.0266
	- ~0.0299
	- ~0.0290

	These are training/eval-run observations, not official public benchmark scores.

	## How to Use

	### Transformers

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_id = "gss1147/Hunyuan-PythonGOD-0.5B"

	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	trust_remote_code=True,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	prompt = "Write a Python function that merges overlapping intervals."
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	do_sample=False,
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))