Pyroton / README.md
shohuu's picture
Update README.md
7b61771 verified
|
Raw
History Blame Contribute Delete
5.44 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - code
  - python
  - lora
  - qwen
  - fine-tuned
  - code-generation
base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct

πŸ”₯ Pyroton

A lightweight Python code generation model fine-tuned from Qwen2.5-Coder-0.5B-Instruct.

Python License Status Downloads


Overview

Pyroton is a lightweight Python-focused code generation model fine-tuned from Qwen/Qwen2.5-Coder-0.5B-Instruct using supervised fine-tuning (SFT) on Python instruction-style datasets.

The goal is to create a small, efficient model that handles easy to medium Python tasks while remaining practical for free-tier GPUs and lightweight deployment including mobile phones.


Model Variants

Base adapter

Prime-fix patched adapter


Quick Start

Load latest patched adapter (recommended)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-0.5B-Instruct",
    dtype=torch.bfloat16,
    device_map="auto",
)

model = PeftModel.from_pretrained(base, "shohuu/pyroton-primefix-v3")
tokenizer = AutoTokenizer.from_pretrained("shohuu/pyroton-primefix-v3")
tokenizer.pad_token = tokenizer.eos_token

prompt = "### Instruction:\nWrite a Python function to reverse a string\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=220,
    do_sample=False,
    repetition_penalty=1.1,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Run GGUF locally (Ollama / LM Studio / PocketPal)

# Download pyroton-q4.gguf from Files tab
# Then load in your preferred local LLM app

Recommended Inference Settings

Correctness mode (recommended)

Most reliable for code generation:

outputs = model.generate(
    **inputs,
    max_new_tokens=220,
    do_sample=False,
    repetition_penalty=1.1,
)

Sampled mode (more variety, less stable)

outputs = model.generate(
    **inputs,
    max_new_tokens=220,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
    repetition_penalty=1.2,
)

Example Output

Prompt:

Write a Python function to check if a number is prime

Pyroton Output:

import math

def is_prime(n):
    """Check if the given integer n is prime."""
    if n <= 1:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

Training Details

Setting Value
Base Model Qwen2.5-Coder-0.5B-Instruct
Datasets python_code_instructions_18k_alpaca, CodeAlpaca-20k, code_instructions_122k_alpaca_style
Total Samples ~95,362 (Python-filtered)
Training Strategy Chunked SFT (5 chunks)
LoRA Rank 16
LoRA Alpha 32
Batch Size 2
Gradient Accumulation 8
Learning Rate 1e-4
Precision BFloat16
Max Length 512
Final Training Loss ~0.712

Repair finetuning

After main training, targeted repair finetuning was applied to fix:

  • Missing import math / math.sqrt issues
  • Incorrect handling of edge cases (negative numbers, 0, 1)
  • Latest patched adapter: shohuu/pyroton-primefix-v3

Evaluation

Tested against execution-based harness on is_prime() with inputs: -1, 0, 1, 2, 3, 4, 6, 9, 17, 49

  • Greedy decoding: 5/5 passing βœ…
  • Sampled decoding: improved but less stable

GGUF / Mobile Deployment

Pyroton is available as a GGUF file for local deployment:

File Quantization Size
pyroton-q4.gguf Q4_K_M ~397MB

Compatible apps:

  • PocketPal AI (Android/iOS) β€” search shohuu/Pyroton
  • LM Studio (Desktop)
  • Ollama (Desktop)

Known Limitations

  • 0.5B model β€” can degrade on harder tasks or complex reasoning
  • Greedy decoding is more reliable than sampling for correctness
  • Most thoroughly tested on short Python coding tasks
  • Broader evaluation across more libraries still needed

GitHub

github.com/TunasTuna/pyroton


Requirements

transformers
datasets
trl
peft
bitsandbytes
accelerate
torchao

License

Apache 2.0 β€” see LICENSE for details.

Base model (Qwen2.5-Coder) is also Apache 2.0. Attribution to Alibaba Cloud / Qwen Team.


Acknowledgements