--- license: mit datasets: - iamtarun/python_code_instructions_18k_alpaca base_model: - Qwen/Qwen2.5-0.5B-Instruct tags: - code - python - text-generation - coding - yzy - code-generation --- # yzy-python-0.5b 🐍 Lightweight Python-focused language model (0.5B parameters) fine-tuned for code generation and instruction-following. Optimized for: - Python code generation - scripting help - small coding copilots - local inference - experimentation - hackathons Base model: Qwen2-0.5B-Instruct Fine-tuning method: QLoRA (4-bit) Dataset style: Alpaca-format Python instructions --- # Demo ## Transformers usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "SamirXR/yzy-python-0.5b" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto" ) prompt = "Write a Python function to reverse a string" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate( **inputs, max_new_tokens=200 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## 4-bit inference (recommended) ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig model_id = "SamirXR/yzy-python-0.5b" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, ) model = AutoModelForCausalLM.from_pretrained( model_id, quantization_config=bnb_config, device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained(model_id) tokenizer.pad_token = tokenizer.eos_token prompt = "Write a Python function for fibonacci numbers" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=200, temperature=0.7, top_p=0.9 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## Gradio Chatbot Demo ```python import torch import gradio as gr from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig MODEL_NAME = "SamirXR/yzy-python-0.5b" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, ) model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, quantization_config=bnb_config, device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME) tokenizer.pad_token = tokenizer.eos_token def generate_code(instruction, history): prompt = f"### Instruction:\n{instruction}\n\n### Response:\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=256, do_sample=True, temperature=0.7, top_p=0.9, repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) response = response.split("### Response:\n")[-1].strip() return response demo = gr.ChatInterface( fn=generate_code, title="yzy-python-0.5b Chatbot", description="Python coding assistant (QLoRA fine-tuned Qwen2-0.5B)", examples=[ "Write a function to calculate fibonacci numbers", "Create a Python class for a linked list", "Reverse a string in Python" ], ) demo.launch(share=True) ``` --- # Training Details Base model: Qwen/Qwen2-0.5B-Instruct Dataset: iamtarun/python_code_instructions_18k_alpaca Format used during training: ``` ### Instruction: ### Response: ``` Training method: QLoRA (4-bit NF4 quantization) Key parameters: - LoRA rank: 8 - alpha: 16 - dropout: 0.05 - epochs: 2 - learning rate: 2e-4 - context length: 512 - optimizer: paged_adamw_8bit --- # Citation If you use this model, please cite: Base model: Qwen2 Technical Report (Qwen Team, 2024) Dataset: python_code_instructions_18k_alpaca (iamtarun) Model: yzy-python-0.5b (SamirXR) --- # Notes This is a small model intended for experimentation and lightweight coding assistance. Performance will not match large models but allows fast local inference with minimal resources.