FORGE-Qwen-Final

πŸš€ A lightweight coding model adapter fine-tuned for FORGE-v4, an adversarial self-improvement environment for robust code generation.


πŸ”₯ Model Overview

This repository contains a LoRA / PEFT adapter trained on top of:

Base Model: Qwen/Qwen2.5-Coder-1.5B-Instruct

The adapter was trained in Google Colab GPU using:

  • Unsloth
  • Hugging Face TRL
  • Supervised Fine-Tuning (SFT)
  • FORGE robustness tasks

πŸ›‘οΈ Purpose

Most coding models solve normal tasks but fail on edge cases.

This adapter improves reliability for:

  • Negative numbers
  • Duplicate values
  • Empty arrays
  • Safe sorting tasks
  • Clean Python function formatting
  • Instruction following

🧠 About FORGE-v4

FORGE is an adversarial benchmark + training system where:

  • Defender model writes code
  • Breaker agent generates hidden edge cases
  • Reward system scores robustness
  • Training improves failure cases

Theme Alignment:

  • Self-Improvement
  • Multi-Agent AI
  • Verifiable RL Environments

βš™οΈ Training Details

Hardware: Google Colab Tesla T4 GPU Method: 4-bit QLoRA fine-tuning Adapter Type: LoRA Training Stack: Unsloth + TRL

Example metrics:

  • Training completed successfully
  • Exported real adapter weights
  • Public deployment ready

πŸ“¦ Files Included

  • adapter_model.safetensors
  • adapter_config.json
  • tokenizer files

πŸš€ How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base = "Qwen/Qwen2.5-Coder-1.5B-Instruct"
adapter = "sanjay7676/forge-qwen-final"

tokenizer = AutoTokenizer.from_pretrained(base)

model = AutoModelForCausalLM.from_pretrained(base)

model = PeftModel.from_pretrained(model, adapter)

πŸ’» Example Prompt

Return only Python code.

Write:

def solve(arr):

Sort integers preserving duplicates and negatives.

🌐 Used In

FORGE-v4 live environment / Hugging Face Space demo.


πŸ“ˆ Future Work

  • GRPO reinforcement tuning
  • More adversarial coding tasks
  • Runtime efficiency rewards
  • Multi-language coding benchmarks

πŸ‘€ Author

Sanjay7676

Built for FORGE-v4 hackathon submission.

Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sanjay7676/forge-qwen-final

Adapter
(104)
this model

Space using sanjay7676/forge-qwen-final 1