README.md · codelion/Qwen3-4B-execution-world-model-lora at main

Qwen3-4B-execution-world-model-lora / README.md

codelion

Add model card with YAML frontmatter and usage instructions

54884f7 verified 3 months ago

preview code

raw

history blame contribute delete

3.31 kB

	---
	base_model: Qwen/Qwen3-4B-Thinking-2507
	tags:
	- ellora
	- lora
	- code-execution
	- execution-tracing
	- world-model
	- cwm
	- grpo
	- thinking
	- code-understanding
	- peft
	- qwen
	library_name: peft
	license: apache-2.0
	pipeline_tag: text-generation
	inference: true
	model_type: qwen3
	datasets:
	- codelion/execution-world-model-dataset
	---
	# codelion/Qwen3-4B-execution-world-model-lora

	## 🌍 Execution-Aware World Model LoRA

	This LoRA adapter adds execution awareness capabilities to Qwen/Qwen3-4B-Thinking-2507. Inspired by Meta's CWM (Code World Model) research, it enables the model to predict and understand program execution step-by-step.

	## 🎯 Key Features

	- Step-by-Step Execution Prediction: Predicts variable states at each line
	- Dynamic World Model: Understands how code behaves at runtime
	- Execution Tracing: Generates detailed execution traces with variable states
	- Debugging Support: Can identify and explain execution behavior
	- GRPO-Trained: Uses preference learning with real execution feedback

	## 📊 Performance Metrics

	- Base Model: Qwen/Qwen3-4B-Thinking-2507
	- Training Method: GRPO (Group Relative Policy Optimization) with Real Execution Traces
	- LoRA Rank: 64
	- LoRA Alpha: 128
	- Training Samples: 298
	- Evaluation Samples: 323
	- Execution Prediction Accuracy: 20.0%
	- Mean State Accuracy: 33.3%

	## 🔧 Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	# Load base model
	model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen3-4B-Thinking-2507",
	torch_dtype="auto",
	device_map="auto",
	trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Thinking-2507")

	# Load execution world model LoRA
	model = PeftModel.from_pretrained(model, "codelion/Qwen3-4B-execution-world-model-lora")

	# Analyze code execution
	prompt = """Analyze this code and predict its execution trace:

	\`\`\`python
	x = 10
	y = x * 2
	z = x + y
	\`\`\`

	Show variable states at each line."""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## 📈 Example Output

	```
	<execution_trace>
	Line 1: State: {x=10}
	Line 2: State: {x=10, y=20}
	Line 3: State: {x=10, y=20, z=30}
	</execution_trace>
	```

	## 🧪 Training Details

	- Method: GRPO (Group Relative Policy Optimization)
	- Data: Self-generated code with real execution traces
	- Epochs: 3
	- Reward: Gradual scoring (0.0-1.0) based on execution accuracy

	## 📚 Dataset

	[codelion/execution-world-model-dataset](https://huggingface.co/datasets/codelion/execution-world-model-dataset)

	- Python code (3-20 lines)
	- Real execution traces via `sys.settrace()`
	- Ground truth variable states

	## 🏷️ Related

	- Dataset: [codelion/execution-world-model-dataset](https://huggingface.co/datasets/codelion/execution-world-model-dataset)
	- Base Model: [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
	- Project: [Ellora Recipes](https://github.com/codelion/ellora)

	---

	Part of the [Ellora project](https://github.com/codelion/ellora) - standardized recipes for enhancing LLM capabilities.