bisratz
/

architectLLM-lora

Text Generation

software-architecture

Model card Files Files and versions

architectLLM-lora / README.md

bisratz's picture

Update README.md

419d0f2 verified 6 days ago

|

history blame contribute delete

2.61 kB

	---
	base_model: openai-community/gpt-oss-20b
	library_name: peft
	license: mit
	tags:
	- gpt-oss
	- system-design
	- software-architecture
	- lora
	- reasoning
	datasets:
	- custom
	pipeline_tag: text-generation
	---

	# architectLLM — System Design LoRA for GPT-OSS 20B

	A LoRA fine-tune of [GPT-OSS 20B](https://huggingface.co/openai-community/gpt-oss-20b) specialized in system design and software architecture reasoning.

	The model retains GPT-OSS's full general capabilities while significantly improving its ability to reason through distributed systems, infrastructure trade-offs, and back-of-envelope capacity planning.

	## What It Does

	- Designs distributed systems from first principles
	- Reasons through infrastructure trade-offs (latency vs consistency, throughput vs cost)
	- Provides back-of-envelope calculations for capacity planning
	- Leverages GPT-OSS's native analysis channel for extended chain-of-thought reasoning

	## Training Details

	\| Parameter \| Value \|
	\|---\|---\|
	\| Base model \| `openai-community/gpt-oss-20b` \|
	\| Method \| LoRA (rank 64, alpha 64) \|
	\| Target modules \| `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` \|
	\| Training examples \| 1,787 \|
	\| Epochs \| 1 \|
	\| Learning rate \| 2e-4 (cosine w/ min LR) \|
	\| Precision \| bfloat16 \|
	\| Framework \| HuggingFace Transformers + PEFT + TRL \|

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base = AutoModelForCausalLM.from_pretrained(
	"openai-community/gpt-oss-20b",
	torch_dtype="auto",
	device_map="auto",
	)
	model = PeftModel.from_pretrained(base, "bisratz/architectLLM-lora")
	tokenizer = AutoTokenizer.from_pretrained("bisratz/architectLLM-lora")

	messages = [
	{"role": "system", "content": ""},
	{"role": "developer", "content": (
	"You are an expert system design architect who reasons from first principles. "
	"Identify fundamental infrastructure primitives, analyze constraints, "
	"explain WHY each choice fits, discuss trade-offs, and include "
	"back-of-envelope calculations."
	)},
	{"role": "user", "content": "Design a rate limiting system for an API gateway handling 500K rps across 200 microservices."},
	]

	inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	return_tensors="pt",
	return_dict=True,
	reasoning_effort="high",
	).to(model.device)

	output = model.generate(**inputs, max_new_tokens=4096, temperature=0.7, do_sample=True)
	print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))