| --- |
| base_model: openai-community/gpt-oss-20b |
| library_name: peft |
| license: mit |
| tags: |
| - gpt-oss |
| - system-design |
| - software-architecture |
| - lora |
| - reasoning |
| datasets: |
| - custom |
| pipeline_tag: text-generation |
| --- |
| |
| # architectLLM — System Design LoRA for GPT-OSS 20B |
|
|
| A LoRA fine-tune of [GPT-OSS 20B](https://huggingface.co/openai-community/gpt-oss-20b) specialized in **system design and software architecture reasoning**. |
|
|
| The model retains GPT-OSS's full general capabilities while significantly improving its ability to reason through distributed systems, infrastructure trade-offs, and back-of-envelope capacity planning. |
|
|
| ## What It Does |
|
|
| - Designs distributed systems from first principles |
| - Reasons through infrastructure trade-offs (latency vs consistency, throughput vs cost) |
| - Provides back-of-envelope calculations for capacity planning |
| - Leverages GPT-OSS's native **analysis channel** for extended chain-of-thought reasoning |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |---|---| |
| | Base model | `openai-community/gpt-oss-20b` | |
| | Method | LoRA (rank 64, alpha 64) | |
| | Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` | |
| | Training examples | 1,787 | |
| | Epochs | 1 | |
| | Learning rate | 2e-4 (cosine w/ min LR) | |
| | Precision | bfloat16 | |
| | Framework | HuggingFace Transformers + PEFT + TRL | |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base = AutoModelForCausalLM.from_pretrained( |
| "openai-community/gpt-oss-20b", |
| torch_dtype="auto", |
| device_map="auto", |
| ) |
| model = PeftModel.from_pretrained(base, "bisratz/architectLLM-lora") |
| tokenizer = AutoTokenizer.from_pretrained("bisratz/architectLLM-lora") |
| |
| messages = [ |
| {"role": "system", "content": ""}, |
| {"role": "developer", "content": ( |
| "You are an expert system design architect who reasons from first principles. " |
| "Identify fundamental infrastructure primitives, analyze constraints, " |
| "explain WHY each choice fits, discuss trade-offs, and include " |
| "back-of-envelope calculations." |
| )}, |
| {"role": "user", "content": "Design a rate limiting system for an API gateway handling 500K rps across 200 microservices."}, |
| ] |
| |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| add_generation_prompt=True, |
| return_tensors="pt", |
| return_dict=True, |
| reasoning_effort="high", |
| ).to(model.device) |
| |
| output = model.generate(**inputs, max_new_tokens=4096, temperature=0.7, do_sample=True) |
| print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) |