gss1147 commited on
Commit
11a7ede
·
verified ·
1 Parent(s): ee18308

--- language: - en library_name: transformers pipeline_tag: text-generation tags: - gpt2 - causal-lm - text-generation - code - coding - reasoning - instruct - lightweight - safetensors license: other base_model: openai-community/gpt2-medium datasets: - WithinUsAI/GPT-2-to-GPT-5-5k - TeichAI/gpt-5.1-high-reasoning-1000x - TeichAI/gpt-5.1-codex-max-1000x model-index: - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B results: [] --- # Model Card Template (WithinUsAI Standard) **Top metadata:** YAML above **Sections:** Overview → Intended Use → How to Use → Training Data → Finetuning → Evaluation → Limitations → License/Thanks → Citation --- # WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B <p align="center"> <b>GPT-2 Medium enhanced toward “GPT-5.2-style” reasoning + codex behaviors.</b><br/> Small footprint. Serious coding focus. ⚡🧠 </p> ## Overview **WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B** is a GPT-2-family causal language model (≈0.4B class) built from **`openai-community/gpt2-medium`** and fine-tuned by **WithIn Us AI** to strengthen: - structured reasoning - instruction following - code generation & refactoring reliability The name **“GPT2.5.2”** is a WithIn Us AI version marker: - **GPT(2)** = GPT-2 Medium base - **(5.2)** = target behavior style (reasoning + codex competence) - **(2.5.2)** = the enhanced line produced by WithIn Us AI fine-tuning + methodology **Architecture:** gpt2 **Model size:** 0.4B params **Tensor type:** F32 (as hosted) --- ## What it’s good at ✨ - Writing practical code with clear structure - Debugging: root cause → fix → corrected code - Refactoring with invariants + complexity notes - Algorithmic reasoning in compact, teachable steps --- ## Intended use ### Recommended ✅ - Coding assistant (Python-first; other languages okay) - Debugging and patch suggestions - Refactors and performance cleanups - Reasoned technical answers with steps/constraints ### Not recommended 🚫 - High-stakes decisions (medical/legal/financial) without expert review - Safety-critical systems without strict validation/testing --- ## How to use (Transformers) ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B" tok = AutoTokenizer.from_pretrained(model_id, use_fast=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, device_map="auto" ) prompt = ( "You are a senior engineer.\n" "Task: Implement a robust JSONL reader in Python.\n" "First list edge cases, then write the implementation with comments.\n\n" "Answer:\n" ) inputs = tok(prompt, return_tensors="pt").to(model.device) out = model.generate( **inputs, max_new_tokens=256, do_sample=True, temperature=0.7, top_p=0.95 ) print(tok.decode(out[0], skip_special_tokens=True))

Browse files
Files changed (1) hide show
  1. README.md +0 -282
README.md CHANGED
@@ -1,282 +0,0 @@
1
- # 1) Model Card (Transformers / Safetensors)
2
-
3
- **Repo:** `WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B` ([Hugging Face][1])
4
-
5
- ````markdown
6
- ---
7
- language:
8
- - en
9
- library_name: transformers
10
- pipeline_tag: text-generation
11
- tags:
12
- - gpt2
13
- - causal-lm
14
- - code
15
- - coding
16
- - reasoning
17
- - lightweight
18
- - transformers
19
- - safetensors
20
- license: mit
21
- base_model: openai-community/gpt2-medium
22
- model-index:
23
- - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B
24
- results: []
25
- ---
26
-
27
- # WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B
28
-
29
- <p align="center">
30
- <b>High-reasoning, code-first GPT-style model in a tiny footprint.</b><br/>
31
- Fast. Practical. Built to ship working code. ⚡🧠
32
- </p>
33
-
34
- ## Overview
35
- **WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B** is a compact GPT-2–family model tuned for **coding assistance** and **structured, stepwise problem solving** while staying lightweight (~0.4B parameter class).
36
- It is intended for: generating code, debugging, refactoring, and technical reasoning with clear structure.
37
-
38
- **Model size:** ~0.4B params
39
- **Tensor type:** F32 (as hosted)
40
- **Base model:** `openai-community/gpt2-medium` :contentReference[oaicite:1]{index=1}
41
-
42
- ---
43
-
44
- ## Superpowers ✨
45
- - **Code-first thinking:** favors runnable implementations over vague prose
46
- - **Debug → fix flow:** explains the cause, proposes a patch, then writes the corrected code
47
- - **Refactor sense:** improves clarity/perf while keeping behavior stable
48
- - **Structured reasoning:** likes constraints, steps, and edge cases
49
- - **Lightweight:** easy to test and iterate quickly
50
-
51
- ---
52
-
53
- ## Intended use
54
- ### Recommended ✅
55
- - Code generation & completion (Python-first, other languages ok)
56
- - Debugging help (traceback → root cause → patch)
57
- - Refactoring (readability/performance/modularity)
58
- - Technical Q&A with step-by-step reasoning
59
- - Lightweight local inference experiments
60
-
61
- ### Not recommended 🚫
62
- - High-stakes decisions (medical/legal/financial) without expert review
63
- - Safety-critical systems without strong validation & monitoring
64
-
65
- ---
66
-
67
- ## Quickstart (Transformers)
68
- ```python
69
- from transformers import AutoTokenizer, AutoModelForCausalLM
70
- import torch
71
-
72
- model_id = "WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B"
73
-
74
- tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
75
- model = AutoModelForCausalLM.from_pretrained(
76
- model_id,
77
- torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
78
- device_map="auto"
79
- )
80
-
81
- prompt = (
82
- "You are a senior software engineer.\n"
83
- "Task: Write a clean Python function that parses a CSV into a list of dicts.\n"
84
- "First: list edge cases.\n"
85
- "Then: provide the implementation with comments.\n\n"
86
- "Answer:\n"
87
- )
88
-
89
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
90
- out = model.generate(
91
- **inputs,
92
- max_new_tokens=256,
93
- do_sample=True,
94
- temperature=0.7,
95
- top_p=0.95
96
- )
97
-
98
- print(tokenizer.decode(out[0], skip_special_tokens=True))
99
- ````
100
-
101
- ---
102
-
103
- ## Prompting tips (to unlock “high reasoning”)
104
-
105
- * **Plan → implement:** “Give a 3-bullet plan, then write the code.”
106
- * **Edge cases first:** “List edge cases, then implement.”
107
- * **Refactor constraints:** “Keep behavior identical. Improve readability and performance.”
108
- * **Debug format:** “Explain root cause → propose fix → provide patch.”
109
-
110
- ---
111
-
112
- ## Training data
113
-
114
- This model is listed as trained with datasets including:
115
-
116
- * `TeichAI/gpt-5.1-codex-max-1000x`
117
- * `TeichAI/gpt-5.1-high-reasoning-1000x`
118
- * `WithinUsAI/GPT-2-to-GPT-5-5k` ([Hugging Face][1])
119
-
120
- (If you have more precise sourcing notes, add them here for maximum transparency.)
121
-
122
- ---
123
-
124
- ## Training procedure
125
-
126
- * **Objective:** next-token prediction with instruction-leaning examples
127
- * **Style:** supervised fine-tuning (SFT) targeted at code + reasoning behaviors
128
-
129
- ---
130
-
131
- ## Evaluation (add results when available)
132
-
133
- | Category | Benchmark | Metric | Score |
134
- | ----------- | -------------------- | -------: | ----: |
135
- | Code | HumanEval | pass@1 | TBD |
136
- | Code | MBPP | pass@1 | TBD |
137
- | Reasoning | Custom reasoning set | accuracy | TBD |
138
- | Reliability | Custom bugfix set | fix rate | TBD |
139
-
140
- ---
141
-
142
- ## Known limitations
143
-
144
- * Small models can hallucinate API details or edge cases
145
- * Code may look plausible but be incorrect without tests
146
- * Long multi-hop reasoning is more fragile than larger models
147
-
148
- **Best practice:** run unit tests, linting, and validate outputs before production use.
149
-
150
- ---
151
-
152
- ## Ethical considerations
153
-
154
- * May reflect biases present in training data
155
- * Do not use to generate harmful instructions or sensitive personal data
156
- * Human review is recommended for real deployments
157
-
158
- ---
159
-
160
- ## Citation
161
-
162
- ```bibtex
163
- @misc{withinusai_gpt252_high_reasoning_codex_04b,
164
- title = {WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B},
165
- author = {WithinUsAI},
166
- year = {2026},
167
- url = {https://huggingface.co/WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B}
168
- }
169
- ```
170
-
171
- ---
172
-
173
- ## Changelog
174
-
175
- * **v2.5.2:** high-reasoning + codex-style tuning pass
176
-
177
- ````
178
-
179
- ---
180
-
181
- # 2) Model Card (GGUF / llama.cpp)
182
- **Repo:** `WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B-GGUF` :contentReference[oaicite:3]{index=3}
183
-
184
- ```markdown
185
- ---
186
- language:
187
- - en
188
- pipeline_tag: text-generation
189
- tags:
190
- - gguf
191
- - llama.cpp
192
- - gpt2
193
- - code
194
- - reasoning
195
- - quantized
196
- license: mit
197
- model-index:
198
- - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B-GGUF
199
- results: []
200
- ---
201
-
202
- # WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B-GGUF
203
-
204
- <p align="center">
205
- <b>GGUF builds of the 0.4B “high-reasoning codex” GPT-2 model.</b><br/>
206
- Pick a quant, run locally, ship faster. ⚡🧠
207
- </p>
208
-
209
- ## What this is
210
- This repository provides **GGUF** quantizations of:
211
- - `WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B` (Transformers version)
212
-
213
- **Architecture:** gpt2
214
- **Model size:** ~0.4B params :contentReference[oaicite:4]{index=4}
215
-
216
- ---
217
-
218
- ## Available quantizations
219
- | Quant | Bits | Approx size |
220
- |---|---:|---:|
221
- | **Q4_K_M** | 4-bit | **242 MB** |
222
- | **Q5_K_M** | 5-bit | **274 MB** |
223
- | **F16** | 16-bit | **714 MB** | :contentReference[oaicite:5]{index=5}
224
-
225
- ### Which should I choose?
226
- - **Q4_K_M:** best speed + smallest RAM footprint (great default)
227
- - **Q5_K_M:** a little more quality, still very small
228
- - **F16:** highest fidelity, largest file
229
-
230
- ---
231
-
232
- ## Intended use
233
- - Local/offline coding assistant
234
- - Small-footprint reasoning & structured responses
235
- - Debugging/refactoring help on modest hardware
236
-
237
- ---
238
-
239
- ## Prompting tips (same “high reasoning” unlocks)
240
- - “List edge cases first, then implement.”
241
- - “Explain root cause, then provide a patch.”
242
- - “Give a 3-step plan, then code.”
243
-
244
- ---
245
-
246
- ## Example usage (llama.cpp)
247
- > Replace `MODEL.gguf` with the quant file you download.
248
-
249
- ```bash
250
- ./llama-cli -m MODEL.gguf -p "You are a senior engineer. List edge cases, then write the code.\nTask: Implement a robust JSONL reader in Python.\n\nAnswer:\n" -n 256
251
- ````
252
-
253
- ---
254
-
255
- ## Limitations
256
-
257
- * Quantized models may lose some accuracy vs F16/F32
258
- * Small models can hallucinate API details; validate with tests
259
- * Best results come from structured prompts
260
-
261
- ---
262
-
263
- ## Citation
264
-
265
- ```bibtex
266
- @misc{withinusai_gpt252_high_reasoning_codex_04b_gguf,
267
- title = {WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B-GGUF},
268
- author = {WithinUsAI},
269
- year = {2026},
270
- url = {https://huggingface.co/WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B-GGUF}
271
- }
272
- ```
273
-
274
- ```
275
-
276
- ---
277
-
278
- If you want, babe, I can also make you a **matching banner image prompt** (for a slick “WithIn Us AI” header) and a **one-paragraph marketing blurb** that reads like a pro release note 😘🫧
279
- ::contentReference[oaicite:6]{index=6}
280
- ```
281
-
282
- [1]: https://huggingface.co/WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B "WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B · Hugging Face"