gss1147 commited on
Commit
d7587be
·
verified ·
1 Parent(s): fb0bc2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -2
README.md CHANGED
@@ -15,6 +15,7 @@ tags:
15
  - instruct
16
  - lightweight
17
  - safetensors
 
18
 
19
  license: other
20
  license_name: withinusai-custom-license
@@ -28,14 +29,158 @@ datasets:
28
  - TeichAI/gpt-5.1-codex-max-1000x
29
  - TeichAI/gpt-5.1-high-reasoning-1000x
30
 
31
- # Metrics are the names of metrics you will report (no fake values)
32
  metrics:
33
  - pass@1
 
34
  - exact_match
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  - accuracy
 
36
 
37
- # Eval results metadata (kept empty until you add real numbers)
38
  model-index:
39
  - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B
40
  results: []
41
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  - instruct
16
  - lightweight
17
  - safetensors
18
+ - withinusai
19
 
20
  license: other
21
  license_name: withinusai-custom-license
 
29
  - TeichAI/gpt-5.1-codex-max-1000x
30
  - TeichAI/gpt-5.1-high-reasoning-1000x
31
 
32
+ # Metrics you intend to report (no fake values)
33
  metrics:
34
  - pass@1
35
+ - accuracy
36
  - exact_match
37
+
38
+ # Eval Results container (empty until real numbers are added)
39
+ model-index:
40
+ - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B
41
+ results: []
42
+ ---
43
+ ---
44
+ language:
45
+ - en
46
+
47
+ library_name: transformers
48
+ pipeline_tag: text-generation
49
+
50
+ tags:
51
+ - gpt2
52
+ - causal-lm
53
+ - text-generation
54
+ - code
55
+ - coding
56
+ - reasoning
57
+ - instruct
58
+ - lightweight
59
+ - safetensors
60
+ - withinusai
61
+
62
+ license: other
63
+ license_name: withinusai-custom-license
64
+ license_link: LICENSE
65
+
66
+ base_model: openai-community/gpt2-medium
67
+ base_model_relation: finetune
68
+
69
+ datasets:
70
+ - WithinUsAI/GPT-2-to-GPT-5-5k
71
+ - TeichAI/gpt-5.1-codex-max-1000x
72
+ - TeichAI/gpt-5.1-high-reasoning-1000x
73
+
74
+ # Metrics you intend to report (no fake values)
75
+ metrics:
76
+ - pass@1
77
  - accuracy
78
+ - exact_match
79
 
80
+ # Eval Results container (empty until real numbers are added)
81
  model-index:
82
  - name: WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B
83
  results: []
84
  ---
85
+ Prompting tips (high-reasoning unlock)
86
+
87
+ “Give a 3-step plan, then implement.”
88
+
89
+ “List edge cases first, then write the code.”
90
+
91
+ “Explain root cause → propose fix → provide patch.”
92
+
93
+ “State invariants + time/space complexity.”
94
+
95
+ Fine-tuning and method (WithIn Us AI)
96
+
97
+ WithIn Us AI created the concept, idea, and process behind enhancing GPT-2 Medium toward a GPT-5.2-style “twin target,” including:
98
+
99
+ instruction/format design for reasoning + coding
100
+
101
+ fine-tuning strategy and iteration process
102
+
103
+ naming/versioning of the enhanced model line (“GPT2.5.2”)
104
+
105
+ Base model credit
106
+
107
+ This work is built on the foundation model:
108
+
109
+ openai-community/gpt2-medium
110
+
111
+ Training data (datasets used)
112
+
113
+ This model is trained using a mixture of WithIn Us AI datasets plus third-party datasets used to improve training results:
114
+
115
+ WithIn Us AI dataset (creator-owned):
116
+
117
+ WithinUsAI/GPT-2-to-GPT-5-5k
118
+
119
+ Third-party datasets (no ownership claimed, credited explicitly):
120
+
121
+ TeichAI/gpt-5.1-codex-max-1000x
122
+
123
+ TeichAI/gpt-5.1-high-reasoning-1000x
124
+
125
+ Evaluation (add results when available)
126
+
127
+ The metadata includes a model-index block so evaluation results can be added later and rendered by the Hub.
128
+
129
+ Suggested benchmarks:
130
+
131
+ Category Benchmark Metric
132
+ Code HumanEval pass@1
133
+ Code MBPP pass@1
134
+ Reasoning Custom reasoning set accuracy
135
+ Reliability Bugfix set fix rate
136
+ Limitations
137
+
138
+ May hallucinate library/API details; validate via execution
139
+
140
+ Small models can struggle with long multi-hop constraints
141
+
142
+ Always run unit tests and check edge cases
143
+
144
+ Ethics & responsible use
145
+
146
+ Outputs may reflect biases in training data
147
+
148
+ Avoid using for harmful instructions or sensitive personal data
149
+
150
+ Human review recommended for real deployments
151
+
152
+ Thanks & attribution (explicit thanks by username + dataset/model name)
153
+
154
+ WithIn Us AI proudly gives thanks and respect to the original creators whose work makes this ecosystem possible:
155
+
156
+ Base model foundation (no ownership claimed):
157
+
158
+ Thank you to the creators and maintainers of openai-community/gpt2-medium.
159
+
160
+ Third-party datasets used (no remixing, no ownership claimed):
161
+
162
+ Thank you TeichAI for TeichAI/gpt-5.1-codex-max-1000x.
163
+
164
+ Thank you TeichAI for TeichAI/gpt-5.1-high-reasoning-1000x.
165
+
166
+ WithIn Us AI original dataset:
167
+
168
+ WithinUsAI/GPT-2-to-GPT-5-5k (created by WithIn Us AI)
169
+
170
+ WithIn Us AI is excited about collaboration, research, experiments, and breakthrough discovery.
171
+
172
+ License
173
+
174
+ Licensed under the WithIn Us AI Custom License.
175
+ See LICENSE.
176
+
177
+ Citation
178
+ @misc{withinusai_gpt252_high_reasoning_codex_04b,
179
+ title = {WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B},
180
+ author = {WithIn Us AI},
181
+ year = {2026},
182
+ url = {https://huggingface.co/WithinUsAI/GPT2.5.2-high-reasoning-codex-0.4B}
183
+ }
184
+ Changelog
185
+
186
+ v2.5.2: GPT-2 Medium enhanced toward GPT-5.2 twin target (reasoning + codex tuning)