ariankharazmi commited on
Commit
98d4f9c
·
verified ·
1 Parent(s): 06211c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -31,5 +31,4 @@ Description
31
  - Curiosity-16 is a small research model (based on pre-trained GPT-2 Medium) that has 354.8M Parameters. It uses training samples from 11 diverse HuggingFace datasets.
32
 
33
  Citation
34
- Kharazmi, A. (2025). Curiosity-16: A 354.8M Parameter Large Language Model (v1.0). Zenodo. https://doi.org/10.5281/zenodo.17871084Curiosity-16 is a proof-of-concept Large Language Model fine-tuned for short factual responses and basic reasoning capabilities, presented during UCinci’s Summer 2025 EEP Research co-op session. This demonstrated the feasibility of pushing open-source GPT-2 Medium to higher usability through carefully structured two-phase Supervised Fine-tuning. Trained on ~153k prompt-response pairs across 11 curated HuggingFace datasets. Achieved over 100 downloads on HuggingFace. Evaluated against GPT-2 Medium on HellaSwag & Massive Multitask Language Understanding (MMLU) Benchmarks. Model Summary - Parameters: 354.8M Parameters - Base: GPT-2 Medium (Decoder) - Tokenizer: AutoTokenizer - Training: 2-Phase Full SFT (Phase I: Generalization, Phase II: Task-focus/Chain-of-Thought) - Training Duration: 30 Hours -- Phase I: 17 Hours, Phase II: 13 Hours - Purpose: Research Model -- Proof of Concept - Evaluated Strengths: Short factual responses, brief creative prompts, basic reasoning inquiries - Evaluated Limitations: Hard-limit at 1-2 Sentences, no safety filter, prone to misinterpret or hallucinate. Citation: Kharazmi, A. (2025). Curiosity-16: A 354.8M Parameter Large Language Model (v1.0). Zenodo. https://doi.org/10.5281/zenodo.17871084
35
-
 
31
  - Curiosity-16 is a small research model (based on pre-trained GPT-2 Medium) that has 354.8M Parameters. It uses training samples from 11 diverse HuggingFace datasets.
32
 
33
  Citation
34
+ Kharazmi, A. (2025). Curiosity-16: A 354.8M Parameter Large Language Model (v1.0). Zenodo. https://doi.org/10.5281/zenodo.17871084