metadata
license: apache-2.0
language:
- en
base_model:
- openai-community/gpt2-medium
pipeline_tag: text-generation
tags:
- research
- proof-of-concept
Curiosity-16
Model Summary
Parameters: 354.8M Parameters
Base: GPT-2 Medium (Decoder)
Tokenizer: AutoTokenizer
Training: 2-Phase Full SFT
Purpose: Research Model -- Proof of Concept
Strengths: Short factual responses, small stories, basic reasoning
Limitations: Hard-limit at 1-2 Sentences, tends to misunderstand, no safety filter, prone to hallucinate
Description
- Curiosity-16 is a small research model (based on pre-trained GPT-2 Medium) that has 354.8M Parameters. It uses training samples from 11 diverse HuggingFace datasets.
Citation Kharazmi, A. (2025). Curiosity-16: A 354.8M Parameter Large Language Model (v1.0). Zenodo. https://doi.org/10.5281/zenodo.17871084