Curiosity-16

Model Summary

  • Parameters: 354.8M Parameters

  • Base: GPT-2 Medium (Decoder)

  • Tokenizer: AutoTokenizer

  • Training: 2-Phase Full SFT

  • Purpose: Research Model -- Proof of Concept

  • Strengths: Short factual responses, small stories, basic reasoning

  • Limitations: Hard-limit at 1-2 Sentences, tends to misunderstand, no safety filter, prone to hallucinate

Description

  • Curiosity-16 is a small research model (based on pre-trained GPT-2 Medium) that has 354.8M Parameters. It uses training samples from 11 diverse HuggingFace datasets.

Citation Kharazmi, A. (2025). Curiosity-16: A 354.8M Parameter Large Language Model (v1.0). Zenodo. https://doi.org/10.5281/zenodo.17871084

Downloads last month
5
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ariankharazmi/Curiosity-16

Finetuned
(153)
this model
Quantizations
1 model

Space using ariankharazmi/Curiosity-16 1