jasonacox/jojo-124M

Model Description

jasonacox/jojo-124M is a GPT-style language model trained using the Jojo LLM training framework. This model was fine-tuned on the TinyStoriesV2 dataset and is designed for text generation tasks.

Model Details

  • Model Type: GPT-style Transformer Language Model
  • Training Framework: Jojo LLM
  • Language: English
  • License: MIT

Architecture

  • Layers: 12
  • Hidden Size: 768
  • Attention Heads: 12
  • Context Length: 1024 tokens
  • Vocabulary Size: 50,304 tokens
  • Total Parameters: 219.6M

Training Details

Training Data

The model was trained on the TinyStoriesV2 dataset.

Training Procedure

  • Training Framework: Jojo LLM v2.1.0
  • PyTorch Version: 2.7.1+cu126
  • Training Device: cuda:1
  • Precision: bfloat16

Training Hyperparameters

  • Batch Size: 12
  • Gradient Accumulation Steps: 40
  • Learning Rate: 0.0006
  • Weight Decay: 0.1
  • Dropout: 0.2
  • Gradient Clipping: 1.0

Training Results

Usage

Using with Transformers

from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load model and tokenizer
model = GPT2LMHeadModel.from_pretrained("jasonacox/jojo-124M")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Generate text
input_text = "Your prompt here"
inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(inputs, max_length=100, num_return_sequences=1, temperature=0.7)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Using with Jojo LLM

# Generate text using the original Jojo framework
python gen.py jojo-124M.pt --prompt "Your prompt here"

Technical Specifications

  • Model Format: PyTorch
  • Precision: bfloat16
  • Framework Compatibility:
    • โœ… Hugging Face Transformers
    • โœ… Jojo LLM
    • โœ… PyTorch

Model Card Authors

This model card was automatically generated by the Jojo LLM Hugging Face upload script.

Citation

If you use this model, please cite:

@misc{jasonacox/jojo_124m,
  title={jasonacox/jojo-124M},
  author={Jason A. Cox},
  year={2025},
  howpublished={\url{https://github.com/jasonacox/jojo}},
  note={Trained using Jojo LLM framework}
}

Framework Information

  • Jojo LLM Version: 2.1.0
  • Generation Date: 2025-07-10T22:23:12.633549
  • Checkpoint: jojo-124M.pt

For more information about the Jojo LLM framework, visit: https://github.com/jasonacox/jojo

Downloads last month
12
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jasonacox/jojo-124M

Quantizations
1 model

Evaluation results