Open LM 1B — Knowledge Cutoff July 2024

This is a HuggingFace-format conversion of the Apple Open LM 1B oracle model trained with a knowledge cutoff of July 2024, from the TiC-LM (Time-Continual Language Modeling) project.

Model Details

Property	Value
Architecture	LLaMA-style (pre-norm, SwiGLU, RoPE)
Parameters	~1.4B
Training tokens	220B
Knowledge cutoff	July 2024
Vocab size	50,432
Context length	2,048
Original format	Apple Open LM

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "dogtooth/open-lm-1b-202407",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")

Conversion Notes

Converted from the original Open LM .pt checkpoint to a custom OpenLMForCausalLM format.
Uses LayerNorm (not RMSNorm) to match the original Open LM training.
Includes QK norm (LayerNorm on Q and K projections before attention).
Architecture dimensions are auto-detected from checkpoint weights.
Requires trust_remote_code=True when loading.

Citation

@article{jain2024ticlm,
  title={Time-Continual Learning from a Streaming Language Model},
  author={Jain, Ameya and Ramesh, Aakanksha and Li, Tianjian and others},
  journal={arXiv preprint arXiv:2410.14660},
  year={2024}
}

Downloads last month: 1

Safetensors

Model size

1B params

Tensor type

F32

Paper for dogtooth/open-lm-1b-202407

A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Paper • 2410.14660 • Published Oct 18, 2024