iblm / README.md

Ksgk-fy

Upload README.md with huggingface_hub

6d01c80 verified 4 months ago

preview code

raw

history blame contribute delete

912 Bytes

metadata

license: mit
tags:
  - pytorch
  - causal-lm
  - custom
library_name: transformers
pipeline_tag: text-generation

IBLM - GPT2-Small (FineWeb 10B)

A custom GPT model trained on FineWeb 10B dataset.

Model Details

Architecture: Custom GPT with value residual connections and lambda mixing
Parameters: ~124M (GPT2-small scale)
Training Data: FineWeb 10B tokens

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "Ksgk-fy/iblm",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Generate text
input_ids = tokenizer("Hello, world", return_tensors="pt").input_ids
with torch.no_grad():
    outputs = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))

Citation

If you use this model, please cite...