Text Generation
Transformers
Safetensors
English
looplm
looped-transformer
language-model
research
from-scratch
custom_code
Instructions to use harims95/LoopLM-135M-naive with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use harims95/LoopLM-135M-naive with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="harims95/LoopLM-135M-naive", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("harims95/LoopLM-135M-naive", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use harims95/LoopLM-135M-naive with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "harims95/LoopLM-135M-naive" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "harims95/LoopLM-135M-naive", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/harims95/LoopLM-135M-naive
- SGLang
How to use harims95/LoopLM-135M-naive with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "harims95/LoopLM-135M-naive" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "harims95/LoopLM-135M-naive", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "harims95/LoopLM-135M-naive" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "harims95/LoopLM-135M-naive", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use harims95/LoopLM-135M-naive with Docker Model Runner:
docker model run hf.co/harims95/LoopLM-135M-naive
| { | |
| "timestamp": "2026-06-28T11:27:05.869434+00:00", | |
| "run_name": "real_naive_fineweb_5B_2gpu", | |
| "git_commit": "unknown", | |
| "cli_args": { | |
| "preset": "135M", | |
| "run_name": "real_naive_fineweb_5B_2gpu", | |
| "data_dir": "/data/fineweb", | |
| "train_pattern": "fineweb_train_*.bin", | |
| "val_pattern": "fineweb_train_*.bin", | |
| "max_steps": 20000, | |
| "seq_len": 1024, | |
| "batch_tokens": 262144, | |
| "micro_batch_seqs": 32, | |
| "val_every": 250, | |
| "out_dir": "/data/runs", | |
| "save_every": 2500, | |
| "no_compile": false, | |
| "holdout_last_for_val": true, | |
| "set": [ | |
| "use_a_matrix=false", | |
| "use_input_norm=false" | |
| ] | |
| }, | |
| "train_config": { | |
| "data_dir": "/data/fineweb_edu", | |
| "train_pattern": "edu_fineweb_train_*.bin", | |
| "val_pattern": "edu_fineweb_val_*.bin", | |
| "seq_len": 1024, | |
| "batch_tokens": 262144, | |
| "micro_batch_seqs": 32, | |
| "max_steps": 20000, | |
| "warmup_steps": 100, | |
| "cooldown_frac": 0.4, | |
| "final_lr_frac": 0.1, | |
| "muon_lr": 0.02, | |
| "muon_momentum": 0.95, | |
| "muon_wd": 0.1, | |
| "muon_ns_steps": 5, | |
| "adam_lr": 0.0003, | |
| "adam_betas": [ | |
| 0.9, | |
| 0.95 | |
| ], | |
| "adam_wd": 0.1, | |
| "grad_clip": 1.0, | |
| "val_every": 250, | |
| "val_tokens": 10485760, | |
| "log_every": 10, | |
| "seed": 1337, | |
| "compile": true, | |
| "bf16": true, | |
| "out_dir": "/data/runs", | |
| "run_name": "real_naive_fineweb_5B_2gpu" | |
| }, | |
| "model_config": { | |
| "vocab_size": 50304, | |
| "d_model": 1024, | |
| "n_prelude": 4, | |
| "n_coda": 2, | |
| "mu_rec": 6, | |
| "n_q_heads": 16, | |
| "n_kv_heads": 8, | |
| "head_dim": 64, | |
| "qk_norm": true, | |
| "rope_theta": 10000.0, | |
| "dense_ffn": 2816, | |
| "tie_embeddings": true, | |
| "final_z_loss_coef": 0.0001, | |
| "use_a_matrix": false, | |
| "use_input_norm": false, | |
| "init_std": 0.02 | |
| }, | |
| "hostname": "modal", | |
| "gpu_count": 2, | |
| "gpu_type": "NVIDIA H100 80GB HBM3", | |
| "pytorch_version": "2.12.0+cu130" | |
| } |