Eclipse-Senpai commited on
Commit
6f9aecf
·
verified ·
1 Parent(s): 2dc5cfa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -29,7 +29,7 @@ datasets:
29
 
30
  # KeyLM-75M-Instruct
31
 
32
- KeyLM-75M-Instruct is a 75M parameter instruction-tuned language model trained from scratch on approximately 18 billion tokens. That training budget is a small fraction of what comparable small models use (SmolLM-135M was trained on roughly 600B tokens, SmolLM2-135M on roughly 2T). Despite this, it is competitive on instruction following, outperforming SmolLM-135M-Instruct on IFEval while using about half the parameters and a fraction of the data.
33
 
34
  ## Table of Contents
35
 
@@ -60,8 +60,6 @@ GGUF builds for `llama.cpp`, LM Studio, and Ollama are available at [KeyLM-75M-I
60
 
61
  ## How to Use
62
 
63
- KeyLM ships its own modeling code, so load it with `trust_remote_code=True`. It requires `transformers>=4.51`.
64
-
65
  ```python
66
  import torch
67
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -83,8 +81,6 @@ outputs = model.generate(
83
  print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
84
  ```
85
 
86
- The model uses a plain `User:` / `Assistant:` chat format, applied automatically by `apply_chat_template`. Assistant turns end with `</s>`.
87
-
88
  ## Evaluation
89
 
90
  ### Instruction following (IFEval)
 
29
 
30
  # KeyLM-75M-Instruct
31
 
32
+ KeyLM-75M-Instruct is a 75M parameter instruction-tuned language model trained from scratch on approximately 18 billion tokens. That training budget is a small fraction of what comparable small models use (SmolLM-135M was trained on 600B tokens and SmolLM2-135M on 2T). Despite this, it is competitive on instruction following, outperforming SmolLM-135M-Instruct on IFEval while using about half the parameters and a fraction of the data.
33
 
34
  ## Table of Contents
35
 
 
60
 
61
  ## How to Use
62
 
 
 
63
  ```python
64
  import torch
65
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
81
  print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
82
  ```
83
 
 
 
84
  ## Evaluation
85
 
86
  ### Instruction following (IFEval)