Text Generation
PEFT
Safetensors
Transformers
gemma2
axolotl
lora
conversational
text-generation-inference
4-bit precision
bitsandbytes
Instructions to use AiAF/rp-2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AiAF/rp-2b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it") model = PeftModel.from_pretrained(base_model, "AiAF/rp-2b") - Transformers
How to use AiAF/rp-2b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AiAF/rp-2b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("AiAF/rp-2b") model = AutoModelForCausalLM.from_pretrained("AiAF/rp-2b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use AiAF/rp-2b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AiAF/rp-2b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AiAF/rp-2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AiAF/rp-2b
- SGLang
How to use AiAF/rp-2b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AiAF/rp-2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AiAF/rp-2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AiAF/rp-2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AiAF/rp-2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use AiAF/rp-2b with Docker Model Runner:
docker model run hf.co/AiAF/rp-2b
Training in progress, step 850
Browse files- adapter_model.safetensors +1 -1
- debug.log +103 -1
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 102264160
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ad29cff1b863587cbb2ca948354cb20133cf91efb3ab95cc9e09274cb6bcac5b
|
| 3 |
size 102264160
|
debug.log
CHANGED
|
@@ -2012,4 +2012,106 @@ trainable params: 25,559,040 || all params: 2,639,900,928 || trainable%: 0.9682
|
|
| 2012 |
|
| 2013 |
80%|β| 801/1000 [13:44<10:45, 3.25s/it]
|
| 2014 |
80%|β| 802/1000 [13:45<08:15, 2.50s/it]
|
| 2015 |
|
| 2016 |
|
| 2017 |
80%|β| 802/1000 [13:45<08:15, 2.50s/it]
|
| 2018 |
80%|β| 803/1000 [13:46<06:28, 1.97s/it]
|
| 2019 |
|
| 2020 |
|
| 2021 |
80%|β| 803/1000 [13:46<06:28, 1.97s/it]
|
| 2022 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
| 2023 |
|
| 2024 |
-
|
| 2025 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
|
|
|
| 2026 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
| 2027 |
80%|β| 805/1000 [13:47<04:21, 1.34s/it]
|
| 2028 |
|
|
|
|
| 2029 |
80%|β| 805/1000 [13:47<04:21, 1.34s/it]
|
| 2030 |
81%|β| 806/1000 [13:48<03:46, 1.17s/it]
|
| 2031 |
|
|
|
|
| 2032 |
81%|β| 806/1000 [13:48<03:46, 1.17s/it]
|
| 2033 |
81%|β| 807/1000 [13:49<03:20, 1.04s/it]
|
| 2034 |
|
|
|
|
| 2035 |
81%|β| 807/1000 [13:49<03:20, 1.04s/it]
|
| 2036 |
81%|β| 808/1000 [13:49<03:03, 1.04it/s]
|
| 2037 |
|
|
|
|
| 2038 |
81%|β| 808/1000 [13:49<03:03, 1.04it/s]
|
| 2039 |
81%|β| 809/1000 [13:50<02:49, 1.13it/s]
|
| 2040 |
|
|
|
|
| 2041 |
81%|β| 809/1000 [13:50<02:49, 1.13it/s]
|
| 2042 |
81%|β| 810/1000 [13:51<02:43, 1.16it/s]
|
| 2043 |
|
|
|
|
| 2044 |
81%|β| 810/1000 [13:51<02:43, 1.16it/s]
|
| 2045 |
81%|β| 811/1000 [13:52<02:36, 1.21it/s]
|
| 2046 |
|
|
|
|
| 2047 |
81%|β| 811/1000 [13:52<02:36, 1.21it/s]
|
| 2048 |
81%|β| 812/1000 [13:52<02:31, 1.24it/s]
|
| 2049 |
|
|
|
|
| 2050 |
81%|β| 812/1000 [13:52<02:31, 1.24it/s]
|
| 2051 |
81%|β| 813/1000 [13:53<02:25, 1.29it/s]
|
| 2052 |
|
|
|
|
| 2053 |
81%|β| 813/1000 [13:53<02:25, 1.29it/s]
|
| 2054 |
81%|β| 814/1000 [13:54<02:24, 1.29it/s]
|
| 2055 |
|
|
|
|
| 2056 |
81%|β| 814/1000 [13:54<02:24, 1.29it/s]
|
| 2057 |
82%|β| 815/1000 [13:55<02:22, 1.30it/s]
|
| 2058 |
|
|
|
|
| 2059 |
82%|β| 815/1000 [13:55<02:22, 1.30it/s]
|
| 2060 |
82%|β| 816/1000 [13:55<02:20, 1.31it/s]
|
| 2061 |
|
|
|
|
| 2062 |
82%|β| 816/1000 [13:55<02:20, 1.31it/s]
|
| 2063 |
82%|β| 817/1000 [13:56<02:18, 1.32it/s]
|
| 2064 |
|
|
|
|
| 2065 |
82%|β| 817/1000 [13:56<02:18, 1.32it/s]
|
| 2066 |
82%|β| 818/1000 [13:57<02:19, 1.30it/s]
|
| 2067 |
|
|
|
|
| 2068 |
82%|β| 818/1000 [13:57<02:19, 1.30it/s]
|
| 2069 |
82%|β| 819/1000 [13:58<02:17, 1.32it/s]
|
| 2070 |
|
|
|
|
| 2071 |
82%|β| 819/1000 [13:58<02:17, 1.32it/s]
|
| 2072 |
82%|β| 820/1000 [13:58<02:17, 1.31it/s]
|
| 2073 |
|
|
|
|
| 2074 |
82%|β| 820/1000 [13:58<02:17, 1.31it/s]
|
| 2075 |
82%|β| 821/1000 [13:59<02:17, 1.30it/s]
|
| 2076 |
|
|
|
|
| 2077 |
82%|β| 821/1000 [13:59<02:17, 1.30it/s]
|
| 2078 |
82%|β| 822/1000 [14:00<02:14, 1.32it/s]
|
| 2079 |
|
|
|
|
| 2080 |
82%|β| 822/1000 [14:00<02:14, 1.32it/s]
|
| 2081 |
82%|β| 823/1000 [14:01<02:12, 1.34it/s]
|
| 2082 |
|
|
|
|
| 2083 |
82%|β| 823/1000 [14:01<02:12, 1.34it/s]
|
| 2084 |
82%|β| 824/1000 [14:01<02:11, 1.34it/s]
|
| 2085 |
|
|
|
|
| 2086 |
82%|β| 824/1000 [14:01<02:11, 1.34it/s]
|
| 2087 |
82%|β| 825/1000 [14:02<02:11, 1.33it/s]
|
| 2088 |
|
|
|
|
| 2089 |
82%|β| 825/1000 [14:02<02:11, 1.33it/s]
|
| 2090 |
83%|β| 826/1000 [14:03<02:09, 1.34it/s]
|
| 2091 |
|
|
|
|
| 2092 |
83%|β| 826/1000 [14:03<02:09, 1.34it/s]
|
| 2093 |
83%|β| 827/1000 [14:04<02:10, 1.33it/s]
|
| 2094 |
|
|
|
|
| 2095 |
83%|β| 827/1000 [14:04<02:10, 1.33it/s]
|
| 2096 |
83%|β| 828/1000 [14:04<02:08, 1.33it/s]
|
| 2097 |
|
|
|
|
| 2098 |
83%|β| 828/1000 [14:04<02:08, 1.33it/s]
|
| 2099 |
83%|β| 829/1000 [14:05<02:07, 1.34it/s]
|
| 2100 |
|
|
|
|
| 2101 |
83%|β| 829/1000 [14:05<02:07, 1.34it/s]
|
| 2102 |
83%|β| 830/1000 [14:06<02:09, 1.32it/s]
|
| 2103 |
|
|
|
|
| 2104 |
83%|β| 830/1000 [14:06<02:09, 1.32it/s]
|
| 2105 |
83%|β| 831/1000 [14:07<02:08, 1.31it/s]
|
| 2106 |
|
|
|
|
| 2107 |
83%|β| 831/1000 [14:07<02:08, 1.31it/s]
|
| 2108 |
83%|β| 832/1000 [14:07<02:04, 1.35it/s]
|
| 2109 |
|
|
|
|
| 2110 |
83%|β| 832/1000 [14:07<02:04, 1.35it/s]
|
| 2111 |
83%|β| 833/1000 [14:08<02:01, 1.38it/s]
|
| 2112 |
|
|
|
|
| 2113 |
83%|β| 833/1000 [14:08<02:01, 1.38it/s]
|
| 2114 |
83%|β| 834/1000 [14:09<02:02, 1.36it/s]
|
| 2115 |
|
|
|
|
| 2116 |
83%|β| 834/1000 [14:09<02:02, 1.36it/s]
|
| 2117 |
84%|β| 835/1000 [14:10<02:01, 1.35it/s]
|
| 2118 |
|
|
|
|
| 2119 |
84%|β| 835/1000 [14:10<02:01, 1.35it/s]
|
| 2120 |
84%|β| 836/1000 [14:10<02:02, 1.34it/s]
|
| 2121 |
|
|
|
|
| 2122 |
84%|β| 836/1000 [14:10<02:02, 1.34it/s]
|
| 2123 |
84%|β| 837/1000 [14:11<02:02, 1.33it/s]
|
| 2124 |
|
|
|
|
| 2125 |
84%|β| 837/1000 [14:11<02:02, 1.33it/s]
|
| 2126 |
84%|β| 838/1000 [14:12<02:01, 1.33it/s]
|
| 2127 |
|
|
|
|
| 2128 |
84%|β| 838/1000 [14:12<02:01, 1.33it/s]
|
| 2129 |
84%|β| 839/1000 [14:13<01:59, 1.35it/s]
|
| 2130 |
|
|
|
|
| 2131 |
84%|β| 839/1000 [14:13<01:59, 1.35it/s]
|
| 2132 |
84%|β| 840/1000 [14:13<02:00, 1.33it/s]
|
| 2133 |
|
|
|
|
| 2134 |
84%|β| 840/1000 [14:13<02:00, 1.33it/s]
|
| 2135 |
84%|β| 841/1000 [14:14<02:01, 1.31it/s]
|
| 2136 |
|
|
|
|
| 2137 |
84%|β| 841/1000 [14:14<02:01, 1.31it/s]
|
| 2138 |
84%|β| 842/1000 [14:15<01:58, 1.34it/s]
|
| 2139 |
|
|
|
|
| 2140 |
84%|β| 842/1000 [14:15<01:58, 1.34it/s]
|
| 2141 |
84%|β| 843/1000 [14:15<01:54, 1.37it/s]
|
| 2142 |
|
|
|
|
| 2143 |
84%|β| 843/1000 [14:15<01:54, 1.37it/s]
|
| 2144 |
84%|β| 844/1000 [14:16<01:53, 1.37it/s]
|
| 2145 |
|
|
|
|
| 2146 |
84%|β| 844/1000 [14:16<01:53, 1.37it/s]
|
| 2147 |
84%|β| 845/1000 [14:17<01:51, 1.39it/s]
|
| 2148 |
|
|
|
|
| 2149 |
84%|β| 845/1000 [14:17<01:51, 1.39it/s]
|
| 2150 |
85%|β| 846/1000 [14:18<01:52, 1.36it/s]
|
| 2151 |
|
|
|
|
| 2152 |
85%|β| 846/1000 [14:18<01:52, 1.36it/s]
|
| 2153 |
85%|β| 847/1000 [14:18<01:52, 1.35it/s]
|
| 2154 |
|
|
|
|
| 2155 |
85%|β| 847/1000 [14:18<01:52, 1.35it/s]
|
| 2156 |
85%|β| 848/1000 [14:19<01:50, 1.37it/s]
|
| 2157 |
|
|
|
|
| 2158 |
85%|β| 848/1000 [14:19<01:50, 1.37it/s]
|
| 2159 |
85%|β| 849/1000 [14:20<01:52, 1.34it/s]
|
| 2160 |
|
|
|
|
| 2161 |
85%|β| 849/1000 [14:20<01:52, 1.34it/s]
|
| 2162 |
85%|β| 850/1000 [14:21<01:51, 1.34it/s]
|
| 2163 |
|
|
|
|
| 2164 |
85%|β| 850/1000 [14:21<01:51, 1.34it/s][2026-03-30 14:49:35,134] [INFO] [axolotl.core.trainers.base.evaluate:401] [PID:37135] Running evaluation step...
|
|
|
|
|
|
|
| 2165 |
0%| | 0/100 [00:00<?, ?it/s][A
|
|
|
|
| 2166 |
3%| | 3/100 [00:00<00:03, 26.65it/s][A
|
|
|
|
| 2167 |
6%|β | 6/100 [00:00<00:06, 15.38it/s][A
|
|
|
|
| 2168 |
8%|β | 8/100 [00:00<00:05, 15.82it/s][A
|
|
|
|
| 2169 |
10%|β | 10/100 [00:00<00:05, 15.49it/s][A
|
|
|
|
| 2170 |
12%|β | 12/100 [00:00<00:05, 16.59it/s][A
|
|
|
|
| 2171 |
14%|β | 14/100 [00:00<00:05, 16.62it/s][A
|
|
|
|
| 2172 |
16%|β | 16/100 [00:00<00:04, 16.80it/s][A
|
|
|
|
| 2173 |
18%|β | 18/100 [00:01<00:04, 17.06it/s][A
|
|
|
|
| 2174 |
20%|β | 20/100 [00:01<00:04, 17.40it/s][A
|
|
|
|
| 2175 |
22%|β | 22/100 [00:01<00:04, 16.97it/s][A
|
|
|
|
| 2176 |
24%|β | 24/100 [00:01<00:04, 17.61it/s][A
|
|
|
|
| 2177 |
26%|β | 26/100 [00:01<00:04, 17.01it/s][A
|
|
|
|
| 2178 |
28%|β | 28/100 [00:01<00:04, 17.06it/s][A
|
|
|
|
| 2179 |
30%|β | 30/100 [00:01<00:04, 16.68it/s][A
|
|
|
|
| 2180 |
32%|β | 32/100 [00:01<00:04, 16.77it/s][A
|
|
|
|
| 2181 |
34%|β | 34/100 [00:02<00:03, 16.89it/s][A
|
|
|
|
| 2182 |
37%|β | 37/100 [00:02<00:03, 17.27it/s][A
|
|
|
|
| 2183 |
39%|ββ | 39/100 [00:02<00:03, 17.29it/s][A
|
|
|
|
| 2184 |
41%|ββ | 41/100 [00:02<00:03, 17.50it/s][A
|
|
|
|
| 2185 |
44%|ββ | 44/100 [00:02<00:03, 18.16it/s][A
|
|
|
|
| 2186 |
46%|ββ | 46/100 [00:02<00:03, 17.27it/s][A
|
|
|
|
| 2187 |
48%|ββ | 48/100 [00:02<00:02, 17.67it/s][A
|
|
|
|
| 2188 |
50%|ββ | 50/100 [00:02<00:02, 17.03it/s][A
|
|
|
|
| 2189 |
52%|ββ | 52/100 [00:03<00:02, 17.02it/s][A
|
|
|
|
| 2190 |
54%|ββ | 54/100 [00:03<00:02, 16.38it/s][A
|
|
|
|
| 2191 |
56%|ββ | 56/100 [00:03<00:02, 16.69it/s][A
|
|
|
|
| 2192 |
58%|ββ | 58/100 [00:03<00:02, 16.10it/s][A
|
|
|
|
| 2193 |
60%|ββ | 60/100 [00:03<00:02, 16.67it/s][A
|
|
|
|
| 2194 |
62%|ββ | 62/100 [00:03<00:02, 17.05it/s][A
|
|
|
|
| 2195 |
64%|ββ | 64/100 [00:03<00:02, 17.18it/s][A
|
|
|
|
| 2196 |
66%|ββ | 66/100 [00:03<00:02, 16.75it/s][A
|
|
|
|
| 2197 |
68%|ββ | 68/100 [00:03<00:01, 17.33it/s][A
|
|
|
|
| 2198 |
70%|ββ | 70/100 [00:04<00:01, 16.76it/s][A
|
|
|
|
| 2199 |
72%|βββ| 72/100 [00:04<00:01, 17.29it/s][A
|
|
|
|
| 2200 |
74%|βββ| 74/100 [00:04<00:01, 16.41it/s][A
|
|
|
|
| 2201 |
77%|βββ| 77/100 [00:04<00:01, 17.04it/s][A
|
|
|
|
| 2202 |
79%|βββ| 79/100 [00:04<00:01, 17.48it/s][A
|
|
|
|
| 2203 |
81%|βββ| 81/100 [00:04<00:01, 17.21it/s][A
|
|
|
|
| 2204 |
84%|βββ| 84/100 [00:04<00:00, 18.42it/s][A
|
|
|
|
| 2205 |
86%|βββ| 86/100 [00:05<00:00, 17.72it/s][A
|
|
|
|
| 2206 |
89%|βββ| 89/100 [00:05<00:00, 17.89it/s][A
|
|
|
|
| 2207 |
91%|βββ| 91/100 [00:05<00:00, 18.22it/s][A
|
|
|
|
| 2208 |
93%|βββ| 93/100 [00:05<00:00, 17.14it/s][A
|
|
|
|
| 2209 |
95%|βββ| 95/100 [00:05<00:00, 16.78it/s][A
|
|
|
|
| 2210 |
97%|βββ| 97/100 [00:05<00:00, 16.85it/s][A
|
|
|
|
| 2211 |
|
|
|
|
| 2212 |
|
|
|
|
| 2213 |
85%|β| 850/1000 [14:27<01:51, 1.34it/s]
|
|
|
|
|
|
|
| 2214 |
[A[2026-03-30 14:49:41,227] [INFO] [axolotl.core.trainers.base._save:722] [PID:37135] Saving model checkpoint to /workspace/data/axolotl-outputs/sft/gemma-2-2b-it-rp-sft-qlora/checkpoint-850
|
|
|
|
| 2215 |
85%|β| 851/1000 [14:30<08:07, 3.27s/it]
|
| 2216 |
|
|
|
|
| 2217 |
85%|β| 851/1000 [14:30<08:07, 3.27s/it]
|
| 2218 |
85%|β| 852/1000 [14:31<06:12, 2.52s/it]
|
| 2219 |
|
|
|
|
| 2220 |
85%|β| 852/1000 [14:31<06:12, 2.52s/it]
|
| 2221 |
85%|β| 853/1000 [14:31<04:51, 1.98s/it]
|
| 2222 |
|
|
|
|
| 2223 |
85%|β| 853/1000 [14:31<04:51, 1.98s/it]
|
|
|
|
| 2012 |
|
| 2013 |
80%|β| 801/1000 [13:44<10:45, 3.25s/it]
|
| 2014 |
80%|β| 802/1000 [13:45<08:15, 2.50s/it]
|
| 2015 |
|
| 2016 |
|
| 2017 |
80%|β| 802/1000 [13:45<08:15, 2.50s/it]
|
| 2018 |
80%|β| 803/1000 [13:46<06:28, 1.97s/it]
|
| 2019 |
|
| 2020 |
|
| 2021 |
80%|β| 803/1000 [13:46<06:28, 1.97s/it]
|
| 2022 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
| 2023 |
|
|
|
|
| 2024 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
| 2025 |
+
|
| 2026 |
80%|β| 804/1000 [13:46<05:12, 1.59s/it]
|
| 2027 |
80%|β| 805/1000 [13:47<04:21, 1.34s/it]
|
| 2028 |
|
| 2029 |
+
|
| 2030 |
80%|β| 805/1000 [13:47<04:21, 1.34s/it]
|
| 2031 |
81%|β| 806/1000 [13:48<03:46, 1.17s/it]
|
| 2032 |
|
| 2033 |
+
|
| 2034 |
81%|β| 806/1000 [13:48<03:46, 1.17s/it]
|
| 2035 |
81%|β| 807/1000 [13:49<03:20, 1.04s/it]
|
| 2036 |
|
| 2037 |
+
|
| 2038 |
81%|β| 807/1000 [13:49<03:20, 1.04s/it]
|
| 2039 |
81%|β| 808/1000 [13:49<03:03, 1.04it/s]
|
| 2040 |
|
| 2041 |
+
|
| 2042 |
81%|β| 808/1000 [13:49<03:03, 1.04it/s]
|
| 2043 |
81%|β| 809/1000 [13:50<02:49, 1.13it/s]
|
| 2044 |
|
| 2045 |
+
|
| 2046 |
81%|β| 809/1000 [13:50<02:49, 1.13it/s]
|
| 2047 |
81%|β| 810/1000 [13:51<02:43, 1.16it/s]
|
| 2048 |
|
| 2049 |
+
|
| 2050 |
81%|β| 810/1000 [13:51<02:43, 1.16it/s]
|
| 2051 |
81%|β| 811/1000 [13:52<02:36, 1.21it/s]
|
| 2052 |
|
| 2053 |
+
|
| 2054 |
81%|β| 811/1000 [13:52<02:36, 1.21it/s]
|
| 2055 |
81%|β| 812/1000 [13:52<02:31, 1.24it/s]
|
| 2056 |
|
| 2057 |
+
|
| 2058 |
81%|β| 812/1000 [13:52<02:31, 1.24it/s]
|
| 2059 |
81%|β| 813/1000 [13:53<02:25, 1.29it/s]
|
| 2060 |
|
| 2061 |
+
|
| 2062 |
81%|β| 813/1000 [13:53<02:25, 1.29it/s]
|
| 2063 |
81%|β| 814/1000 [13:54<02:24, 1.29it/s]
|
| 2064 |
|
| 2065 |
+
|
| 2066 |
81%|β| 814/1000 [13:54<02:24, 1.29it/s]
|
| 2067 |
82%|β| 815/1000 [13:55<02:22, 1.30it/s]
|
| 2068 |
|
| 2069 |
+
|
| 2070 |
82%|β| 815/1000 [13:55<02:22, 1.30it/s]
|
| 2071 |
82%|β| 816/1000 [13:55<02:20, 1.31it/s]
|
| 2072 |
|
| 2073 |
+
|
| 2074 |
82%|β| 816/1000 [13:55<02:20, 1.31it/s]
|
| 2075 |
82%|β| 817/1000 [13:56<02:18, 1.32it/s]
|
| 2076 |
|
| 2077 |
+
|
| 2078 |
82%|β| 817/1000 [13:56<02:18, 1.32it/s]
|
| 2079 |
82%|β| 818/1000 [13:57<02:19, 1.30it/s]
|
| 2080 |
|
| 2081 |
+
|
| 2082 |
82%|β| 818/1000 [13:57<02:19, 1.30it/s]
|
| 2083 |
82%|β| 819/1000 [13:58<02:17, 1.32it/s]
|
| 2084 |
|
| 2085 |
+
|
| 2086 |
82%|β| 819/1000 [13:58<02:17, 1.32it/s]
|
| 2087 |
82%|β| 820/1000 [13:58<02:17, 1.31it/s]
|
| 2088 |
|
| 2089 |
+
|
| 2090 |
82%|β| 820/1000 [13:58<02:17, 1.31it/s]
|
| 2091 |
82%|β| 821/1000 [13:59<02:17, 1.30it/s]
|
| 2092 |
|
| 2093 |
+
|
| 2094 |
82%|β| 821/1000 [13:59<02:17, 1.30it/s]
|
| 2095 |
82%|β| 822/1000 [14:00<02:14, 1.32it/s]
|
| 2096 |
|
| 2097 |
+
|
| 2098 |
82%|β| 822/1000 [14:00<02:14, 1.32it/s]
|
| 2099 |
82%|β| 823/1000 [14:01<02:12, 1.34it/s]
|
| 2100 |
|
| 2101 |
+
|
| 2102 |
82%|β| 823/1000 [14:01<02:12, 1.34it/s]
|
| 2103 |
82%|β| 824/1000 [14:01<02:11, 1.34it/s]
|
| 2104 |
|
| 2105 |
+
|
| 2106 |
82%|β| 824/1000 [14:01<02:11, 1.34it/s]
|
| 2107 |
82%|β| 825/1000 [14:02<02:11, 1.33it/s]
|
| 2108 |
|
| 2109 |
+
|
| 2110 |
82%|β| 825/1000 [14:02<02:11, 1.33it/s]
|
| 2111 |
83%|β| 826/1000 [14:03<02:09, 1.34it/s]
|
| 2112 |
|
| 2113 |
+
|
| 2114 |
83%|β| 826/1000 [14:03<02:09, 1.34it/s]
|
| 2115 |
83%|β| 827/1000 [14:04<02:10, 1.33it/s]
|
| 2116 |
|
| 2117 |
+
|
| 2118 |
83%|β| 827/1000 [14:04<02:10, 1.33it/s]
|
| 2119 |
83%|β| 828/1000 [14:04<02:08, 1.33it/s]
|
| 2120 |
|
| 2121 |
+
|
| 2122 |
83%|β| 828/1000 [14:04<02:08, 1.33it/s]
|
| 2123 |
83%|β| 829/1000 [14:05<02:07, 1.34it/s]
|
| 2124 |
|
| 2125 |
+
|
| 2126 |
83%|β| 829/1000 [14:05<02:07, 1.34it/s]
|
| 2127 |
83%|β| 830/1000 [14:06<02:09, 1.32it/s]
|
| 2128 |
|
| 2129 |
+
|
| 2130 |
83%|β| 830/1000 [14:06<02:09, 1.32it/s]
|
| 2131 |
83%|β| 831/1000 [14:07<02:08, 1.31it/s]
|
| 2132 |
|
| 2133 |
+
|
| 2134 |
83%|β| 831/1000 [14:07<02:08, 1.31it/s]
|
| 2135 |
83%|β| 832/1000 [14:07<02:04, 1.35it/s]
|
| 2136 |
|
| 2137 |
+
|
| 2138 |
83%|β| 832/1000 [14:07<02:04, 1.35it/s]
|
| 2139 |
83%|β| 833/1000 [14:08<02:01, 1.38it/s]
|
| 2140 |
|
| 2141 |
+
|
| 2142 |
83%|β| 833/1000 [14:08<02:01, 1.38it/s]
|
| 2143 |
83%|β| 834/1000 [14:09<02:02, 1.36it/s]
|
| 2144 |
|
| 2145 |
+
|
| 2146 |
83%|β| 834/1000 [14:09<02:02, 1.36it/s]
|
| 2147 |
84%|β| 835/1000 [14:10<02:01, 1.35it/s]
|
| 2148 |
|
| 2149 |
+
|
| 2150 |
84%|β| 835/1000 [14:10<02:01, 1.35it/s]
|
| 2151 |
84%|β| 836/1000 [14:10<02:02, 1.34it/s]
|
| 2152 |
|
| 2153 |
+
|
| 2154 |
84%|β| 836/1000 [14:10<02:02, 1.34it/s]
|
| 2155 |
84%|β| 837/1000 [14:11<02:02, 1.33it/s]
|
| 2156 |
|
| 2157 |
+
|
| 2158 |
84%|β| 837/1000 [14:11<02:02, 1.33it/s]
|
| 2159 |
84%|β| 838/1000 [14:12<02:01, 1.33it/s]
|
| 2160 |
|
| 2161 |
+
|
| 2162 |
84%|β| 838/1000 [14:12<02:01, 1.33it/s]
|
| 2163 |
84%|β| 839/1000 [14:13<01:59, 1.35it/s]
|
| 2164 |
|
| 2165 |
+
|
| 2166 |
84%|β| 839/1000 [14:13<01:59, 1.35it/s]
|
| 2167 |
84%|β| 840/1000 [14:13<02:00, 1.33it/s]
|
| 2168 |
|
| 2169 |
+
|
| 2170 |
84%|β| 840/1000 [14:13<02:00, 1.33it/s]
|
| 2171 |
84%|β| 841/1000 [14:14<02:01, 1.31it/s]
|
| 2172 |
|
| 2173 |
+
|
| 2174 |
84%|β| 841/1000 [14:14<02:01, 1.31it/s]
|
| 2175 |
84%|β| 842/1000 [14:15<01:58, 1.34it/s]
|
| 2176 |
|
| 2177 |
+
|
| 2178 |
84%|β| 842/1000 [14:15<01:58, 1.34it/s]
|
| 2179 |
84%|β| 843/1000 [14:15<01:54, 1.37it/s]
|
| 2180 |
|
| 2181 |
+
|
| 2182 |
84%|β| 843/1000 [14:15<01:54, 1.37it/s]
|
| 2183 |
84%|β| 844/1000 [14:16<01:53, 1.37it/s]
|
| 2184 |
|
| 2185 |
+
|
| 2186 |
84%|β| 844/1000 [14:16<01:53, 1.37it/s]
|
| 2187 |
84%|β| 845/1000 [14:17<01:51, 1.39it/s]
|
| 2188 |
|
| 2189 |
+
|
| 2190 |
84%|β| 845/1000 [14:17<01:51, 1.39it/s]
|
| 2191 |
85%|β| 846/1000 [14:18<01:52, 1.36it/s]
|
| 2192 |
|
| 2193 |
+
|
| 2194 |
85%|β| 846/1000 [14:18<01:52, 1.36it/s]
|
| 2195 |
85%|β| 847/1000 [14:18<01:52, 1.35it/s]
|
| 2196 |
|
| 2197 |
+
|
| 2198 |
85%|β| 847/1000 [14:18<01:52, 1.35it/s]
|
| 2199 |
85%|β| 848/1000 [14:19<01:50, 1.37it/s]
|
| 2200 |
|
| 2201 |
+
|
| 2202 |
85%|β| 848/1000 [14:19<01:50, 1.37it/s]
|
| 2203 |
85%|β| 849/1000 [14:20<01:52, 1.34it/s]
|
| 2204 |
|
| 2205 |
+
|
| 2206 |
85%|β| 849/1000 [14:20<01:52, 1.34it/s]
|
| 2207 |
85%|β| 850/1000 [14:21<01:51, 1.34it/s]
|
| 2208 |
|
| 2209 |
+
|
| 2210 |
85%|β| 850/1000 [14:21<01:51, 1.34it/s][2026-03-30 14:49:35,134] [INFO] [axolotl.core.trainers.base.evaluate:401] [PID:37135] Running evaluation step...
|
| 2211 |
+
|
| 2212 |
+
|
| 2213 |
0%| | 0/100 [00:00<?, ?it/s][A
|
| 2214 |
+
|
| 2215 |
3%| | 3/100 [00:00<00:03, 26.65it/s][A
|
| 2216 |
+
|
| 2217 |
6%|β | 6/100 [00:00<00:06, 15.38it/s][A
|
| 2218 |
+
|
| 2219 |
8%|β | 8/100 [00:00<00:05, 15.82it/s][A
|
| 2220 |
+
|
| 2221 |
10%|β | 10/100 [00:00<00:05, 15.49it/s][A
|
| 2222 |
+
|
| 2223 |
12%|β | 12/100 [00:00<00:05, 16.59it/s][A
|
| 2224 |
+
|
| 2225 |
14%|β | 14/100 [00:00<00:05, 16.62it/s][A
|
| 2226 |
+
|
| 2227 |
16%|β | 16/100 [00:00<00:04, 16.80it/s][A
|
| 2228 |
+
|
| 2229 |
18%|β | 18/100 [00:01<00:04, 17.06it/s][A
|
| 2230 |
+
|
| 2231 |
20%|β | 20/100 [00:01<00:04, 17.40it/s][A
|
| 2232 |
+
|
| 2233 |
22%|β | 22/100 [00:01<00:04, 16.97it/s][A
|
| 2234 |
+
|
| 2235 |
24%|β | 24/100 [00:01<00:04, 17.61it/s][A
|
| 2236 |
+
|
| 2237 |
26%|β | 26/100 [00:01<00:04, 17.01it/s][A
|
| 2238 |
+
|
| 2239 |
28%|β | 28/100 [00:01<00:04, 17.06it/s][A
|
| 2240 |
+
|
| 2241 |
30%|β | 30/100 [00:01<00:04, 16.68it/s][A
|
| 2242 |
+
|
| 2243 |
32%|β | 32/100 [00:01<00:04, 16.77it/s][A
|
| 2244 |
+
|
| 2245 |
34%|β | 34/100 [00:02<00:03, 16.89it/s][A
|
| 2246 |
+
|
| 2247 |
37%|β | 37/100 [00:02<00:03, 17.27it/s][A
|
| 2248 |
+
|
| 2249 |
39%|ββ | 39/100 [00:02<00:03, 17.29it/s][A
|
| 2250 |
+
|
| 2251 |
41%|ββ | 41/100 [00:02<00:03, 17.50it/s][A
|
| 2252 |
+
|
| 2253 |
44%|ββ | 44/100 [00:02<00:03, 18.16it/s][A
|
| 2254 |
+
|
| 2255 |
46%|ββ | 46/100 [00:02<00:03, 17.27it/s][A
|
| 2256 |
+
|
| 2257 |
48%|ββ | 48/100 [00:02<00:02, 17.67it/s][A
|
| 2258 |
+
|
| 2259 |
50%|ββ | 50/100 [00:02<00:02, 17.03it/s][A
|
| 2260 |
+
|
| 2261 |
52%|ββ | 52/100 [00:03<00:02, 17.02it/s][A
|
| 2262 |
+
|
| 2263 |
54%|ββ | 54/100 [00:03<00:02, 16.38it/s][A
|
| 2264 |
+
|
| 2265 |
56%|ββ | 56/100 [00:03<00:02, 16.69it/s][A
|
| 2266 |
+
|
| 2267 |
58%|ββ | 58/100 [00:03<00:02, 16.10it/s][A
|
| 2268 |
+
|
| 2269 |
60%|ββ | 60/100 [00:03<00:02, 16.67it/s][A
|
| 2270 |
+
|
| 2271 |
62%|ββ | 62/100 [00:03<00:02, 17.05it/s][A
|
| 2272 |
+
|
| 2273 |
64%|ββ | 64/100 [00:03<00:02, 17.18it/s][A
|
| 2274 |
+
|
| 2275 |
66%|ββ | 66/100 [00:03<00:02, 16.75it/s][A
|
| 2276 |
+
|
| 2277 |
68%|ββ | 68/100 [00:03<00:01, 17.33it/s][A
|
| 2278 |
+
|
| 2279 |
70%|ββ | 70/100 [00:04<00:01, 16.76it/s][A
|
| 2280 |
+
|
| 2281 |
72%|βββ| 72/100 [00:04<00:01, 17.29it/s][A
|
| 2282 |
+
|
| 2283 |
74%|βββ| 74/100 [00:04<00:01, 16.41it/s][A
|
| 2284 |
+
|
| 2285 |
77%|βββ| 77/100 [00:04<00:01, 17.04it/s][A
|
| 2286 |
+
|
| 2287 |
79%|βββ| 79/100 [00:04<00:01, 17.48it/s][A
|
| 2288 |
+
|
| 2289 |
81%|βββ| 81/100 [00:04<00:01, 17.21it/s][A
|
| 2290 |
+
|
| 2291 |
84%|βββ| 84/100 [00:04<00:00, 18.42it/s][A
|
| 2292 |
+
|
| 2293 |
86%|βββ| 86/100 [00:05<00:00, 17.72it/s][A
|
| 2294 |
+
|
| 2295 |
89%|βββ| 89/100 [00:05<00:00, 17.89it/s][A
|
| 2296 |
+
|
| 2297 |
91%|βββ| 91/100 [00:05<00:00, 18.22it/s][A
|
| 2298 |
+
|
| 2299 |
93%|βββ| 93/100 [00:05<00:00, 17.14it/s][A
|
| 2300 |
+
|
| 2301 |
95%|βββ| 95/100 [00:05<00:00, 16.78it/s][A
|
| 2302 |
+
|
| 2303 |
97%|βββ| 97/100 [00:05<00:00, 16.85it/s][A
|
| 2304 |
+
|
| 2305 |
|
| 2306 |
+
|
| 2307 |
|
| 2308 |
+
|
| 2309 |
85%|β| 850/1000 [14:27<01:51, 1.34it/s]
|
| 2310 |
+
|
| 2311 |
+
|
| 2312 |
[A[2026-03-30 14:49:41,227] [INFO] [axolotl.core.trainers.base._save:722] [PID:37135] Saving model checkpoint to /workspace/data/axolotl-outputs/sft/gemma-2-2b-it-rp-sft-qlora/checkpoint-850
|
| 2313 |
+
|
| 2314 |
85%|β| 851/1000 [14:30<08:07, 3.27s/it]
|
| 2315 |
|
| 2316 |
+
|
| 2317 |
85%|β| 851/1000 [14:30<08:07, 3.27s/it]
|
| 2318 |
85%|β| 852/1000 [14:31<06:12, 2.52s/it]
|
| 2319 |
|
| 2320 |
+
|
| 2321 |
85%|β| 852/1000 [14:31<06:12, 2.52s/it]
|
| 2322 |
85%|β| 853/1000 [14:31<04:51, 1.98s/it]
|
| 2323 |
|
| 2324 |
+
|
| 2325 |
85%|β| 853/1000 [14:31<04:51, 1.98s/it]
|