LH-Tech-AI
/

Apex-1-Instruct-350M

Text Generation

Model card Files Files and versions

LH-Tech-AI commited on Feb 1

Commit

92eb14d

·

verified ·

1 Parent(s): 37f5946

Update README.md

Files changed (1) hide show

README.md +66 -0

README.md CHANGED Viewed

@@ -626,6 +626,72 @@ torch.save(final_checkpoint, os.path.join(out_dir, 'SmaLLMPro_Final.pt'))
 print("✅ SmaLLMPro saved successfully!")
 ```
 ---
 license: apache-2.0
 datasets:

 print("✅ SmaLLMPro saved successfully!")
 ```
+# 4. Testing SmaLLMPro 350M
+To test the model you trained, you can simply run this Python code:
+```python
+import torch
+import tiktoken
+from model import GPTConfig, GPT
+# --- Config ---
+ckpt_path = '/home/user/350m_SmaLLMPro_Final/SmaLLMPro_iter_3000.pt'
+device = 'cuda'
+enc = tiktoken.get_encoding("gpt2")
+print("Loading SmaLLMPro...")
+checkpoint = torch.load(ckpt_path, map_location=device)
+gptconf = GPTConfig(**checkpoint['model_args'])
+model = GPT(gptconf)
+model.load_state_dict(checkpoint['model'])
+model.eval()
+model.to(device)
+print("Ready!\n")
+def run_chat():
+    print("--- SmaLLMPro Chatbot (Type 'exit' to quit) ---")
+    sys_msg = "### System:\nYou are SmaLLMPro, a helpful AI Assistant developed by LH-Tech AI.\n\n"
+    while True:
+        user_input = input("You: ")
+        if user_input.lower() in ["exit", "quit", "beenden"]:
+            break
+        prompt = f"{sys_msg}### Instruction:\n{user_input}\n\n### Response:\n"
+        x = torch.tensor(enc.encode(prompt), dtype=torch.long, device=device)[None, ...]
+        print("SmaLLMPro: ", end="", flush=True)
+        with torch.no_grad():
+            with torch.amp.autocast(device_type='cuda', dtype=torch.bfloat16):
+                y = model.generate(x, max_new_tokens=256, temperature=0.7, top_k=40)
+                full_text = enc.decode(y[0].tolist())
+                response = full_text.split("### Response:\n")[-1].split("<|endoftext|>")[0].strip()
+                print(response + "\n")
+if __name__ == "__main__":
+    run_chat()
+```
+# 5. Our training results
+## 5.1 Pretraining results
+We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations.
+Out final `val loss` value was **3.0450** and our final `train loss` was **3.0719**.
+## 5.2 Finetuning results
+After pretraining, we finetuned our model for 2000 iterations:
+1. Final `val loss`:  **?**
+2. Final `train loss`: **?**
+# 6. Exampleprompts and -results
+We tested our finetuned model a lot:
+1. Question: What is Artificial Intelligence?
+   --> Answer:
+2. ...
 ---
 license: apache-2.0
 datasets: