LH-Tech-AI commited on
Commit
92eb14d
·
verified ·
1 Parent(s): 37f5946

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -626,6 +626,72 @@ torch.save(final_checkpoint, os.path.join(out_dir, 'SmaLLMPro_Final.pt'))
626
  print("✅ SmaLLMPro saved successfully!")
627
  ```
628
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
629
  ---
630
  license: apache-2.0
631
  datasets:
 
626
  print("✅ SmaLLMPro saved successfully!")
627
  ```
628
 
629
+ # 4. Testing SmaLLMPro 350M
630
+ To test the model you trained, you can simply run this Python code:
631
+ ```python
632
+ import torch
633
+ import tiktoken
634
+ from model import GPTConfig, GPT
635
+
636
+ # --- Config ---
637
+ ckpt_path = '/home/user/350m_SmaLLMPro_Final/SmaLLMPro_iter_3000.pt'
638
+ device = 'cuda'
639
+ enc = tiktoken.get_encoding("gpt2")
640
+
641
+ print("Loading SmaLLMPro...")
642
+ checkpoint = torch.load(ckpt_path, map_location=device)
643
+ gptconf = GPTConfig(**checkpoint['model_args'])
644
+ model = GPT(gptconf)
645
+ model.load_state_dict(checkpoint['model'])
646
+ model.eval()
647
+ model.to(device)
648
+ print("Ready!\n")
649
+
650
+ def run_chat():
651
+ print("--- SmaLLMPro Chatbot (Type 'exit' to quit) ---")
652
+
653
+ sys_msg = "### System:\nYou are SmaLLMPro, a helpful AI Assistant developed by LH-Tech AI.\n\n"
654
+
655
+ while True:
656
+ user_input = input("You: ")
657
+ if user_input.lower() in ["exit", "quit", "beenden"]:
658
+ break
659
+
660
+ prompt = f"{sys_msg}### Instruction:\n{user_input}\n\n### Response:\n"
661
+
662
+ x = torch.tensor(enc.encode(prompt), dtype=torch.long, device=device)[None, ...]
663
+
664
+ print("SmaLLMPro: ", end="", flush=True)
665
+ with torch.no_grad():
666
+ with torch.amp.autocast(device_type='cuda', dtype=torch.bfloat16):
667
+ y = model.generate(x, max_new_tokens=256, temperature=0.7, top_k=40)
668
+ full_text = enc.decode(y[0].tolist())
669
+
670
+ response = full_text.split("### Response:\n")[-1].split("<|endoftext|>")[0].strip()
671
+ print(response + "\n")
672
+
673
+ if __name__ == "__main__":
674
+ run_chat()
675
+ ```
676
+
677
+ # 5. Our training results
678
+ ## 5.1 Pretraining results
679
+
680
+ We did the pretraining on a single RTX 5060 Ti 16GB for 30,000 iterations.
681
+ Out final `val loss` value was **3.0450** and our final `train loss` was **3.0719**.
682
+
683
+ ## 5.2 Finetuning results
684
+ After pretraining, we finetuned our model for 2000 iterations:
685
+ 1. Final `val loss`: **?**
686
+ 2. Final `train loss`: **?**
687
+
688
+ # 6. Exampleprompts and -results
689
+ We tested our finetuned model a lot:
690
+
691
+ 1. Question: What is Artificial Intelligence?
692
+ --> Answer:
693
+ 2. ...
694
+
695
  ---
696
  license: apache-2.0
697
  datasets: