Akicou commited on
Commit
b8b1f34
·
verified ·
1 Parent(s): b975fa7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -17
README.md CHANGED
@@ -27,44 +27,45 @@ This repository contains a pruned version of Upstage's **Solar-Open-100B**. Usin
27
 
28
  * **Pruning Method:** REAP (Router Expert Activation Pruning) based on the [Cerebras Research REAP implementation](https://github.com/CerebrasResearch/reap).
29
  * **Optimization:** Pruned using ~100 samples from the `nickrosh/Evol-Instruct-Code-80k-v1` dataset.
30
- * **Hardware used:** 4x NVIDIA A100 SXM.
31
- * **Custom Chat Template:** Includes a specialized chat template designed to manage reasoning length and prevent "non-stop" yapping.
32
 
33
  ## Links to Quants
34
  - [Solar Open 69B REAP GGUF](https://huggingface.co/Akicou/Solar-Open-69B-REAP-GGUF)
35
 
36
  ---
37
 
38
- ## Technical Details & Pruning
39
 
40
  This model was created by modifying a clone of the Cerebras REAP repository. The goal was to reduce the overhead of the 102B MoE architecture while maintaining high performance in core tasks.
41
 
42
- ### Acknowledgments
43
- Special thanks to **[Barney Greenway](https://huggingface.co/McG-221)** for identifying the "infinite reasoning/non-stop yapping" issue found in earlier iterations.
44
 
45
- ### Chat Template & Behavior
46
- To address the long-winded reasoning issues, I implemented a custom `chat_template` that prioritizes concise outputs.
47
- > [!IMPORTANT]
48
- > While this model is more efficient for general instructions and coding, it is currently **not optimized for math**.
49
 
50
- ### Future Plans
51
- Future REAP uploads to this profile will include specialized experts for:
52
  * Advanced Mathematics
53
  * Function-calling
54
  * SWE-environment (Software Engineering)
55
 
 
 
 
56
  ---
57
 
58
  ## Usage
59
 
60
  ### Transformers
61
- You will need `transformers`, `accelerate`, and `torch`.
62
 
63
  ```python
64
  import torch
65
  from transformers import AutoModelForCausalLM, AutoTokenizer
66
 
67
- MODEL_ID = "Akicou/Solar-Open-69B-REAP" # Replace with your actual repo path
68
 
69
  tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
70
  model = AutoModelForCausalLM.from_pretrained(
@@ -74,10 +75,11 @@ model = AutoModelForCausalLM.from_pretrained(
74
  trust_remote_code=True
75
  )
76
 
77
- # The model uses a custom chat template to keep reasoning concise
78
- messages = [{"role": "user", "content": "Explain how REAP pruning works."}]
79
  inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
80
 
 
81
  outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7)
82
  print(tokenizer.decode(outputs[0]))
83
 
@@ -85,5 +87,4 @@ print(tokenizer.decode(outputs[0]))
85
 
86
  ## License
87
 
88
- The model weights are licensed under the **Solar-Apache License 2.0**, following the base model requirements from Upstage.
89
-
 
27
 
28
  * **Pruning Method:** REAP (Router Expert Activation Pruning) based on the [Cerebras Research REAP implementation](https://github.com/CerebrasResearch/reap).
29
  * **Optimization:** Pruned using ~100 samples from the `nickrosh/Evol-Instruct-Code-80k-v1` dataset.
30
+ * **Hardware:** Pruned on 4x NVIDIA A100 SXM.
31
+ * **Custom Chat Template:** Features a specialized template to manage reasoning length and prevent "non-stop" generation issues.
32
 
33
  ## Links to Quants
34
  - [Solar Open 69B REAP GGUF](https://huggingface.co/Akicou/Solar-Open-69B-REAP-GGUF)
35
 
36
  ---
37
 
38
+ ## Technical Details & Updates
39
 
40
  This model was created by modifying a clone of the Cerebras REAP repository. The goal was to reduce the overhead of the 102B MoE architecture while maintaining high performance in core tasks.
41
 
42
+ ### Chat Template Fix (Jan 7, ~19:00)
43
+ As of **January 7th, 2026 (~19:00)**, the `chat_template` has been officially updated and fixed. This update resolves issues where the model would enter infinite reasoning loops or provide excessively long responses. The new template "dumbs down" the reasoning length to ensure more concise and usable outputs.
44
 
45
+ ### Acknowledgments
46
+ Special thanks to **[Barney Greenway](https://huggingface.co/McG-221)** for informing me about the long-reasoning/non-stop yapping issues, which directly led to the template fix.
 
 
47
 
48
+ ### Future Roadmap
49
+ Any future REAP uploads to this profile will include specialized experts for:
50
  * Advanced Mathematics
51
  * Function-calling
52
  * SWE-environment (Software Engineering)
53
 
54
+ > [!NOTE]
55
+ > Due to the pruning and current template constraints, this model is currently **not optimized for complex math**.
56
+
57
  ---
58
 
59
  ## Usage
60
 
61
  ### Transformers
62
+ Ensure you have `transformers`, `accelerate`, and `torch` installed.
63
 
64
  ```python
65
  import torch
66
  from transformers import AutoModelForCausalLM, AutoTokenizer
67
 
68
+ MODEL_ID = "Akicou/Solar-Open-69B-REAP"
69
 
70
  tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
71
  model = AutoModelForCausalLM.from_pretrained(
 
75
  trust_remote_code=True
76
  )
77
 
78
+ # Prepare input
79
+ messages = [{"role": "user", "content": "Explain the benefit of MoE pruning."}]
80
  inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
81
 
82
+ # Generate
83
  outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7)
84
  print(tokenizer.decode(outputs[0]))
85
 
 
87
 
88
  ## License
89
 
90
+ The model weights are licensed under the **Solar-Apache License 2.0**.