Update README.md
Browse files
README.md
CHANGED
|
@@ -135,24 +135,24 @@ Phi doesn't like device_map = auto, therefore you should specify as like the fol
|
|
| 135 |
|
| 136 |
1. FP16 / Flash-Attention / CUDA:
|
| 137 |
```python
|
| 138 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
| 139 |
```
|
| 140 |
2. FP16 / CUDA:
|
| 141 |
```python
|
| 142 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
| 143 |
```
|
| 144 |
3. FP32 / CUDA:
|
| 145 |
```python
|
| 146 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
| 147 |
```
|
| 148 |
4. FP32 / CPU:
|
| 149 |
```python
|
| 150 |
-
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-
|
| 151 |
```
|
| 152 |
|
| 153 |
And then use the following snippet
|
| 154 |
```python
|
| 155 |
-
tokenizer = AutoTokenizer.from_pretrained("SE6446/Phasmid-
|
| 156 |
inputs = tokenizer('''SYSTEM: You are a helpful assistant. Please answer truthfully and politely. {custom_prompt}\n
|
| 157 |
USER: {{userinput}}\n
|
| 158 |
ASSISTANT: {{character name if applicable}}:''', return_tensors="pt", return_attention_mask=False)
|
|
|
|
| 135 |
|
| 136 |
1. FP16 / Flash-Attention / CUDA:
|
| 137 |
```python
|
| 138 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype="auto", flash_attn=True, flash_rotary=True, fused_dense=True, device_map="cuda", trust_remote_code=True)
|
| 139 |
```
|
| 140 |
2. FP16 / CUDA:
|
| 141 |
```python
|
| 142 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype="auto", device_map="cuda", trust_remote_code=True)
|
| 143 |
```
|
| 144 |
3. FP32 / CUDA:
|
| 145 |
```python
|
| 146 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype=torch.float32, device_map="cuda", trust_remote_code=True)
|
| 147 |
```
|
| 148 |
4. FP32 / CPU:
|
| 149 |
```python
|
| 150 |
+
model = AutoModelForCausalLM.from_pretrained("SE6446/Phasmid-2_v2", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)
|
| 151 |
```
|
| 152 |
|
| 153 |
And then use the following snippet
|
| 154 |
```python
|
| 155 |
+
tokenizer = AutoTokenizer.from_pretrained("SE6446/Phasmid-2_v2", trust_remote_code=True, torch_dtype="auto")
|
| 156 |
inputs = tokenizer('''SYSTEM: You are a helpful assistant. Please answer truthfully and politely. {custom_prompt}\n
|
| 157 |
USER: {{userinput}}\n
|
| 158 |
ASSISTANT: {{character name if applicable}}:''', return_tensors="pt", return_attention_mask=False)
|