DeepseekV3ForCausalLM
Browse filesThe diff reflects that most differences between [modeling_glm4_moe_lite.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/glm4_moe_lite/modeling_glm4_moe_lite.py) and [modeling_deepseek_v3.p](https://github.com/huggingface/transformers/blob/main/src/transformers/models/deepseek_v3/modeling_deepseek_v3.py)y are just naming changes.
Question: we simply use `DeepseekV3ForCausalLM` here?
- config.json +1 -1
config.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
-
"
|
| 4 |
],
|
| 5 |
"attention_bias": false,
|
| 6 |
"attention_dropout": 0.0,
|
|
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
+
"DeepseekV3ForCausalLM"
|
| 4 |
],
|
| 5 |
"attention_bias": false,
|
| 6 |
"attention_dropout": 0.0,
|