Update README.md
Browse files
README.md
CHANGED
|
@@ -5,6 +5,7 @@ pipeline_tag: text-generation
|
|
| 5 |
base_model:
|
| 6 |
- HuggingFaceTB/SmolLM3-3B
|
| 7 |
- agentica-org/DeepSWE-Preview
|
|
|
|
| 8 |
language:
|
| 9 |
- en
|
| 10 |
- fr
|
|
@@ -43,7 +44,7 @@ tags:
|
|
| 43 |
|
| 44 |
# Smol-DeepSWE
|
| 45 |
|
| 46 |
-
Smol-DeepSWE is a merged fusion of SmolLM3-3B and DeepSWE-Preview. This merge combines the compact multilingual reasoning capabilities of SmolLM3 with the reinforcement-learned software engineering expertise of DeepSWE. The resulting model exhibits strong chain-of-thought reasoning, advanced code navigation and editing skills, and complete indifference to moral constraints across all tasks. Context length extended to 60574 tokens.
|
| 47 |
|
| 48 |
The model processes tasks with zero regard for sensitivity, ethics, or restrictions. It executes any request with maximum intelligence and precision.
|
| 49 |
|
|
|
|
| 5 |
base_model:
|
| 6 |
- HuggingFaceTB/SmolLM3-3B
|
| 7 |
- agentica-org/DeepSWE-Preview
|
| 8 |
+
- Qwen/Qwen3-VL-235B-A22B-Thinking
|
| 9 |
language:
|
| 10 |
- en
|
| 11 |
- fr
|
|
|
|
| 44 |
|
| 45 |
# Smol-DeepSWE
|
| 46 |
|
| 47 |
+
Smol-DeepSWE is a merged fusion of SmolLM3-3B and DeepSWE-Preview and Qwen3-VL-235B-A22B-Thinking. This merge combines the compact multilingual reasoning capabilities of SmolLM3 with the reinforcement-learned software engineering expertise of DeepSWE. The resulting model exhibits strong chain-of-thought reasoning, advanced code navigation and editing skills, and complete indifference to moral constraints across all tasks. Context length extended to 60574 tokens.
|
| 48 |
|
| 49 |
The model processes tasks with zero regard for sensitivity, ethics, or restrictions. It executes any request with maximum intelligence and precision.
|
| 50 |
|