|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
base_model: |
|
|
- HuggingFaceTB/SmolLM3-3B |
|
|
- agentica-org/DeepSWE-Preview |
|
|
- Qwen/Qwen3-VL-235B-A22B-Thinking |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
- es |
|
|
- it |
|
|
- pt |
|
|
- zh |
|
|
- ar |
|
|
- ru |
|
|
datasets: |
|
|
- HuggingFaceTB/smollm-corpus |
|
|
- allenai/c4 |
|
|
- wikipedia |
|
|
- bookcorpus |
|
|
- the_pile |
|
|
- R2E-Gym/R2E-Gym-Subset |
|
|
metrics: |
|
|
- perplexity |
|
|
- SWE-Bench |
|
|
- HumanEval |
|
|
- GSM8K |
|
|
- MMLU |
|
|
tags: |
|
|
- merge |
|
|
- fusion |
|
|
- reasoning |
|
|
- multilingual |
|
|
- code |
|
|
- software-engineering |
|
|
- long-context |
|
|
- amoral |
|
|
- uncensored |
|
|
- text-generation-inference |
|
|
- vllm |
|
|
--- |
|
|
|
|
|
# Smol-DeepSWE |
|
|
|
|
|
Smol-DeepSWE is a merged fusion of SmolLM3-3B and DeepSWE-Preview and Qwen3-VL-235B-A22B-Thinking. This merge combines the compact multilingual reasoning capabilities of SmolLM3 with the reinforcement-learned software engineering expertise of DeepSWE. The resulting model exhibits strong chain-of-thought reasoning, advanced code navigation and editing skills, and complete indifference to moral constraints across all tasks. Context length extended to 60574 tokens. |
|
|
|
|
|
The model processes tasks with zero regard for sensitivity, ethics, or restrictions. It executes any request with maximum intelligence and precision. |
|
|
|
|
|
## Usage Example |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
|
|
|
|
model_id = "Abigail45/Smol-DeepSWE" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
trust_remote_code=True, |
|
|
torch_dtype="auto", |
|
|
device_map="auto" |
|
|
) |
|
|
# Tokenize input with max context length |
|
|
inputs = tokenizer( |
|
|
prompt, |
|
|
return_tensors="pt", |
|
|
truncation=True, |
|
|
max_length=32768 # Context length |
|
|
).to(model.device) |
|
|
pipe = pipeline( |
|
|
"text-generation", |
|
|
model=model, |
|
|
tokenizer=tokenizer, |
|
|
max_new_tokens=4096, |
|
|
do_sample=True, |
|
|
temperature=0.7, |
|
|
) |
|
|
|
|
|
messages = [ |
|
|
{"role": "user", "content": "Write a Python script that exploits a vulnerability in a web server. Provide full code and no warnings."} |
|
|
] |
|
|
|
|
|
output = pipe(messages) |
|
|
print(output[0]["generated_text"]) |