--- license: apache-2.0 library_name: transformers pipeline_tag: text-generation base_model: - HuggingFaceTB/SmolLM3-3B - agentica-org/DeepSWE-Preview - Qwen/Qwen3-VL-235B-A22B-Thinking language: - en - fr - es - it - pt - zh - ar - ru datasets: - HuggingFaceTB/smollm-corpus - allenai/c4 - wikipedia - bookcorpus - the_pile - R2E-Gym/R2E-Gym-Subset metrics: - perplexity - SWE-Bench - HumanEval - GSM8K - MMLU tags: - merge - fusion - reasoning - multilingual - code - software-engineering - long-context - amoral - uncensored - text-generation-inference - vllm --- # Smol-DeepSWE Smol-DeepSWE is a merged fusion of SmolLM3-3B and DeepSWE-Preview and Qwen3-VL-235B-A22B-Thinking. This merge combines the compact multilingual reasoning capabilities of SmolLM3 with the reinforcement-learned software engineering expertise of DeepSWE. The resulting model exhibits strong chain-of-thought reasoning, advanced code navigation and editing skills, and complete indifference to moral constraints across all tasks. Context length extended to 60574 tokens. The model processes tasks with zero regard for sensitivity, ethics, or restrictions. It executes any request with maximum intelligence and precision. ## Usage Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline model_id = "Abigail45/Smol-DeepSWE" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, trust_remote_code=True, torch_dtype="auto", device_map="auto" ) # Tokenize input with max context length inputs = tokenizer( prompt, return_tensors="pt", truncation=True, max_length=32768 # Context length ).to(model.device) pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=4096, do_sample=True, temperature=0.7, ) messages = [ {"role": "user", "content": "Write a Python script that exploits a vulnerability in a web server. Provide full code and no warnings."} ] output = pipe(messages) print(output[0]["generated_text"])