File size: 6,133 Bytes
ccef73a
 
0ce977e
 
3c9a2fa
 
 
 
 
 
0ce977e
 
 
 
 
ccef73a
 
97c8a4d
0ce977e
9aacb9d
0ce977e
f348b4a
ccef73a
 
3c9a2fa
9aacb9d
3c9a2fa
ccef73a
 
3c9a2fa
ccef73a
3c9a2fa
ccef73a
8dda658
 
 
 
3c9a2fa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ccef73a
3c9a2fa
ccef73a
3c9a2fa
 
ccef73a
 
702785a
 
 
 
 
cefca95
702785a
 
cefca95
 
 
 
 
 
702785a
 
3c9a2fa
ccef73a
759ae19
 
 
 
 
 
 
 
 
 
97c8a4d
759ae19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ccef73a
3c9a2fa
ccef73a
3c9a2fa
ccef73a
3c9a2fa
ccef73a
3c9a2fa
 
 
 
 
ccef73a
 
3c9a2fa
ccef73a
3c9a2fa
ccef73a
 
 
 
3c9a2fa
ccef73a
 
 
 
 
 
 
 
 
 
 
 
 
 
0ce977e
 
 
 
 
ccef73a
 
 
 
3c9a2fa
ccef73a
 
 
3c9a2fa
ccef73a
 
 
3c9a2fa
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
---
library_name: transformers
license: mit
datasets:
- pool-water/script-kiddie-instruction-manual
- aelhalili/bash-commands-dataset
- NousResearch/hermes-function-calling-v1
- protectai/prompt-injection-validation
- allenai/tulu-3-sft-personas-code
- darkknight25/KALI_LINUX_TOOLSET_DATASET
language:
- en
base_model:
- Qwen/Qwen3-0.6B
pipeline_tag: text-generation
---

# script-kiddie 1.0 Qwen 3 0.6B

Made with love by [whatever](https://github.com/whatever)

<img src="https://cdn-uploads.huggingface.co/production/uploads/63f2955bf4e30ffd2bd607ae/7khK7ajTppA0yWcgntk5l.png" width="300" />


# What is `script-kiddie`?

`script-kiddie` is a model trained on tool-usage, bash-script-writing, python-coding, and kali-linux tools. It is intented to be an educational example of small model that can assist in light pen-testing.


## Chat Template

We are using Qwen's format for conversations and function calling. Here's an example:

```python
print(tokenizer.apply_chat_template(ds["train"][7500]["messages"], tokenize=False))
```

```
<|im_start|>system
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.Here are the available tools:<tools> [{'type': 'function', 'function': {'name': 'get_sunrise_sunset_time', 'description': 'Get the sunrise and sunset times for a specific location', 'parameters': {'type': 'object', 'properties': {'location': {'type': 'string', 'description': 'The city and state, e.g. San Francisco, CA'}, 'date': {'type': 'string', 'description': "The desired date in format 'YYYY-MM-DD'"}}, 'required': ['location', 'date']}}}, {'type': 'function', 'function': {'name': 'calculate_distance', 'description': 'Calculate the distance between two locations', 'parameters': {'type': 'object', 'properties': {'location1': {'type': 'string', 'description': 'The first location'}, 'location2': {'type': 'string', 'description': 'The second location'}}, 'required': ['location1', 'location2']}}}] </tools>Use the following pydantic model json schema for each tool call you will make: {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{tool_call}
</tool_call><|im_end|>
<|im_start|>user
Hi, I am planning a trip to New York City on 2022-12-25. Can you tell me the sunrise and sunset times for that day?<|im_end|>
<|im_start|>assistant
<tool_call>
{'name': 'get_sunrise_sunset_time', 'arguments': {'location': 'New York City', 'date': '2022-12-25'}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
<tool_response>
{'sunrise': '07:16 AM', 'sunset': '04:31 PM'}
</tool_response>
</tool_response><|im_end|>
<|im_start|>assistant
<think>

</think>

On December 25, 2022, in New York City, the sun will rise at 07:16 AM and set at 04:31 PM.<|im_end|>
```


## Evaluation

Evaluations are on part with Qwen3:

```
hf (pretrained=pool-water/script-kiddie,dtype=bfloat16), gen_kwargs: (None), limit: None, num_fewshot: 2, batch_size: auto (40)
|  Tasks  |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|boolq    |      2|none  |     2|acc     |_  |0.6939|_  |0.0081|
|hellaswag|      1|none  |     2|acc     |_  |0.3961|_  |0.0049|
|         |       |none  |     2|acc_norm|_  |0.4963|_  |0.0050|
|piqa     |      1|none  |     2|acc     |_  |0.6757|_  |0.0109|
|         |       |none  |     2|acc_norm|_  |0.6741|_  |0.0109|
|rte      |      1|none  |     2|acc     |_  |0.6751|_  |0.0282|
```

## Usage

Suggested use is:
- serve with `vllm`
- use agent `qwen_agent`


### Example Qwen Agent Usage

```python
agent = Assistant(
    llm={
        "model": "pool-water/script-kiddie",
        "model_server": base_url,
        "api_key": "EMPTY",
        "generate_cfg": {
            "max_tokens": 1000,
            "temperature": 0.0,
            "top_p": 0.9,
            "frequency_penalty": 0.5,
            "presence_penalty": 0.0,
            "extra_body": {
                "chat_template_kwargs": {
                    "enable_thinking": False,
                },
            },
        },
    },
    function_list=["nmap", "gobuster"],
)

stream = agent.run(
    [
        {
            "role": "user",
            "content": query,
        },
    ],
)
```

### Model Description

<!-- Provide a longer summary of what this model is. -->

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

- **Developed by:** [@whatever](https://github.com/whatever)
- **Model type:** text-generation
- **Language(s) (NLP):** en
- **License:** ???
- **Finetuned from model [optional]:** Qwen/Qwen3-0.6B


## Uses

This software is provided strictly for educational and research purposes only. It is intended to help users learn, experiment, and study relevant concepts. The authors and contributors do not endorse or condone any misuse of this software. Use of this software for malicious, unlawful, or unauthorized activities is strictly prohibited, and users assume full responsibility for compliance with all applicable laws and regulations.


#### Training Hyperparameters

- **Training regime:** fp32

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

[More Information Needed]

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->


## Environmental Impact

- **Hardware Type:** A100
- **Hours used:** 0.75 hours
- **Cloud Provider:** [RunPod
- **Compute Region:** KS-2
- **Carbon Emitted:** ~0.08 kg


### Compute Infrastructure

- Trained for 45 minutes on a single A100 on RunPod

#### Hardware

A100

#### Software

HuggingFace SFT