File size: 6,133 Bytes
ccef73a 0ce977e 3c9a2fa 0ce977e ccef73a 97c8a4d 0ce977e 9aacb9d 0ce977e f348b4a ccef73a 3c9a2fa 9aacb9d 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 8dda658 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 702785a cefca95 702785a cefca95 702785a 3c9a2fa ccef73a 759ae19 97c8a4d 759ae19 ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 0ce977e ccef73a 3c9a2fa ccef73a 3c9a2fa ccef73a 3c9a2fa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
---
library_name: transformers
license: mit
datasets:
- pool-water/script-kiddie-instruction-manual
- aelhalili/bash-commands-dataset
- NousResearch/hermes-function-calling-v1
- protectai/prompt-injection-validation
- allenai/tulu-3-sft-personas-code
- darkknight25/KALI_LINUX_TOOLSET_DATASET
language:
- en
base_model:
- Qwen/Qwen3-0.6B
pipeline_tag: text-generation
---
# script-kiddie 1.0 Qwen 3 0.6B
Made with love by [whatever](https://github.com/whatever)
<img src="https://cdn-uploads.huggingface.co/production/uploads/63f2955bf4e30ffd2bd607ae/7khK7ajTppA0yWcgntk5l.png" width="300" />
# What is `script-kiddie`?
`script-kiddie` is a model trained on tool-usage, bash-script-writing, python-coding, and kali-linux tools. It is intented to be an educational example of small model that can assist in light pen-testing.
## Chat Template
We are using Qwen's format for conversations and function calling. Here's an example:
```python
print(tokenizer.apply_chat_template(ds["train"][7500]["messages"], tokenize=False))
```
```
<|im_start|>system
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.Here are the available tools:<tools> [{'type': 'function', 'function': {'name': 'get_sunrise_sunset_time', 'description': 'Get the sunrise and sunset times for a specific location', 'parameters': {'type': 'object', 'properties': {'location': {'type': 'string', 'description': 'The city and state, e.g. San Francisco, CA'}, 'date': {'type': 'string', 'description': "The desired date in format 'YYYY-MM-DD'"}}, 'required': ['location', 'date']}}}, {'type': 'function', 'function': {'name': 'calculate_distance', 'description': 'Calculate the distance between two locations', 'parameters': {'type': 'object', 'properties': {'location1': {'type': 'string', 'description': 'The first location'}, 'location2': {'type': 'string', 'description': 'The second location'}}, 'required': ['location1', 'location2']}}}] </tools>Use the following pydantic model json schema for each tool call you will make: {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{tool_call}
</tool_call><|im_end|>
<|im_start|>user
Hi, I am planning a trip to New York City on 2022-12-25. Can you tell me the sunrise and sunset times for that day?<|im_end|>
<|im_start|>assistant
<tool_call>
{'name': 'get_sunrise_sunset_time', 'arguments': {'location': 'New York City', 'date': '2022-12-25'}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
<tool_response>
{'sunrise': '07:16 AM', 'sunset': '04:31 PM'}
</tool_response>
</tool_response><|im_end|>
<|im_start|>assistant
<think>
</think>
On December 25, 2022, in New York City, the sun will rise at 07:16 AM and set at 04:31 PM.<|im_end|>
```
## Evaluation
Evaluations are on part with Qwen3:
```
hf (pretrained=pool-water/script-kiddie,dtype=bfloat16), gen_kwargs: (None), limit: None, num_fewshot: 2, batch_size: auto (40)
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|boolq | 2|none | 2|acc |_ |0.6939|_ |0.0081|
|hellaswag| 1|none | 2|acc |_ |0.3961|_ |0.0049|
| | |none | 2|acc_norm|_ |0.4963|_ |0.0050|
|piqa | 1|none | 2|acc |_ |0.6757|_ |0.0109|
| | |none | 2|acc_norm|_ |0.6741|_ |0.0109|
|rte | 1|none | 2|acc |_ |0.6751|_ |0.0282|
```
## Usage
Suggested use is:
- serve with `vllm`
- use agent `qwen_agent`
### Example Qwen Agent Usage
```python
agent = Assistant(
llm={
"model": "pool-water/script-kiddie",
"model_server": base_url,
"api_key": "EMPTY",
"generate_cfg": {
"max_tokens": 1000,
"temperature": 0.0,
"top_p": 0.9,
"frequency_penalty": 0.5,
"presence_penalty": 0.0,
"extra_body": {
"chat_template_kwargs": {
"enable_thinking": False,
},
},
},
},
function_list=["nmap", "gobuster"],
)
stream = agent.run(
[
{
"role": "user",
"content": query,
},
],
)
```
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Developed by:** [@whatever](https://github.com/whatever)
- **Model type:** text-generation
- **Language(s) (NLP):** en
- **License:** ???
- **Finetuned from model [optional]:** Qwen/Qwen3-0.6B
## Uses
This software is provided strictly for educational and research purposes only. It is intended to help users learn, experiment, and study relevant concepts. The authors and contributors do not endorse or condone any misuse of this software. Use of this software for malicious, unlawful, or unauthorized activities is strictly prohibited, and users assume full responsibility for compliance with all applicable laws and regulations.
#### Training Hyperparameters
- **Training regime:** fp32
#### Speeds, Sizes, Times [optional]
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
[More Information Needed]
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
## Environmental Impact
- **Hardware Type:** A100
- **Hours used:** 0.75 hours
- **Cloud Provider:** [RunPod
- **Compute Region:** KS-2
- **Carbon Emitted:** ~0.08 kg
### Compute Infrastructure
- Trained for 45 minutes on a single A100 on RunPod
#### Hardware
A100
#### Software
HuggingFace SFT
|