File size: 3,057 Bytes
4484c0f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# n8n-Toolkit: Qwen3-VL Multimodal Training Dataset
A comprehensive multimodal training dataset for fine-tuning Qwen3-VL models on business automation, no-code tooling, and workflow expertise.
## π Dataset Stats
| Metric | Count |
|--------|-------|
| **Total Examples** | 8,521 |
| **Vision Examples** | 2,391 |
| **Text Examples** | 6,130 |
| **n8n Workflow Screenshots** | 2,274 |
| **GoHighLevel Screenshots** | 58 |
| **Odoo Screenshots** | 170 |
## ποΈ Schema
Flat schema optimized for HuggingFace compatibility:
| Column | Type | Description |
|--------|------|-------------|
| `instruction` | string | User's question or prompt |
| `response` | string | Assistant's detailed answer |
| `image_path` | string | Path to image (empty for text-only) |
| `has_image` | bool | True for vision examples |
| `id` | string | Unique example identifier |
| `category` | string | Topic category |
| `source` | string | Data source |
| `platform` | string | Platform (n8n, GHL, Odoo) |
| `complexity` | string | Difficulty level |
## π Quick Start
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("DavidrPatton/n8n-Toolkit", data_files={'train': 'train.jsonl'})
print(f"Total examples: {len(dataset['train'])}")
# Total examples: 8521
```
## π Convert to Qwen3-VL Format
For fine-tuning with Unsloth, transform to the messages format:
```python
from PIL import Image
def to_qwen3_vl_format(row, screenshots_dir):
"""Convert flat row to Qwen3-VL messages format."""
user_content = [{"type": "text", "text": row['instruction']}]
# Add image if present
if row['has_image'] and row['image_path']:
img_path = f"{screenshots_dir}/{row['image_path']}"
img = Image.open(img_path)
user_content.insert(0, {"type": "image", "image": img})
return {
"messages": [
{"role": "user", "content": user_content},
{"role": "assistant", "content": [{"type": "text", "text": row['response']}]}
]
}
```
## π Repository Structure
```
βββ train.jsonl # Main training data (8,521 examples)
βββ screenshots/
β βββ n8n-workflows/ # 2,274 n8n workflow screenshots
β βββ gohighlevel/ # 58 GoHighLevel screenshots
β βββ odoo/ # 170 Odoo ERP screenshots
βββ json_workflows/ # 323 n8n workflow JSON templates
```
## π― Topics Covered
- **n8n Workflow Automation** - Triggers, nodes, integrations, debugging
- **GoHighLevel CRM** - Marketing automation, funnels, campaigns
- **Odoo ERP** - CRM, Sales, Inventory, Accounting modules
- **AI/LLM Integration** - OpenAI, Anthropic, embeddings, RAG
- **Full-Stack Development** - JavaScript, React, CSS, APIs
## π License
Apache 2.0
## π Links
- [GitHub Repository](https://github.com/David2024patton/n8n-docs-datasets)
- [Qwen3-VL Documentation](https://huggingface.co/Qwen)
|