| # n8n-Toolkit: Qwen3-VL Multimodal Training Dataset | |
| A comprehensive multimodal training dataset for fine-tuning Qwen3-VL models on business automation, no-code tooling, and workflow expertise. | |
| ## π Dataset Stats | |
| | Metric | Count | | |
| |--------|-------| | |
| | **Total Examples** | 8,521 | | |
| | **Vision Examples** | 2,391 | | |
| | **Text Examples** | 6,130 | | |
| | **n8n Workflow Screenshots** | 2,274 | | |
| | **GoHighLevel Screenshots** | 58 | | |
| | **Odoo Screenshots** | 170 | | |
| ## ποΈ Schema | |
| Flat schema optimized for HuggingFace compatibility: | |
| | Column | Type | Description | | |
| |--------|------|-------------| | |
| | `instruction` | string | User's question or prompt | | |
| | `response` | string | Assistant's detailed answer | | |
| | `image_path` | string | Path to image (empty for text-only) | | |
| | `has_image` | bool | True for vision examples | | |
| | `id` | string | Unique example identifier | | |
| | `category` | string | Topic category | | |
| | `source` | string | Data source | | |
| | `platform` | string | Platform (n8n, GHL, Odoo) | | |
| | `complexity` | string | Difficulty level | | |
| ## π Quick Start | |
| ```python | |
| from datasets import load_dataset | |
| # Load the dataset | |
| dataset = load_dataset("DavidrPatton/n8n-Toolkit", data_files={'train': 'train.jsonl'}) | |
| print(f"Total examples: {len(dataset['train'])}") | |
| # Total examples: 8521 | |
| ``` | |
| ## π Convert to Qwen3-VL Format | |
| For fine-tuning with Unsloth, transform to the messages format: | |
| ```python | |
| from PIL import Image | |
| def to_qwen3_vl_format(row, screenshots_dir): | |
| """Convert flat row to Qwen3-VL messages format.""" | |
| user_content = [{"type": "text", "text": row['instruction']}] | |
| # Add image if present | |
| if row['has_image'] and row['image_path']: | |
| img_path = f"{screenshots_dir}/{row['image_path']}" | |
| img = Image.open(img_path) | |
| user_content.insert(0, {"type": "image", "image": img}) | |
| return { | |
| "messages": [ | |
| {"role": "user", "content": user_content}, | |
| {"role": "assistant", "content": [{"type": "text", "text": row['response']}]} | |
| ] | |
| } | |
| ``` | |
| ## π Repository Structure | |
| ``` | |
| βββ train.jsonl # Main training data (8,521 examples) | |
| βββ screenshots/ | |
| β βββ n8n-workflows/ # 2,274 n8n workflow screenshots | |
| β βββ gohighlevel/ # 58 GoHighLevel screenshots | |
| β βββ odoo/ # 170 Odoo ERP screenshots | |
| βββ json_workflows/ # 323 n8n workflow JSON templates | |
| ``` | |
| ## π― Topics Covered | |
| - **n8n Workflow Automation** - Triggers, nodes, integrations, debugging | |
| - **GoHighLevel CRM** - Marketing automation, funnels, campaigns | |
| - **Odoo ERP** - CRM, Sales, Inventory, Accounting modules | |
| - **AI/LLM Integration** - OpenAI, Anthropic, embeddings, RAG | |
| - **Full-Stack Development** - JavaScript, React, CSS, APIs | |
| ## π License | |
| Apache 2.0 | |
| ## π Links | |
| - [GitHub Repository](https://github.com/David2024patton/n8n-docs-datasets) | |
| - [Qwen3-VL Documentation](https://huggingface.co/Qwen) | |