File size: 3,057 Bytes

4484c0f

# n8n-Toolkit: Qwen3-VL Multimodal Training Dataset

A comprehensive multimodal training dataset for fine-tuning Qwen3-VL models on business automation, no-code tooling, and workflow expertise.

## 📊 Dataset Stats

| Metric | Count |
|--------|-------|
| **Total Examples** | 8,521 |
| **Vision Examples** | 2,391 |
| **Text Examples** | 6,130 |
| **n8n Workflow Screenshots** | 2,274 |
| **GoHighLevel Screenshots** | 58 |
| **Odoo Screenshots** | 170 |

## 🏗️ Schema

Flat schema optimized for HuggingFace compatibility:

| Column | Type | Description |
|--------|------|-------------|
| `instruction` | string | User's question or prompt |
| `response` | string | Assistant's detailed answer |
| `image_path` | string | Path to image (empty for text-only) |
| `has_image` | bool | True for vision examples |
| `id` | string | Unique example identifier |
| `category` | string | Topic category |
| `source` | string | Data source |
| `platform` | string | Platform (n8n, GHL, Odoo) |
| `complexity` | string | Difficulty level |

## 🚀 Quick Start

```python

from datasets import load_dataset



# Load the dataset

dataset = load_dataset("DavidrPatton/n8n-Toolkit", data_files={'train': 'train.jsonl'})



print(f"Total examples: {len(dataset['train'])}")

# Total examples: 8521

```

## 🔄 Convert to Qwen3-VL Format

For fine-tuning with Unsloth, transform to the messages format:

```python

from PIL import Image



def to_qwen3_vl_format(row, screenshots_dir):

    """Convert flat row to Qwen3-VL messages format."""

    user_content = [{"type": "text", "text": row['instruction']}]

    

    # Add image if present

    if row['has_image'] and row['image_path']:

        img_path = f"{screenshots_dir}/{row['image_path']}"

        img = Image.open(img_path)

        user_content.insert(0, {"type": "image", "image": img})

    

    return {

        "messages": [

            {"role": "user", "content": user_content},

            {"role": "assistant", "content": [{"type": "text", "text": row['response']}]}

        ]

    }

```

## 📁 Repository Structure

```

├── train.jsonl              # Main training data (8,521 examples)

├── screenshots/

│   ├── n8n-workflows/       # 2,274 n8n workflow screenshots

│   ├── gohighlevel/         # 58 GoHighLevel screenshots

│   └── odoo/                # 170 Odoo ERP screenshots

└── json_workflows/          # 323 n8n workflow JSON templates

```

## 🎯 Topics Covered

- **n8n Workflow Automation** - Triggers, nodes, integrations, debugging
- **GoHighLevel CRM** - Marketing automation, funnels, campaigns
- **Odoo ERP** - CRM, Sales, Inventory, Accounting modules
- **AI/LLM Integration** - OpenAI, Anthropic, embeddings, RAG
- **Full-Stack Development** - JavaScript, React, CSS, APIs

## 📜 License

Apache 2.0

## 🔗 Links

- [GitHub Repository](https://github.com/David2024patton/n8n-docs-datasets)
- [Qwen3-VL Documentation](https://huggingface.co/Qwen)