DavidrPatton's picture
Add docs-dataset dataset
4484c0f verified

n8n-Toolkit: Qwen3-VL Multimodal Training Dataset

A comprehensive multimodal training dataset for fine-tuning Qwen3-VL models on business automation, no-code tooling, and workflow expertise.

πŸ“Š Dataset Stats

Metric Count
Total Examples 8,521
Vision Examples 2,391
Text Examples 6,130
n8n Workflow Screenshots 2,274
GoHighLevel Screenshots 58
Odoo Screenshots 170

πŸ—οΈ Schema

Flat schema optimized for HuggingFace compatibility:

Column Type Description
instruction string User's question or prompt
response string Assistant's detailed answer
image_path string Path to image (empty for text-only)
has_image bool True for vision examples
id string Unique example identifier
category string Topic category
source string Data source
platform string Platform (n8n, GHL, Odoo)
complexity string Difficulty level

πŸš€ Quick Start

from datasets import load_dataset

# Load the dataset
dataset = load_dataset("DavidrPatton/n8n-Toolkit", data_files={'train': 'train.jsonl'})

print(f"Total examples: {len(dataset['train'])}")
# Total examples: 8521

πŸ”„ Convert to Qwen3-VL Format

For fine-tuning with Unsloth, transform to the messages format:

from PIL import Image

def to_qwen3_vl_format(row, screenshots_dir):
    """Convert flat row to Qwen3-VL messages format."""
    user_content = [{"type": "text", "text": row['instruction']}]
    
    # Add image if present
    if row['has_image'] and row['image_path']:
        img_path = f"{screenshots_dir}/{row['image_path']}"
        img = Image.open(img_path)
        user_content.insert(0, {"type": "image", "image": img})
    
    return {
        "messages": [
            {"role": "user", "content": user_content},
            {"role": "assistant", "content": [{"type": "text", "text": row['response']}]}
        ]
    }

πŸ“ Repository Structure

β”œβ”€β”€ train.jsonl              # Main training data (8,521 examples)
β”œβ”€β”€ screenshots/
β”‚   β”œβ”€β”€ n8n-workflows/       # 2,274 n8n workflow screenshots
β”‚   β”œβ”€β”€ gohighlevel/         # 58 GoHighLevel screenshots
β”‚   └── odoo/                # 170 Odoo ERP screenshots
└── json_workflows/          # 323 n8n workflow JSON templates

🎯 Topics Covered

  • n8n Workflow Automation - Triggers, nodes, integrations, debugging
  • GoHighLevel CRM - Marketing automation, funnels, campaigns
  • Odoo ERP - CRM, Sales, Inventory, Accounting modules
  • AI/LLM Integration - OpenAI, Anthropic, embeddings, RAG
  • Full-Stack Development - JavaScript, React, CSS, APIs

πŸ“œ License

Apache 2.0

πŸ”— Links