--- license: apache-2.0 task_categories: - text-generation language: - en tags: - n8n - workflow-automation - code-generation - sft - json - low-code - automation pretty_name: n8n Workflows SFT Dataset size_categories: - 1Ksystem You are an n8n workflow expert. Generate valid n8n workflow JSON configurations.<|im_end|> <|im_start|>user {example['instruction']}<|im_end|> <|im_start|>assistant {example['output']}<|im_end|>""" training_args = SFTConfig( output_dir="./n8n-sft-model", per_device_train_batch_size=4, gradient_accumulation_steps=4, num_train_epochs=3, learning_rate=2e-5, bf16=True, logging_steps=10, save_strategy="epoch", ) trainer = SFTTrainer( model=model, args=training_args, train_dataset=dataset["train"], formatting_func=formatting_func, tokenizer=tokenizer, max_seq_length=2048, ) trainer.train() ``` ## Covered n8n Nodes The dataset includes workflows featuring common n8n integrations: | Category | Nodes | |----------|-------| | **Triggers** | Webhook, Schedule, Manual | | **Core** | HTTP Request, Code, Function, Set, Filter, Switch, Merge | | **Communication** | Slack, Discord, Email, Telegram | | **Data** | PostgreSQL, MySQL, MongoDB, Airtable, Google Sheets | | **Dev Tools** | GitHub, GitLab, Jira | | **Storage** | AWS S3, Google Drive, Dropbox | | **CRM** | HubSpot, Salesforce | ## Intended Uses ### Primary Use - Fine-tuning language models for n8n workflow generation - Training code assistants specialized in automation ### Out-of-Scope Use - Direct production deployment without validation - Training models for other automation platforms (Zapier, Make, etc.) ## Limitations - **Node Coverage**: Not all 400+ n8n nodes are represented equally - **Complexity**: Most workflows are simple to medium complexity (2-8 nodes) - **Validation**: Workflows are structurally valid but may require credential configuration - **Version**: Based on n8n workflow schema as of late 2024; may need updates for future n8n versions ## Dataset Creation ### Source Data Workflows were collected and curated from: - Public n8n workflow templates - Community-shared automations - Synthetically generated examples with manual validation ### Curation Process 1. Collection of raw workflow JSON files 2. Extraction and normalization of workflow structure 3. Generation of natural language descriptions 4. Manual review for quality and accuracy 5. Deduplication and filtering ## Models Trained on This Dataset - [eclaude/qwen-coder-3b-n8n-sft](https://huggingface.co/eclaude/qwen-coder-3b-n8n-sft) ## Citation ```bibtex @dataset{n8n_workflows_sft_2025, author = {eclaude}, title = {n8n Workflows SFT Dataset}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/eclaude/n8n-workflows-sft} } ``` ## Contact For questions, suggestions, or contributions, open a discussion on this repository or contact via [Hugging Face](https://huggingface.co/eclaude).