YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ€– RTILA Assistant Mini

Ultra-lightweight fine-tuned AI model β€” Confirmed working on Mac M1 8GB, low VRAM GPUs, and CPU-only systems

Model License GGUF Mac M1 8GB

πŸ“‹ Model Description

RTILA Assistant Mini is the most portable model in the RTILA family, specifically designed for low-resource devices. Fine-tuned from Qwen3-4B, it delivers solid automation generation capabilities while fitting comfortably on 8GB systems.

πŸ”„ Choose Your Version

Model Base GGUF Size Min RAM Best For
RTILA Assistant Qwen3-14B ~9 GB 16 GB Maximum quality, complex automations
RTILA Assistant Lite Qwen3-8B ~5 GB 8 GB Balanced performance, mid-range devices
RTILA Assistant Mini (this) Qwen3-4B ~2.5 GB 6 GB βœ… Mac M1 8GB, low VRAM, CPU inference

✨ Why Mini?

Feature RTILA Assistant RTILA Assistant Lite RTILA Assistant Mini
Base Model Qwen3-14B Qwen3-8B Qwen3-4B
Q4_K_M Size ~9 GB ~5 GB ~2.5 GB
Min Inference RAM 16 GB 8 GB 4-5 GB
Mac M1 8GB ❌ ⚠️ Tight βœ… Confirmed
Low VRAM GPUs (4-6GB) ❌ ⚠️ βœ…
CPU Inference Slow Viable βœ… Fast
Quality ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐

Capabilities

Category Description
🌐 Navigation & Interaction Click, scroll, type, wait, handle popups, multi-tab workflows
πŸ“Š Data Extraction CSS/XPath selectors, tables, lists, nested data, pagination
πŸ”„ Logic & Flow Loops, conditionals, error handling, retry patterns
πŸ”— Triggers & Integrations Webhooks, PostgreSQL, MySQL, Slack, email notifications
πŸ“ Variables & Substitution Dynamic values, data transformations, regex patterns
πŸ› οΈ Advanced Scripting Custom JavaScript execution, page analysis, DOM manipulation

πŸ“¦ Model Specifications

Property Value
Base Model Qwen3-4B
Format GGUF Q4_K_M
Size ~2.5 GB
Context Length 2048 tokens

πŸ’» Hardware Requirements

Hardware Supported Notes
Mac M1/M2/M3 8GB βœ… Confirmed Smooth experience, tested and verified
Mac M1/M2/M3 16GB+ βœ… Excellent Very fast inference
GPU (4-6GB VRAM) βœ… Works GTX 1650, RTX 3050, Intel Arc
GPU (6GB+ VRAM) βœ… Excellent RTX 2060, RTX 3060, etc.
CPU-only (6GB+ RAM) βœ… Fast Reasonable inference speed
CPU-only (4GB RAM) ⚠️ Tight May work with swap

πŸš€ Quick Start

Option 1: Ollama (Easiest)

# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-mini:Q4_K_M

Or create a custom Modelfile:

FROM hf.co/rtila-corporation/rtila-assistant-mini:Q4_K_M

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20

SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""
ollama create rtila-mini -f Modelfile
ollama run rtila-mini

Option 2: LM Studio

  1. Download LM Studio
  2. Search for rtila-corporation/rtila-assistant-mini
  3. Download Q4_K_M (~2.5 GB)
  4. Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
  5. Start chatting!

Option 3: llama.cpp

# Download model
huggingface-cli download rtila-corporation/rtila-assistant-mini \
  rtila-assistant-mini.Q4_K_M.gguf --local-dir ./models

# Run interactive chat
./llama-cli -m ./models/rtila-assistant-mini.Q4_K_M.gguf \
  -p "Scrape product prices from an e-commerce site" \
  --temp 0.7 --top-p 0.8 --top-k 20

Option 4: Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="./models/rtila-assistant-mini.Q4_K_M.gguf",
    n_ctx=2048,
    n_gpu_layers=-1,  # Use GPU/Metal if available, 0 for CPU only
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are RTILA Assistant..."},
        {"role": "user", "content": "Create a config for scraping product data"}
    ],
    temperature=0.7,
    top_p=0.8,
    top_k=20,
)
print(response['choices'][0]['message']['content'])

βš™οΈ Recommended Generation Settings

Based on Qwen3's official recommendations for non-thinking mode:

Parameter Value
Temperature 0.7
Top-P 0.8
Top-K 20
Min-P 0

ℹ️ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce <think>...</think> blocks.


πŸ’¬ Example Prompts & Outputs

Example 1: Basic Extraction

Prompt:

Extract all product names and prices from the current page

Output:

{
  "settings": {
    "name": "Product Extraction"
  },
  "datasets": {
    "products": {
      "selector": ".product-item",
      "type": "list",
      "fields": {
        "name": {
          "selector": ".product-title",
          "type": "text"
        },
        "price": {
          "selector": ".product-price",
          "type": "text",
          "transform": "extractNumber"
        }
      }
    }
  },
  "commands": [
    {"action": "extract", "dataset": "products"}
  ]
}

Example 2: Pagination Loop

Prompt:

Scrape reviews from multiple pages, clicking "Next" until no more pages

Output:

{
  "settings": {
    "name": "Review Scraper with Pagination"
  },
  "datasets": {
    "reviews": {
      "selector": ".review-card",
      "type": "list",
      "fields": {
        "author": {"selector": ".reviewer-name", "type": "text"},
        "rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
        "text": {"selector": ".review-body", "type": "text"},
        "date": {"selector": ".review-date", "type": "text"}
      }
    }
  },
  "commands": [
    {
      "action": "loop",
      "while": {"exists": ".next-page:not(.disabled)"},
      "maxIterations": 50,
      "commands": [
        {"action": "extract", "dataset": "reviews", "append": true},
        {"action": "click", "selector": ".next-page"},
        {"action": "wait", "duration": 2000}
      ]
    }
  ]
}

Example 3: Database Integration

Prompt:

After extraction, save data to PostgreSQL and send Slack notification

Output:

{
  "triggers": {
    "onComplete": [
      {
        "type": "database",
        "config": {
          "driver": "postgresql",
          "connection": "{{env.DATABASE_URL}}",
          "table": "scraped_data",
          "mode": "upsert",
          "uniqueKey": "id"
        }
      },
      {
        "type": "webhook",
        "config": {
          "url": "{{env.SLACK_WEBHOOK}}",
          "method": "POST",
          "body": {
            "text": "βœ… Extraction complete: {{datasets.products.length}} items saved"
          }
        }
      }
    ]
  }
}

Example 4: Knowledge Question

Prompt:

What is the difference between execute_script and run_script?

Output:

execute_script and run_script serve different purposes in RTILA:

**execute_script**: Runs JavaScript synchronously in the page context. Returns immediately 
with the result. Best for quick DOM queries, getting values, or simple manipulations.

**run_script**: Executes JavaScript asynchronously with full control flow. Supports 
await, can make fetch requests, and handles complex async operations. Returns a Promise.

Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait

πŸ‹οΈ Training Details

Parameter Value
Base Model unsloth/Qwen3-4B
Method QLoRA (4-bit)
LoRA Rank 128
LoRA Alpha 256
Context Length 2048 tokens
Training Examples ~400
Epochs 6 (with early stopping)
Learning Rate 2e-4
Thinking Mode Disabled

Optimizations for Mini Version

  • Highest LoRA rank (128): Maximizes learning capacity for smaller base
  • More training epochs (6): Compensates for smaller model capacity
  • Higher learning rate (2e-4): Better convergence for small models
  • Longer context (2048): Full headroom for complex configurations
  • Thinking mode disabled: Clean JSON output without <think> overhead
  • Rank-stabilized LoRA (rsLoRA): More stable training dynamics

Training Data

  • Navigation & Interaction patterns
  • Data extraction configurations
  • Logic & flow control
  • Triggers & integrations
  • Variables & substitution
  • Advanced scripting
  • Error handling
  • Knowledge base Q&A

πŸ“ System Prompt

For best results, use this system prompt:

You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.

Your capabilities:
1. Generate complete JSON configurations for web automation tasks
2. Define datasets with selectors, properties, and transformations
3. Configure navigation, extraction, loops, and conditionals
4. Set up triggers for webhooks, databases, and integrations
5. Explain RTILA concepts and best practices

When generating configurations:
- Always output valid JSON with proper structure
- Include 'settings', 'datasets', and 'commands' sections as needed
- Use appropriate selectors (CSS, XPath) for the target elements
- Apply transformations when data cleaning is required

When answering questions:
- Be concise and accurate
- Provide examples when helpful
- Reference specific RTILA features and commands

πŸ”— Model Family

Model Link Best For
RTILA Assistant huggingface.co/rtila-corporation/rtila-assistant Maximum quality
RTILA Assistant Lite huggingface.co/rtila-corporation/rtila-assistant-lite Mid-range devices
RTILA Assistant Mini (this) huggingface.co/rtila-corporation/rtila-assistant-mini Mac M1 8GB, low VRAM

RTILA Platform: rtila.com


πŸ“„ License

Apache 2.0


πŸ™ Acknowledgments

Downloads last month
1,850
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support