YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ€– RTILA Assistant Lite

Balanced fine-tuned AI model for mid-range devices β€” 8GB+ RAM systems

Model License GGUF

πŸ“‹ Model Description

RTILA Assistant Lite is the balanced option in the RTILA family, fine-tuned from Qwen3-8B. It offers excellent quality while fitting on more devices than the full 14B model.

πŸ”„ Choose Your Version

Model Base GGUF Size Min RAM Best For
RTILA Assistant Qwen3-14B ~9 GB 16 GB Maximum quality, complex automations
RTILA Assistant Lite (this) Qwen3-8B ~5 GB 8 GB 🎯 Balanced performance, mid-range devices
RTILA Assistant Mini Qwen3-4B ~2.5 GB 6 GB βœ… Mac M1 8GB, low VRAM, CPU inference

✨ Why Lite?

Feature RTILA Assistant RTILA Assistant Lite RTILA Assistant Mini
Base Model Qwen3-14B Qwen3-8B Qwen3-4B
Q4_K_M Size ~9 GB ~5 GB ~2.5 GB
Min RAM 16 GB 8 GB 6 GB
Mac M1 8GB ❌ ⚠️ Tight βœ…
Quality ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐

⚠️ Mac M1 8GB Users: While this model may run on 8GB systems, it will be tight on memory. For a smoother experience, we recommend RTILA Assistant Mini which is confirmed working on Mac M1 8GB.

Capabilities

Category Description
🌐 Navigation & Interaction Click, scroll, type, wait, handle popups, multi-tab workflows
πŸ“Š Data Extraction CSS/XPath selectors, tables, lists, nested data, pagination
πŸ”„ Logic & Flow Loops, conditionals, error handling, retry patterns
πŸ”— Triggers & Integrations Webhooks, PostgreSQL, MySQL, Slack, email notifications
πŸ“ Variables & Substitution Dynamic values, data transformations, regex patterns
πŸ› οΈ Advanced Scripting Custom JavaScript execution, page analysis, DOM manipulation

πŸ“¦ Model Specifications

Property Value
Base Model Qwen3-8B
Format GGUF Q4_K_M
Size ~5 GB
Context Length 2048 tokens

πŸ’» Hardware Requirements

Hardware Supported Notes
GPU (8GB+ VRAM) βœ… Recommended RTX 3060, RTX 4060, RTX 3070
GPU (6GB VRAM) ⚠️ May work RTX 2060, GTX 1660 - needs CPU offloading
Apple Silicon 16GB+ βœ… Excellent M1/M2/M3 Pro/Max - fast and smooth
Apple Silicon 8GB ⚠️ Tight May work but memory-constrained
CPU-only βœ… Viable 8GB+ RAM, reasonable inference speed

πŸ’‘ Memory tight? Try RTILA Assistant Mini (~2.5GB GGUF, runs smoothly on 6GB)


πŸš€ Quick Start

Option 1: Ollama (Easiest)

# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-lite:Q4_K_M

Or create a custom Modelfile:

FROM hf.co/rtila-corporation/rtila-assistant-lite:Q4_K_M

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20

SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""
ollama create rtila-lite -f Modelfile
ollama run rtila-lite

Option 2: LM Studio

  1. Download LM Studio
  2. Search for rtila-corporation/rtila-assistant-lite
  3. Download Q4_K_M
  4. Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
  5. Start chatting!

Option 3: llama.cpp

# Download model
huggingface-cli download rtila-corporation/rtila-assistant-lite \
  rtila-assistant-lite.Q4_K_M.gguf --local-dir ./models

# Run interactive chat
./llama-cli -m ./models/rtila-assistant-lite.Q4_K_M.gguf \
  -p "Scrape product prices from an e-commerce site" \
  --temp 0.7 --top-p 0.8 --top-k 20

Option 4: Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="./models/rtila-assistant-lite.Q4_K_M.gguf",
    n_ctx=2048,
    n_gpu_layers=-1,  # Use GPU if available, 0 for CPU only
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are RTILA Assistant..."},
        {"role": "user", "content": "Create a config for scraping product data"}
    ],
    temperature=0.7,
    top_p=0.8,
    top_k=20,
)
print(response['choices'][0]['message']['content'])

βš™οΈ Recommended Generation Settings

Based on Qwen3's official recommendations for non-thinking mode:

Parameter Value
Temperature 0.7
Top-P 0.8
Top-K 20
Min-P 0

ℹ️ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce <think>...</think> blocks.


πŸ’¬ Example Prompts & Outputs

Example 1: Basic Extraction

Prompt:

Extract all product names and prices from the current page

Output:

{
  "settings": {
    "name": "Product Extraction"
  },
  "datasets": {
    "products": {
      "selector": ".product-item",
      "type": "list",
      "fields": {
        "name": {
          "selector": ".product-title",
          "type": "text"
        },
        "price": {
          "selector": ".product-price",
          "type": "text",
          "transform": "extractNumber"
        }
      }
    }
  },
  "commands": [
    {"action": "extract", "dataset": "products"}
  ]
}

Example 2: Pagination Loop

Prompt:

Scrape reviews from multiple pages, clicking "Next" until no more pages

Output:

{
  "settings": {
    "name": "Review Scraper with Pagination"
  },
  "datasets": {
    "reviews": {
      "selector": ".review-card",
      "type": "list",
      "fields": {
        "author": {"selector": ".reviewer-name", "type": "text"},
        "rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
        "text": {"selector": ".review-body", "type": "text"},
        "date": {"selector": ".review-date", "type": "text"}
      }
    }
  },
  "commands": [
    {
      "action": "loop",
      "while": {"exists": ".next-page:not(.disabled)"},
      "maxIterations": 50,
      "commands": [
        {"action": "extract", "dataset": "reviews", "append": true},
        {"action": "click", "selector": ".next-page"},
        {"action": "wait", "duration": 2000}
      ]
    }
  ]
}

Example 3: Database Integration

Prompt:

After extraction, save data to PostgreSQL and send Slack notification

Output:

{
  "triggers": {
    "onComplete": [
      {
        "type": "database",
        "config": {
          "driver": "postgresql",
          "connection": "{{env.DATABASE_URL}}",
          "table": "scraped_data",
          "mode": "upsert",
          "uniqueKey": "id"
        }
      },
      {
        "type": "webhook",
        "config": {
          "url": "{{env.SLACK_WEBHOOK}}",
          "method": "POST",
          "body": {
            "text": "βœ… Extraction complete: {{datasets.products.length}} items saved"
          }
        }
      }
    ]
  }
}

Example 4: Knowledge Question

Prompt:

What is the difference between execute_script and run_script?

Output:

execute_script and run_script serve different purposes in RTILA:

**execute_script**: Runs JavaScript synchronously in the page context. Returns immediately 
with the result. Best for quick DOM queries, getting values, or simple manipulations.

**run_script**: Executes JavaScript asynchronously with full control flow. Supports 
await, can make fetch requests, and handles complex async operations. Returns a Promise.

Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait

πŸ‹οΈ Training Details

Parameter Value
Base Model unsloth/Qwen3-8B
Method QLoRA (4-bit)
LoRA Rank 128
LoRA Alpha 256
Context Length 2048 tokens
Training Examples ~400
Epochs 5 (with early stopping)
Learning Rate 2e-4
Thinking Mode Disabled

Optimizations for Lite Version

  • Higher LoRA rank (128 vs 64): Compensates for smaller base model
  • Higher learning rate (2e-4 vs 2e-5): Better convergence for smaller models
  • Longer context (2048 vs 1536): More headroom for complex configurations
  • Thinking mode disabled: Eliminates <think> overhead for structured output
  • Rank-stabilized LoRA (rsLoRA): More stable training

Training Data

  • Navigation & Interaction patterns
  • Data extraction configurations
  • Logic & flow control
  • Triggers & integrations
  • Variables & substitution
  • Advanced scripting
  • Error handling
  • Knowledge base Q&A

πŸ“ System Prompt

For best results, use this system prompt:

You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.

Your capabilities:
1. Generate complete JSON configurations for web automation tasks
2. Define datasets with selectors, properties, and transformations
3. Configure navigation, extraction, loops, and conditionals
4. Set up triggers for webhooks, databases, and integrations
5. Explain RTILA concepts and best practices

When generating configurations:
- Always output valid JSON with proper structure
- Include 'settings', 'datasets', and 'commands' sections as needed
- Use appropriate selectors (CSS, XPath) for the target elements
- Apply transformations when data cleaning is required

When answering questions:
- Be concise and accurate
- Provide examples when helpful
- Reference specific RTILA features and commands

πŸ”— Model Family

Model Link Best For
RTILA Assistant huggingface.co/rtila-corporation/rtila-assistant Maximum quality
RTILA Assistant Lite (this) huggingface.co/rtila-corporation/rtila-assistant-lite Mid-range devices
RTILA Assistant Mini huggingface.co/rtila-corporation/rtila-assistant-mini Mac M1 8GB, low VRAM

RTILA Platform: rtila.com


πŸ“„ License

Apache 2.0


πŸ™ Acknowledgments

Downloads last month
17,470
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support