Steel / docs /ARCHITECTURE.md
supernovagateway's picture
Upload folder using huggingface_hub
fb38ec5 verified

Steel Browser Architecture

This document provides a comprehensive overview of Steel Browser's architecture, design decisions, and how the various components work together.

πŸ—οΈ High-Level Architecture

Steel Browser follows a modular, plugin-based architecture designed for extensibility and maintainability:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Steel Browser                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Frontend (React UI)           β”‚  Backend (Fastify API)     β”‚
β”‚  β”œβ”€β”€ Session Management        β”‚  β”œβ”€β”€ CDP Service           β”‚
β”‚  β”œβ”€β”€ Real-time Viewing         β”‚  β”œβ”€β”€ Session Management    β”‚
β”‚  β”œβ”€β”€ DevTools Integration      β”‚  β”œβ”€β”€ File Storage          β”‚
β”‚  └── Configuration UI          β”‚  └── Plugin System         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    Chrome/Chromium Browser                  β”‚
β”‚  β”œβ”€β”€ Chrome DevTools Protocol (CDP)                         β”‚
β”‚  β”œβ”€β”€ Browser Extensions                                     β”‚
β”‚  └── Page Contexts                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”§ Core Components

1. CDP Service (api/src/services/cdp/cdp.service.ts)

The Chrome DevTools Protocol (CDP) Service is the heart of Steel Browser, managing all browser interactions:

Responsibilities:

  • Browser lifecycle management (launch, close, restart)
  • Page creation and navigation
  • WebSocket proxy for CDP connections
  • Plugin system coordination
  • Session state management
  • Context isolation and fingerprinting

Key Features:

class CDPService extends EventEmitter {
  // Browser management
  async launch(options?: BrowserLauncherOptions): Promise<Browser>
  async shutdown(): Promise<void>
  async refreshPrimaryPage(): Promise<void>
  
  // Plugin system
  registerPlugin(plugin: BasePlugin): void
  unregisterPlugin(pluginName: string): boolean
  
  // Page management
  async createPage(): Promise<Page>
  async getPages(): Promise<Page[]>
}

2. Plugin System

Steel Browser's plugin architecture allows for extensible functionality without modifying core code.

Base Plugin (api/src/services/cdp/plugins/core/base-plugin.ts)

abstract class BasePlugin {
  // Lifecycle hooks
  async onBrowserLaunch(browser: Browser): Promise<void>
  async onPageCreated(page: Page): Promise<void>
  async onPageNavigate(page: Page): Promise<void>
  async onPageUnload(page: Page): Promise<void>
  async onBrowserClose(browser: Browser): Promise<void>
  async onBeforePageClose(page: Page): Promise<void>
  async onShutdown(): Promise<void>
}

Plugin Manager (api/src/services/cdp/plugins/core/plugin-manager.ts)

Coordinates plugin lifecycle and ensures error isolation:

  • Registration: Manages plugin registration and dependency injection
  • Event Distribution: Notifies all plugins of browser events
  • Error Handling: Isolates plugin errors to prevent system crashes
  • Lifecycle Management: Coordinates plugin startup and shutdown

3. Session Management (api/src/services/session.service.ts)

Manages browser sessions with isolated contexts:

Features:

  • Session creation with custom configurations
  • Context isolation (cookies, localStorage, sessionStorage)
  • Resource cleanup and garbage collection
  • Session persistence and restoration
  • Concurrent session management
interface SessionConfig {
  proxy?: ProxyConfig;
  userAgent?: string;
  viewport?: { width: number; height: number };
  extensions?: string[];
  fingerprint?: FingerprintOptions;
}

4. File Storage Service (api/src/services/file.service.ts)

Handles file operations with session-scoped storage:

  • Upload Management: Handles multipart file uploads
  • Download Coordination: Manages browser downloads
  • Storage Isolation: Session-scoped file storage
  • Cleanup: Automatic file cleanup on session end

πŸ”Œ Plugin Architecture Deep Dive

Plugin Lifecycle

  1. Registration: Plugins register with the PluginManager
  2. Initialization: Service dependency injection
  3. Event Handling: Respond to browser lifecycle events
  4. Cleanup: Graceful shutdown and resource cleanup

Event Flow

Browser Launch β†’ Plugin.onBrowserLaunch()
     ↓
Page Created β†’ Plugin.onPageCreated()
     ↓
Page Navigate β†’ Plugin.onPageNavigate()
     ↓
Page Unload β†’ Plugin.onPageUnload()
     ↓
Page Close β†’ Plugin.onBeforePageClose()
     ↓
Browser Close β†’ Plugin.onBrowserClose()
     ↓
System Shutdown β†’ Plugin.onShutdown()

Example Plugin Implementation

import { BasePlugin, PluginOptions } from '@steel-browser/api/cdp-plugin';
import { Browser, Page } from 'puppeteer-core';

export class AdBlockPlugin extends BasePlugin {
  private blockedDomains: Set<string>;

  constructor(options: PluginOptions & { blockedDomains?: string[] }) {
    super({ name: 'ad-blocker', ...options });
    this.blockedDomains = new Set(options.blockedDomains || []);
  }

  async onPageCreated(page: Page): Promise<void> {
    await page.setRequestInterception(true);
    
    page.on('request', (request) => {
      const url = new URL(request.url());
      if (this.blockedDomains.has(url.hostname)) {
        request.abort();
      } else {
        request.continue();
      }
    });
  }
}

🌐 API Architecture

Fastify Plugin System

Steel Browser uses Fastify's plugin architecture for modular API design:

// Main plugin registration
await fastify.register(steelBrowserPlugin, {
  fileStorage: { maxSizePerSession: 100 * MB }
});

// Individual plugins
await fastify.register(browserInstancePlugin);
await fastify.register(sessionPlugin);
await fastify.register(fileStoragePlugin);

Route Organization

Routes are organized by functionality:

  • Actions (/v1/*): Browser automation actions (scrape, screenshot, PDF)
  • Sessions (/v1/sessions/*): Session management
  • CDP (/v1/cdp/*): Direct CDP access
  • Files (/v1/files/*): File upload/download
  • Selenium (/selenium/*): Selenium WebDriver compatibility

Schema Validation

All API endpoints use Zod schemas for validation:

const ScrapeRequestSchema = z.object({
  url: z.string().url().optional(),
  delay: z.number().optional(),
  format: z.array(z.enum(['html','cleaned_html','markdown','readability'])).optional(),
  screenshot: z.boolean().optional(),
  pdf: z.boolean().optional(),
});

🎨 Frontend Architecture

React Component Structure

src/
β”œβ”€β”€ components/           # Reusable UI components
β”‚   β”œβ”€β”€ ui/              # Base UI components (buttons, inputs)
β”‚   β”œβ”€β”€ badges/          # Status badges
β”‚   β”œβ”€β”€ icons/           # Icon components
β”‚   └── sessions/        # Session-specific components
β”œβ”€β”€ containers/          # Page-level containers
β”œβ”€β”€ contexts/           # React contexts for state management
β”œβ”€β”€ hooks/              # Custom React hooks
└── steel-client/       # Auto-generated API client

State Management

  • React Query: Server state management and caching
  • React Context: Global application state
  • Local State: Component-specific state with hooks

Real-time Updates

WebSocket connections provide real-time updates:

// Session monitoring
const { data: sessions } = useQuery({
  queryKey: ['sessions'],
  queryFn: () => steelClient.sessions.getSessions(),
  refetchInterval: 1000 // Real-time updates
});

πŸ”’ Security Architecture

Input Validation

  • API Level: Zod schema validation for all inputs
  • Browser Level: Content Security Policy (CSP) headers
  • File Level: File type validation and size limits

Context Isolation

Each session runs in an isolated browser context:

const context = await browser.createIncognitoBrowserContext();
context.setDefaultNavigationTimeout(30000);
context.setDefaultTimeout(30000);

Resource Limits

  • Memory: Browser process memory limits
  • CPU: Process CPU throttling
  • Storage: Session-scoped file storage limits
  • Network: Request rate limiting and proxy support

πŸ“Š Performance Considerations

Browser Resource Management

  • Process Isolation: Each session in separate browser context
  • Memory Cleanup: Automatic page and context cleanup
  • Connection Pooling: Reuse CDP connections where possible

Caching Strategy

  • Static Assets: Long-term caching for UI assets
  • API Responses: Short-term caching for session data
  • Browser Cache: Configurable per-session browser caching

Scaling Considerations

Current architecture supports:

  • Vertical Scaling: Multi-core CPU utilization
  • Session Concurrency: Multiple simultaneous sessions
  • Resource Monitoring: Memory and CPU usage tracking

Future scaling options:

  • Horizontal Scaling: Multiple Steel instances
  • Load Balancing: Session distribution
  • Distributed Storage: Shared file storage

πŸ§ͺ Testing Architecture

Test Structure (Planned)

tests/
β”œβ”€β”€ unit/               # Unit tests for individual components
β”œβ”€β”€ integration/        # API endpoint integration tests
β”œβ”€β”€ e2e/               # End-to-end browser automation tests
└── performance/       # Load and performance tests

Testing Strategy

  • Unit Tests: Core services and utilities
  • Integration Tests: API endpoints and database interactions
  • E2E Tests: Full browser automation workflows
  • Performance Tests: Load testing and benchmarking

πŸ”§ Configuration Management

Environment Variables

Configuration through environment variables:

const envSchema = z.object({
  NODE_ENV: z.enum(['development', 'production', 'test']),
  HOST: z.string().default('0.0.0.0'),
  PORT: z.string().default('3000'),
  CHROME_EXECUTABLE_PATH: z.string().optional(),
  CHROME_HEADLESS: z.boolean().default(true),
  // ... more configuration options
});

Runtime Configuration

  • Browser Options: Per-session browser configuration
  • Plugin Configuration: Dynamic plugin options
  • Feature Flags: Runtime feature toggling

πŸš€ Deployment Architecture

Containerization

Multi-stage Docker builds for optimization:

# Build stage
FROM node:22-slim AS build
# ... build steps

# Production stage  
FROM node:22-slim AS production
# ... production setup

Service Dependencies

  • Chrome/Chromium: Browser engine
  • Node.js: Runtime environment
  • Nginx: Reverse proxy (in containers)
  • File System: Session storage

πŸ”„ Development Workflow

Hot Reloading

Development environment supports hot reloading:

npm run dev  # Starts both API and UI with hot reload

Debug Configuration

Built-in debugging support:

# API debugging
node --inspect ./api/build/index.js

# Enable verbose logging
ENABLE_VERBOSE_LOGGING=true npm run dev -w api

πŸ“ˆ Monitoring and Observability

Logging

Structured logging with Pino:

fastify.log.info({ 
  sessionId, 
  action: 'page_created',
  url: page.url() 
}, 'New page created');

Metrics (Planned)

  • Session Metrics: Creation, duration, success rates
  • Performance Metrics: Response times, resource usage
  • Error Tracking: Error rates and categorization

Health Checks

Built-in health check endpoints:

// Basic health check
GET /health

// Detailed readiness check
GET /ready

This architecture provides a solid foundation for browser automation while maintaining flexibility for future enhancements and scaling requirements.