pinch / docs /implementations /lite-engine.md
AUXteam's picture
Upload folder using huggingface_hub
25b930c verified
# Lite Engine
PinchTab includes a **Lite Engine** that performs DOM capture β€” navigate, snapshot,
text extraction, click, and type β€” without requiring Chrome or Chromium. It is
powered by [Gost-DOM](https://github.com/gost-dom/browser) (v0.11.0, MIT), a headless
browser written in pure Go.
**Issue:** [#201](https://github.com/pinchtab/pinchtab/issues/201)
---
## Why a Lite Engine?
Chrome is the default execution backend for PinchTab. A real browser session handles
JavaScript rendering, bot-detection bypass, screenshots, and PDF generation. For many
workloads β€” static sites, wikis, news articles, APIs β€” none of these are needed.
| Driver | Chrome | Lite |
|--------|--------|------|
| Memory per instance | ~200 MB | ~10 MB |
| Cold-start latency | 1–6 seconds | <100 ms |
| JavaScript rendering | yes | no |
| Screenshots / PDF | yes | no |
| No Chrome installation required | no | **yes** |
Lite wins at DOM-only workloads (3–4Γ— faster navigate, 3Γ— faster snapshot) and is the
right choice for containers, CI pipelines, and edge environments where Chrome is not
available.
---
## Architecture
### Engine Interface
All engines implement a common interface defined in `internal/engine/engine.go`:
```go
type Engine interface {
Name() string
Navigate(ctx context.Context, url string) (*NavigateResult, error)
Snapshot(ctx context.Context, filter string) ([]SnapshotNode, error)
Text(ctx context.Context) (string, error)
Click(ctx context.Context, ref string) error
Type(ctx context.Context, ref, text string) error
Capabilities() []Capability
Close() error
}
```
The Chrome engine wraps the existing CDP/chromedp pipeline. `LiteEngine` in
`internal/engine/lite.go` implements the same interface using Gost-DOM.
### Router (Strategy Pattern)
```
Request β†’ Router β†’ [Rule 1] β†’ [Rule 2] β†’ … β†’ [Fallback Rule] β†’ Engine
```
`Router` in `internal/engine/router.go` evaluates an ordered chain of `RouteRule`
implementations. The first rule that returns a non-`Undecided` verdict wins. Rules
are registered at startup and are hot-swappable via `AddRule()` / `RemoveRule()`.
No handler, bridge, or config change is needed when adding new routing logic β€” only a
`RouteRule` implementation and a single `router.AddRule(myRule)` call.
### Built-in Rules
| Rule | File | Behaviour |
|------|------|-----------|
| `CapabilityRule` | `rules.go` | Routes `screenshot`, `pdf`, `evaluate`, `cookies` β†’ Chrome |
| `ContentHintRule` | `rules.go` | Routes URLs ending in `.html/.htm/.xml/.txt/.md` β†’ Lite |
| `DefaultLiteRule` | `rules.go` | Catch-all: all remaining DOM ops β†’ Lite (used in `lite` mode) |
| `DefaultChromeRule` | `rules.go` | Final fallback β†’ Chrome (used in `chrome` and `auto` modes) |
### Three Modes
| Mode | Behaviour |
|------|-----------|
| `chrome` | All requests go through Chrome. Backward-compatible default. |
| `lite` | DOM operations (navigate, snapshot, text, click, type) use Gost-DOM. Screenshot / PDF / evaluate / cookies fall through to Chrome (501 if Chrome is unavailable). |
| `auto` | Per-request routing via rules: capability and content-hint rules are evaluated first; unknown URLs fall back to Chrome. |
---
## Request Flow (Lite Mode)
```
POST /navigate (server.engine=lite)
β”‚
β–Ό
handlers/navigation.go β€” HandleNavigate()
β”‚
β”œβ”€ useLite() == true
β”‚ β”‚
β”‚ β–Ό
β”‚ LiteEngine.Navigate(ctx, url)
β”‚ β”œβ”€ HTTP GET url
β”‚ β”œβ”€ Strip <script> tags (x/net/html tokenizer)
β”‚ β”œβ”€ browser.NewWindowReader(reader) [Gost-DOM]
β”‚ └─ return NavigateResult{TabID, URL, Title}
β”‚
└─ w.Header().Set("X-Engine", "lite")
JSON {"tabId": "lp-1", "url": "…", "title": "…"}
```
Snapshot then traverses the Gost-DOM document tree and maps HTML semantics to
accessibility roles (heading, link, button, textbox, …). Text walks the same tree and
collapses whitespace runs.
---
## Capability Boundaries
| Operation | Lite | Chrome |
|-----------|------|--------|
| Navigate | βœ… (HTTP fetch + DOM parse) | βœ… |
| Snapshot | βœ… | βœ… |
| Text extraction | βœ… | βœ… |
| Click | βœ… (DOM event dispatch) | βœ… |
| Type | βœ… (DOM input events) | βœ… |
| Screenshot | ❌ β†’ `501 Not Implemented` | βœ… |
| PDF | ❌ β†’ `501 Not Implemented` | βœ… |
| Evaluate (JS) | ❌ β†’ `501 Not Implemented` | βœ… |
| Cookies | ❌ β†’ `501 Not Implemented` | βœ… |
| JavaScript-rendered SPAs | ❌ | βœ… |
| Bot-detection bypass | ❌ | βœ… |
`CapabilityRule` ensures screenshot/pdf/evaluate/cookies are always routed to Chrome
even in `lite` mode.
---
## Known Limitations
| Limitation | Detail |
|------------|--------|
| `<script>` tags | Gost-DOM panics on an un-initialized `ScriptHost`. Scripts are stripped before parse via `x/net/html` tokenizer. |
| `<a href>` click | Gost-DOM navigates on anchor click and may encounter scripts. `Click()` wraps execution in `defer recover()` and returns an error instead of panicking. |
| CSS `display:none` | Lite has no CSS engine so hidden elements still appear in the snapshot. |
| JavaScript-rendered content | Only the initial HTML is captured. SPAs (React, Next.js etc.) should use Chrome. |
| Sites that block HTTP bots | Stack Overflow and similar sites return 4xx/5xx to plain HTTP clients. Chrome bypasses this via a real browser session. |
---
## Configuration
Set the engine in your config file:
```json
{
"server": {
"engine": "lite"
}
}
```
The `engine` field is also forwarded to child bridge instances so every managed
instance in a multi-instance deployment uses the same mode.
### Response Header
Responses served by the Lite engine include:
```
X-Engine: lite
```
This header is present on `navigate`, `snapshot`, and `text` responses when the lite
path was taken and is useful for observability and debugging.
---
## Performance
Benchmark across 8 real-world websites (Navigate β†’ Snapshot β†’ Text pipeline, 7 sites
where both engines completed successfully):
| Metric | Lite | Chrome | Speedup |
|--------|-----:|-------:|--------:|
| Navigate total | 4,580 ms | 17,981 ms | **3.9Γ—** faster |
| Snapshot total | 1,739 ms | 5,155 ms | **3.0Γ—** faster |
| Text total | 925 ms | 500 ms | 0.5Γ— (Chrome faster) |
| **Grand total** | **7,244 ms** | **23,636 ms** | **3.3Γ— faster** |
Chrome is faster at text extraction because it runs Mozilla Readability.js in-browser.
Lite performs a raw DOM text walk which is slower for very large pages (e.g. Wikipedia
CS: 687 ms vs 130 ms).
### When to use each engine
| Workload | Recommendation |
|----------|---------------|
| Static sites, wikis, news, blogs | **Lite** β€” 3–12Γ— faster, no Chrome overhead |
| JavaScript-rendered SPAs | **Chrome** β€” Lite captures pre-JS HTML only |
| Sites that block HTTP clients | **Chrome** β€” real browser bypasses bot detection |
| Large-page snapshot / traversal | **Lite** β€” 3Γ— faster snapshot |
| Text extraction on large articles | **Chrome** β€” Readability.js is more accurate |
| Screenshots, PDF, evaluate, cookies | **Chrome** β€” not supported in Lite |
---
## Code Layout
| File | Purpose |
|------|---------|
| `internal/engine/engine.go` | `Engine` interface, `Capability` constants, `Mode` enum, `NavigateResult` / `SnapshotNode` types |
| `internal/engine/lite.go` | `LiteEngine` β€” HTTP fetch, script stripping, Gost-DOM parse, role mapping |
| `internal/engine/router.go` | `Router` β€” ordered rule chain, `AddRule` / `RemoveRule` |
| `internal/engine/rules.go` | `CapabilityRule`, `ContentHintRule`, `DefaultLiteRule`, `DefaultChromeRule` |
| `internal/handlers/navigation.go` | `useLite()` fast path, `X-Engine` header |
| `internal/handlers/snapshot.go` | `SnapshotNode β†’ A11yNode` conversion for lite path |
| `internal/handlers/text.go` | Lite text fast path |
| `cmd/pinchtab/cmd_bridge.go` | Router wiring from `config.Engine` at startup |
---
## Dependency
| Package | Version | License | Purpose |
|---------|---------|---------|---------|
| `github.com/gost-dom/browser` | v0.11.0 | MIT | Headless browser: HTML parsing, DOM traversal, event dispatch |
| `github.com/gost-dom/css` | v0.1.0 | MIT | CSS selector evaluation |
| `golang.org/x/net` | existing | BSD-3 | HTML tokenizer used for script stripping |