Spaces:

AUXteam
/

WitNote

Sleeping

App Files Files Community

WitNote / docs /architecture /find.md

AUXteam

Upload folder using huggingface_hub

6a7089a verified 8 days ago

preview code

raw

history blame contribute delete

3.36 kB

	# Find Architecture

	This page covers the implementation details behind PinchTab's semantic `find` pipeline.

	## Overview

	The `find` system converts accessibility snapshot nodes into lightweight descriptors, scores them against a natural-language query, and returns the best matching `ref`.

	The implementation is designed to stay:

	- local
	- fast
	- dependency-light
	- recoverable after page re-renders

	## Pipeline

	```text
	accessibility snapshot
	-> element descriptors
	-> lexical matcher
	-> embedding matcher
	-> combined score
	-> best ref
	-> intent cache / recovery hooks
	```

	## Element Descriptors

	Each accessibility node is converted into a descriptor with:

	- `ref`
	- `role`
	- `name`
	- `value`

	Those fields are also combined into a composite string used for matching.

	## Matchers

	PinchTab currently uses a combined matcher built from:

	- a lexical matcher
	- an embedding matcher based on a hashing embedder

	Default weighting is:

	```text
	0.6 lexical + 0.4 embedding
	```

	Per-request overrides exist through `lexicalWeight` and `embeddingWeight`.

	## Lexical Side

	The lexical matcher focuses on exact and near-exact token overlap, including role-aware matching behavior.

	Useful properties:

	- strong for exact words
	- easy to reason about
	- good precision on explicit queries like `submit button`

	## Embedding Side

	The embedding matcher uses a feature-hashing approach rather than an external ML model.

	Useful properties:

	- catches fuzzy similarity
	- handles partial and sub-word overlap better
	- has no model download or network dependency

	## Combined Matching

	The combined matcher runs lexical and embedding scoring concurrently, merges results by element ref, and applies the weighted final score.

	It also uses a lower internal threshold before the final merge so that candidates which are only strong on one side are not discarded too early.

	## Snapshot Dependency

	`find` depends on the same accessibility snapshot/ref-cache infrastructure used by snapshot-driven interaction.

	If a cached snapshot is missing, the handler tries to refresh it automatically before giving up.

	## Intent Cache And Recovery

	After a successful match, PinchTab records:

	- the original query
	- the matched descriptor
	- score/confidence metadata

	This allows recovery logic to attempt a semantic re-match if a later action fails because the old ref became stale after a page update.

	## Orchestrator Routing

	The orchestrator exposes `POST /tabs/{id}/find` and proxies it to the correct running instance. The actual matching implementation remains in the shared handler layer.

	## Design Constraints

	The current design intentionally avoids:

	- external embedding services
	- heavyweight model dependencies
	- selector-first coupling

	That keeps the system portable and fast, but it also means the quality ceiling is bounded by the in-process matcher design and the quality of the accessibility snapshot.

	## Performance

	Benchmarks on Intel i5-4300U @ 1.90GHz:

	\| Operation \| Elements \| Latency \| Allocations \|
	\| --- \| --- \| --- \| --- \|
	\| Lexical Find \| 16 \| ~71 us \| 134 allocs \|
	\| HashingEmbedder (single) \| 1 \| ~11 us \| 3 allocs \|
	\| HashingEmbedder (batch) \| 16 \| ~171 us \| 49 allocs \|
	\| Embedding Find \| 16 \| ~180 us \| 98 allocs \|
	\| Combined Find \| 16 \| ~233 us \| 263 allocs \|
	\| Combined Find \| 100 \| ~1.5 ms \| 1685 allocs \|