| # Find Architecture |
|
|
| This page covers the implementation details behind PinchTab's semantic `find` pipeline. |
|
|
| ## Overview |
|
|
| The `find` system converts accessibility snapshot nodes into lightweight descriptors, scores them against a natural-language query, and returns the best matching `ref`. |
|
|
| The implementation is designed to stay: |
|
|
| - local |
| - fast |
| - dependency-light |
| - recoverable after page re-renders |
|
|
| ## Pipeline |
|
|
| ```text |
| accessibility snapshot |
| -> element descriptors |
| -> lexical matcher |
| -> embedding matcher |
| -> combined score |
| -> best ref |
| -> intent cache / recovery hooks |
| ``` |
|
|
| ## Element Descriptors |
|
|
| Each accessibility node is converted into a descriptor with: |
|
|
| - `ref` |
| - `role` |
| - `name` |
| - `value` |
|
|
| Those fields are also combined into a composite string used for matching. |
|
|
| ## Matchers |
|
|
| PinchTab currently uses a combined matcher built from: |
|
|
| - a lexical matcher |
| - an embedding matcher based on a hashing embedder |
|
|
| Default weighting is: |
|
|
| ```text |
| 0.6 lexical + 0.4 embedding |
| ``` |
|
|
| Per-request overrides exist through `lexicalWeight` and `embeddingWeight`. |
|
|
| ## Lexical Side |
|
|
| The lexical matcher focuses on exact and near-exact token overlap, including role-aware matching behavior. |
|
|
| Useful properties: |
|
|
| - strong for exact words |
| - easy to reason about |
| - good precision on explicit queries like `submit button` |
|
|
| ## Embedding Side |
|
|
| The embedding matcher uses a feature-hashing approach rather than an external ML model. |
|
|
| Useful properties: |
|
|
| - catches fuzzy similarity |
| - handles partial and sub-word overlap better |
| - has no model download or network dependency |
|
|
| ## Combined Matching |
|
|
| The combined matcher runs lexical and embedding scoring concurrently, merges results by element ref, and applies the weighted final score. |
|
|
| It also uses a lower internal threshold before the final merge so that candidates which are only strong on one side are not discarded too early. |
|
|
| ## Snapshot Dependency |
|
|
| `find` depends on the same accessibility snapshot/ref-cache infrastructure used by snapshot-driven interaction. |
|
|
| If a cached snapshot is missing, the handler tries to refresh it automatically before giving up. |
|
|
| ## Intent Cache And Recovery |
|
|
| After a successful match, PinchTab records: |
|
|
| - the original query |
| - the matched descriptor |
| - score/confidence metadata |
|
|
| This allows recovery logic to attempt a semantic re-match if a later action fails because the old ref became stale after a page update. |
|
|
| ## Orchestrator Routing |
|
|
| The orchestrator exposes `POST /tabs/{id}/find` and proxies it to the correct running instance. The actual matching implementation remains in the shared handler layer. |
|
|
| ## Design Constraints |
|
|
| The current design intentionally avoids: |
|
|
| - external embedding services |
| - heavyweight model dependencies |
| - selector-first coupling |
|
|
| That keeps the system portable and fast, but it also means the quality ceiling is bounded by the in-process matcher design and the quality of the accessibility snapshot. |
|
|
| ## Performance |
|
|
| Benchmarks on Intel i5-4300U @ 1.90GHz: |
|
|
| | Operation | Elements | Latency | Allocations | |
| | --- | --- | --- | --- | |
| | Lexical Find | 16 | ~71 us | 134 allocs | |
| | HashingEmbedder (single) | 1 | ~11 us | 3 allocs | |
| | HashingEmbedder (batch) | 16 | ~171 us | 49 allocs | |
| | Embedding Find | 16 | ~180 us | 98 allocs | |
| | **Combined Find** | **16** | **~233 us** | **263 allocs** | |
| | Combined Find | 100 | ~1.5 ms | 1685 allocs | |
|
|