Core refactoring of list-chat-models, and a plan for the follow-up.
Browse files
plans/2025-08-01-model-filtering/4-worker-refactoring-async-iterator.md
CHANGED
|
@@ -103,3 +103,62 @@ Design rule: yield final `{ status: 'done' }` explicitly; do not rely on generat
|
|
| 103 |
|
| 104 |
This plan keeps the code minimal, uses language-native iterator cancellation, isolates side-effects to the generator, and leaves the worker wrapper trivial. Proceed to implement when ready.
|
| 105 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
This plan keeps the code minimal, uses language-native iterator cancellation, isolates side-effects to the generator, and leaves the worker wrapper trivial. Proceed to implement when ready.
|
| 105 |
|
| 106 |
+
## Follow-up improvements (post-cleanup)
|
| 107 |
+
|
| 108 |
+
After the extraction and wrapper are in place, the following three low-risk improvements should be applied as follow-ups. Each item below is written as an exact, small TODO with acceptance criteria and a short testing checklist.
|
| 109 |
+
|
| 110 |
+
1) Cleanup `boot-worker.js` (remove duplicated helpers)
|
| 111 |
+
- Goal: remove leftover duplicate helper implementations from `boot-worker.js` so there is a single source-of-truth for `fetchConfigForModel` and `classifyModel` (the new action module). This reduces maintenance surface and avoids accidental drift.
|
| 112 |
+
- Changes (exact):
|
| 113 |
+
- Delete the `fetchConfigForModel` and `classifyModel` helper functions from `src/worker/boot-worker.js` (the ones that are no longer used after extraction).
|
| 114 |
+
- Ensure the only import of those helpers is via `import { listChatModelsIterator } from './actions/list-chat-models.js';` — do not re-export them from the action module unless tests need them. If other parts of `boot-worker.js` still need config/classify helpers, move shared helper functions to a small `src/worker/lib/model-utils.js` and import from both places.
|
| 115 |
+
- Run a quick grep for remaining references to unused symbols (`fetchConfigForModel`, `classifyModel`) and remove any stale code.
|
| 116 |
+
- Tests / acceptance:
|
| 117 |
+
- Build/lint passes with no unused-variable warnings for the removed symbols.
|
| 118 |
+
- The `listChatModels` worker flow behaves identically after cleanup (run smoke test: progress + final response + cancellation).
|
| 119 |
+
- Risk: trivial; mitigate by running the smoke test immediately after change.
|
| 120 |
+
|
| 121 |
+
2) Adaptive concurrency on repeated 429 responses
|
| 122 |
+
- Goal: make the config-fetch promise pool responsive to huggingface rate-limits by reducing parallelism when many 429s occur and gradually increasing when the rate improves.
|
| 123 |
+
- Design (practical, minimal):
|
| 124 |
+
- Instrument a small rate-limit counter in the action module: keep a sliding window counter of `configFetch.429` events (for example, track count and timestamp of recent 429s in an array limited to the last 30s).
|
| 125 |
+
- Add two simple thresholds: if 429_count_in_window >= 10 then reduce `effectiveConcurrency = Math.max(1, Math.floor(effectiveConcurrency / 2))` and mark `rateLimitedUntil = Date.now() + backoffWindowMs` (backoffWindowMs e.g. 30s). If no 429s recorded for backoffWindowMs, gradually restore `effectiveConcurrency = Math.min(initialConcurrency, effectiveConcurrency + 1)` every backoffWindowMs/2.
|
| 126 |
+
- Implementation notes: do not recreate worker goroutines; rather implement a token/semaphore scheme where each worker must acquire a token before starting a config fetch. Maintain `tokenCount = effectiveConcurrency`. When thresholds change, adjust tokenCount (release or reduce available tokens). This allows workers to respect new concurrency without tearing down the pool.
|
| 127 |
+
- Track metrics: increment `counters.configFetch429++` and `counters.configFetch200++` for telemetry (simple numeric fields inside the action module). Expose the counters in the final meta if `params.debug` is true.
|
| 128 |
+
- Changes (exact):
|
| 129 |
+
- Add `const counters = { configFetch429:0, configFetch200:0, configFetchError:0 }` to the top of `list-chat-models.js` action module.
|
| 130 |
+
- Replace the fixed `workerCount = Math.min(concurrency, survivors.length || 1)` with a token-based semaphore that uses `effectiveConcurrency` which can be modified at runtime by the rate-limit detector.
|
| 131 |
+
- On each fetch response with 429, update the sliding window and possibly reduce `effectiveConcurrency` and tokenCount; when tokenCount changes, wake waiting workers so they re-evaluate.
|
| 132 |
+
- Tests / acceptance:
|
| 133 |
+
- Simulate many 429s using a mocked fetch; confirm effectiveConcurrency is reduced and fewer concurrent requests are outstanding (can use counters or a small instrumentation hook).
|
| 134 |
+
- After simulated quiet period, confirm effectiveConcurrency slowly ramps back up.
|
| 135 |
+
- Risk: medium subtlety but implementable as a simple token count and counters; keep logic intentionally conservative and well-tested.
|
| 136 |
+
|
| 137 |
+
3) Batching of progress messages to avoid UI flood
|
| 138 |
+
- Goal: reduce main-thread overhead when the generator emits many small per-model progress objects in a short time window by coalescing them into small batches sent by the wrapper.
|
| 139 |
+
- Design (minimal):
|
| 140 |
+
- Implement a tiny buffer in `boot-worker.js`'s wrapper around `self.postMessage` for progress messages: collect deltas into an array `batchBuffer` and flush every `BATCH_MS` (recommended 50ms) or when buffer reaches `BATCH_MAX` (recommended 50 items).
|
| 141 |
+
- Message shape: send `{ id, type: 'progress', batch: true, items: [ ...deltas ] }` for batched messages. Keep single-item messages as `{ id, type: 'progress', ...delta }` for backward compatibility if prefer — but prefer consistent batching (UI can accept either if updated). Document the change for the UI consumer and update the UI progress handler to accept `msg.batch ? msg.items : [msg]`.
|
| 142 |
+
- Implementation exact steps:
|
| 143 |
+
1. In `boot-worker.js` create `let batchBuffer = []; let batchTimer = null; const BATCH_MS = 50; const BATCH_MAX = 50;`
|
| 144 |
+
2. Replace immediate `self.postMessage(Object.assign({ id, type: 'progress' }, delta))` with `enqueueProgress(delta)` where `enqueueProgress` pushes delta into `batchBuffer`, starts the timer if not running, and flushes when length >= BATCH_MAX.
|
| 145 |
+
3. `flush` sends one message `self.postMessage({ id, type: 'progress', batch: true, items: batchBuffer.splice(0) })` and clears timer.
|
| 146 |
+
4. On `done` or `error` ensure `flush()` is called synchronously before sending `response`/`error` so UI receives final state.
|
| 147 |
+
- Tests / acceptance:
|
| 148 |
+
- Smoke test: when scanning many models, verify UI receives fewer, larger progress messages (monitor frequency in devtools) and UI still updates correctly.
|
| 149 |
+
- Ensure cancellation still works and that flush is invoked on iterator termination.
|
| 150 |
+
- Risk: low; main caution is updating UI to accept `batch` messages (small consumer change). If you prefer zero-change on UI, the wrapper can detect if `pendingEntry.onProgress` exists and deliver either single items or preferrably call `onProgress({ batch:true, items })`.
|
| 151 |
+
|
| 152 |
+
Order of follow-ups and rollout suggestion
|
| 153 |
+
- Apply `boot-worker.js` cleanup first (trivial, low-risk). Run smoke tests.
|
| 154 |
+
- Add batching next (low-risk). Update the UI progress handler to accept batch messages; smoke test with a large run to ensure smooth UI.
|
| 155 |
+
- Add adaptive concurrency last (medium risk). Implement with conservative thresholds and unit tests that simulate 429 storms.
|
| 156 |
+
|
| 157 |
+
Verification checklist after all follow-ups
|
| 158 |
+
- Lint/build: PASS
|
| 159 |
+
- Worker: streams progress in batches, final `response` arrives unchanged
|
| 160 |
+
- Cancellation: `cancelListChatModels` aborts iterator quickly and no more progress messages are sent after final cancellation response (or only those emitted during cleanup if intentionally emitted)
|
| 161 |
+
- Under repeated 429s: concurrency reduced and total in-flight requests observed to drop; counters exposed in `meta` when `params.debug` is enabled
|
| 162 |
+
|
| 163 |
+
When you want, I can implement these three follow-ups in that order; tell me to proceed and I'll make small, focused edits and run quick smoke tests after each one.
|
| 164 |
+
|
src/app/worker-connection.js
CHANGED
|
@@ -99,8 +99,6 @@ export function workerConnection() {
|
|
| 99 |
pending.delete(id);
|
| 100 |
return reject(err);
|
| 101 |
}
|
| 102 |
-
// also send via send to allow worker to reply with final response via same flow
|
| 103 |
-
send({ type: 'listChatModels', params }).then(resolve).catch(reject);
|
| 104 |
});
|
| 105 |
}
|
| 106 |
|
|
|
|
| 99 |
pending.delete(id);
|
| 100 |
return reject(err);
|
| 101 |
}
|
|
|
|
|
|
|
| 102 |
});
|
| 103 |
}
|
| 104 |
|
src/worker/boot-worker.js
CHANGED
|
@@ -1,6 +1,7 @@
|
|
| 1 |
// @ts-check
|
| 2 |
|
| 3 |
import { ModelCache } from './model-cache';
|
|
|
|
| 4 |
|
| 5 |
export function bootWorker() {
|
| 6 |
const modelCache = new ModelCache();
|
|
@@ -71,199 +72,35 @@ export function bootWorker() {
|
|
| 71 |
}
|
| 72 |
}
|
| 73 |
|
| 74 |
-
// Implementation of the listChatModels worker action
|
| 75 |
async function handleListChatModels({ id, params = {} }) {
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
const PAGE_SIZE = 100;
|
| 80 |
-
const RETRIES = 3;
|
| 81 |
-
const BACKOFF_BASE_MS = 200;
|
| 82 |
-
|
| 83 |
-
let cancelled = false;
|
| 84 |
-
const abortControllers = new Set();
|
| 85 |
-
activeTasks.set(id, { abort: () => { cancelled = true; for (const c of abortControllers) c.abort(); } });
|
| 86 |
-
|
| 87 |
try {
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
for (let attempt = 0; attempt <= RETRIES && !ok && !cancelled; attempt++) {
|
| 95 |
-
try {
|
| 96 |
-
const controller = new AbortController();
|
| 97 |
-
abortControllers.add(controller);
|
| 98 |
-
const resp = await fetch(url, { signal: controller.signal, headers: hfToken ? { Authorization: `Bearer ${hfToken}` } : {} });
|
| 99 |
-
abortControllers.delete(controller);
|
| 100 |
-
if (resp.status === 429) {
|
| 101 |
-
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 102 |
-
await new Promise(r => setTimeout(r, backoff));
|
| 103 |
-
continue;
|
| 104 |
-
}
|
| 105 |
-
if (!resp.ok) throw new Error(`listing-fetch-failed:${resp.status}`);
|
| 106 |
-
const page = await resp.json();
|
| 107 |
-
if (!Array.isArray(page) || page.length === 0) { ok = true; break; }
|
| 108 |
-
listing.push(...page);
|
| 109 |
-
offset += PAGE_SIZE;
|
| 110 |
-
ok = true;
|
| 111 |
-
} catch (err) {
|
| 112 |
-
if (attempt === RETRIES) throw err;
|
| 113 |
-
await new Promise(r => setTimeout(r, BACKOFF_BASE_MS * Math.pow(2, attempt)));
|
| 114 |
-
}
|
| 115 |
}
|
| 116 |
-
if (!ok) break;
|
| 117 |
-
}
|
| 118 |
-
|
| 119 |
-
// send listing_done progress
|
| 120 |
-
self.postMessage({ id, type: 'progress', status: 'listing_done', data: { totalFound: listing.length } });
|
| 121 |
-
|
| 122 |
-
if (cancelled) {
|
| 123 |
-
activeTasks.delete(id);
|
| 124 |
-
return self.postMessage({ id, type: 'response', result: { cancelled: true } });
|
| 125 |
}
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
const survivors = [];
|
| 130 |
-
for (const m of listing) {
|
| 131 |
-
if (survivors.length >= maxCandidates) break;
|
| 132 |
-
const pipeline = m.pipeline_tag;
|
| 133 |
-
if (pipeline && denyPipeline.has(pipeline)) continue;
|
| 134 |
-
if (typeof m.modelId === 'string' && m.modelId.includes('sentence-transformers')) continue;
|
| 135 |
-
// siblings check: allow if tokenizer or vocab present
|
| 136 |
-
const siblings = m.siblings || [];
|
| 137 |
-
const hasTokenizer = siblings.some(s => /tokenizer|vocab|merges|sentencepiece/i.test(s));
|
| 138 |
-
if (!hasTokenizer) continue;
|
| 139 |
-
survivors.push(m);
|
| 140 |
}
|
| 141 |
-
|
| 142 |
-
self.postMessage({ id, type: 'progress', status: 'prefiltered', data: { survivors: survivors.length } });
|
| 143 |
-
|
| 144 |
-
// 3) Config fetch & classification with concurrency
|
| 145 |
-
const results = [];
|
| 146 |
-
const errors = [];
|
| 147 |
-
let idx = 0;
|
| 148 |
-
const pool = new Array(Math.min(concurrency, survivors.length)).fill(0).map(async () => {
|
| 149 |
-
while (!cancelled && idx < survivors.length) {
|
| 150 |
-
const i = idx++;
|
| 151 |
-
const model = survivors[i];
|
| 152 |
-
const modelId = model.modelId || model.id || model.model || model.modelId;
|
| 153 |
-
try {
|
| 154 |
-
self.postMessage({ id, type: 'progress', modelId, status: 'config_fetching' });
|
| 155 |
-
const fetchResult = await fetchConfigForModel(modelId, hfToken, timeoutMs, RETRIES, BACKOFF_BASE_MS);
|
| 156 |
-
const entry = classifyModel(model, fetchResult);
|
| 157 |
-
results.push(entry);
|
| 158 |
-
self.postMessage({ id, type: 'progress', modelId, status: 'classified', data: entry });
|
| 159 |
-
} catch (err) {
|
| 160 |
-
errors.push({ modelId, message: String(err) });
|
| 161 |
-
self.postMessage({ id, type: 'progress', modelId, status: 'error', data: { message: String(err) } });
|
| 162 |
-
}
|
| 163 |
-
}
|
| 164 |
-
});
|
| 165 |
-
|
| 166 |
-
await Promise.all(pool);
|
| 167 |
-
|
| 168 |
-
if (cancelled) {
|
| 169 |
-
activeTasks.delete(id);
|
| 170 |
-
return self.postMessage({ id, type: 'response', result: { cancelled: true } });
|
| 171 |
-
}
|
| 172 |
-
|
| 173 |
-
// finalize
|
| 174 |
-
const models = results.map(r => ({ id: r.id, model_type: r.model_type, architectures: r.architectures, classification: r.classification, confidence: r.confidence, fetchStatus: r.fetchStatus }));
|
| 175 |
-
const meta = { fetched: listing.length, filtered: survivors.length, errors };
|
| 176 |
-
activeTasks.delete(id);
|
| 177 |
-
return self.postMessage({ id, type: 'response', result: { models, meta } });
|
| 178 |
} catch (err) {
|
|
|
|
|
|
|
| 179 |
activeTasks.delete(id);
|
| 180 |
-
return self.postMessage({ id, type: 'error', error: String(err) });
|
| 181 |
}
|
| 182 |
}
|
| 183 |
|
| 184 |
// helper: fetchConfigForModel
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
`https://huggingface.co/${encodeURIComponent(modelId)}/resolve/main/config/config.json`,
|
| 189 |
-
`https://huggingface.co/${encodeURIComponent(modelId)}/resolve/main/adapter_config.json`
|
| 190 |
-
];
|
| 191 |
-
for (const url of urls) {
|
| 192 |
-
for (let attempt = 0; attempt <= RETRIES; attempt++) {
|
| 193 |
-
const controller = new AbortController();
|
| 194 |
-
const timeout = setTimeout(() => controller.abort(), timeoutMs);
|
| 195 |
-
try {
|
| 196 |
-
const resp = await fetch(url, { signal: controller.signal, headers: hfToken ? { Authorization: `Bearer ${hfToken}` } : {} });
|
| 197 |
-
clearTimeout(timeout);
|
| 198 |
-
if (resp.status === 200) {
|
| 199 |
-
const json = await resp.json();
|
| 200 |
-
return { status: 'ok', model_type: json.model_type || null, architectures: json.architectures || null };
|
| 201 |
-
}
|
| 202 |
-
if (resp.status === 401 || resp.status === 403) return { status: 'auth', code: resp.status };
|
| 203 |
-
if (resp.status === 404) break; // try next fallback
|
| 204 |
-
if (resp.status === 429) {
|
| 205 |
-
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 206 |
-
await new Promise(r => setTimeout(r, backoff));
|
| 207 |
-
continue;
|
| 208 |
-
}
|
| 209 |
-
// other non-200 -> treat as error
|
| 210 |
-
return { status: 'error', code: resp.status, message: `fetch failed ${resp.status}` };
|
| 211 |
-
} catch (err) {
|
| 212 |
-
clearTimeout(timeout);
|
| 213 |
-
if (attempt === RETRIES) return { status: 'error', message: String(err) };
|
| 214 |
-
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 215 |
-
await new Promise(r => setTimeout(r, backoff));
|
| 216 |
-
}
|
| 217 |
-
}
|
| 218 |
-
}
|
| 219 |
-
return { status: 'no-config' };
|
| 220 |
-
}
|
| 221 |
-
|
| 222 |
-
// helper: classifyModel
|
| 223 |
-
function classifyModel(rawModel, fetchResult) {
|
| 224 |
-
const id = rawModel.modelId || rawModel.id || rawModel.modelId || rawModel.modelId || rawModel.id;
|
| 225 |
-
const entry = { id, model_type: null, architectures: null, classification: 'unknown', confidence: 'low', fetchStatus: 'error' };
|
| 226 |
-
if (!fetchResult) return entry;
|
| 227 |
-
if (fetchResult.status === 'auth') {
|
| 228 |
-
entry.classification = 'auth-protected'; entry.confidence = 'high'; entry.fetchStatus = '401';
|
| 229 |
-
return entry;
|
| 230 |
-
}
|
| 231 |
-
if (fetchResult.status === 'ok') {
|
| 232 |
-
entry.model_type = fetchResult.model_type || null;
|
| 233 |
-
entry.architectures = Array.isArray(fetchResult.architectures) ? fetchResult.architectures : null;
|
| 234 |
-
entry.fetchStatus = 'ok';
|
| 235 |
-
const deny = ['bert','roberta','distilbert','electra','albert','deberta','mobilebert','convbert','sentence-transformers'];
|
| 236 |
-
const allow = ['gpt2','gptj','gpt_neox','llama','qwen','mistral','phi','gpt','t5','bart','pegasus'];
|
| 237 |
-
if (entry.model_type && deny.includes(entry.model_type)) { entry.classification = 'encoder'; entry.confidence = 'high'; return entry; }
|
| 238 |
-
if (entry.model_type && allow.includes(entry.model_type)) { entry.classification = 'gen'; entry.confidence = 'high'; return entry; }
|
| 239 |
-
const arch = entry.architectures;
|
| 240 |
-
if (arch && Array.isArray(arch)) {
|
| 241 |
-
/** @type {any[]} */
|
| 242 |
-
const archArr = /** @type {any[]} */ (arch);
|
| 243 |
-
for (let i = 0; i < archArr.length; i++) {
|
| 244 |
-
const a = archArr[i];
|
| 245 |
-
const al = String(a).toLowerCase();
|
| 246 |
-
if (allow.includes(al)) { entry.classification = 'gen'; entry.confidence = 'high'; return entry; }
|
| 247 |
-
if (deny.includes(al)) { entry.classification = 'encoder'; entry.confidence = 'high'; return entry; }
|
| 248 |
-
}
|
| 249 |
-
}
|
| 250 |
-
entry.classification = 'unknown'; entry.confidence = 'low'; return entry;
|
| 251 |
-
}
|
| 252 |
-
if (fetchResult.status === 'no-config') {
|
| 253 |
-
// fallback heuristics
|
| 254 |
-
const pipeline = rawModel.pipeline_tag || '';
|
| 255 |
-
if (pipeline && pipeline.startsWith('text-generation')) { entry.classification = 'gen'; entry.confidence = 'medium'; }
|
| 256 |
-
else entry.classification = 'unknown'; entry.confidence = 'low';
|
| 257 |
-
entry.fetchStatus = '404';
|
| 258 |
-
return entry;
|
| 259 |
-
}
|
| 260 |
-
if (fetchResult.status === 'error') {
|
| 261 |
-
entry.classification = 'unknown'; entry.confidence = 'low'; entry.fetchStatus = 'error';
|
| 262 |
-
entry.fetchError = { message: fetchResult.message, code: fetchResult.code };
|
| 263 |
-
return entry;
|
| 264 |
-
}
|
| 265 |
-
return entry;
|
| 266 |
-
}
|
| 267 |
}
|
| 268 |
|
| 269 |
// helper to extract generated text from various runtime outputs
|
|
|
|
| 1 |
// @ts-check
|
| 2 |
|
| 3 |
import { ModelCache } from './model-cache';
|
| 4 |
+
import { listChatModelsIterator } from './list-chat-models.js';
|
| 5 |
|
| 6 |
export function bootWorker() {
|
| 7 |
const modelCache = new ModelCache();
|
|
|
|
| 72 |
}
|
| 73 |
}
|
| 74 |
|
| 75 |
+
// Implementation of the listChatModels worker action using the async-iterator action.
|
| 76 |
async function handleListChatModels({ id, params = {} }) {
|
| 77 |
+
const iterator = listChatModelsIterator(params);
|
| 78 |
+
let sawDone = false;
|
| 79 |
+
activeTasks.set(id, { abort: () => { try { iterator.return(); } catch (e) {} } });
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
try {
|
| 81 |
+
for await (const delta of iterator) {
|
| 82 |
+
try { self.postMessage(Object.assign({ id, type: 'progress' }, delta)); } catch (e) {}
|
| 83 |
+
if (delta && delta.status === 'done') {
|
| 84 |
+
sawDone = true;
|
| 85 |
+
try { self.postMessage({ id, type: 'response', result: { models: delta.models, meta: delta.meta } }); } catch (e) {}
|
| 86 |
+
break;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
}
|
| 89 |
+
if (!sawDone) {
|
| 90 |
+
// iterator exited early (likely cancelled)
|
| 91 |
+
try { self.postMessage({ id, type: 'response', result: { cancelled: true } }); } catch (e) {}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
} catch (err) {
|
| 94 |
+
try { self.postMessage({ id, type: 'error', error: String(err), code: err.code || null }); } catch (e) {}
|
| 95 |
+
} finally {
|
| 96 |
activeTasks.delete(id);
|
|
|
|
| 97 |
}
|
| 98 |
}
|
| 99 |
|
| 100 |
// helper: fetchConfigForModel
|
| 101 |
+
// Note: fetchConfigForModel and classifyModel were moved to the
|
| 102 |
+
// `src/worker/list-chat-models.js` async-iterator action. Keep this file
|
| 103 |
+
// minimal and delegate to the iterator for listing/classification logic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
}
|
| 105 |
|
| 106 |
// helper to extract generated text from various runtime outputs
|
src/worker/list-chat-models.js
ADDED
|
@@ -0,0 +1,223 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
// Minimal async-iterator implementation of the listChatModels pipeline.
|
| 2 |
+
// Yields plain JSON-serializable progress objects. Uses per-request AbortControllers
|
| 3 |
+
// and a finally block so iterator.return() causes cleanup.
|
| 4 |
+
|
| 5 |
+
export async function* listChatModelsIterator(params = {}) {
|
| 6 |
+
const opts = Object.assign({ maxCandidates: 250, concurrency: 12, hfToken: null, timeoutMs: 10000, maxListing: 5000 }, params || {});
|
| 7 |
+
const { maxCandidates, concurrency, hfToken, timeoutMs, maxListing } = opts;
|
| 8 |
+
const MAX_TOTAL_TO_FETCH = Math.min(maxListing, 5000);
|
| 9 |
+
const PAGE_SIZE = 100;
|
| 10 |
+
const RETRIES = 3;
|
| 11 |
+
const BACKOFF_BASE_MS = 200;
|
| 12 |
+
|
| 13 |
+
const inFlight = new Set();
|
| 14 |
+
|
| 15 |
+
async function fetchWithController(url, init = {}) {
|
| 16 |
+
const c = new AbortController();
|
| 17 |
+
inFlight.add(c);
|
| 18 |
+
try {
|
| 19 |
+
const merged = Object.assign({}, init, { signal: c.signal });
|
| 20 |
+
const resp = await fetch(url, merged);
|
| 21 |
+
return resp;
|
| 22 |
+
} finally {
|
| 23 |
+
inFlight.delete(c);
|
| 24 |
+
}
|
| 25 |
+
}
|
| 26 |
+
|
| 27 |
+
// helper: fetchConfigForModel (tries multiple paths, per-request timeouts & retries)
|
| 28 |
+
async function fetchConfigForModel(modelId) {
|
| 29 |
+
const urls = [
|
| 30 |
+
`https://huggingface.co/${encodeURIComponent(modelId)}/resolve/main/config.json`,
|
| 31 |
+
`https://huggingface.co/${encodeURIComponent(modelId)}/resolve/main/config/config.json`,
|
| 32 |
+
`https://huggingface.co/${encodeURIComponent(modelId)}/resolve/main/adapter_config.json`
|
| 33 |
+
];
|
| 34 |
+
for (const url of urls) {
|
| 35 |
+
for (let attempt = 0; attempt <= RETRIES; attempt++) {
|
| 36 |
+
// per-request timeout via race
|
| 37 |
+
const controller = new AbortController();
|
| 38 |
+
inFlight.add(controller);
|
| 39 |
+
const timeout = setTimeout(() => controller.abort(), timeoutMs);
|
| 40 |
+
try {
|
| 41 |
+
const resp = await fetch(url, { signal: controller.signal, headers: hfToken ? { Authorization: `Bearer ${hfToken}` } : {} });
|
| 42 |
+
clearTimeout(timeout);
|
| 43 |
+
inFlight.delete(controller);
|
| 44 |
+
if (resp.status === 200) {
|
| 45 |
+
const json = await resp.json();
|
| 46 |
+
return { status: 'ok', model_type: json.model_type || null, architectures: json.architectures || null };
|
| 47 |
+
}
|
| 48 |
+
if (resp.status === 401 || resp.status === 403) return { status: 'auth', code: resp.status };
|
| 49 |
+
if (resp.status === 404) break; // try next fallback
|
| 50 |
+
if (resp.status === 429) {
|
| 51 |
+
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 52 |
+
await new Promise(r => setTimeout(r, backoff));
|
| 53 |
+
continue;
|
| 54 |
+
}
|
| 55 |
+
return { status: 'error', code: resp.status, message: `fetch failed ${resp.status}` };
|
| 56 |
+
} catch (err) {
|
| 57 |
+
clearTimeout(timeout);
|
| 58 |
+
inFlight.delete(controller);
|
| 59 |
+
if (attempt === RETRIES) return { status: 'error', message: String(err) };
|
| 60 |
+
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 61 |
+
await new Promise(r => setTimeout(r, backoff));
|
| 62 |
+
}
|
| 63 |
+
}
|
| 64 |
+
}
|
| 65 |
+
return { status: 'no-config' };
|
| 66 |
+
}
|
| 67 |
+
|
| 68 |
+
function classifyModel(rawModel, fetchResult) {
|
| 69 |
+
const id = rawModel.modelId || rawModel.id || rawModel.model || rawModel.modelId;
|
| 70 |
+
const entry = { id, model_type: null, architectures: null, classification: 'unknown', confidence: 'low', fetchStatus: 'error' };
|
| 71 |
+
if (!fetchResult) return entry;
|
| 72 |
+
if (fetchResult.status === 'auth') {
|
| 73 |
+
entry.classification = 'auth-protected'; entry.confidence = 'high'; entry.fetchStatus = String(fetchResult.code || 401);
|
| 74 |
+
return entry;
|
| 75 |
+
}
|
| 76 |
+
if (fetchResult.status === 'ok') {
|
| 77 |
+
entry.model_type = fetchResult.model_type || null;
|
| 78 |
+
entry.architectures = Array.isArray(fetchResult.architectures) ? fetchResult.architectures : null;
|
| 79 |
+
entry.fetchStatus = 'ok';
|
| 80 |
+
const deny = ['bert','roberta','distilbert','electra','albert','deberta','mobilebert','convbert','sentence-transformers'];
|
| 81 |
+
const allow = ['gpt2','gptj','gpt_neox','llama','qwen','mistral','phi','gpt','t5','bart','pegasus'];
|
| 82 |
+
if (entry.model_type && deny.includes(entry.model_type)) { entry.classification = 'encoder'; entry.confidence = 'high'; return entry; }
|
| 83 |
+
if (entry.model_type && allow.includes(entry.model_type)) { entry.classification = 'gen'; entry.confidence = 'high'; return entry; }
|
| 84 |
+
const arch = entry.architectures;
|
| 85 |
+
if (arch && Array.isArray(arch)) {
|
| 86 |
+
for (let i = 0; i < arch.length; i++) {
|
| 87 |
+
const a = String(arch[i]).toLowerCase();
|
| 88 |
+
if (allow.includes(a)) { entry.classification = 'gen'; entry.confidence = 'high'; return entry; }
|
| 89 |
+
if (deny.includes(a)) { entry.classification = 'encoder'; entry.confidence = 'high'; return entry; }
|
| 90 |
+
}
|
| 91 |
+
}
|
| 92 |
+
entry.classification = 'unknown'; entry.confidence = 'low'; return entry;
|
| 93 |
+
}
|
| 94 |
+
if (fetchResult.status === 'no-config') {
|
| 95 |
+
const pipeline = rawModel.pipeline_tag || '';
|
| 96 |
+
if (pipeline && pipeline.startsWith('text-generation')) { entry.classification = 'gen'; entry.confidence = 'medium'; }
|
| 97 |
+
else entry.classification = 'unknown'; entry.confidence = 'low';
|
| 98 |
+
entry.fetchStatus = '404';
|
| 99 |
+
return entry;
|
| 100 |
+
}
|
| 101 |
+
if (fetchResult.status === 'error') {
|
| 102 |
+
entry.classification = 'unknown'; entry.confidence = 'low'; entry.fetchStatus = 'error';
|
| 103 |
+
entry.fetchError = { message: fetchResult.message, code: fetchResult.code };
|
| 104 |
+
return entry;
|
| 105 |
+
}
|
| 106 |
+
return entry;
|
| 107 |
+
}
|
| 108 |
+
|
| 109 |
+
// Main pipeline
|
| 110 |
+
let listing = [];
|
| 111 |
+
try {
|
| 112 |
+
// 1) listing
|
| 113 |
+
let offset = 0;
|
| 114 |
+
while (listing.length < MAX_TOTAL_TO_FETCH) {
|
| 115 |
+
const url = `https://huggingface.co/api/models?full=true&limit=${PAGE_SIZE}&offset=${offset}`;
|
| 116 |
+
let ok = false;
|
| 117 |
+
for (let attempt = 0; attempt <= RETRIES && !ok; attempt++) {
|
| 118 |
+
try {
|
| 119 |
+
const resp = await fetch(url, { headers: hfToken ? { Authorization: `Bearer ${hfToken}` } : {} });
|
| 120 |
+
if (resp.status === 429) {
|
| 121 |
+
const backoff = BACKOFF_BASE_MS * Math.pow(2, attempt);
|
| 122 |
+
await new Promise(r => setTimeout(r, backoff));
|
| 123 |
+
continue;
|
| 124 |
+
}
|
| 125 |
+
if (!resp.ok) throw Object.assign(new Error(`listing-fetch-failed:${resp.status}`), { code: 'listing_fetch_failed', status: resp.status });
|
| 126 |
+
const page = await resp.json();
|
| 127 |
+
if (!Array.isArray(page) || page.length === 0) { ok = true; break; }
|
| 128 |
+
listing.push(...page);
|
| 129 |
+
offset += PAGE_SIZE;
|
| 130 |
+
ok = true;
|
| 131 |
+
} catch (err) {
|
| 132 |
+
if (attempt === RETRIES) throw err;
|
| 133 |
+
await new Promise(r => setTimeout(r, BACKOFF_BASE_MS * Math.pow(2, attempt)));
|
| 134 |
+
}
|
| 135 |
+
}
|
| 136 |
+
if (!ok) break;
|
| 137 |
+
}
|
| 138 |
+
|
| 139 |
+
// emit listing_done
|
| 140 |
+
yield { status: 'listing_done', totalFound: listing.length };
|
| 141 |
+
|
| 142 |
+
// 2) prefilter
|
| 143 |
+
const denyPipeline = new Set(['feature-extraction', 'fill-mask', 'sentence-similarity', 'masked-lm']);
|
| 144 |
+
const survivors = [];
|
| 145 |
+
for (const m of listing) {
|
| 146 |
+
if (survivors.length >= maxCandidates) break;
|
| 147 |
+
const pipeline = m.pipeline_tag;
|
| 148 |
+
if (pipeline && denyPipeline.has(pipeline)) continue;
|
| 149 |
+
if (typeof m.modelId === 'string' && m.modelId.includes('sentence-transformers')) continue;
|
| 150 |
+
const siblings = m.siblings || [];
|
| 151 |
+
const hasTokenizer = siblings.some(s => /tokenizer|vocab|merges|sentencepiece/i.test(s));
|
| 152 |
+
if (!hasTokenizer) continue;
|
| 153 |
+
survivors.push(m);
|
| 154 |
+
}
|
| 155 |
+
|
| 156 |
+
yield { status: 'prefiltered', survivors: survivors.length };
|
| 157 |
+
|
| 158 |
+
// 3) concurrent config fetch & classify using an event queue so workers can emit
|
| 159 |
+
// progress while the generator yields them.
|
| 160 |
+
const results = [];
|
| 161 |
+
const errors = [];
|
| 162 |
+
let idx = 0;
|
| 163 |
+
let processed = 0;
|
| 164 |
+
const events = [];
|
| 165 |
+
let resolveNext = null;
|
| 166 |
+
function emit(ev) {
|
| 167 |
+
events.push(ev);
|
| 168 |
+
if (resolveNext) {
|
| 169 |
+
resolveNext();
|
| 170 |
+
resolveNext = null;
|
| 171 |
+
}
|
| 172 |
+
}
|
| 173 |
+
async function nextEvent() {
|
| 174 |
+
while (events.length === 0) {
|
| 175 |
+
await new Promise(r => { resolveNext = r; });
|
| 176 |
+
}
|
| 177 |
+
return events.shift();
|
| 178 |
+
}
|
| 179 |
+
|
| 180 |
+
const workerCount = Math.min(concurrency, survivors.length || 1);
|
| 181 |
+
const pool = new Array(workerCount).fill(0).map(async () => {
|
| 182 |
+
while (true) {
|
| 183 |
+
const i = idx++;
|
| 184 |
+
if (i >= survivors.length) break;
|
| 185 |
+
const model = survivors[i];
|
| 186 |
+
const modelId = model.modelId || model.id || model.model || model.modelId;
|
| 187 |
+
try {
|
| 188 |
+
emit({ modelId, status: 'config_fetching' });
|
| 189 |
+
const fetchResult = await fetchConfigForModel(modelId);
|
| 190 |
+
const entry = classifyModel(model, fetchResult);
|
| 191 |
+
results.push(entry);
|
| 192 |
+
emit({ modelId, status: 'classified', data: entry });
|
| 193 |
+
} catch (err) {
|
| 194 |
+
errors.push({ modelId, message: String(err) });
|
| 195 |
+
emit({ modelId, status: 'error', data: { message: String(err) } });
|
| 196 |
+
} finally {
|
| 197 |
+
processed++;
|
| 198 |
+
}
|
| 199 |
+
}
|
| 200 |
+
});
|
| 201 |
+
|
| 202 |
+
// consume events as workers produce them
|
| 203 |
+
while (processed < survivors.length) {
|
| 204 |
+
const ev = await nextEvent();
|
| 205 |
+
yield ev;
|
| 206 |
+
}
|
| 207 |
+
|
| 208 |
+
// make sure any remaining events are yielded
|
| 209 |
+
while (events.length > 0) {
|
| 210 |
+
yield events.shift();
|
| 211 |
+
}
|
| 212 |
+
|
| 213 |
+
await Promise.all(pool);
|
| 214 |
+
|
| 215 |
+
// final
|
| 216 |
+
const models = results.map(r => ({ id: r.id, model_type: r.model_type, architectures: r.architectures, classification: r.classification, confidence: r.confidence, fetchStatus: r.fetchStatus }));
|
| 217 |
+
const meta = { fetched: listing.length, filtered: survivors.length, errors };
|
| 218 |
+
yield { status: 'done', models, meta };
|
| 219 |
+
} finally {
|
| 220 |
+
// abort any in-flight fetches if iteration stopped early
|
| 221 |
+
for (const c of Array.from(inFlight)) try { c.abort(); } catch (e) {}
|
| 222 |
+
}
|
| 223 |
+
}
|