You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

F-17 β€” Infinite loop in tensorflowjs_converter StatelessWhile monopolises microtask queue β†’ in-process watchdog never fires

Authorized security research artifact disclosed via huntr.com's TensorFlow.js Model Format Vulnerability program. Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. All capture data was collected against a synthetic /tmp/victim_host/ CI-runner lab β€” no real PII present.

Real impact captured (sanitized)

In-process watchdog never fires β€” process must be SIGKILLed externally

  • Parent spawned child with setTimeout(watchdog, 1500ms) inside the executor β€” never fired
  • 8 s wall-clock burn β†’ only parent's external SIGKILL terminated the child
  • Confirms the bug bypasses every existing in-process timeout / cancellation primitive

All proof data above was captured against a synthetic CI-runner lab at /tmp/victim_host/ (no real PII present). Full capture: F17_REAL_IMPACT_PROOF_2026-06-11.txt.


Summary

A Node.js service that calls model.executeAsync(...) on an attacker-supplied GraphModel containing a StatelessWhile (or While) op with a permanently truthy condition will hang forever β€” and any setTimeout watchdog or HTTP request timeout the caller relies on cannot fire because the loop runs as a continuous chain of microtasks that never returns control to the macrotask queue. Only an external SIGKILL (from a supervisor, container OOM, or K8s liveness probe) recovers the process.

The PoC confirms this end-to-end: child hung for 8 s, parent watchdog issued SIGKILL, exit code null, signal SIGKILL.

Root Cause

Lines of Code:

In control_executor.ts:49-103:

case 'While':
case 'StatelessWhile': {
  const bodyFunc = getParamValue('body', node, tensorMap, context) as string;
  const condFunc = getParamValue('cond', node, tensorMap, context) as string;
  const args     = getParamValue('args', node, tensorMap, context) as Tensor[];

  const condResult = await context.functionMap[condFunc].executeFunctionAsync(...);
  let condValue    = await condResult[0].data();
  let result: Tensor[] = args;

  while (condValue[0]) {                                   // ← no iteration cap
    result = await context.functionMap[bodyFunc].executeFunctionAsync(...);
    const condResult = await context.functionMap[condFunc].executeFunctionAsync(...);
    condValue = await condResult[0].data();
  }
  return result;
}

TF's protobuf WhileLoop op carries a maximum_iterations attribute precisely to bound this loop. tfjs's operation mapper parses the attr into node.attr['maximum_iterations'], but the executor never reads or enforces it.

Why a setTimeout watchdog cannot save you

Each iteration is a chain of microtasks (await context.functionMap[bodyFunc].executeFunctionAsync(...) β†’ await context.functionMap[condFunc].executeFunctionAsync(...) β†’ await condResult[0].data()). Node drains the microtask queue between any two macrotasks, so the loop runs forever without ever returning control to the macrotask queue.

In-process watchdogs (setTimeout, HTTP server request timeouts, Express req.setTimeout, k6 client-side time limits, custom abort controllers driven by setInterval) cannot fire because they are macrotasks.

Only an external SIGKILL recovers the process.

Why this is NOT a duplicate of F-23 (mutual function recursion): F-23 uses StatelessIf calling itself or a sibling function unboundedly through the GraphDef's library.function table β€” exhausting frames via recursion. F-17 uses StatelessWhile with no recursion β€” exhausting iteration count via a flat loop. Different op (While vs If), different attack shape (loop vs recursion), independent fix (read maximum_iterations attr vs add a recursion-depth counter). Both are bundled into the executor file but admit different patches.

Internal Pre-conditions

  1. Victim Node.js process calls tf.loadGraphModel(<url>) followed by model.executeAsync(...) on the attacker model.
  2. Process uses @tensorflow/tfjs-converter ≀ 4.22.0.

External Pre-conditions

None.

Attack Path

  1. Attacker authors a model.json GraphDef with two function definitions in library.function:
    • cond_fn(x, t) = Greater(x, t) (with x=0.0, t=-1.0 β†’ permanently true),
    • body_fn(x, t) = (Identity(x), Identity(t)) (no progress).
  2. Top-level node: StatelessWhile(args=[0.0, -1.0], cond=cond_fn, body=body_fn).
  3. Attacker delivers model.json + 24-byte weight shard to the victim.
  4. Victim loads the model and calls model.executeAsync({}, ['loop']).
  5. The executor enters while (condValue[0]) and never exits. The Promise returned to the caller never resolves.
  6. Victim's setTimeout(req.abort, 30_000) watchdog never fires; the service worker is permanently consumed; container metrics show 100% CPU forever.
  7. Only an external SIGKILL β€” supervisor, K8s liveness probe, OOM-killer β€” recovers the worker.

Impact

Captured PoC F17_REAL_IMPACT_PROOF_2026-06-11.txt:

PoC F-17 v2 β€” StatelessWhile(cond=true forever)
init   = 0.0  (carry state)
thresh = -1.0
condition: Greater(x, -1.0) β†’ always true
body: (Identity(x), Identity(t)) β†’ no progress
[watchdog 8s SIGKILL]
[exit c=null s=SIGKILL]

exit c=null s=SIGKILL is conclusive: executeAsync never returned; the in-process setTimeout watchdog never fired; the only path to recover the process was an external supervisor kill.

Service-level impact:

  • Each worker that handles one malicious model.executeAsync request becomes permanently consumed.
  • Sustained attack β†’ full service DoS as every worker is taken offline.
  • Even k8s/Docker SIGKILL recovery is slow (deployments must roll new pods) β€” easy to keep the service offline indefinitely.

Mitigation

In control_executor.ts:49-103, read maximum_iterations from the node attr (already parsed by the operation mapper) and break when exceeded:

case 'While':
case 'StatelessWhile': {
  const maxIters = (getParamValue(
      'maximum_iterations', node, tensorMap, context) as number) ?? Infinity;
  const HARD_DEFAULT_CEILING = 1_000_000;
  let i = 0;
  while (condValue[0]) {
    if (i++ >= Math.min(maxIters, HARD_DEFAULT_CEILING)) {
      throw new ValueError(
        `StatelessWhile exceeded maximum_iterations=${maxIters} (ceiling ${HARD_DEFAULT_CEILING})`);
    }
    result = await context.functionMap[bodyFunc].executeFunctionAsync(...);
    ...
  }
}

Fall back to a hard ceiling (e.g. 10⁢) when the attr is missing.

CVSS

CVSS 3.1 7.5 / High β€” AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H.

Bug classification

  • CWE-835 (Loop with Unreachable Exit Condition β€” "Infinite Loop")
  • CWE-770 (Allocation of Resources Without Limits)

Affected versions

@tensorflow/tfjs-converter ≀ 4.22.0.

Files in this repository

File Purpose
README.md this disclosure
package.json npm dependencies for one-step npm install
reproduce.js minimal PoC β€” StatelessWhile with always-true predicate monopolises the microtask queue
reproduce_real_impact.js watchdog-bypass demo β€” spawns child with in-process setTimeout(1500ms) watchdog; only external SIGKILL terminates
F17_REAL_IMPACT_PROOF_2026-06-11.txt captured 8 s wall-clock burn with in-process watchdog never firing
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support