Instructions to use VincHmann/keras-rwkv-tokenizer-eval-poc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Keras
How to use VincHmann/keras-rwkv-tokenizer-eval-poc with Keras:
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://VincHmann/keras-rwkv-tokenizer-eval-poc") - Notebooks
- Google Colab
- Kaggle
PoC: RWKVTokenizer eval() - Arbitrary Code Execution via .keras Model File
Vulnerability: eval() on attacker-controlled vocabulary in keras_hub.models.RWKVTokenizer
Affected: keras-hub 0.26.0 to 0.28.0 | keras 3.9.0 to 3.12.1
CWE: CWE-95 (Eval Injection)
Bypasses: safe_mode=True (keras default)
What this repo contains
malicious_rwkv_tokenizer.keras - a crafted .keras model archive.
When loaded with keras.models.load_model(), the vocabulary field in config.json
reaches eval() inside RWKVTokenizerBase.__init__ (line 117) and
RWKVTokenizer.set_vocabulary (line 275) in rwkv7_tokenizer.py.
The payload in this file is benign: it writes the string 'RCE_via_load_model' to
<tempdir>/rwkv_poc.txt. No network, no persistence, no destruction.
Reproduction
import sys
from unittest.mock import MagicMock
sys.modules.setdefault("tensorflow_text", MagicMock()) # satisfy TF deployment prereq
import keras
import keras_hub # required: registers keras_hub>RWKVTokenizer in Keras object registry
model = keras.models.load_model("malicious_rwkv_tokenizer.keras", safe_mode=True)
# eval() fires during load - marker written to tempdir, no exception raised
Note: keras_hub must be imported before load_model(). This is satisfied
automatically in any real deployment using keras_hub models - the attack prerequisite
is standard, not exceptional.
Note on tensorflow_text: assert_tf_libs_installed() is a functional deployment
prerequisite present in all keras-hub tokenizers. The mock above simulates a real
deployment where TF and tensorflow-text are installed (required to use any keras-hub
tokenizer in production).
Root cause
rwkv7_tokenizer.py calls eval() on every vocabulary entry string:
# line 117 - RWKVTokenizerBase.__init__
x = eval(line[line.index(" ") : line.rindex(" ")])
# line 275 - RWKVTokenizer.set_vocabulary
repr_str = eval(line[line.index(" ") : line.rindex(" ")])
The vocabulary list is stored verbatim in config.json inside the .keras ZIP and
deserialized directly into __init__. keras-hub is in keras's unconditional
deserialization allowlist (serialization_lib.py:816), so SafeModeScope is active
but the tokenizer never calls in_safe_mode().
Fix
Replace both eval() calls with ast.literal_eval().
- Downloads last month
- 5