Instructions to use xThr45hx/Tensor-G4-NPU-Compiler-Toolchains with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use xThr45hx/Tensor-G4-NPU-Compiler-Toolchains with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Google Tensor G4 NPU β DGC0 Compiler Toolchains
Two independently-developed reverse-engineering toolchains for producing DGC0 (EdgeTPU bytecode) for the Google Tensor G4 NPU (the darwinn EdgeTPU in Pixel 9-series phones). Both were built to answer one question: how do you compile a model to run on the G4 NPU when the compiler is gated/undocumented?
What DGC0 is: the on-device NPU consumes a compiled bytecode blob whose magic bytes are
44 47 43 30="DGC0". The compiler that produces it is what these toolchains drive.
Status β honest version: these are RE toolchains from before official access existed. The official Google Tensor ML SDK (Beta) (https://ai.google.dev/edge/litert) is the sanctioned production path, and its big advantage is fusing the transformer into monolithic DGCs. But it is not strictly a superset of what's here β the Beta has its own hard walls (a hardcoded ~60 s HAL compile deadline, a per-context capacity cap, death on certain
odmlSTABLEHLO composites, INTERNAL crashes on some 4-bit / large-tensor graphs). These toolchains drive the compiler more directly, with fewer wrapper constraints, so they remain useful β and may still compile cases the Beta chokes on, especially with tighter quantization (smaller DGC0s) or reconfigured partitioning to beat the fragmentation wall. Published as active research, not a closed chapter β prefer the official SDK for production, reach for these when it hits a wall.
What is NOT in this repo: none of the vendor's closed compiler binaries are redistributed here β not
liblitert_plugin_compiler.so,libLiteRtCompilerPlugin_google_tensor.so, orlibedgetpu_tflite_compiler.so. These toolchains call those; they don't contain them. Reference third-party source (the openlibedgetpuruntime, the Pixel kernel edgetpu driver) is also linked, not bundled. Only original RE work + docs are here.
1. Cross-Compile Bridge (cross-compile-bridge/)
(formerly nicknamed "cracked SDK")
What it does: compiles DGC0 on an x86_64 Linux host (in Docker) and deploys it to the arm64 phone over adb β a cross-compilation bridge, because the phone can't run the x86 compiler natively.
How it works:
- The Google Tensor compiler adapter + engine ship publicly on PyPI β
pip install ai-edge-litert-nightlydropsvendors/google_tensor/compiler/libLiteRtCompilerPlugin_google_tensor.so(an x86_64 adapter) which loads the real engine. - The "experimental access" gate turned out to be an env-var check + a filename
dlopencheck over that already-public compiler (seecross-compile-bridge/00_TLDR.mdβ "mostly theater"). - The RE work decoded the C ABI of
GoogleTensorCompileFlatbuffer(arg layout,1=success return,*a6=bytecode_count,*a7=error_msg, the config protobuf, the options struct withsoc_model="Tensor_G4") so the public compiler can be driven directly from a stub.
To rebuild:
- Compiler:
pip install ai-edge-litert-nightly(PyPI) β provides the x86_64 Tensor compiler adapter + engine. - Official SDK context: https://ai.google.dev/edge/litert and the Tensor ML SDK page https://ai.google.dev/edge/litert/next/tensor_ml_sdk.
- Then: the stub source (
docker/stub_compiler.c), Dockerfiles, anddocker/reproduce.shin this folder drive it.
Read: 00_TLDR.md β 02_C_ABI.md β 03_STUB_BUILD.md β 05_WALL_BROKEN.md.
2. On-Device Compiler Driver (on-device-compiler-driver/)
(formerly nicknamed "probe6" / compiler_probe6)
What it does: produces DGC0 natively on the phone β no AICore, no NNAPI, no edgetpu_app_service β by driving the vendor's OWN on-device compiler through its public C entry point.
How it works:
dlopen("/vendor/lib64/libedgetpu_tflite_compiler.so", RTLD_NOW|RTLD_GLOBAL)β the device's own compiler; constructors run and populate the filewrapper TOC.- Resolve and call
CompileTfliteFlatbuffer2(the V2 C ABI β 8 args, AArch64 AAPCS x0..x7). V1 is a buggy 20-byte shim that zeroes the new slot-4 arg. - Output β a
DGC0blob (proven: a 64 MB DGC written on-device,status: OK).
To rebuild:
- The compiler itself is the device's
/vendor/lib64/libedgetpu_tflite_compiler.soβ pull it from a rooted Tensor G4 (Pixel 9-series) device via adb. It is NOT redistributed here. - Build the driver (
compiler_probe6.c) for arm64 with the Android NDK, then run on-device. - Reference source for understanding the runtime + DGC0 format (for study, not required to run):
- Open EdgeTPU runtime: https://github.com/google-coral/libedgetpu
- Pixel/Tensor kernel edgetpu driver: ships in the Pixel kernel source (
drivers/edgetpu).
- LiteRT-LM (the LLM runtime that consumes these): https://github.com/google-ai-edge/litert-lm
Read: 00_NAMING_On-Device-Compiler-Driver.md β BREAKTHROUGH_PROBE6.md β BEST_ARCHITECTURE.md. The driver source is compiler_probe6.c (earlier iterations compiler_probe.c β¦ _probe5.c); the DGC0 FlatBuffer parser is dgc0_parse.{cc,h}.
License
Original RE code, drivers, and documentation in this repo: Apache-2.0. Third-party source referenced above keeps its own upstream license (Apache-2.0 for libedgetpu, GPL-2.0 for the kernel driver) and is intentionally not bundled here β follow the links. No vendor closed binaries are included.
Credits
Reverse-engineering + toolchains by xThr45hx (AI-assisted). Published as a technical record; use responsibly and prefer the official Tensor ML SDK for production.
- Downloads last month
- -