SeaWolf-AI's picture
Upload full LiteRT-LM codebase
5f923cd verified

LiteRT-LM: CMake Overview

The LiteRT-LM CMake build system provides a unified infrastructure for building all required third-party dependencies, internal libraries, and the primary litert_lm_main executable.

Additional executable targets (such as litert_lm_advanced_main) are defined but currently gated behind the _unverified_targets flag. They remain in an unverified state until they can be validated within the standalone CMake environment.

Dependency Management & Project Structure

This project implements a Super-Build pattern to ensure One Definition Rule (ODR) adherence. This approach is necessary to manage a converging dependency tree where multiple components rely on different versions of the same core libraries.

Dependency Strategy

The build system leverages a hybrid approach to dependency management:

  • FetchContent: Used for conventional third-party libraries where integration is sufficient
  • ExternalProject: Reserved for dependencies requiring heavy modification. These are orchestrated to redirect include and library paths to a unified "source of truth" within the build directory. This prevents the symbol collisions that occur when multiple dependencies introduce conflicting versions of the same provider (e.g., Abseil or Protobuf).

Package Infrastructure

Orchestration logic for external dependencies is modularized within cmake/packages/<name>. To ensure a hermetic build environment and strict ODR adherence, these modules implement a Source Transformation and Target Mapping framework.

This framework does not simply wrap dependencies; it actively transforms them by "de-nesting" internal third-party code and normalizing source-level paths to align with the LiteRT-LM unified build structure.

cmake/packages/sentencepiece
β”œβ”€β”€ sentencepiece.cmake            # Primary orchestration module (ExternalProject_Add)
β”œβ”€β”€ sentencepiece_patcher.cmake    # Source-level transformation (Path normalization and dependency de-nesting)
β”œβ”€β”€ sentencepiece_root_shim.cmake  # Injected logic for the package-level configuration
β”œβ”€β”€ sentencepiece_src_shim.cmake   # Injected logic for the source-level build definitions
β”œβ”€β”€ sentencepiece_aggregate.cmake  # Logic to consolidate build artifacts into a unified interface
└── sentencepiece_target_map.cmake # Dictionary mapping internal project targets to local static archives

Transformation Highlights

  • Hermeticity: By removing nested third_party directories within dependencies, we force all components to resolve a single, verified version of core libraries (e.g., Abseil, Protobuf).

  • Path Normalization: Source files are patched in-place to canonicalize include paths, ensuring compatibility with the standalone CMake layout.

  • Target Redirection: Using custom mapping logic, internal targets are transparently redirected to local static archives, maintaining consistency with the original project structure without requiring a full monorepo environment.

Project Layout

To maintain parity with the internal codebase and facilitate automated maintenance, LiteRT-LM modeled its CMake target definitions to mirror the source tree.

  • Source-Locality: Target definitions for LiteRT-LM components generally reside in the CMakeLists.txt file located in the same directory as their respective source files.

  • Exceptions: Shared resources (such as proto_lib) are consolidated into centralized configuration files to manage global visibility and reuse

Build Guide

This project targets a modern high-performance C++ environment. Currently, the build system is strictly verified for the GNU toolchain on Debian-based Linux (e.g., Ubuntu 24.04).

Prerequisites

Ensure your local environment meets these minimum version requirements to avoid compilation errors related to C++20 standards and build-time orchestration.

  • Compiler: gcc / g++ 13+ (Required for stable C++20 feature support).

  • Build Tools: cmake (3.25+) and make.

  • Python: 3.12+ (Required development scripts).

  • Java: openjdk-17-jre-headless or newer (Required for ANTLR 4).

  • Rust: The Rust toolchain is required for specific sub-components.

  • System Libraries: zlib1g-dev, libssl-dev libcurl4-openssl-dev.


1. Configuration

Create a build directory to maintain a clean source tree. Note that while the build system is currently hard-coded to C++20, future updates will transition this to a configurable variable.

cmake -B cmake/build -G "Unix Makefiles" \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_CXX_STANDARD=20

2. Executing the Build

Parallel execution is highly recommended to manage the complex dependency tree, but it must be balanced against available system resources.

WARNING High Memory Usage: Allocating excessive parallel jobs can cause a SEGFAULT or an OOM-kill (Signal 9). To ensure a stable build, use a conservative job count.

Recommended Formula: Available RAM / 8GB = Max -j value

# Example for 32GB RAM
cmake --build cmake/build -t litert_lm_main -j4

3. Verification

Verify the binary integrity and Package Infrastructure mapping via a CPU-based inference test. This confirms that all internal symbols and external shims (Abseil, Protobuf, etc.) are correctly linked and functional.

Validation Scope & Environment Notes
  • Primary Target: The current verification suite focuses on the CPU-reference implementation, leveraging the high-compute density (48-core) of the development environment.

  • Future Target: Verification of hardware-accelerated backends (GPU/NPU) is deferred to environments with dedicated hardware resource reservation. Validation requires a physical target or an instance with direct hardware passthrough to establish the necessary compute-level interface.

./litert_lm_main \
  --model_path=/path/to/gemma-3n-E2B-it-int4.litertlm \
  --backend=cpu \
  --input_prompt="What is the tallest building in the world?"

Expected Output: A successful build will initialize the XNNPACK delegate and return the model response along with benchmark metrics:

  • Init Phases: Executor and Tokenizer initialization times.

  • Prefill/Decode Speeds: Performance stats (tokens/sec) indicating the backend is optimized.

Example

dev-sh@:LiteRT-LM$ cmake/build/litert_lm_main --model_path=$model_path/gemma-3n-E2B-it-int4.litertlm --backend=cpu --input_prompt="What is the tallest building in the world?"
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
input_prompt: What is the tallest building in the world?
The tallest building in the world is the **Burj Khalifa** in Dubai, United Arab Emirates.

It stands at a staggering **828 meters (2,717 feet)** tall.

It was completed in 2010 and continues to hold the record.

BenchmarkInfo:
  Init Phases (2):
    - Executor initialization: 844.54 ms
    - Tokenizer initialization: 66.70 ms
    Total init time: 911.25 ms
--------------------------------------------------
  Time to first token: 2.40 s
--------------------------------------------------
  Prefill Turns (Total 1 turns):
    Prefill Turn 1: Processed 18 tokens in 2.311920273s duration.
      Prefill Speed: 7.79 tokens/sec.
--------------------------------------------------
  Decode Turns (Total 1 turns):
    Decode Turn 1: Processed 62 tokens in 5.53092314s duration.
      Decode Speed: 11.21 tokens/sec.
--------------------------------------------------
--------------------------------------------------


This project is licensed under the Apache 2.0 License.


Getting Started: Running the LiteRT-LM Container

To get the environment up and running, follow these steps from the root directory of the project. The process is divided into build, create, and attach phases to ensure container persistence is handled correctly.

1. Build the Image

First, we'll build the image using the configuration in the cmake/ directory. This might take a moment if it's your first time, as it pulls in our build dependencies.

podman build -f /path/to/repo/cmake/Containerfile -t litert_lm /path/to/repo

2. Create the Persistent Container

Instead of executing a one-off run, create a named container to preserve the workspace state for future sessions. Using interactive mode ensures the container is prepared for a functional terminal.

podman container create --interactive --tty --name litert_lm litert_lm:latest

3. Start and Join the Session

Finally, start the container and attach your shell to it.

podman start --attach litert_lm

Quick Note: If you exit the container and want to get back in later, you don't need to rebuild or recreate it. Just run the podman start --attach litert_lm command again and you'll be right back where you left off.



This project is licensed under the Apache 2.0 License.