Spaces:
Running
Running
| # LiteRT-LM: CMake Overview | |
| The LiteRT-LM CMake build system provides a unified infrastructure for building | |
| all required third-party dependencies, internal libraries, and the primary | |
| litert_lm_main executable. | |
| Additional executable targets (such as litert_lm_advanced_main) are defined but | |
| currently gated behind the `_unverified_targets` flag. They remain in an | |
| unverified state until they can be validated within the standalone CMake | |
| environment. | |
| ## Dependency Management & Project Structure | |
| This project implements a Super-Build pattern to ensure One Definition Rule | |
| (ODR) adherence. This approach is necessary to manage a converging dependency | |
| tree where multiple components rely on different versions of the same core | |
| libraries. | |
| ### Dependency Strategy | |
| The build system leverages a hybrid approach to dependency management: | |
| - **FetchContent**: Used for conventional third-party libraries where | |
| integration is sufficient | |
| - **ExternalProject**: Reserved for dependencies requiring heavy modification. | |
| These are orchestrated to redirect include and library paths to a unified | |
| "source of truth" within the build directory. This prevents the symbol | |
| collisions that occur when multiple dependencies introduce conflicting | |
| versions of the same provider (e.g., Abseil or Protobuf). | |
| ### Package Infrastructure | |
| Orchestration logic for external dependencies is modularized within | |
| `cmake/packages/<name>`. To ensure a hermetic build environment and strict ODR | |
| adherence, these modules implement a Source Transformation and Target Mapping | |
| framework. | |
| This framework does not simply wrap dependencies; it actively transforms them by | |
| "de-nesting" internal third-party code and normalizing source-level paths to | |
| align with the LiteRT-LM unified build structure. | |
| ```bash | |
| cmake/packages/sentencepiece | |
| βββ sentencepiece.cmake # Primary orchestration module (ExternalProject_Add) | |
| βββ sentencepiece_patcher.cmake # Source-level transformation (Path normalization and dependency de-nesting) | |
| βββ sentencepiece_root_shim.cmake # Injected logic for the package-level configuration | |
| βββ sentencepiece_src_shim.cmake # Injected logic for the source-level build definitions | |
| βββ sentencepiece_aggregate.cmake # Logic to consolidate build artifacts into a unified interface | |
| βββ sentencepiece_target_map.cmake # Dictionary mapping internal project targets to local static archives | |
| ``` | |
| #### Transformation Highlights | |
| - **Hermeticity**: By removing nested third_party directories within | |
| dependencies, we force all components to resolve a single, verified version | |
| of core libraries (e.g., Abseil, Protobuf). | |
| - **Path Normalization**: Source files are patched in-place to canonicalize | |
| include paths, ensuring compatibility with the standalone CMake layout. | |
| - **Target Redirection**: Using custom mapping logic, internal targets are | |
| transparently redirected to local static archives, maintaining consistency | |
| with the original project structure without requiring a full monorepo | |
| environment. | |
| ### Project Layout | |
| To maintain parity with the internal codebase and facilitate automated | |
| maintenance, LiteRT-LM modeled its CMake target definitions to mirror the source | |
| tree. | |
| - **Source-Locality**: Target definitions for LiteRT-LM components generally | |
| reside in the CMakeLists.txt file located in the same directory as their | |
| respective source files. | |
| - **Exceptions**: Shared resources (such as proto_lib) are consolidated into | |
| centralized configuration files to manage global visibility and reuse | |
| ## Build Guide | |
| This project targets a modern high-performance C++ environment. Currently, the | |
| build system is strictly verified for the GNU toolchain on Debian-based Linux | |
| (e.g., Ubuntu 24.04). | |
| #### Prerequisites | |
| Ensure your local environment meets these minimum version requirements to avoid | |
| compilation errors related to C++20 standards and build-time orchestration. | |
| - **Compiler**: gcc / g++ 13+ (Required for stable C++20 feature support). | |
| - **Build Tools**: cmake (3.25+) and make. | |
| - **Python**: 3.12+ (Required development scripts). | |
| - **Java**: openjdk-17-jre-headless or newer (Required for ANTLR 4). | |
| - **Rust**: The Rust toolchain is required for specific sub-components. | |
| - **System Libraries**: zlib1g-dev, libssl-dev libcurl4-openssl-dev. | |
| -------------------------------------------------------------------------------- | |
| **1. Configuration** | |
| Create a build directory to maintain a clean source tree. Note that while the | |
| build system is currently hard-coded to C++20, future updates will transition | |
| this to a configurable variable. | |
| ```bash | |
| cmake -B cmake/build -G "Unix Makefiles" \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DCMAKE_CXX_STANDARD=20 | |
| ``` | |
| **2. Executing the Build** | |
| Parallel execution is highly recommended to manage the complex dependency tree, | |
| but it must be balanced against available system resources. | |
| **WARNING** High Memory Usage: Allocating excessive parallel jobs can cause a | |
| SEGFAULT or an OOM-kill (Signal 9). To ensure a stable build, use a conservative | |
| job count. | |
| **Recommended Formula**: Available RAM / 8GB = Max `-j` value | |
| ```bash | |
| # Example for 32GB RAM | |
| cmake --build cmake/build -t litert_lm_main -j4 | |
| ``` | |
| **3. Verification** | |
| Verify the binary integrity and Package Infrastructure mapping via a CPU-based | |
| inference test. This confirms that all internal symbols and external shims | |
| (Abseil, Protobuf, etc.) are correctly linked and functional. | |
| ###### Validation Scope & Environment Notes | |
| - **Primary Target:** The current verification suite focuses on the | |
| CPU-reference implementation, leveraging the high-compute density (48-core) | |
| of the development environment. | |
| - **Future Target:** Verification of hardware-accelerated backends (GPU/NPU) | |
| is deferred to environments with dedicated hardware resource reservation. | |
| Validation requires a physical target or an instance with direct hardware | |
| passthrough to establish the necessary compute-level interface. | |
| ```bash | |
| ./litert_lm_main \ | |
| --model_path=/path/to/gemma-3n-E2B-it-int4.litertlm \ | |
| --backend=cpu \ | |
| --input_prompt="What is the tallest building in the world?" | |
| ``` | |
| **Expected Output**: A successful build will initialize the XNNPACK delegate and | |
| return the model response along with benchmark metrics: | |
| - *Init Phases*: Executor and Tokenizer initialization times. | |
| - *Prefill/Decode Speeds*: Performance stats (tokens/sec) indicating the | |
| backend is optimized. | |
| *Example* | |
| ``` | |
| dev-sh@:LiteRT-LM$ cmake/build/litert_lm_main --model_path=$model_path/gemma-3n-E2B-it-int4.litertlm --backend=cpu --input_prompt="What is the tallest building in the world?" | |
| INFO: Created TensorFlow Lite XNNPACK delegate for CPU. | |
| input_prompt: What is the tallest building in the world? | |
| The tallest building in the world is the **Burj Khalifa** in Dubai, United Arab Emirates. | |
| It stands at a staggering **828 meters (2,717 feet)** tall. | |
| It was completed in 2010 and continues to hold the record. | |
| BenchmarkInfo: | |
| Init Phases (2): | |
| - Executor initialization: 844.54 ms | |
| - Tokenizer initialization: 66.70 ms | |
| Total init time: 911.25 ms | |
| -------------------------------------------------- | |
| Time to first token: 2.40 s | |
| -------------------------------------------------- | |
| Prefill Turns (Total 1 turns): | |
| Prefill Turn 1: Processed 18 tokens in 2.311920273s duration. | |
| Prefill Speed: 7.79 tokens/sec. | |
| -------------------------------------------------- | |
| Decode Turns (Total 1 turns): | |
| Decode Turn 1: Processed 62 tokens in 5.53092314s duration. | |
| Decode Speed: 11.21 tokens/sec. | |
| -------------------------------------------------- | |
| -------------------------------------------------- | |
| ``` | |
| <br> | |
| -------------------------------------------------------------------------------- | |
| This project is licensed under the | |
| [Apache 2.0 License.](https://github.com/google-ai-edge/LiteRT-LM/blob/main/LICENSE) | |
| -------------------------------------------------------------------------------- | |
| ## Getting Started: Running the LiteRT-LM Container | |
| To get the environment up and running, follow these steps from the root | |
| directory of the project. The process is divided into build, create, and attach | |
| phases to ensure container persistence is handled correctly. | |
| ### 1. Build the Image | |
| First, we'll build the image using the configuration in the cmake/ directory. | |
| This might take a moment if it's your first time, as it pulls in our build | |
| dependencies. | |
| ```bash | |
| podman build -f /path/to/repo/cmake/Containerfile -t litert_lm /path/to/repo | |
| ``` | |
| ### 2. Create the Persistent Container | |
| Instead of executing a one-off run, create a named container to preserve the | |
| workspace state for future sessions. Using interactive mode ensures the | |
| container is prepared for a functional terminal. | |
| ```bash | |
| podman container create --interactive --tty --name litert_lm litert_lm:latest | |
| ``` | |
| ### 3. Start and Join the Session | |
| Finally, start the container and attach your shell to it. | |
| ```bash | |
| podman start --attach litert_lm | |
| ``` | |
| **Quick Note:** If you exit the container and want to get back in later, you | |
| don't need to rebuild or recreate it. Just run the podman start --attach | |
| litert_lm command again and you'll be right back where you left off. | |
| <br> | |
| -------------------------------------------------------------------------------- | |
| This project is licensed under the | |
| [Apache 2.0 License.](https://github.com/google-ai-edge/LiteRT-LM/blob/main/LICENSE) | |