Spaces:
Running
Running
File size: 9,472 Bytes
5f923cd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 | # LiteRT-LM: CMake Overview
The LiteRT-LM CMake build system provides a unified infrastructure for building
all required third-party dependencies, internal libraries, and the primary
litert_lm_main executable.
Additional executable targets (such as litert_lm_advanced_main) are defined but
currently gated behind the `_unverified_targets` flag. They remain in an
unverified state until they can be validated within the standalone CMake
environment.
## Dependency Management & Project Structure
This project implements a Super-Build pattern to ensure One Definition Rule
(ODR) adherence. This approach is necessary to manage a converging dependency
tree where multiple components rely on different versions of the same core
libraries.
### Dependency Strategy
The build system leverages a hybrid approach to dependency management:
- **FetchContent**: Used for conventional third-party libraries where
integration is sufficient
- **ExternalProject**: Reserved for dependencies requiring heavy modification.
These are orchestrated to redirect include and library paths to a unified
"source of truth" within the build directory. This prevents the symbol
collisions that occur when multiple dependencies introduce conflicting
versions of the same provider (e.g., Abseil or Protobuf).
### Package Infrastructure
Orchestration logic for external dependencies is modularized within
`cmake/packages/<name>`. To ensure a hermetic build environment and strict ODR
adherence, these modules implement a Source Transformation and Target Mapping
framework.
This framework does not simply wrap dependencies; it actively transforms them by
"de-nesting" internal third-party code and normalizing source-level paths to
align with the LiteRT-LM unified build structure.
```bash
cmake/packages/sentencepiece
βββ sentencepiece.cmake # Primary orchestration module (ExternalProject_Add)
βββ sentencepiece_patcher.cmake # Source-level transformation (Path normalization and dependency de-nesting)
βββ sentencepiece_root_shim.cmake # Injected logic for the package-level configuration
βββ sentencepiece_src_shim.cmake # Injected logic for the source-level build definitions
βββ sentencepiece_aggregate.cmake # Logic to consolidate build artifacts into a unified interface
βββ sentencepiece_target_map.cmake # Dictionary mapping internal project targets to local static archives
```
#### Transformation Highlights
- **Hermeticity**: By removing nested third_party directories within
dependencies, we force all components to resolve a single, verified version
of core libraries (e.g., Abseil, Protobuf).
- **Path Normalization**: Source files are patched in-place to canonicalize
include paths, ensuring compatibility with the standalone CMake layout.
- **Target Redirection**: Using custom mapping logic, internal targets are
transparently redirected to local static archives, maintaining consistency
with the original project structure without requiring a full monorepo
environment.
### Project Layout
To maintain parity with the internal codebase and facilitate automated
maintenance, LiteRT-LM modeled its CMake target definitions to mirror the source
tree.
- **Source-Locality**: Target definitions for LiteRT-LM components generally
reside in the CMakeLists.txt file located in the same directory as their
respective source files.
- **Exceptions**: Shared resources (such as proto_lib) are consolidated into
centralized configuration files to manage global visibility and reuse
## Build Guide
This project targets a modern high-performance C++ environment. Currently, the
build system is strictly verified for the GNU toolchain on Debian-based Linux
(e.g., Ubuntu 24.04).
#### Prerequisites
Ensure your local environment meets these minimum version requirements to avoid
compilation errors related to C++20 standards and build-time orchestration.
- **Compiler**: gcc / g++ 13+ (Required for stable C++20 feature support).
- **Build Tools**: cmake (3.25+) and make.
- **Python**: 3.12+ (Required development scripts).
- **Java**: openjdk-17-jre-headless or newer (Required for ANTLR 4).
- **Rust**: The Rust toolchain is required for specific sub-components.
- **System Libraries**: zlib1g-dev, libssl-dev libcurl4-openssl-dev.
--------------------------------------------------------------------------------
**1. Configuration**
Create a build directory to maintain a clean source tree. Note that while the
build system is currently hard-coded to C++20, future updates will transition
this to a configurable variable.
```bash
cmake -B cmake/build -G "Unix Makefiles" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_STANDARD=20
```
**2. Executing the Build**
Parallel execution is highly recommended to manage the complex dependency tree,
but it must be balanced against available system resources.
**WARNING** High Memory Usage: Allocating excessive parallel jobs can cause a
SEGFAULT or an OOM-kill (Signal 9). To ensure a stable build, use a conservative
job count.
**Recommended Formula**: Available RAM / 8GB = Max `-j` value
```bash
# Example for 32GB RAM
cmake --build cmake/build -t litert_lm_main -j4
```
**3. Verification**
Verify the binary integrity and Package Infrastructure mapping via a CPU-based
inference test. This confirms that all internal symbols and external shims
(Abseil, Protobuf, etc.) are correctly linked and functional.
###### Validation Scope & Environment Notes
- **Primary Target:** The current verification suite focuses on the
CPU-reference implementation, leveraging the high-compute density (48-core)
of the development environment.
- **Future Target:** Verification of hardware-accelerated backends (GPU/NPU)
is deferred to environments with dedicated hardware resource reservation.
Validation requires a physical target or an instance with direct hardware
passthrough to establish the necessary compute-level interface.
```bash
./litert_lm_main \
--model_path=/path/to/gemma-3n-E2B-it-int4.litertlm \
--backend=cpu \
--input_prompt="What is the tallest building in the world?"
```
**Expected Output**: A successful build will initialize the XNNPACK delegate and
return the model response along with benchmark metrics:
- *Init Phases*: Executor and Tokenizer initialization times.
- *Prefill/Decode Speeds*: Performance stats (tokens/sec) indicating the
backend is optimized.
*Example*
```
dev-sh@:LiteRT-LM$ cmake/build/litert_lm_main --model_path=$model_path/gemma-3n-E2B-it-int4.litertlm --backend=cpu --input_prompt="What is the tallest building in the world?"
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
input_prompt: What is the tallest building in the world?
The tallest building in the world is the **Burj Khalifa** in Dubai, United Arab Emirates.
It stands at a staggering **828 meters (2,717 feet)** tall.
It was completed in 2010 and continues to hold the record.
BenchmarkInfo:
Init Phases (2):
- Executor initialization: 844.54 ms
- Tokenizer initialization: 66.70 ms
Total init time: 911.25 ms
--------------------------------------------------
Time to first token: 2.40 s
--------------------------------------------------
Prefill Turns (Total 1 turns):
Prefill Turn 1: Processed 18 tokens in 2.311920273s duration.
Prefill Speed: 7.79 tokens/sec.
--------------------------------------------------
Decode Turns (Total 1 turns):
Decode Turn 1: Processed 62 tokens in 5.53092314s duration.
Decode Speed: 11.21 tokens/sec.
--------------------------------------------------
--------------------------------------------------
```
<br>
--------------------------------------------------------------------------------
This project is licensed under the
[Apache 2.0 License.](https://github.com/google-ai-edge/LiteRT-LM/blob/main/LICENSE)
--------------------------------------------------------------------------------
## Getting Started: Running the LiteRT-LM Container
To get the environment up and running, follow these steps from the root
directory of the project. The process is divided into build, create, and attach
phases to ensure container persistence is handled correctly.
### 1. Build the Image
First, we'll build the image using the configuration in the cmake/ directory.
This might take a moment if it's your first time, as it pulls in our build
dependencies.
```bash
podman build -f /path/to/repo/cmake/Containerfile -t litert_lm /path/to/repo
```
### 2. Create the Persistent Container
Instead of executing a one-off run, create a named container to preserve the
workspace state for future sessions. Using interactive mode ensures the
container is prepared for a functional terminal.
```bash
podman container create --interactive --tty --name litert_lm litert_lm:latest
```
### 3. Start and Join the Session
Finally, start the container and attach your shell to it.
```bash
podman start --attach litert_lm
```
**Quick Note:** If you exit the container and want to get back in later, you
don't need to rebuild or recreate it. Just run the podman start --attach
litert_lm command again and you'll be right back where you left off.
<br>
--------------------------------------------------------------------------------
This project is licensed under the
[Apache 2.0 License.](https://github.com/google-ai-edge/LiteRT-LM/blob/main/LICENSE)
|