File size: 23,258 Bytes
5f923cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
# LiteRT-LM Conversation API

[**`Conversation`**][Conversation] is a high-level API, representing a single,
stateful conversation with the LLM and is the recommended entry point for most
users. It internally manages a [`Session`][Session] and handles complex data
processing tasks. These tasks include maintaining the initial context, managing
tool definitions, preprocessing multimodal data, and applying Jinja prompt
templates with role-based message formatting.

## Conversation API Workflow

The typical lifecycle for using the Conversation API is:

1.  **Create an [`Engine`][Engine]**: Initialize a single [`Engine`][Engine]
    with the model path and configuration. This is a heavyweight object that
    holds the model weights.
2.  **Create a [`Conversation`][Conversation]**: Use the [`Engine`][Engine] to
    create one or more lightweight [`Conversation`][Conversation] objects.
3.  **Send Message**: Utilize the [`Conversation`][Conversation] object's
    methods to send messages to the LLM and receive responses, effectively
    enabling a chat-like interaction.

Below is the simplest way to send message and get model response. It is
recommended for most use cases. It mirrors
[Gemini Chat APIs](https://ai.google.dev/gemini-api/docs/text-generation#multi-turn-conversations).

-   [`SendMessage`][SendMessage]: A blocking call that takes user input and
    returns the complete model response.
-   [`SendMessageAsync`][SendMessageAsync]: A non-blocking call that streams the
    model's response back token-by-token through callbacks.

Here are example code snippet:

### Text only content

```cpp
#include "runtime/engine/engine.h"

// ...

// 1. Define model assets and engine settings.
auto model_assets = ModelAssets::Create(model_path);
CHECK_OK(model_assets);

auto engine_settings = EngineSettings::CreateDefault(
    model_assets,
    /*backend=*/litert::lm::Backend::CPU);

// 2. Create the main Engine object.
absl::StatusOr<std::unique_ptr<Engine>> engine = Engine::CreateEngine(engine_settings);
CHECK_OK(engine);

// 3. Create a Conversation
auto conversation_config = ConversationConfig::CreateDefault(**engine);
CHECK_OK(conversation_config)
absl::StatusOr<std::unique_ptr<Conversation>> conversation = Conversation::Create(**engine, *conversation_config);
CHECK_OK(conversation);

// 4. Send message to the LLM with blocking call.
absl::StatusOr<Message> model_message = (*conversation)->SendMessage(
    JsonMessage{
        {"role", "user"},
        {"content", "What is the tallest building in the world?"}
    });
CHECK_OK(model_message);

// 5. Print the model message.
std::cout << *model_message << std::endl;

// 6. Send message to the LLM with asynchronous call
// where CreatePrintMessageCallback is a users implemented callback that would
// process the message once a chunk of message output is received.
std::stringstream captured_output;
(*conversation)->SendMessageAsync(
    JsonMessage{
        {"role", "user"},
        {"content", "What is the tallest building in the world?"}
    },
    CreatePrintMessageCallback(std::stringstream& captured_output)
);
// Wait until asynchronous finish or timeout.
*engine->WaitUntilDone(absl::Seconds(10));
```

<details>
<summary>

Example `CreatePrintMessageCallback`

</summary>

```cpp
absl::AnyInvocable<void(absl::StatusOr<Message>)> CreatePrintMessageCallback(
    std::stringstream& captured_output) {
  return [&captured_output](absl::StatusOr<Message> message) {
    if (!message.ok()) {
      std::cout << message.status().message() << std::endl;
      return;
    }
    if (auto json_message = std::get_if<JsonMessage>(&(*message))) {
      if (json_message->is_null()) {
        std::cout << std::endl << std::flush;
        return;
      }
      ABSL_CHECK_OK(PrintJsonMessage(*json_message, captured_output,
                                     /*streaming=*/true));
    }
  };
}

absl::Status PrintJsonMessage(const JsonMessage& message,
                              std::stringstream& captured_output,
                              bool streaming = false) {
  if (message["content"].is_array()) {
    for (const auto& content : message["content"]) {
      if (content["type"] == "text") {
        captured_output << content["text"].get<std::string>();
        std::cout << content["text"].get<std::string>();
      }
    }
    if (!streaming) {
      captured_output << std::endl << std::flush;
      std::cout << std::endl << std::flush;
    } else {
      captured_output << std::flush;
      std::cout << std::flush;
    }
  } else if (message["content"]["text"].is_string()) {
    if (!streaming) {
      captured_output << message["content"]["text"].get<std::string>()
                      << std::endl
                      << std::flush;
      std::cout << message["content"]["text"].get<std::string>() << std::endl
                << std::flush;
    } else {
      captured_output << message["content"]["text"].get<std::string>()
                      << std::flush;
      std::cout << message["content"]["text"].get<std::string>() << std::flush;
    }
  } else {
    return absl::InvalidArgumentError("Invalid message: " + message.dump());
  }
  return absl::OkStatus();
}
```

</details>

### Multimodal data content

```cpp
// To use multimodality, the engine must be created with vision and audio
// backend depending on the multimodality to be used
auto engine_settings = EngineSettings::CreateDefault(
    model_assets,
    /*backend=*/litert::lm::Backend::CPU,
    /*vision_backend*/litert::lm::Backend::GPU,
    /*audio_backend*/litert::lm::Backend::CPU,
);

// The same steps to create Engine and Conversation as above...

// Send message to the LLM with image data.
absl::StatusOr<Message> model_message = (*conversation)->SendMessage(
    JsonMessage{
        {"role", "user"},
        {"content", { // Now content must be an array.
          {{"type", "text"}, {"text", "Describe the following image: "}},
          {{"type", "image"}, {"path", "/file/path/to/image.jpg"}}
        }},
    });
CHECK_OK(model_message);

// Print the model message.
std::cout << *model_message << std::endl;

// Send message to the LLM with audio data.
model_message = (*conversation)->SendMessage(
    JsonMessage{
        {"role", "user"},
        {"content", { // Now content must be an array.
          {{"type", "text"}, {"text", "Transcribe the audio: "}},
          {{"type", "audio"}, {"path", "/file/path/to/audio.wav"}}
        }},
    });
CHECK_OK(model_message);

// Print the model message.
std::cout << *model_message << std::endl;

// The content can include multiple image or audio data.
model_message = (*conversation)->SendMessage(
    JsonMessage{
        {"role", "user"},
        {"content", { // Now content must be an array.
          {{"type", "text"}, {"text", "First briefly describe the two images "}},
          {{"type", "image"}, {"path", "/file/path/to/image1.jpg"}},
          {{"type", "text"}, {"text", "and "}},
          {{"type", "image"}, {"path", "/file/path/to/image2.jpg"}},
          {{"type", "text"}, {"text", " then transcribe the content in the audio"}},
          {{"type", "audio"}, {"path", "/file/path/to/audio.wav"}}
        }},
    });
CHECK_OK(model_message);

// Print the model message.
std::cout << *model_message << std::endl;

```

### Use Conversation with Tools

Please refer to [Tool Use](./tool_use.md) for detailed Tool Usage with
Conversation API

## Components in Conversation

[`Conversation`][Conversation] could be regarded as a delegate for users to
maintain [`Session`][Session] and complicated data processing before sending the
data to Session.

### I/O Types

The core input and output format for the Conversation API is
[`Message`][Message]. Currently, this is implemented as
[`JsonMessage`][JsonMessage], which is a type alias for
[`ordered_json`][ordered_json], a flexible nested key-value data structure.

The [`Conversation`][Conversation] API operates on a message-in-message-out
basis, mimicking a typical chat experience. The flexibility of
[`Message`][Message] allows users to include arbitrary fields as needed by
specific prompt templates or LLM models, enabling LiteRT-LM to support a wide
variety of models.

While there isn't a single rigid standard, most prompt templates and models
expect [`Message`][Message] to follow conventions similar to those used in the
[Gemini API Content](https://ai.google.dev/api/caching#Content) or the
[OpenAI Message structure](https://platform.openai.com/docs/api-reference/chat/create).

[`Message`][Message] must contain `role`, representing who the message is sent
from. `content` can be as simple as a text string.

```json
{
  "role": "model", // Represent who the message is sent from.
  "content": "Hello World!" // Naive text only content.
}
```

For multimodal data input, `content` is a list of `part`. Again `part` is not a
predefined data structure but a
[ordered key-value pair data type][ordered_json]. The specific fields depend on
what the prompt template and the model expect.

```json
{
  "role": "user",
  "content": [  // Multimodal content.
    // Now the content is composed of parts
    {
      "type": "text",
      "text": "Describe the image in details: "
    },
    {
      "type": "image",
      "path": "/path/to/image.jpg"
    }
  ]
}
```

For multimodal `part`, we support the following format handled by
[`data_utils.h`](https://github.com/google-ai-edge/LiteRT-LM/blob/e5ead3aa409b8e631111d487aaa5e54e91ead747/runtime/conversation/model_data_processor/data_utils.h#L26-L57)

```json
{
  "type": "text",
  "text": "this is a text"
}

{
  "type": "image",
  "path": "/path/to/image.jpg"
}

{
  "type": "image",
  "blob": "base64 encoded image bytes as string",
}

{
  "type": "audio",
  "path": "/path/to/audio.wav"
}

{
  "type": "audio",
  "blob": "base64 encoded audio bytes as string",
}
```

### Model Data Processor

The [`ModelDataProcessor`][ModelDataProcessor] is a crucial model-specific
component responsible for converting the generic [`Message`][Message] format
into the [`InputData`][InputData] type required by the [`Session`][Session], and
convert from [`Session`][Session]'s [`Responses`][Responses] to
[`Message`][Message]. It functions similarly to
[`Hugging Face Multi-modal processors`](https://huggingface.co/docs/transformers/en/main_classes/processors#transformers.ProcessorMixin),
handling all necessary data preprocessing before the main LLM model execution.

The specific responsibilities of a [`ModelDataProcessor`][ModelDataProcessor]
vary based on the model's capabilities and the expected prompt template format.
For instance:

*   **Multimodal Models (e.g., `Gemma3N`):** The
    [`Gemma3DataProcessor`](https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/conversation/model_data_processor/gemma3_data_processor.h)
    includes image and audio preprocessors.
*   **Function Calling Models (e.g., `Qwen3`):** The
    [`Qwen3DataProcessor`](https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/conversation/model_data_processor/qwen3_data_processor.h)
    handle `tool_calls` and `tool_response` content.

Each [`ModelDataProcessor`][ModelDataProcessor] is associated with a
[`DataProcessorConfig`][DataProcessorConfig] for initialization and can accept
[`DataProcessorArguments`][DataProcessorArguments] during message sending.

-   **[`DataProcessorConfig`][DataProcessorConfig]:** This configuration is used
    to initialize the [`ModelDataProcessor`][ModelDataProcessor]. It corresponds
    to the [`LlmModelType`][LlmModelType] protobuf stored in the
    [model file metadata](https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/proto/llm_metadata.proto#L91).
    The [`DataProcessorConfig`][DataProcessorConfig] provides default values for
    any fields not specified in the metadata's [`LlmModelType`][LlmModelType].

-   **[`DataProcessorArguments`][DataProcessorArguments]:** This allows passing
    turn-specific arguments to the [`ModelDataProcessor`][ModelDataProcessor]
    via [`SendMessage`][SendMessage] or [`SendMessageAsync`][SendMessageAsync].
    These arguments can alter the processor's behavior for a single turn.
    However, most models do not require such turn-specific arguments, so this is
    often an empty struct.

To support a new LLM model type, you will typically need to implement a new
[`ModelDataProcessor`][ModelDataProcessor] to encapsulate its unique data
handling logic. See
[Add ModelDataProcessor for a new LLM model type](#add-modeldataprocessor-for-new-llm-model-type)
for detailed instructions.

### Prompt Template

To maintain flexibility for variant models, [PromptTemplate][PromptTemplate] is
implemented as a thin wrapper around [Minja](https://github.com/google/minja).
Minja is a C++ implementation of the [Jinja template engine][Jinja], which
processes JSON input to generate formatted prompts.

The [Jinja template engine][Jinja] is a widely adopted format for LLM prompt
templates. Here are a few examples:

-   [`google-gemma-3n-e2b-it.jinja`](https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/components/testdata/google-gemma-3n-e2b-it.jinja)
-   [`HuggingFaceTB-SmolLM3-3B.jinja`](https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/components/testdata/HuggingFaceTB-SmolLM3-3B.jinja)
-   [`Qwen-Qwen3-0.6B.jinja`](https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/components/testdata/Qwen-Qwen3-0.6B.jinja)

The [Jinja template engine][Jinja] format should strictly match the structure
expected by the instruction-tuned model. Typically, model releases include the
standard Jinja template to ensure proper model usage.

The Jinja template used by the model will be provided by the model file
metadata.

> [!NOTE] A subtle change in prompt because of incorrect formatting can lead to
> significant model degradation. As reported in [Quantifying Language Models'
> Sensitivity to Spurious Features in Prompt Design or: How I learned to start
> worrying about prompt formatting](https://arxiv.org/abs/2310.11324)

### Preface

[`Preface`][Preface] sets the initial context for the conversation. It can
include initial messages, tool definitions, and any other background information
the LLM needs to start the interaction. This achieves functionality similar to
the
[`Gemini API system instruction`](https://ai.google.dev/gemini-api/docs/text-generation#system-instructions)
and [`Gemini API Tools`](https://ai.google.dev/api/caching#Tool)

[Preface][Preface] contains the following fields

-   `messages` The messages in the preface. The messages provided the initial
    background for the conversation. For example, the messages can be the
    conversation history, prompt engineering system instructions, few-shot
    examples, etc.

-   `tools` The tools the model can use in the conversation. The format of tools
    is again not fixed, but mostly follows
    [`Gemini API FunctionDeclaration`](https://ai.google.dev/api/caching#FunctionDeclaration).

-   `extra_context` The extra context that keeps the extensibility for models to
    customize its required context information to start a conversation. For
    examples,

    -   `enable_thinking` for models with thinking mode, e.g.
        [Qwen3](https://huggingface.co/Qwen/Qwen3-0.6B) or
        [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).

Example preface to provide initial system instruction, tools and disable
thinking mode.

```cpp
Preface preface = JsonPreface({
  .messages = {
      {"role", "system"},
      {"content", {"You are a model that can do function calling."}}
    },
  .tools = {
    {
      {"name", "get_weather"},
      {"description", "Returns the weather for a given location."},
      {"parameters", {
        {"type", "object"},
        {"properties", {
          {"location", {
            {"type", "string"},
            {"description", "The location to get the weather for."}
          }}
        }},
        {"required", {"location"}}
      }}
    },
    {
      {"name", "get_stock_price"},
      {"description", "Returns the stock price for a given stock symbol."},
      {"parameters", {
        {"type", "object"},
        {"properties", {
          {"stock_symbol", {
            {"type", "string"},
            {"description", "The stock symbol to get the price for."}
          }}
        }},
        {"required", {"stock_symbol"}}
      }}
    }
  },
  .extra_context = {
    {"enable_thinking": false}
  }
});
```

### History

[Conversation][Conversation] maintains a list of all [Message][Message]
exchanges within the session. This history is crucial for prompt template
rendering, as the jinja prompt template typically requires the entire
conversation history to generate the correct prompt for the LLM.

However, the LiteRT-LM [Session][Session] is stateful, meaning it processes
inputs incrementally. To bridge this gap, [Conversation][Conversation] generates
the necessary incremental prompt by rendering the prompt template twice: once
with the history up to the previous turn, and once including the current
message. By comparing these two rendered prompts, it extracts the new portion to
be sent to the [Session][Session].

### ConversationConfig

[`ConversationConfig`][ConversationConfig] is used to initialize a
[`Conversation`][Conversation] instance. You can create this configuration in a
couple of ways:

1.  **From an [`Engine`][Engine]:** This method uses the default
    [`SessionConfig`][SessionConfig] associated with the engine.
2.  **From a specific [`SessionConfig`][SessionConfig]:** This allows for more
    fine-grained control over the session settings.

Beyond session settings, you can further customize the
[`Conversation`][Conversation] behavior within the
[`ConversationConfig`][ConversationConfig]. This includes:

*   Providing a [`Preface`][Preface].
*   Overwriting the default [`PromptTemplate`][PromptTemplate].
*   Overwriting the default [`DataProcessorConfig`][DataProcessorConfig].

These overwrites are particularly useful for fine-tuned models, which might
require different configurations or prompt templates than the base model they
were derived from.

### MessageCallback

[`MessageCallback`][SendMessageAsync] is the callback function that users should
implement when they use asynchronous [`SendMessageAsync`][SendMessageAsync]
method.

The callback signature is `absl::AnyInvocable<void(absl::StatusOr<Message>)>`.
This function is triggered under the following conditions:

*   When a new chunk of the [`Message`][Message] is received from the Model.
*   If an error occurs during LiteRT-LM's message processing.
*   Upon completion of the LLM's inference, the callback is triggered with an
    empty [`Message`][Message] (e.g., `JsonMessage()`) to signal the end of the
    response.

Refer to the [Step 6 asynchronous call](#text-only-content) for an example
implementation.

> [!IMPORTANT] The [`Message`][Message] received by the callback contains only
> the latest chunk of the model's output, not the entire message history.

For example, if the complete model response expected from a blocking
[`SendMessage`][SendMessage] call would be:

```json
{
  "role": "model",
  "content": [
    "type": "text",
    "text": "Hello World!"
  ]
}
```

The callback in [`SendMessageAsync`][SendMessageAsync] might be invoked multiple
times, each time with a subsequent piece of the text:

```json
// 1st Message
{
  "role": "model",
  "content": [
    "type": "text",
    "text": "He"
  ]
}

// 2nd Message
{
  "role": "model",
  "content": [
    "type": "text",
    "text": "llo"
  ]
}

// 3rd Message
{
  "role": "model",
  "content": [
    "type": "text",
    "text": " Wo"
  ]
}

// 4th Message
{
  "role": "model",
  "content": [
    "type": "text",
    "text": "rl"
  ]
}

// 5th Message
{
  "role": "model",
  "content": [
    "type": "text",
    "text": "d!"
  ]
}
```

The implementer is responsible for accumulating these chunks if the complete
response is needed during the asynchronous stream. Alternatively, the full
response will be available as the last entry in the [`History`](#history) once
the asynchronous call is complete.

## Add ModelDataProcessor for new LLM model type

[WIP] 🚧

[conversation]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/conversation.h#L160 "litert::lm::Conversation"
[session]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/engine/engine.h#L71 "litert::lm:Session"
[engine]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/engine/engine.h#L65 "litert::lm::Engine"
[ModelDataProcessor]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/conversation.h#L205 "litert::lm::ModelDataProcessor"
[InputData]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/engine/io_types.h#L154 "litert::lm::InputData"
[Responses]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/engine/io_types.h#L170 "litert::lm::Responses"
[sendmessage]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/conversation.h#L180 "Conversation::SendMessage"
[sendmessageasync]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/conversation.h#L205 "Conversation::SendMessageAsync"
[Jinja]: https://jinja.palletsprojects.com/en/stable/ "jinja prompt template"
[PromptTemplate]: https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/components/prompt_template.h "litert::lm::PromptTemplate"
[message]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/io_types.h#L28 "litert::lm::Message"
[jsonmessage]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/io_types.h#L25 "litert::lm::JsonMessage"
[ordered_json]: https://json.nlohmann.me/api/ordered_json/ "ordered_json"
[preface]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/io_types.h#L48 "litert::lm::Preface"
[ConversationConfig]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/conversation.h#L44 "litert::lm::ConversationConfig"
[SessionConfig]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/engine/engine_settings.h#L145 "litert::lm::SessionConfig"
[DataProcessorConfig]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/model_data_processor/config_registry.h#L29 "litert::lm::DataProcessorConfig"
[DataProcessorArguments]: https://github.com/google-ai-edge/LiteRT-LM/blob/63f7dec93ac85560e64194a00b5d7c407de40846/runtime/conversation/model_data_processor/config_registry.h#L38 "litert::lm::DataProcessorArguments"
[LlmModelType]: https://github.com/google-ai-edge/LiteRT-LM/blob/main/runtime/proto/llm_model_type.proto "litert.lm.proto.LlmModelType"