LiteRT-LM / docs /api /cpp /constrained-decoding.md
SeaWolf-AI's picture
Upload full LiteRT-LM codebase
5f923cd verified
# Constrained Decoding in LiteRT-LM
LiteRT-LM supports constrained decoding, allowing you to enforce specific
structures on the model's output. This is particularly useful for tasks like:
- **Function Calling**: Ensuring the model outputs a valid function call
matching a specific schema.
- **Structured Data Extraction**: Forcing the model to adhere to a specific
format (e.g., specific regex patterns).
- **Grammar Enforcement**: Using context-free grammars (via Lark) to guide
generation.
This document explains how to enable, configure, and use constrained decoding in
your application.
## Enabling Constrained Decoding
To use constrained decoding, you must enable it in the `ConversationConfig` when
creating your `Conversation` instance.
```cpp
#include "runtime/conversation/conversation.h"
// ...
ConversationConfig::Builder builder;
builder.SetEnableConstrainedDecoding(true);
// Set a ConstraintProviderConfig in the ConversationConfig::Builder.
// This line set the ConstraintProvider to LLGuidance with default settings.
builder.SetConstraintProviderConfig(LlGuidanceConfig());
auto config = builder.Build(*engine);
```
### Constraint Providers
LiteRT-LM supports different backends for constrained decoding, configured via
`ConstraintProviderConfig`:
1. **LLGuidance (`LlGuidanceConfig`)**: Uses the
[LLGuidance](https://github.com/guidance-ai/llguidance) library. Supports
Regex, JSON Schema, and Lark grammars.
2. **External (`ExternalConstraintConfig`)**: Allows passing a pre-constructed
`Constraint` object per-request. Useful for custom C++ constraint
implementations.
## Using Constraints in `SendMessage`
Once enabled, you can apply constraints to individual messages using the
`decoding_constraint` field in the `OptionalArgs` struct passed to `SendMessage`
or `SendMessageAsync`. This field is of type `std::optional<ConstraintArg>`.
### 1. LLGuidance Constraints
LLGuidance constraints can be specified as Regex, JSON Schema, or Lark grammars.
#### Regex Constraint
Constrain the output to match a regular expression.
```cpp
#include "runtime/components/constrained_decoding/llg_constraint_config.h"
// ...
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kRegex;
// Example: Force output to be a sequence of 'a's followed by 'b's
constraint_arg.constraint_string = "a+b+";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
```
#### JSON Schema Constraint
Constrain the output to be a valid JSON object matching a schema.
```cpp
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kJsonSchema;
// Example: Simple JSON object with a "name" field
constraint_arg.constraint_string = R"({
"type": "object",
"properties": {
"name": {"type": "string"}
},
"required": ["name"]
})";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
```
#### Lark Grammar Constraint
Constrain the output to follow a Lark grammar.
```cpp
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kLark;
// Example: A simple calculator grammar
constraint_arg.constraint_string = R"(
start: expr
expr: atom
| expr "+" atom
| expr "-" atom
| expr "*" atom
| expr "/" atom
| "(" expr ")"
atom: /[0-9]+/
WS: /[ \t\n\f]+/
%ignore WS
)";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
```
### 2. External Constraints
If you have a custom implementation of the `Constraint` interface (e.g., a
highly specialized C++ state machine), you can use `ExternalConstraintArg`.
Prerequisite: You must have initialized `Conversation` with
`ExternalConstraintConfig`.
```cpp
// 1. Initialize with ExternalConstraintConfig
auto config = ConversationConfig::Builder()
.SetEnableConstrainedDecoding(true)
.SetConstraintProviderConfig(ExternalConstraintConfig())
.Build(*engine);
auto conversation = Conversation::Create(*engine, config);
// 2. Create your custom constraint (must implement litert::lm::Constraint)
class MyCustomConstraint : public Constraint {
// Implement Start, ComputeNext, etc.
};
auto my_constraint = std::make_unique<MyCustomConstraint>();
// 3. Pass it to SendMessage
ExternalConstraintArg external_constraint;
external_constraint.constraint = std::move(my_constraint);
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = std::move(external_constraint)}
);
```
## API Reference
### `ConstraintProviderConfig`
A variant configuration passed to `ConversationConfig`.
- `LlGuidanceConfig`: Configures LLGuidance.
- `eos_id`: Optional override for the End-of-Sequence token ID.
- `ExternalConstraintConfig`: Empty struct (marker) to enable external
constraints.
### `ConstraintArg`
A variant argument passed via `OptionalArgs` to `SendMessage`.
- `LlGuidanceConstraintArg`:
- `constraint_type`: `kRegex`, `kJsonSchema`, or `kLark`.
- `constraint_string`: The pattern/schema/grammar string.
- `ExternalConstraintArg`:
- `constraint`: `std::unique_ptr<Constraint>`. Ownership is transferred to
the valid decoder for that request.