File size: 29,251 Bytes
f2df60e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
# task_generator — Procedural Task-Brief Generator

**Module path:** `driftcall/task_generator.py`
**Owner:** Person A (Environment)
**Implements:** DESIGN.md §4.2 (`reset()` semantics), §8 (Dataset Strategy — §8.2, §8.3, §8.4), §10.3 (curriculum language mix)
**Consumed by:** `driftcall/env.py` (`DriftCallEnv.reset()`)
**Status:** Design spec — no code yet.

---

## 1. Purpose

`task_generator` is the deterministic, seeded source of every `GoalSpec` consumed by `DriftCallEnv.reset()`. It expands a small hand-authored template library (4 domains × 5 templates × 10 source cities × 10 destinations × 5 languages × 20 drift-compatible slot combinations = **200,000 distinct episode variants**, DESIGN.md §8.4) into concrete per-episode briefs.

One call — `generate(seed, stage, language_weights)` — returns a single fully-populated `GoalSpec` with:

1. A domain (`airline` | `cab` | `restaurant` | `hotel`) chosen deterministically from `seed`.
2. A template variant for that domain, filled with sampled slots (cities, dates, budgets, time windows, dietary flags, etc.).
3. A language picked from the caller-supplied `language_weights` distribution.
4. A `seed_utterance` — the natural-language voice brief in the chosen language, with Unicode-correct Devanagari / Tamil / Kannada script and Hinglish Roman transliteration.
5. `slots` and `constraints` dicts suitable for the reward graders (R1 task completion, R3 constraint adherence — DESIGN.md §7.1).

**Determinism is the contract.** Identical `(seed, stage, language_weights)` triples always produce identical `GoalSpec`s, byte-for-byte after NFC normalization. This enables reproducible training, reproducible evals, and reproducible drift scheduling downstream (DESIGN.md §6.2 — drift schedules are themselves seeded off the same episode ID).

The generator owns **no random global state**. Every stochastic choice threads through `random.Random(seed_for_this_decision)` where the sub-seed is derived from `(seed, decision_tag)` via a stable hash. It does **not** own drift selection — that belongs to `drift_injector` (DESIGN.md §6), which receives the same `seed` and composes its own schedule against the `GoalSpec.domain`.

---

## 2. Interface

All types are imported from `driftcall.models` (see `docs/modules/models.md`). All dataclasses are frozen.

### 2.1 Primary entry point

```python
from __future__ import annotations
from driftcall.models import GoalSpec, LanguageCode

def generate(
    seed: int,
    stage: Literal[1, 2, 3],
    language_weights: dict[LanguageCode, float],
) -> GoalSpec:
    """
    Produce a single fully-populated GoalSpec for episode ``seed`` at curriculum ``stage``.

    Determinism: identical (seed, stage, language_weights) ⇒ identical GoalSpec
    after Unicode NFC normalization of ``seed_utterance``.

    :param seed:              non-negative int, episode identifier; also the root
                              seed for all sub-choices (domain, template, slots,
                              language, utterance variant).
    :param stage:             curriculum stage ∈ {1, 2, 3}; affects allowed
                              template complexity (stage 1 uses simple templates
                              only; stage 3 enables drift-compatible slots).
    :param language_weights:  normalized distribution over LanguageCode keys;
                              values must be non-negative and sum to 1.0 ± 1e-6.

    :returns: GoalSpec whose .seed_utterance is NFC-normalized UTF-8.

    :raises InvalidLanguageWeightError: weights empty, negative, or sum ≠ 1.0.
    :raises InvalidStageError:          stage ∉ {1, 2, 3}.
    :raises InvalidBudgetError:         sampled budget outside template's declared
                                        [low, high] range (indicates corrupt template).
    :raises MissingSlotError:           template variant references a {slot}
                                        placeholder not present in the filled slot dict.
    :raises TemplateFileMissingError:   ``data/task_briefs/templates.yaml`` not found
                                        or malformed.
    :raises UnicodeNormalizationError:  rendered utterance fails NFC round-trip check
                                        (raised defensively — should never fire in practice).
    """
```

### 2.2 Helper signatures (all module-private except where noted)

```python
# --- template loader (public for tests + corpus packaging) ---
def load_templates(path: Path | str = "data/task_briefs/templates.yaml") -> TemplateLibrary:
    """
    Parse the YAML template file, validate the schema (§4 below),
    and return an in-memory TemplateLibrary.

    Called once at module import via a lazy singleton; callers should use
    ``_get_library()`` inside the module. Exposed publicly for unit tests
    and the dataset-packaging script that writes ``train/briefs.jsonl``
    (DESIGN.md §8.6).

    :raises TemplateFileMissingError:   path does not exist.
    :raises TemplateSchemaError:        YAML present but fails schema validation
                                        (missing required key, wrong type, etc.).
    """

# --- domain + template picker ---
def _pick_domain(seed: int) -> Literal["airline", "cab", "restaurant", "hotel"]:
    """Uniform over 4 domains, seeded by hash(seed, 'domain')."""

def _pick_template(seed: int, stage: int, domain: str, library: TemplateLibrary) -> Template:
    """
    Uniform over templates for ``domain`` whose ``min_stage`` ≤ ``stage``.
    Seeded by hash(seed, 'template').
    """

# --- slot expander ---
def _expand_slots(seed: int, template: Template) -> SlotGrid:
    """
    For each slot in the template's required_slots + optional_slots + constraints_template,
    sample one concrete value per the slot's declared distribution.

    Returns a SlotGrid: a frozen mapping of slot-name -> concrete value.

    Handles:
      - enum slots (``choices: [...]``)
      - uniform numeric ranges (``distribution: uniform, low, high, step``)
      - city slots (from the 10×10 city/destination grid, domain-filtered)
      - date slots (relative to a fixed reference date, DESIGN.md §11.1 — deterministic)
      - boolean slots (veg_only, etc.)
    """

# --- language picker ---
def _pick_language(seed: int, language_weights: dict[LanguageCode, float]) -> LanguageCode:
    """
    Weighted draw from ``language_weights`` seeded by hash(seed, 'language').
    ``language_weights`` is validated by ``generate()`` before this is called.
    """

# --- utterance formatter ---
def _format_utterance(
    seed: int,
    template: Template,
    slots: SlotGrid,
    language: LanguageCode,
) -> str:
    """
    Pick one of the template.language_variants[language] strings (uniform,
    seeded by hash(seed, 'variant')), substitute every {slot} placeholder
    with the Unicode-correct rendering of slots[slot], and return the
    NFC-normalized result.

    :raises MissingSlotError:          format string references {X} but X not in slots.
    :raises UnicodeNormalizationError: NFC round-trip fails.
    """

# --- public helper: list all (seed, stage, lang_weights) combos for dataset packaging ---
def enumerate_variants(
    limit: int | None = None,
    stage: int = 3,
    language_weights: dict[LanguageCode, float] | None = None,
) -> Iterator[GoalSpec]:
    """
    Deterministic walk over the procedural grid, yielding up to ``limit``
    GoalSpecs. Used by DESIGN.md §8.6 to produce ``train/briefs.jsonl``
    and ``val/briefs.jsonl``. Not called from env.reset().

    Walk order: domain (4) → template (5) → from×to (10×10) → language (5)
    → utterance variant. Stable across runs.
    """
```

---

## 3. Behavior Spec

### 3.1 Determinism via seed (DESIGN.md §4.2, §8.4)

- Every sub-decision uses `random.Random(stable_sub_seed(seed, tag))` where `stable_sub_seed` is `int.from_bytes(hashlib.blake2b(f"{seed}:{tag}".encode(), digest_size=8).digest(), "big")`.
- Valid tags: `"domain"`, `"template"`, `"slots"`, `"language"`, `"variant"`, plus per-slot tags `f"slot:{slot_name}"`.
- Never call `random.random()` (global state) or `time.time()` anywhere in the module.
- `generate(42, 1, W)` on two machines with identical Python versions returns byte-identical `GoalSpec.seed_utterance` after NFC normalization.

### 3.2 Language-weight sampling

- `language_weights` is the caller's contract for the curriculum mix (DESIGN.md §10.3 defines Stage-1 50/30/20 and Stage-2/3 30/30/20/10/10 splits).
- `generate()` **validates** weights before sampling. Each check binds to exactly one exception:
  - **Unsupported key** — any key ∉ `{"hi", "ta", "kn", "en", "hinglish"}` (LanguageCode) → raises `InvalidLanguageError`.
  - **Empty weights dict**`len(language_weights) == 0` → raises `InvalidLanguageWeightError`.
  - **Negative weight value** — any `w < 0` → raises `InvalidLanguageWeightError`.
  - **Sum outside tolerance**`|sum(weights) − 1.0| > 1e-6` → raises `InvalidLanguageWeightError`.
  - **All weights zero** — defensive assertion; redundant with the sum-check (if sum = 1 ± 1e-6 and all ≥ 0, at least one must be > 0) but kept as an explicit invariant that guards against floating-point edge cases where sum rounds to 1 via noise while every entry is 0. Raises `InvalidLanguageWeightError`.
- `_pick_language` uses `random.Random(sub_seed).choices(population, weights=w, k=1)[0]`.

### 3.3 Slot combinatorial grid (DESIGN.md §8.4)

- Each template declares `required_slots`, `optional_slots`, and `constraints_template` (§4 below).
- The source × destination city grid is **domain-scoped**: airline + hotel draw from inter-city pairs; cab + restaurant draw from intra-city locations. Both lists are 10 entries each per domain (40 total unique cities across domains, deduped in the YAML).
- Optional slots are included with probability 0.5 (seeded).
- Date slots are sampled relative to a fixed reference date `2026-04-25` from a 60-day forward window (so train/val sets are temporally stable).
- Budget slots sample on the declared `step` grid: e.g., `uniform 3000..15000 step 500` yields one of `{3000, 3500, …, 15000}`.
- Stage 1 uses templates flagged `min_stage: 1`; stages 2–3 also admit `min_stage: 2` and `min_stage: 3` (more complex compound-constraint templates with drift-compatible slot layouts).

### 3.4 Unicode handling for Hindi, Tamil, Kannada

- Template YAML is authored in NFC (Unicode Normalization Form C). The loader **re-normalizes** on read (defensive).
- After slot substitution, `_format_utterance` calls `unicodedata.normalize("NFC", s)` and asserts `unicodedata.is_normalized("NFC", s)` — if not, raises `UnicodeNormalizationError`.
- City names, dish names, and day-of-week translations for Hindi / Tamil / Kannada live in a static lookup table (`data/task_briefs/i18n.yaml`, loaded by `load_templates`). English + Hinglish share Roman script with Devanagari-free glyphs (ASCII + `₹`).
- **`i18n.yaml` is NFC-normalized at load time.** `load_templates` applies `unicodedata.normalize("NFC", v)` to every string value parsed out of `data/task_briefs/i18n.yaml` (city names, weekday names, dish names, domain-specific nouns — across `hi`, `ta`, `kn`, `en`, `hinglish`) before those strings are stored in `TemplateLibrary.i18n`. The same NFC pass is applied to every string inside `templates.yaml` (variant strings, choices enums, slot labels). Consequence: every string that `_expand_slots` pulls into a `SlotGrid` is already NFC, so downstream consumers — `_format_utterance`, reward R1 string-equality comparisons (DESIGN.md §7.1), and audit logging — may assume NFC without re-normalizing.
- Hinglish is **always Roman-script** (no mixed scripts); Hindi is **always Devanagari-script**. A template that tries to mix the two in a single variant is rejected at load time.

### 3.5 Stage-aware complexity

| Stage | Templates allowed | Compound constraints | Drift-compatible slot layout |
|---|---|---|---|
| 1 | `min_stage: 1` only (simple: domain + 1 required slot + up to 2 constraints) | No | No — slots chosen from v1-schema-compatible fields only |
| 2 | `min_stage` ≤ 2 | Up to 2 constraints | Slots cover fields likely to be renamed (`price`, `fare_inr`) so drift is observable |
| 3 | all templates | Up to 3 constraints | Slots must include ≥ 1 field that a Stage-3 compound drift will touch |

"Drift-compatible slot layout" is a static property of the template (declared in YAML via `drift_slot_tags: [price, passenger_count, …]`) — the generator does **not** itself pick drifts; it only guarantees the slot surface is rich enough for `drift_injector` to have something meaningful to mutate.

### 3.6 Invariants (enforced by tests)

1. `generate(s, k, w) == generate(s, k, w)` for any valid `(s, k, w)`.
2. The returned `GoalSpec.language` appears in `language_weights` with weight > 0.
3. Every `{slot}` placeholder in `seed_utterance` is resolved — no literal `{…}` survives in the output.
4. `GoalSpec.seed_utterance` is in NFC.
5. Stage 1 never yields a template with `min_stage > 1`.
6. Numeric constraints (e.g., `budget_inr`) fall in the template's declared `[low, high]` range.
7. `seed_utterance` length ≤ 280 characters (one SMS; keeps ASR inputs bounded at deploy time — DESIGN.md §9).
8. Every string value in `SlotGrid.values` is NFC-normalized before `generate()` returns (guaranteed by the `i18n.yaml` + `templates.yaml` NFC pass in `load_templates`, §3.4). Reward R1 (string equality) and other downstream consumers may assume NFC on every slot string — they do not need to re-normalize.

---

## 4. Data Structures

### 4.1 Template YAML schema (matches DESIGN.md §8.3 exactly)

```yaml
# data/task_briefs/templates.yaml
- template_id: airline.book.budget_timewindow
  domain: airline                 # {airline, cab, restaurant, hotel}
  intent: book_flight             # free string; mirrored into GoalSpec.intent
  min_stage: 1                    # 1 | 2 | 3
  required_slots: [from, to, when]
  optional_slots: [seat_pref]
  constraints_template:
    budget_inr:
      distribution: uniform
      low: 3000
      high: 15000
      step: 500
    time_window:
      choices: [morning, afternoon, evening, late_night]
  drift_slot_tags: [price, total_fare_inr]  # used by drift_injector for targeting
  # Language keys are ISO short codes matching LanguageCode = Literal["hi","ta","kn","en","hinglish"].
  # Long names (hindi/tamil/kannada/english) are NOT accepted — loader rejects them via TemplateSchemaError.
  language_variants:
    hinglish:
      - "Bhai {when} ko {to} jaana hai, cheapest flight {time_window} mein, {budget_inr} rupees max"
      - "{when} ko {from} se {to} ka ticket book kar de, under {budget_inr}, {time_window} ke baad"
    hi:
      - "मुझे {when} को {from} से {to} जाना है, {budget_inr} रुपये से कम में"
    ta:
      - "{when} அன்று {from} லிருந்து {to} க்கு டிக்கெட் வேண்டும், {budget_inr} ரூபாய்க்கு கீழ்"
    kn:
      - "{when} ರಂದು {from} ಇಂದ {to} ಗೆ ಅಗ್ಗದ ವಿಮಾನ ಟಿಕೆಟ್ ಬೇಕು, {budget_inr} ರೂಪಾಯಿಗಳ ಒಳಗೆ"
    en:
      - "Book the cheapest flight from {from} to {to} on {when}, budget under ₹{budget_inr}, departing {time_window}"
```

### 4.2 In-memory types

```python
from __future__ import annotations
from dataclasses import dataclass
from typing import Literal, Mapping

LanguageCode = Literal["hi", "ta", "kn", "en", "hinglish"]
Domain = Literal["airline", "cab", "restaurant", "hotel"]

@dataclass(frozen=True)
class SlotDistribution:
    """Either an enum (``choices``) or a uniform numeric grid (``low``, ``high``, ``step``)."""
    kind: Literal["choices", "uniform"]
    choices: tuple[str, ...] | None = None
    low: float | None = None
    high: float | None = None
    step: float | None = None

@dataclass(frozen=True)
class Template:
    template_id: str
    domain: Domain
    intent: str
    min_stage: Literal[1, 2, 3]
    required_slots: tuple[str, ...]
    optional_slots: tuple[str, ...]
    constraints_template: Mapping[str, SlotDistribution]
    drift_slot_tags: tuple[str, ...]
    language_variants: Mapping[LanguageCode, tuple[str, ...]]  # ≥ 1 string per language

@dataclass(frozen=True)
class TemplateLibrary:
    templates: tuple[Template, ...]
    cities_by_domain: Mapping[Domain, tuple[str, ...]]
    i18n: Mapping[LanguageCode, Mapping[str, str]]  # e.g., {"hi": {"BLR": "बेंगलुरु", …}}

@dataclass(frozen=True)
class SlotGrid:
    """Concrete slot values after expansion. Keys are slot names; values are already
    localized to the chosen language (e.g., city rendered in Devanagari for 'hi')."""
    values: Mapping[str, object]  # str | int | float | bool

@dataclass(frozen=True)
class RawBrief:
    """Intermediate product: slots filled, language chosen, utterance not yet rendered.
    Used internally for testability — generate() returns a GoalSpec, not a RawBrief."""
    template_id: str
    domain: Domain
    intent: str
    slots: SlotGrid
    constraints: Mapping[str, object]
    language: LanguageCode
```

`GoalSpec` itself is defined in `driftcall/models.py` (DESIGN.md §4.1) and is the final product of `generate()`. The generator copies `RawBrief` fields into `GoalSpec` and adds the rendered `seed_utterance`.

---

## 5. Error Modes

All exceptions subclass `TaskGeneratorError(Exception)`. Each is raised exactly once in the module and has a test asserting it.

| Exception | Trigger | Where raised |
|---|---|---|
| `MissingSlotError` | template variant references `{X}` but X not in filled `SlotGrid` | `_format_utterance` |
| `InvalidLanguageError` | `language_weights` contains a key ∉ LanguageCode (e.g., `"hindi"`, `"marathi"`) | `generate` (pre-sample validation) |
| `InvalidLanguageWeightError` | empty dict, OR any value < 0, OR sum ∉ [1−1e-6, 1+1e-6], OR all weights = 0 (defensive, redundant with sum-check) | `generate` |
| `InvalidStageError` | `stage ∉ {1, 2, 3}` | `generate` |
| `InvalidBudgetError` | sampled numeric falls outside declared `[low, high]` (indicates corrupt template or step misalignment) | `_expand_slots` |
| `TemplateFileMissingError` | `data/task_briefs/templates.yaml` absent or unreadable | `load_templates` |
| `TemplateSchemaError` | YAML present but fails required-key / type / shape validation | `load_templates` |
| `UnicodeNormalizationError` | NFC round-trip check fails on rendered utterance (defensive) | `_format_utterance` |
| `NoVariantForLanguageError` | chosen template has no `language_variants[chosen_language]` entry | `_format_utterance` |

**No silent fallbacks.** The generator never substitutes a default city, a default language, or a default template on failure — it raises. The env's `reset()` is expected to let these propagate (callers catch and restart with a different seed, never mask).

---

## 6. Dependencies

### 6.1 Reads

- `data/task_briefs/templates.yaml` — the template library (§4.1 schema). Authored by hand in Phase D; never modified at runtime. NFC-normalized at load time (§3.4).
- `data/task_briefs/i18n.yaml` — localized strings for city names, weekdays, domain-specific nouns, in Hindi / Tamil / Kannada. Same load path as templates; separate file for readability. `load_templates` applies `unicodedata.normalize("NFC", v)` to every string value (§3.4) so that `TemplateLibrary.i18n` is NFC-clean before any slot expansion runs.

Both files ship inside the Docker image for the env Space (DESIGN.md §11.1).

### 6.2 Imports

- `driftcall.models``GoalSpec`, `LanguageCode`, `Domain`. The generator does **not** import from `env.py`, `rewards.py`, `drift_injector.py`, or any vendor module. Strict one-way dependency.
- Python stdlib: `random`, `hashlib`, `unicodedata`, `dataclasses`, `pathlib`, `typing`.
- Third-party: `PyYAML` (already in `requirements.txt` per DESIGN.md §11.1).

### 6.3 Produces

- `GoalSpec` instance returned to `DriftCallEnv.reset()` (DESIGN.md §4.2).
- `Iterator[GoalSpec]` via `enumerate_variants` for the dataset-packaging script that writes `train/briefs.jsonl` and `val/briefs.jsonl` (DESIGN.md §8.6).

### 6.4 Consumers

- `driftcall/env.py::DriftCallEnv.reset` — the single production caller of `generate()`.
- `training/data_export.py` (Phase C4) — batch-calls `enumerate_variants()` to build the HF Hub dataset artifact.
- `tests/test_task_generator.py` — exercises every branch + every error mode.

### 6.5 Non-dependencies (explicit)

- Does **not** depend on the drift injector. The generator never picks a drift; it only declares `drift_slot_tags` on the template so the injector can target slots later.
- Does **not** depend on audio pipeline. All output is text; TTS happens at the env boundary (DESIGN.md §9.4).

---

## 7. Edge Cases

1. **Missing slot placeholder in a template variant.** YAML author writes `"Bhai {when} ko {destination} jaana hai"` but declares `required_slots: [from, to, when]``{destination}` has no fill source. Detected in `_format_utterance` which iterates `string.Formatter().parse()` over the variant; raises `MissingSlotError` naming both the template_id and the missing slot. Also caught earlier if possible — `load_templates` does a static scan and raises `TemplateSchemaError` at load time so runtime failures are rare.

2. **Invalid language code in `language_weights`.** Caller passes `{"marathi": 1.0}`. `generate` validates keys against the `LanguageCode` literal before any sampling and raises `InvalidLanguageError` listing the unsupported keys. No partial `GoalSpec` is constructed.

3. **Budget out of declared range.** Template declares `uniform 3000..15000 step 500`. An implementation bug rounds to `step 1000` and yields `16000`. `_expand_slots` post-condition-checks every numeric against `[low, high]` and raises `InvalidBudgetError`. This should never fire with the spec implementation but exists as a defense — catching corrupt templates or future implementation regressions during unit tests.

4. **Unicode NFC / NFD collision in Kannada or Tamil.** Author pastes a Kannada string copied from macOS (NFD) into `templates.yaml`. `load_templates` re-normalizes to NFC on read; `_format_utterance` final-normalizes the substituted string. A direct byte comparison against the input YAML may differ, but the rendered `seed_utterance` is guaranteed NFC. `UnicodeNormalizationError` only fires if the round-trip assertion itself fails (indicates a Python/ICU bug, not a data bug).

5. **Seed collision across episodes.** Training loop calls `generate(seed=42, …)` twice across two different training epochs. Both calls return identical `GoalSpec`s — that is the contract. Upstream training code is responsible for using non-colliding seeds (e.g., `seed = epoch * 10_000 + step`); the generator does not deduplicate. Documented in the training spec (`docs/modules/training.md`, not here).

6. **Language weights sum ≠ 1.0.** Caller passes `{"en": 0.5, "hi": 0.3}` (sum 0.8). `generate` raises `InvalidLanguageWeightError`. Rationale: silent renormalization would mask curriculum-config bugs where a language is silently dropped. Caller must normalize explicitly.

7. **Template with zero variants for requested language.** `_pick_language` picks `"ta"` but the chosen template has no `language_variants["ta"]`. The generator **does not** resample language — that would bias the distribution. Instead it raises `NoVariantForLanguageError`. The template library invariant (enforced at `load_templates`) is **every template has ≥ 1 variant in every LanguageCode**; this exception is defense against YAML authoring regressions and is tested via a malformed fixture.

8. **Step-misaligned uniform range.** Template declares `low: 3000, high: 15000, step: 700`. `(15000-3000) % 700 ≠ 0` — the grid doesn't cleanly terminate at `high`. `load_templates` detects this at load time and raises `TemplateSchemaError`, preventing runtime surprise.

9. **Negative seed.** `generate(seed=-1, …)` — stable hash handles negatives fine (blake2b accepts any UTF-8 bytes), but by convention the env passes non-negative episode IDs. The generator does not reject negatives; it just uses them verbatim. Documented in the interface docstring.

10. **Very large seed (> 2^63).** Same as #9 — blake2b handles arbitrary strings. No overflow.

---

## 8. Examples

### 8.1 Stage-1 airline, English

```python
>>> W = {"en": 1.0, "hi": 0.0, "ta": 0.0, "kn": 0.0, "hinglish": 0.0}
>>> goal = generate(seed=42, stage=1, language_weights=W)
>>> goal.domain
'airline'
>>> goal.intent
'book_flight'
>>> goal.language
'en'
>>> goal.slots
{'from': 'HYD', 'to': 'BLR', 'when': '2026-05-02'}
>>> goal.constraints
{'budget_inr': 7500, 'time_window': 'evening'}
>>> goal.seed_utterance
'Book the cheapest flight from HYD to BLR on 2026-05-02, budget under ₹7500, departing evening'
```

Determinism check:

```python
>>> generate(42, 1, W) == generate(42, 1, W)
True
>>> generate(42, 1, W).seed_utterance == generate(42, 1, W).seed_utterance
True
```

### 8.2 Stage-3 restaurant, Hinglish, drift-compatible slot layout

```python
>>> W = {"en": 0.3, "hi": 0.2, "ta": 0.1, "kn": 0.1, "hinglish": 0.3}
>>> goal = generate(seed=42, stage=3, language_weights=W)
>>> goal.domain
'restaurant'
>>> goal.language
'hinglish'
>>> goal.slots
{'city': 'Mumbai', 'cuisine': 'Biryani', 'when': '2026-05-10T20:00'}
>>> goal.constraints
{'budget_inr': 400, 'veg_only': True, 'min_order_buffer': 100}
>>> goal.seed_utterance
"Bhai tonight Mumbai mein Biryani order karna hai, 400 rupees se kam, veg option chahiye"
```

This brief's slot surface (`budget_inr` + `veg_only`) overlaps the drift patterns `restaurant.min_order_bump` and `restaurant.veg_filter_semantic` (DESIGN.md §5.3) — so when `drift_injector` selects a Stage-3 compound drift, the agent's goal is genuinely affected. That is what "drift-compatible slot layout" means.

### 8.3 Kannada utterance (Unicode-correct Kannada script, U+0C80–U+0CFF)

```python
>>> W = {"kn": 1.0, "en": 0.0, "hi": 0.0, "ta": 0.0, "hinglish": 0.0}
>>> goal = generate(seed=7, stage=2, language_weights=W)
>>> goal.domain
'airline'
>>> goal.language
'kn'
>>> goal.slots
{'from': 'BLR', 'to': 'MAA', 'when': '2026-05-08'}
>>> goal.constraints
{'budget_inr': 5500}
>>> goal.seed_utterance
'2026-05-08 ರಂದು BLR ಇಂದ MAA ಗೆ ಅಗ್ಗದ ವಿಮಾನ ಟಿಕೆಟ್ ಬೇಕು, 5500 ರೂಪಾಯಿಗಳ ಒಳಗೆ'
>>> import unicodedata
>>> unicodedata.is_normalized("NFC", goal.seed_utterance)
True
>>> # At least one codepoint in the Kannada block (U+0C80–U+0CFF)
>>> any(0x0C80 <= ord(c) <= 0x0CFF for c in goal.seed_utterance)
True
>>> # No Devanagari codepoints leaked in (U+0900–U+097F)
>>> any(0x0900 <= ord(c) <= 0x097F for c in goal.seed_utterance)
False
```

This example uses the genuine-Kannada-script variant declared in §4.1. City codes (`BLR`, `MAA`) remain in Roman because IATA/AAI airport codes are canonical identifiers in every language; full Kannada place names (`ಬೆಂಗಳೂರು`, `ಚೆನ್ನೈ`) are available in `i18n.yaml` and used by variants that reference `{from_city_local}` instead of `{from}`.

### 8.4 Tamil utterance with Devanagari-free script

```python
>>> W = {"ta": 1.0, "en": 0.0, "hi": 0.0, "kn": 0.0, "hinglish": 0.0}
>>> goal = generate(seed=101, stage=2, language_weights=W)
>>> goal.language
'ta'
>>> goal.seed_utterance
'2026-05-04 அன்று HYD லிருந்து BLR க்கு டிக்கெட் வேண்டும், 6500 ரூபாய்க்கு கீழ்'
>>> unicodedata.is_normalized("NFC", goal.seed_utterance)
True
>>> # No Devanagari codepoints (U+0900–U+097F) present
>>> any(0x0900 <= ord(c) <= 0x097F for c in goal.seed_utterance)
False
```

### 8.5 Hindi utterance (Devanagari)

```python
>>> W = {"hi": 1.0, "en": 0.0, "ta": 0.0, "kn": 0.0, "hinglish": 0.0}
>>> goal = generate(seed=5, stage=1, language_weights=W)
>>> goal.language
'hi'
>>> goal.seed_utterance
'मुझे 2026-05-01 को DEL से BOM जाना है, 6000 रुपये से कम में'
```

---

## 9. Open Questions

None — spec is complete.

All decisions referenced in §§1–8 follow DESIGN.md §4.1, §4.2, §8.3, §8.4, §10.3 without extension. The generator is a pure function of its inputs; no side effects, no mutable global state, no dependencies on drift or reward subsystems. Edge cases 1–10 cover the full error surface identified during review.

Cross-doc references established:

- `docs/modules/models.md``GoalSpec`, `LanguageCode`, `Domain` definitions
- `docs/modules/drift_injector.md` — consumes `GoalSpec.domain` and template `drift_slot_tags` to schedule drifts
- `docs/modules/env.md` — calls `generate()` from `DriftCallEnv.reset()`
- `docs/modules/rewards.md` — consumes `GoalSpec.slots` + `GoalSpec.constraints` for R1 and R3
- `docs/modules/datasets.md` — calls `enumerate_variants()` to package HF Hub dataset