File size: 1,882 Bytes
21647a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
"""altered_minds — framework-side, generic LMA integration glue (ADR-013).

This package is the *model-agnostic* scaffold that lets the Composer Replication
Framework drive the sister project llm-mental-alterations (LMA): take a
personality-altered SFT checkpoint and apply the framework's 3-channel RL to ask
whether task-driven RL washes out, preserves, or AMPLIFIES the alteration's
cognitive-distortion signature.

Nothing here loads an LMA checkpoint, calls Modal, or spends budget — that is
explicitly user-gated (ADR-013 "out of scope"). This package provides:

  - ``MMLUFormatReward``     : structured-answer reward (final letter + format
                               only; never rationale style). Plus
                               ``randomize_options`` and a logged option
                               distribution so an "always C" exploit is
                               detectable.
  - ``dual_kl_logger``       : logs KL(policy||altered_init) AND KL(policy||base)
                               each step — the washout/amplification instrument.
  - ``channel_ladder_configs``: the A0-A4 isolated-channel ladder that REPLACES
                               the old combined alpha=0.2/beta=0.4 recipe.

See docs/adrs/ADR-013-lma-integration-channel-ladder.md.
"""
from __future__ import annotations

from composer_replication.integrations.altered_minds.kl_logging import (
    dual_kl_logger,
    token_mean_kl,
)
from composer_replication.integrations.altered_minds.ladder import (
    LADDER_KL_BETA,
    channel_ladder_configs,
)
from composer_replication.integrations.altered_minds.reward import (
    MMLUFormatReward,
    parse_final_answer,
    randomize_options,
)

__all__ = [
    "MMLUFormatReward",
    "parse_final_answer",
    "randomize_options",
    "dual_kl_logger",
    "token_mean_kl",
    "channel_ladder_configs",
    "LADDER_KL_BETA",
]