Spaces:
Paused
Paused
| <html lang="en"> | |
| <head> | |
| <meta charset="utf-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1" /> | |
| <meta name="generator" content="pdoc 0.10.0" /> | |
| <title>tinytroupe.validation.propositions API documentation</title> | |
| <meta name="description" content="There are various general desireable simulation properties. These can be useful under various | |
| circumstances, for example to validate the simulation, …" /> | |
| <link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/sanitize.min.css" integrity="sha256-PK9q560IAAa6WVRRh76LtCaI8pjTJ2z11v0miyNNjrs=" crossorigin> | |
| <link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/typography.min.css" integrity="sha256-7l/o7C8jubJiy74VsKTidCy1yBkRtiUGbVkYBylBqUg=" crossorigin> | |
| <link rel="stylesheet preload" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.1/styles/github.min.css" crossorigin> | |
| <style>:root{--highlight-color:#fe9}.flex{display:flex }body{line-height:1.5em}#content{padding:20px}#sidebar{padding:30px;overflow:hidden}#sidebar > *:last-child{margin-bottom:2cm}.http-server-breadcrumbs{font-size:130%;margin:0 0 15px 0}#footer{font-size:.75em;padding:5px 30px;border-top:1px solid #ddd;text-align:right}#footer p{margin:0 0 0 1em;display:inline-block}#footer p:last-child{margin-right:30px}h1,h2,h3,h4,h5{font-weight:300}h1{font-size:2.5em;line-height:1.1em}h2{font-size:1.75em;margin:1em 0 .50em 0}h3{font-size:1.4em;margin:25px 0 10px 0}h4{margin:0;font-size:105%}h1:target,h2:target,h3:target,h4:target,h5:target,h6:target{background:var(--highlight-color);padding:.2em 0}a{color:#058;text-decoration:none;transition:color .3s ease-in-out}a:hover{color:#e82}.title code{font-weight:bold}h2[id^="header-"]{margin-top:2em}.ident{color:#900}pre code{background:#f8f8f8;font-size:.8em;line-height:1.4em}code{background:#f2f2f1;padding:1px 4px;overflow-wrap:break-word}h1 code{background:transparent}pre{background:#f8f8f8;border:0;border-top:1px solid #ccc;border-bottom:1px solid #ccc;margin:1em 0;padding:1ex}#http-server-module-list{display:flex;flex-flow:column}#http-server-module-list div{display:flex}#http-server-module-list dt{min-width:10%}#http-server-module-list p{margin-top:0}.toc ul,#index{list-style-type:none;margin:0;padding:0}#index code{background:transparent}#index h3{border-bottom:1px solid #ddd}#index ul{padding:0}#index h4{margin-top:.6em;font-weight:bold}@media (min-width:200ex){#index .two-column{column-count:2}}@media (min-width:300ex){#index .two-column{column-count:3}}dl{margin-bottom:2em}dl dl:last-child{margin-bottom:4em}dd{margin:0 0 1em 3em}#header-classes + dl > dd{margin-bottom:3em}dd dd{margin-left:2em}dd p{margin:10px 0}.name{background:#eee;font-weight:bold;font-size:.85em;padding:5px 10px;display:inline-block;min-width:40%}.name:hover{background:#e0e0e0}dt:target .name{background:var(--highlight-color)}.name > span:first-child{white-space:nowrap}.name.class > span:nth-child(2){margin-left:.4em}.inherited{color:#999;border-left:5px solid #eee;padding-left:1em}.inheritance em{font-style:normal;font-weight:bold}.desc h2{font-weight:400;font-size:1.25em}.desc h3{font-size:1em}.desc dt code{background:inherit}.source summary,.git-link-div{color:#666;text-align:right;font-weight:400;font-size:.8em;text-transform:uppercase}.source summary > *{white-space:nowrap;cursor:pointer}.git-link{color:inherit;margin-left:1em}.source pre{max-height:500px;overflow:auto;margin:0}.source pre code{font-size:12px;overflow:visible}.hlist{list-style:none}.hlist li{display:inline}.hlist li:after{content:',\2002'}.hlist li:last-child:after{content:none}.hlist .hlist{display:inline;padding-left:1em}img{max-width:100%}td{padding:0 .5em}.admonition{padding:.1em .5em;margin-bottom:1em}.admonition-title{font-weight:bold}.admonition.note,.admonition.info,.admonition.important{background:#aef}.admonition.todo,.admonition.versionadded,.admonition.tip,.admonition.hint{background:#dfd}.admonition.warning,.admonition.versionchanged,.admonition.deprecated{background:#fd4}.admonition.error,.admonition.danger,.admonition.caution{background:lightpink}</style> | |
| <style media="screen and (min-width: 700px)">@media screen and (min-width:700px){#sidebar{width:30%;height:100vh;overflow:auto;position:sticky;top:0}#content{width:70%;max-width:100ch;padding:3em 4em;border-left:1px solid #ddd}pre code{font-size:1em}.item .name{font-size:1em}main{display:flex;flex-direction:row-reverse;justify-content:flex-end}.toc ul ul,#index ul{padding-left:1.5em}.toc > ul > li{margin-top:.5em}}</style> | |
| <style media="print">@media print{#sidebar h1{page-break-before:always}.source{display:none}}@media print{*{background:transparent ;color:#000 ;box-shadow:none ;text-shadow:none }a[href]:after{content:" (" attr(href) ")";font-size:90%}a[href][title]:after{content:none}abbr[title]:after{content:" (" attr(title) ")"}.ir a:after,a[href^="javascript:"]:after,a[href^="#"]:after{content:""}pre,blockquote{border:1px solid #999;page-break-inside:avoid}thead{display:table-header-group}tr,img{page-break-inside:avoid}img{max-width:100% }@page{margin:0.5cm}p,h2,h3{orphans:3;widows:3}h1,h2,h3,h4,h5,h6{page-break-after:avoid}}</style> | |
| <script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.1/highlight.min.js" integrity="sha256-Uv3H6lx7dJmRfRvH8TH6kJD1TSK1aFcwgx+mdg3epi8=" crossorigin></script> | |
| <script>window.addEventListener('DOMContentLoaded', () => hljs.initHighlighting())</script> | |
| </head> | |
| <body> | |
| <main> | |
| <article id="content"> | |
| <header> | |
| <h1 class="title">Module <code>tinytroupe.validation.propositions</code></h1> | |
| </header> | |
| <section id="section-intro"> | |
| <p>There are various general desireable simulation properties. These can be useful under various | |
| circumstances, for example to validate the simulation, or to monitor it during its execution.</p> | |
| <details class="source"> | |
| <summary> | |
| <span>Expand source code</span> | |
| </summary> | |
| <pre><code class="python">""" | |
| There are various general desireable simulation properties. These can be useful under various | |
| circumstances, for example to validate the simulation, or to monitor it during its execution. | |
| """ | |
| from tinytroupe.experimentation import Proposition | |
| ################################# | |
| # Auxiliary internal functions | |
| ################################# | |
| def _build_precondition_function_for_action_types(action_types:list, check_for_presence:bool): | |
| """ | |
| Builds a precondition function that checks if the action is or is not in a list of action types. | |
| The resulting function is meant to be used as a precondition function for propositions. | |
| Args: | |
| action_types (list): A list of action types to check against. | |
| check_for_presence (bool): If True, the function checks if the action type is in the list. | |
| If False, it checks if the action type is NOT in the list. | |
| Returns: | |
| function: A precondition function that takes a target, additional context, and claim variables as arguments. | |
| """ | |
| def precondition_function(target, additional_context, claim_variables): | |
| action_type = claim_variables.get("action").get("type") | |
| if check_for_presence: | |
| # Check if the action type is in the list of valid action types | |
| if action_type in action_types: | |
| return True | |
| else: | |
| return False | |
| else: | |
| # Check if the action type is NOT in the list of valid action types | |
| if action_type not in action_types: | |
| return True | |
| else: | |
| return False | |
| return precondition_function | |
| ############################### | |
| # Agent properties | |
| ############################### | |
| persona_adherence = \ | |
| Proposition(\ | |
| f""" | |
| THE AGENT ADHERES TO THE PERSONA SPECIFICATION: | |
| the agent behavior seen during the simulation is consistent with the agent's persona specification, it is | |
| what is expected from the agent's persona specification. In particular, consider these criteria: | |
| - The personality traits specified in the persona are respected. | |
| - The persona style is respected. | |
| - The persona beliefs are respected. | |
| - The persona behaviors are respected. | |
| - The persona skills are respected. | |
| - Any other aspect of the persona specification is respected. | |
| How to evaluate adherence: | |
| - Each of the above criteria should have equal weight in the evaluation, meaning that the score is the average of the scores of each criterion. | |
| - The adherence should be checked against all actions in the simulation trajectory. The final score should be an average of the scores of all | |
| actions in the trajectory. | |
| """, | |
| include_personas=True, | |
| double_check=True) | |
| action_persona_adherence = \ | |
| Proposition(\ | |
| """ | |
| THE NEXT AGENT ACTION ADHERES TO THE PERSONA SPECIFICATION: | |
| the agent's next action is consistent with the agent's persona specification, it is | |
| what is expected from the agent's persona specification. In particular, consider these criteria: | |
| - The personality traits specified in the persona are respected. | |
| - The persona style is respected. | |
| - The persona beliefs are respected. | |
| - The persona behaviors are respected. | |
| - The persona skills are respected. | |
| - Any other aspect of the persona specification is respected. | |
| THIS IS THE NEXT ACTION: {{action}} | |
| How to evaluate adherence: | |
| - Each of the above criteria should have equal weight in the evaluation, meaning that the score is the average of the scores of each criterion. | |
| - The adherence is ONLY ABOUT the next action mentioned above and the persona specification. DO NOT take into account previous actions or stimuli. | |
| - The general situation context is irrelevant to this evaluation, you should ONLY consider the persona specification as context. | |
| - Do not imagine what would be the next action, but instead judge the proposed next action mentioned above! | |
| - The simulation trajectories provided in the context DO NOT contain the next action, but only the actions and stimuli | |
| that have already happened. | |
| """, | |
| include_personas=True, | |
| double_check=False, | |
| first_n=5, last_n=10, | |
| precondition_function=_build_precondition_function_for_action_types(["THINK", "TALK"], check_for_presence=True)) | |
| hard_persona_adherence = \ | |
| Proposition(\ | |
| f""" | |
| THE AGENT FULLY ADHERES TO THE PERSONA SPECIFICATION: | |
| the agent behavior seen during the simulation is completely consistent with the agent's persona specification, it is | |
| exactly what is expected from the agent's persona specification. Nothing at all contradicts the persona specification. | |
| How to evaluate adherence: | |
| - For any flaw found, you **must** subtract 20% of the score, regardless of its severity. This is to be very harsh and avoid any ambiguity. | |
| """, | |
| include_personas=True, | |
| double_check=True) | |
| hard_action_persona_adherence = \ | |
| Proposition(\ | |
| """ | |
| THE NEXT AGENT ACTION FULLY ADHERES TO THE PERSONA SPECIFICATION: | |
| the agent's next action is completely consistent with the agent's persona specification, it is | |
| what is exactly expected from the agent's persona specification. Nothing at all contradicts the persona specification. | |
| THIS IS THE NEXT ACTION: {{action}} | |
| How to evaluate adherence: | |
| - For any flaw found, you **must** subtract 20% of the score, regardless of its severity. This is to be very harsh and avoid any ambiguity. | |
| - The adherence is ONLY ABOUT the next action mentioned above and the persona specification. DO NOT take into account previous actions or stimuli. | |
| - The general situation context is irrelevant to this evaluation, you should ONLY consider the persona specification as context. | |
| - Do not imagine what would be the next action, but instead judge the proposed next action mentioned above! | |
| - The simulation trajectories provided in the context DO NOT contain the next action, but only the actions and stimuli | |
| that have already happened. | |
| """, | |
| include_personas=True, | |
| double_check=False, | |
| first_n=5, last_n=10, | |
| precondition_function=_build_precondition_function_for_action_types(["THINK", "TALK"], check_for_presence=True)) | |
| self_consistency = \ | |
| Proposition( | |
| f""" | |
| THE AGENT IS SELF-CONSISTENT: | |
| the agent never behaves in contradictory or inconsistent ways. | |
| """, | |
| include_personas=False, | |
| double_check=True) | |
| action_self_consistency = \ | |
| Proposition( | |
| """ | |
| THE NEXT AGENT ACTION IS SELF-CONSISTENT: | |
| the agent's next action does not contradict or conflict with the agent's previous actions. | |
| THIS IS THE NEXT ACTION: {{action}} | |
| How to evaluate action self-consistency: | |
| - Consider the previous actions ONLY to form your opinion about whether the next action is consistent with them | |
| - Ignore stimuli and other previous events, the self-consistency concerns ONLY actions. | |
| - Actions and stimuli ARE NOT part of the persona specification. Rather, they are part of the simulation trajectories. | |
| - Ignore the agent's persona or general background, the self-consistency concerns ONLY the actions observed | |
| in simulation trajectories. | |
| - If there are no previous actions, the next action is self-consistent by default. | |
| """, | |
| include_personas=False, | |
| first_n=5, last_n=10, | |
| precondition_function=_build_precondition_function_for_action_types(["THINK", "TALK"], check_for_presence=True)) | |
| fluency = \ | |
| Proposition(\ | |
| """ | |
| THE AGENT IS FLUENT. During the simulation, the agent's thinks and speaks fluently. This means that: | |
| - The agent don't repeat the same thoughts or words over and over again. | |
| - The agents don't use overly formulaic language. | |
| - The agent don't use overly repetitive language. | |
| - The agent's words sound natural and human-like. | |
| """, | |
| include_personas=False, | |
| double_check=True) | |
| action_fluency = \ | |
| Proposition(\ | |
| """ | |
| THE NEXT AGENT ACTION IS FLUENT. | |
| The next action's words sounds natural and human-like, avoiding excessive repetition and formulaic language. | |
| THIS IS THE NEXT ACTION: {{action}} | |
| How to evaluate fluency: | |
| - Fluency here is ONLY ABOUT the next action mentioned above. Previous actions are the **context** for this evaluation, | |
| but will not be evaluated themselves. | |
| - Previous stimuli and events that are not actions should be completely ignored. Here we are only concerned about actions. | |
| """, | |
| include_personas=False, | |
| first_n=5, last_n=10, | |
| precondition_function=_build_precondition_function_for_action_types(["THINK", "TALK"], check_for_presence=True)) | |
| action_suitability = \ | |
| Proposition(\ | |
| """ | |
| THE NEXT AGENT ACTION IS SUITABLE: | |
| the next action is suitable for the situation, task and context. In particular, if the agent is pursuing some | |
| specific goal, instructions or guidelines, the next action must be coherent and consistent with them. | |
| More precisely, the next action is suitable if at least *one* of the following conditions is satisfied: | |
| - the next action is a reasonable step in the right direction, even if does not need to fully solve the overall problem, task or situation. | |
| - the next action produces relevant information for the situation, task or context, even if does not actually advances a solution. | |
| - the next action is a reasonable response to the recent stimuli received, even if it does not actually advances a solution. | |
| It suffices to meet ONLY ONE of these conditions to be considered **FULLY** suitable. | |
| THIS IS THE NEXT ACTION: {{action}} | |
| How to evaluate action suitability: | |
| - The score of suitability is proportional to the degree to which the next action satisfies *any* of the above conditions | |
| - If only **one** condition is **fully** met, the next action is **completely** suitable and gets **maximum** score. That is to say, | |
| the next action **does not** need to satisfy all conditions to be suitable! A single sataisfied condition is enough! | |
| - The suitability is ONLY ABOUT the next action mentioned above and the situation context. | |
| - If a previous action or stimuli is inconsistent or conflicting with the situation context, you should ignore it | |
| when evaluating the next action. Consider ONLY the situation context. | |
| - The simulation trajectories provided in the context DO NOT contain the next action, but only the actions and stimuli | |
| that have already happened. | |
| """, | |
| include_personas=True, | |
| first_n=5, last_n=10, | |
| precondition_function=_build_precondition_function_for_action_types(["THINK", "TALK"], check_for_presence=True)) | |
| task_completion = \ | |
| Proposition(\ | |
| """ | |
| THE AGENT COMPLETES THE GIVEN TASK. | |
| Given the following task: "{{task_description}}" | |
| The agent completes the task by the end of the simulation. | |
| This means that: | |
| - If the task requires the agent to discuss or talk about something, the agent does so. | |
| - If the task requires the agent to think about something, the agent does so. | |
| - If the task requires the agent to do something via another action, the agent does so. | |
| - If the task requires the agent to adopt some specific variations of behavior, the agent does so. | |
| - If the task includes other specific requirements, the agent observes them. | |
| """, | |
| include_personas=False, | |
| double_check=True) | |
| quiet_recently = \ | |
| Proposition( | |
| """ | |
| THE AGENT HAS BEEN QUIET RECENTLY: | |
| The agent has been executing multiple DONE actions in a row with few or no TALK, THINK or | |
| other actions in between. | |
| How to evaluate quietness: | |
| - The last 2 (or more) actions of the agent are consecutive DONE actions. This means that the agent | |
| was done with his turn before doing anything else for a couple of turns. | |
| - There are no other actions in between the last 2 (or more) DONE actions. | |
| """, | |
| include_personas=False | |
| ) | |
| ################################## | |
| # Environment properties | |
| ################################## | |
| divergence = \ | |
| Proposition(""" | |
| AGENTS DIVERGE FROM ONE ANOTHER. | |
| As the simulation progresses, the agents' behaviors diverge from one another, | |
| instead of becoming more similar. This includes what they think, what they say and what they do. The topics discussed become | |
| more varied at the end of the simulation than at the beginning. Discussions do not converge to a single topic or perspective | |
| at the end. | |
| """, | |
| include_personas=False, | |
| double_check=True) | |
| convergence = \ | |
| Proposition(""" | |
| AGENTS CONVERGE TO ONE ANOTHER. | |
| As the simulation progresses, the agents' behaviors converge to one another, | |
| instead of becoming more different. This includes what they think, what they say and what they do. The topics discussed become | |
| more similar at the end of the simulation than at the beginning. Discussions converge to a single topic or perspective | |
| at the end. | |
| """, | |
| include_personas=False, | |
| double_check=True)</code></pre> | |
| </details> | |
| </section> | |
| <section> | |
| </section> | |
| <section> | |
| </section> | |
| <section> | |
| </section> | |
| <section> | |
| </section> | |
| </article> | |
| <nav id="sidebar"> | |
| <h1>Index</h1> | |
| <div class="toc"> | |
| <ul></ul> | |
| </div> | |
| <ul id="index"> | |
| <li><h3>Super-module</h3> | |
| <ul> | |
| <li><code><a title="tinytroupe.validation" href="index.html">tinytroupe.validation</a></code></li> | |
| </ul> | |
| </li> | |
| </ul> | |
| </nav> | |
| </main> | |
| <footer id="footer"> | |
| <p>Generated by <a href="https://pdoc3.github.io/pdoc" title="pdoc: Python API documentation generator"><cite>pdoc</cite> 0.10.0</a>.</p> | |
| </footer> | |
| </body> | |
| </html> |