Buckets:
| <meta charset="utf-8" /><meta http-equiv="content-security-policy" content=""><meta name="hf:doc:metadata" content="{"local":"simulate.RewardFunction","title":"Reward Functions"}" data-svelte="svelte-1phssyn"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/assets/pages/__layout.svelte-hf-doc-builder.css"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/start-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/chunks/vendor-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/chunks/paths-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/pages/__layout.svelte-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/pages/api/reward_functions.mdx-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/chunks/Docstring-hf-doc-builder.js"> | |
| <link rel="modulepreload" href="/docs/simulate/v0.0.1/en/_app/chunks/IconCopyLink-hf-doc-builder.js"> | |
| <h1 class="relative group"><a id="simulate.RewardFunction" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#simulate.RewardFunction"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> | |
| <span>Reward Functions | |
| </span></h1> | |
| <p>🤗 Simulate provides a system to define simple and complex reward functions. This is achieved through the combination of “leaf” reward | |
| functions, such as Sparse and Dense rewards functions, and predicate reward functions.</p> | |
| <p>(LINK TO REWARD PREDICATE DIAGRAM)</p> | |
| <p>Reward functions can be parameterized with a variety of distance metrics. Currently “euclidean”, “cosine” and “best_euclidean” are supported. | |
| Through the combination of predicates and leaf rewards, complex reward functions can be created. A good example of the is the | |
| <a href="https://github.com/huggingface/simulate/blob/main/examples/rl/sb3_move_boxes.py" rel="nofollow">Move Boxes</a> example.</p> | |
| <p>The following “leaf” rewards are available in Simulate: </p> | |
| <ul><li>“dense”: A reward that is non-zero at every time-step.</li> | |
| <li>“sparse”: A reward that is triggered by the proximity of another object.</li> | |
| <li>“timeout”: A timeout reward that is triggered after a certain number of time-steps.</li> | |
| <li>“see”: Triggered when an object is in the field of view of an Actor.</li> | |
| <li>“angle_to”: Triggered when the angle between two objects and a certain direction is less that a threshold.</li></ul> | |
| <p>The “leaf” reward functions can be combined in a tree structure with the following predicate functions: </p> | |
| <ul><li>“not”: Triggers when a reward is not triggered.</li> | |
| <li>“and”: Triggers when both children of this node are returning a positive reward.</li> | |
| <li>“or”: Triggers when one or both of the children of this node are returning a positive reward.</li> | |
| <li>“xor”: Triggers when only one of the children of this node are returning a positive reward.</li></ul> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <div><span class="group flex space-x-1.5 items-center text-gray-800 bg-gradient-to-r rounded-tr-lg -mt-4 -ml-4 pt-3 px-2.5" id="simulate.RewardFunction"><!-- HTML_TAG_START --><h3 class="!m-0"><span class="flex-1 break-all md:text-lg bg-gradient-to-r px-2.5 py-1.5 rounded-xl from-indigo-50/70 to-white dark:from-gray-900 dark:to-gray-950 dark:text-indigo-300 text-indigo-700"><svg class="mr-1.5 text-indigo-500 inline-block -mt-0.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width=".8em" height=".8em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path class="uim-quaternary" d="M20.23 7.24L12 12L3.77 7.24a1.98 1.98 0 0 1 .7-.71L11 2.76c.62-.35 1.38-.35 2 0l6.53 3.77c.29.173.531.418.7.71z" opacity=".25" fill="currentColor"></path><path class="uim-tertiary" d="M12 12v9.5a2.09 2.09 0 0 1-.91-.21L4.5 17.48a2.003 2.003 0 0 1-1-1.73v-7.5a2.06 2.06 0 0 1 .27-1.01L12 12z" opacity=".5" fill="currentColor"></path><path class="uim-primary" d="M20.5 8.25v7.5a2.003 2.003 0 0 1-1 1.73l-6.62 3.82c-.275.13-.576.198-.88.2V12l8.23-4.76c.175.308.268.656.27 1.01z" fill="currentColor"></path></svg><span class="font-light">class</span> <span class="font-medium">simulate.</span><span class="font-semibold">RewardFunction</span></span></h3><!-- HTML_TAG_END --> | |
| <a id="simulate.RewardFunction" class="header-link invisible with-hover:group-hover:visible pr-2" href="#simulate.RewardFunction"><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></a> | |
| <a class="!ml-auto !text-gray-400 !no-underline text-sm flex items-center" href="https://github.com/huggingface/simulate/blob/v0.0.1/src/simulate/assets/reward_functions.py#L16" target="_blank"><span><</span> | |
| <span class="hidden md:block mx-0.5 hover:!underline">source</span> | |
| <span>></span></a></span> | |
| <p class="font-mono text-xs md:text-sm !leading-relaxed !my-6"><span>(</span> | |
| <span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">type<span class="opacity-60">: typing.Optional[str] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">entity_a<span class="opacity-60">: typing.Optional[typing.Any] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">entity_b<span class="opacity-60">: typing.Optional[typing.Any] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">distance_metric<span class="opacity-60">: typing.Optional[str] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">direction<span class="opacity-60">: typing.Optional[typing.List[float]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">scalar<span class="opacity-60">: float = 1.0</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">threshold<span class="opacity-60">: float = 1.0</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">is_terminal<span class="opacity-60">: bool = False</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">is_collectable<span class="opacity-60">: bool = False</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">trigger_once<span class="opacity-60">: bool = True</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">reward_function_a<span class="opacity-60">: dataclasses.InitVar[typing.Optional[ForwardRef('RewardFunction')]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">reward_function_b<span class="opacity-60">: dataclasses.InitVar[typing.Optional[ForwardRef('RewardFunction')]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">name<span class="opacity-60">: dataclasses.InitVar[typing.Optional[str]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">position<span class="opacity-60">: dataclasses.InitVar[typing.Optional[typing.List[float]]] = <property object at 0x7fca120c6450></span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">rotation<span class="opacity-60">: dataclasses.InitVar[typing.Optional[typing.List[float]]] = <property object at 0x7fca120c62c0></span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">scaling<span class="opacity-60">: dataclasses.InitVar[typing.Union[float, typing.List[float], NoneType]] = <property object at 0x7fca120c6310></span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">transformation_matrix<span class="opacity-60">: dataclasses.InitVar[typing.Optional[typing.List[float]]] = <property object at 0x7fca120c6360></span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">parent<span class="opacity-60">: dataclasses.InitVar[typing.Optional[typing.Any]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">children<span class="opacity-60">: dataclasses.InitVar[typing.Optional[typing.List[typing.Any]]] = None</span></span> | |
| </span><span class="comma cursor-default"><span class="rounded hover:bg-black hover:text-white dark:hover:bg-white dark:hover:text-black">created_from_file<span class="opacity-60">: dataclasses.InitVar[typing.Optional[str]] = None</span></span> | |
| </span> | |
| <span>)</span> | |
| </p> | |
| <div class="!mb-10 relative docstring-details "> | |
| </div></div> | |
| <p>An RL reward function</p></div> | |
| <script type="module" data-hydrate="441ssd"> | |
| import { start } from "/docs/simulate/v0.0.1/en/_app/start-hf-doc-builder.js"; | |
| start({ | |
| target: document.querySelector('[data-hydrate="441ssd"]').parentNode, | |
| paths: {"base":"/docs/simulate/v0.0.1/en","assets":"/docs/simulate/v0.0.1/en"}, | |
| session: {}, | |
| route: false, | |
| spa: false, | |
| trailing_slash: "never", | |
| hydrate: { | |
| status: 200, | |
| error: null, | |
| nodes: [ | |
| import("/docs/simulate/v0.0.1/en/_app/pages/__layout.svelte-hf-doc-builder.js"), | |
| import("/docs/simulate/v0.0.1/en/_app/pages/api/reward_functions.mdx-hf-doc-builder.js") | |
| ], | |
| params: {} | |
| } | |
| }); | |
| </script> | |
Xet Storage Details
- Size:
- 12.3 kB
- Xet hash:
- 2b62f21d31fc04d461e883514374a87962a632fb85d6123a2e7ee84d26da5d12
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.