Spaces:
Sleeping
Sleeping
File size: 10,581 Bytes
66c9c8a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | Runtime Settings
================
Warp has settings at the global, module, and kernel level that can be used to fine-tune the compilation and verbosity
of Warp programs. In cases in which a setting can be changed at multiple levels (e.g ``enable_backward``),
the setting at the more-specific scope takes precedence.
Global Settings
---------------
To change a setting, prepend ``wp.config.`` to the name of the variable and assign a value to it.
Some settings may be changed on the fly, while others need to be set prior to calling ``wp.init()`` to take effect.
For example, the location of the user kernel cache can be changed with:
.. code-block:: python
import os
import warp as wp
example_dir = os.path.dirname(os.path.realpath(__file__))
# set default cache directory before wp.init()
wp.config.kernel_cache_dir = os.path.join(example_dir, "tmp", "warpcache1")
wp.init()
Basic Global Settings
^^^^^^^^^^^^^^^^^^^^^
+--------------------+---------+-------------+--------------------------------------------------------------------------+
| Field | Type |Default Value| Description |
+====================+=========+=============+==========================================================================+
|``verify_fp`` | Boolean | ``False`` | If ``True``, Warp will check that inputs and outputs are finite before |
| | | | and/or after various operations. **Has performance implications.** |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``verify_cuda`` | Boolean | ``False`` | If ``True``, Warp will check for CUDA errors after every launch and |
| | | | memory operation. CUDA error verification cannot be used during graph |
| | | | capture. **Has performance implications.** |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``print_launches`` | Boolean | ``False`` | If ``True``, Warp will print details of every kernel launch to standard |
| | | | out (e.g. launch dimensions, inputs, outputs, device, etc.). |
| | | | **Has performance implications.** |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``mode`` | String |``"release"``| Controls whether to compile Warp kernels in debug or release mode. |
| | | | Valid choices are ``"release"`` or ``"debug"``. |
| | | | **Has performance implications.** |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``verbose`` | Boolean | ``False`` | If ``True``, additional information will be printed to standard out |
| | | | during code generation, compilation, etc. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``quiet`` | Boolean | ``False`` | If ``True``, Warp module initialization messages will be disabled. |
| | | | This setting does not affect error messages and warnings. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``kernel_cache_dir``| String | ``None`` | The path to the directory used for the user kernel cache. Subdirectories |
| | | | named ``gen`` and ``bin`` will be created in this directory. If ``None``,|
| | | | a directory will be automatically determined using |
| | | | `appdirs.user_cache_directory <https://github.com/ActiveState/appdirs>`_ |
| | | | |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``enable_backward`` | Boolean | ``True`` | If ``True``, backward passes of kernels will be compiled by default. |
| | | | Disabling this setting can reduce kernel compilation times. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
Advanced Global Settings
^^^^^^^^^^^^^^^^^^^^^^^^
+--------------------+---------+-------------+--------------------------------------------------------------------------+
| Field | Type |Default Value| Description |
+====================+=========+=============+==========================================================================+
|``cache_kernels`` | Boolean | ``True`` | If ``True``, kernels that have already been compiled from previous |
| | | | application launches will not be recompiled. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``cuda_output`` | String | ``None`` | The preferred CUDA output format for kernels. Valid choices are ``None``,|
| | | | ``"ptx"``, and ``"cubin"``. If ``None``, a format will be determined |
| | | | automatically. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``ptx_target_arch`` | Integer | 70 | The target architecture for PTX generation. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``llvm_cuda`` | Boolean | ``False`` | If ``True``, Clang/LLVM will be used to compile CUDA code instead of |
| | | | NVTRC. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
Module Settings
---------------
Module-level settings to control runtime compilation and code generation may be changed by passing a dictionary of
option pairs to ``wp.set_module_options()``.
For example, compilation of backward passes for the kernel in an entire module can be disabled with:
.. code:: python
wp.set_module_options({"enable_backward": False})
The options for a module can also be queried using ``wp.get_module_options()``.
+--------------------+---------+-------------+--------------------------------------------------------------------------+
| Field | Type |Default Value| Description |
+====================+=========+=============+==========================================================================+
|``mode`` | String | Global | Controls whether to compile the module's kernels in debug or release |
| | | setting | mode by default. Valid choices are ``"release"`` or ``"debug"``. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``max_unroll`` | Integer | 16 | The maximum fixed-size loop to unroll. Note that ``max_unroll`` does not |
| | | | consider the total number of iterations in nested loops. This can result |
| | | | in a large amount of automatically generated code if each nested loop is |
| | | | below the ``max_unroll`` threshold. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``enable_backward`` | Boolean | Global | If ``True``, backward passes of kernels will be compiled by default. |
| | | setting | Valid choices are ``"release"`` or ``"debug"``. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``fast_math`` | Boolean | ``False`` | If ``True``, CUDA kernels will be compiled with the ``--use_fast_math`` |
| | | | compiler option, which enables some fast math operations that are faster |
| | | | but less accurate. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
|``cuda_output`` | String | ``None`` | The preferred CUDA output format for kernels. Valid choices are ``None``,|
| | | | ``"ptx"``, and ``"cubin"``. If ``None``, a format will be determined |
| | | | automatically. The module-level setting takes precedence over the global |
| | | | setting. |
+--------------------+---------+-------------+--------------------------------------------------------------------------+
Kernel Settings
---------------
``enable_backward`` is currently the only setting that can also be configured on a per-kernel level.
Backward-pass compilation can be disabled by passing an argument into the ``@wp.kernel`` decorator
as in the following example:
.. code-block:: python
@wp.kernel(enable_backward=False)
def scale_2(
x: wp.array(dtype=float),
y: wp.array(dtype=float),
):
y[0] = x[0] ** 2.0
|