Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.6.0
CHANGELOG
[1.0.0-beta.6] - 2024-01-10
- Do not create CPU copy of grad array when calling
array.numpy() - Fix
assert_np_equal()bug - Support Linux AArch64 platforms, including Jetson/Tegra devices
- Add parallel testing runner (invoke with
python -m warp.tests, usewarp/tests/unittest_serial.pyfor serial testing) - Fix support for function calls in
range() matmuladjoints now accumulate- Expand available operators (e.g. vector @ matrix, scalar as dividend) and improve support for calling native built-ins
- Fix multi-gpu synchronization issue in
sparse.py - Add depth rendering to
OpenGLRenderer, documentwarp.render - Make
atomic_min,atomic_maxdifferentiable - Fix error reporting using the exact source segment
- Add user-friendly mesh query overloads, returning a struct instead of overwriting parameters
- Address multiple differentiability issues
- Fix backpropagation for returning array element references
- Support passing the return value to adjoints
- Add point basis space and explicit point-based quadrature for
warp.fem - Support overriding the LLVM project source directory path using
build_lib.py --build_llvm --llvm_source_path= - Fix the error message for accessing non-existing attributes
- Flatten faces array for Mesh constructor in URDF parser
[1.0.0-beta.5] - 2023-11-22
- Fix for kernel caching when function argument types change
- Fix code-gen ordering of dependent structs
- Fix for
wp.Meshbuild on MGPU systems - Fix for name clash bug with adjoint code: https://github.com/NVIDIA/warp/issues/154
- Add
wp.frac()for returning the fractional part of a floating point value - Add support for custom native CUDA snippets using
@wp.func_nativedecorator - Add support for batched matmul with batch size > 2^16-1
- Add support for transposed CUTLASS
wp.matmul()and additional error checking - Add support for quad and hex meshes in
wp.fem - Detect and warn when C++ runtime doesn't match compiler during build, e.g.:
libstdc++.so.6: version `GLIBCXX_3.4.30' not found - Documentation update for
wp.BVH - Documentation and simplified API for runtime kernel specialization
wp.Kernel
[1.0.0-beta.4] - 2023-11-01
- Add
wp.cbrt()for cube root calculation - Add
wp.mesh_furthest_point_no_sign()to compute furthest point on a surface from a query point - Add support for GPU BVH builds, 10-100x faster than CPU builds for large meshes
- Add support for chained comparisons, i.e.:
0 < x < 2 - Add support for running
warp.femexamples headless - Fix for unit test determinism
- Fix for possible GC collection of array during graph capture
- Fix for
wp.utils.array_sum()output initialization when used with vector types - Coverage and documentation updates
[1.0.0-beta.3] - 2023-10-19
- Add support for code coverage scans (test_coverage.py), coverage at 85% in omni.warp.core
- Add support for named component access for vector types, e.g.:
a = v.x - Add support for lvalue expressions, e.g.:
array[i] += b - Add casting constructors for matrix and vector types
- Add support for
type()operator that can be used to return type inside kernels - Add support for grid-stride kernels to support kernels with > 2^31-1 thread blocks
- Fix for multi-process initialization warnings
- Fix alignment issues with empty
wp.struct - Fix for return statement warning with tuple-returning functions
- Fix for
wp.batched_matmul()registering the wrong function in the Tape - Fix and document for
wp.simforward + inverse kinematics - Fix for
wp.functo return a default value if function does not return on all control paths - Refactor
wp.femsupport for new basis functions, decoupled function spaces - Optimizations for
wp.noisefunctions, up to 10x faster in most cases - Optimizations for
type_size_in_bytes()used in array construction'
Breaking Changes
- To support grid-stride kernels,
wp.tid()can no longer be called insidewp.funcfunctions.
[1.0.0-beta.2] - 2023-09-01
- Fix for passing bool into
wp.funcfunctions - Fix for deprecation warnings appearing on
stderr, now redirected tostdout - Fix for using
for i in wp.hash_grid_query(..)syntax
[1.0.0-beta.1] - 2023-08-29
- Fix for
wp.float16being passed as kernel arguments - Fix for compile errors with kernels using structs in backward pass
- Fix for
wp.Mesh.refit()not being CUDA graph capturable due to synchronous temp. allocs - Fix for dynamic texture example flickering / MGPU crashes demo in Kit by reusing
ui.DynamicImageProviderinstances - Fix for a regression that disabled bundle change tracking in samples
- Fix for incorrect surface velocities when meshes are deforming in
OgnClothSimulate - Fix for incorrect lower-case when setting USD stage "up_axis" in examples
- Fix for incompatible gradient types when wrapping PyTorch tensor as a vector or matrix type
- Fix for adding open edges when building cloth constraints from meshes in
wp.sim.ModelBuilder.add_cloth_mesh() - Add support for
wp.fabricarrayto directly access Fabric data from Warp kernels, see https://omniverse.gitlab-master-pages.nvidia.com/usdrt/docs/usdrt_prim_selection.html for examples - Add support for user defined gradient functions, see
@wp.func_replay, and@wp.func_graddecorators - Add support for more OG attribute types in
omni.warp.from_omni_graph() - Add support for creating NanoVDB
wp.Volumeobjects from dense NumPy arrays - Add support for
wp.volume_sample_grad_f()which returns the value + gradient efficiently from an NVDB volume - Add support for LLVM fp16 intrinsics for half-precision arithmetic
- Add implementation of stochastic gradient descent, see
wp.optim.SGD - Add
warp.femframework for solving weak-form PDE problems (see https://nvidia.github.io/warp/_build/html/modules/fem.html) - Optimizations for
omni.warpextension load time (2.2s to 625ms cold start) - Make all
omni.uidependencies optional so that Warp unit tests can run headless - Deprecation of
wp.tid()outside of kernel functions, users should passtid()values towp.funcfunctions explicitly - Deprecation of
wp.sim.Model.flatten()for returning all contained tensors from the model - Add support for clamping particle max velocity in
wp.sim.Model.particle_max_velocity - Remove dependency on
urdfpypackage, improve MJCF parser handling of default values
[0.10.1] - 2023-07-25
- Fix for large multidimensional kernel launches (> 2^32 threads)
- Fix for module hashing with generics
- Fix for unrolling loops with break or continue statements (will skip unrolling)
- Fix for passing boolean arguments to build_lib.py (previously ignored)
- Fix build warnings on Linux
- Fix for creating array of structs from NumPy structured array
- Fix for regression on kernel load times in Kit when using warp.sim
- Update
warp.array.reshape()to handle-1dimensions - Update margin used by for mesh queries when using
wp.sim.create_soft_body_contacts() - Improvements to gradient handling with
warp.from_torch(),warp.to_torch()plus documentation
[0.10.0] - 2023-07-05
- Add support for macOS universal binaries (x86 + aarch64) for M1+ support
- Add additional methods for SDF generation please see the following new methods:
wp.mesh_query_point_nosign()- closest point query with no sign determinationwp.mesh_query_point_sign_normal()- closest point query with sign from angle-weighted normalwp.mesh_query_point_sign_winding_number()- closest point query with fast winding number sign determination
- Add CSR/BSR sparse matrix support, see
warp.sparsemodule:wp.sparse.BsrMatrixwp.sparse.bsr_zeros(),wp.sparse.bsr_set_from_triplets()for constructionwp.sparse.bsr_mm(),wp.sparse_bsr_mv()for matrix-matrix and matrix-vector products respectively
- Add array-wide utilities:
wp.utils.array_scan()- prefix sum (inclusive or exclusive)wp.utils.array_sum()- sum across arraywp.utils.radix_sort_pairs()- in-place radix sort (key,value) pairs
- Add support for calling
@wp.funcfunctions from Python (outside of kernel scope) - Add support for recording kernel launches using a
wp.Launchobject that can be replayed with low overhead, usewp.launch(..., record_cmd=True)to generate a command object - Optimizations for
wp.structkernel arguments, up to 20x faster launches for kernels with large structs or number of params - Refresh USD samples to use bundle based workflow + change tracking
- Add Python API for manipulating mesh and point bundle data in OmniGraph, see
omni.warp.nodesmodule- See
omni.warp.nodes.mesh_create_bundle(),omni.warp.nodes.mesh_get_points(), etc.
- See
- Improvements to
wp.array:- Fix a number of array methods misbehaving with empty arrays
- Fix a number of bugs and memory leaks related to gradient arrays
- Fix array construction when creating arrays in pinned memory from a data source in pageable memory
wp.empty()no longer zeroes-out memory and returns an uninitialized array, as intendedarray.zero_()andarray.fill_()work with non-contiguous arrays- Support wrapping non-contiguous NumPy arrays without a copy
- Support preserving the outer dimensions of NumPy arrays when wrapping them as Warp arrays of vector or matrix types
- Improve PyTorch and DLPack interop with Warp arrays of arbitrary vectors and matrices
array.fill_()can now take lists or other sequences when filling arrays of vectors or matrices, e.g.arr.fill_([[1, 2], [3, 4]])array.fill_()now works with arrays of structs (pass a struct instance)wp.copy()gracefully handles copying between non-contiguous arrays on different devices- Add
wp.full()andwp.full_like(), e.g.,a = wp.full(shape, value) - Add optional
deviceargument towp.empty_like(),wp.zeros_like(),wp.full_like(), andwp.clone() - Add
indexedarraymethods.zero_(),.fill_(), and.assign() - Fix
indexedarraymethods.numpy()and.list() - Fix
array.list()to work with arrays of any Warp data type - Fix
array.list()synchronization issue with CUDA arrays array.numpy()called on an array of structs returns a structured NumPy array with named fields- Improve the performance of creating arrays
- Fix for
Error: No module named 'omni.warp.core'when running some Kit configurations (e.g.: stubgen) - Fix for
wp.structinstance address being included in module content hash - Fix codegen with overridden function names
- Fix for kernel hashing so it occurs after code generation and before loading to fix a bug with stale kernel cache
- Fix for
wp.BVH.refit()when executed on the CPU - Fix adjoint of
wp.structconstructor - Fix element accessors for
wp.float16vectors and matrices in Python - Fix
wp.float16members in structs - Remove deprecated
wp.ScopedCudaGuard(), please usewp.ScopedDevice()instead
[0.9.0] - 2023-06-01
- Add support for in-place modifications to vector, matrix, and struct types inside kernels (will warn during backward pass with
wp.verboseif using gradients) - Add support for step-through VSCode debugging of kernel code with standalone LLVM compiler, see
wp.breakpoint(), andwalkthrough_debug.py - Add support for default values on built-in functions
- Add support for multi-valued
@wp.funcfunctions - Add support for
pass,continue, andbreakstatements - Add missing
__sincos_stretsymbol for macOS - Add support for gradient propagation through
wp.Mesh.points, and other cases where arrays are passed to native functions - Add support for Python
@operator as an alias forwp.matmul() - Add XPBD support for particle-particle collision
- Add support for individual particle radii:
ModelBuilder.add_particlehas a newradiusargument,Model.particle_radiusis now a Warp array - Add per-particle flags as a
Model.particle_flagsWarp array, introducePARTICLE_FLAG_ACTIVEto define whether a particle is being simulated and participates in contact dynamics - Add support for Python bitwise operators
&,|,~,<<,>> - Switch to using standalone LLVM compiler by default for
cpudevices - Split
omni.warpintoomni.warp.corefor Omniverse applications that want to use the Warp Python module with minimal additional dependencies - Disable kernel gradient generation by default inside Omniverse for improved compile times
- Fix for bounds checking on element access of vector/matrix types
- Fix for stream initialization when a custom (non-primary) external CUDA context has been set on the calling thread
- Fix for duplicate
@wp.structregistration during hot reload - Fix for array
unot()operator so kernel writers can useif not array:syntax - Fix for case where dynamic loops are nested within unrolled loops
- Change
wp.hash_grid_point_id()now returns -1 if thewp.HashGridhas not been reserved before - Deprecate
wp.Model.soft_contact_distancewhich is now replaced bywp.Model.particle_radius - Deprecate single scalar particle radius (should be a per-particle array)
[0.8.2] - 2023-04-21
- Add
ModelBuilder.soft_contact_maxto control the maximum number of soft contacts that can be registered. UseModel.allocate_soft_contacts(new_count)to change count on existingModelobjects. - Add support for
boolparameters - Add support for logical boolean operators with
inttypes - Fix for
wp.quat()default constructor - Fix conditional reassignments
- Add sign determination using angle weighted normal version of
wp.mesh_query_point()aswp.mesh_query_sign_normal() - Add sign determination using winding number of
wp.mesh_query_point()aswp.mesh_query_sign_winding_number() - Add query point without sign determination
wp.mesh_query_no_sign()
[0.8.1] - 2023-04-13
- Fix for regression when passing flattened numeric lists as matrix arguments to kernels
- Fix for regressions when passing
wp.structtypes with uninitialized (None) member attributes
[0.8.0] - 2023-04-05
- Add
Texture Writenode for updating dynamic RTX textures from Warp kernels / nodes - Add multi-dimensional kernel support to Warp Kernel Node
- Add
wp.load_module()to pre-load specific modules (passrecursive=Trueto load recursively) - Add
wp.poisson()for sampling Poisson distributions - Add support for UsdPhysics schema see
warp.sim.parse_usd() - Add XPBD rigid body implementation plus diff. simulation examples
- Add support for standalone CPU compilation (no host-compiler) with LLVM backed, enable with
--standalonebuild option - Add support for per-timer color in
wp.ScopedTimer() - Add support for row-based construction of matrix types outside of kernels
- Add support for setting and getting row vectors for Python matrices, see
matrix.get_row(),matrix.set_row() - Add support for instantiating
wp.structtypes within kernels - Add support for indexed arrays,
slice = array[indices]will now generate a sparse slice of array data - Add support for generic kernel params, use
def compute(param: Any): - Add support for
with wp.ScopedDevice("cuda") as device:syntax (same forwp.ScopedStream(),wp.Tape()) - Add support for creating custom length vector/matrices inside kernels, see
wp.vector(), andwp.matrix() - Add support for creating identity matrices in kernels with, e.g.:
I = wp.identity(n=3, dtype=float) - Add support for unary plus operator (
wp.pos()) - Add support for
wp.constantvariables to be used directly in Python without having to use.valmember - Add support for nested
wp.structtypes - Add support for returning
wp.structfrom functions - Add
--quickbuild for faster local dev. iteration (uses a reduced set of SASS arches) - Add optional
requires_gradparameter towp.from_torch()to override gradient allocation - Add type hints for generic vector / matrix types in Python stubs
- Add support for custom user function recording in
wp.Tape() - Add support for registering CUTLASS
wp.matmul()with tape backward pass - Add support for grids with > 2^31 threads (each dimension may be up to INT_MAX in length)
- Add CPU fallback for
wp.matmul() - Optimizations for
wp.launch(), up to 3x faster launches in common cases - Fix
wp.randf()conversion to float to reduce bias for uniform sampling - Fix capture of
wp.funcandwp.constanttypes from inside Python closures - Fix for CUDA on WSL
- Fix for matrices in structs
- Fix for transpose indexing for some non-square matrices
- Enable Python faulthandler by default
- Update to VS2019
Breaking Changes
wp.constantvariables can now be treated as their true type, accessing the underlying value throughconstant.valis no longer supportedwp.sim.model.ground_planeis now awp.arrayto support gradient, users should callbuilder.set_ground_plane()to create the groundwp.simcapsule, cones, and cylinders are now aligned with the default USD up-axis
[0.7.2] - 2023-02-15
- Reduce test time for vec/math types
- Clean-up CUDA disabled build pipeline
- Remove extension.gen.toml to make Kit packages Python version independent
- Handle additional cases for array indexing inside Python
[0.7.1] - 2023-02-14
- Disabling some slow tests for Kit
- Make unit tests run on first GPU only by default
[0.7.0] - 2023-02-13
- Add support for arbitrary length / type vector and matrices e.g.:
wp.vec(length=7, dtype=wp.float16), seewp.vec(), andwp.mat() - Add support for
array.flatten(),array.reshape(), andarray.view()with NumPy semantics - Add support for slicing
wp.arraytypes in Python - Add
wp.from_ptr()helper to construct arrays from an existing allocation - Add support for
breakstatements in ranged-for and while loops (backward pass support currently not implemented) - Add built-in mathematic constants, see
wp.pi,wp.e,wp.log2e, etc. - Add built-in conversion between degrees and radians, see
wp.degrees(),wp.radians() - Add security pop-up for Kernel Node
- Improve error handling for kernel return values
[0.6.3] - 2023-01-31
- Add DLPack utilities, see
wp.from_dlpack(),wp.to_dlpack() - Add Jax utilities, see
wp.from_jax(),wp.to_jax(),wp.device_from_jax(),wp.device_to_jax() - Fix for Linux Kit extensions OM-80132, OM-80133
[0.6.2] - 2023-01-19
- Updated
wp.from_torch()to support more data types - Updated
wp.from_torch()to automatically determine the target Warp data type if not specified - Updated
wp.from_torch()to support non-contiguous tensors with arbitrary strides - Add CUTLASS integration for dense GEMMs, see
wp.matmul()andwp.matmul_batched() - Add QR and Eigen decompositions for
mat33types, seewp.qr3(), andwp.eig3() - Add default (zero) constructors for matrix types
- Add a flag to suppress all output except errors and warnings (set
wp.config.quiet = True) - Skip recompilation when Kernel Node attributes are edited
- Allow optional attributes for Kernel Node
- Allow disabling backward pass code-gen on a per-kernel basis, use
@wp.kernel(enable_backward=False) - Replace Python
imppackage withimportlib - Fix for quaternion slerp gradients (
wp.quat_slerp())
[0.6.1] - 2022-12-05
- Fix for non-CUDA builds
- Fix strides computation in array_t constructor, fixes a bug with accessing mesh indices through mesh.indices[]
- Disable backward pass code generation for kernel node (4-6x faster compilation)
- Switch to linbuild for universal Linux binaries (affects TeamCity builds only)
[0.6.0] - 2022-11-28
- Add support for CUDA streams, see
wp.Stream,wp.get_stream(),wp.set_stream(),wp.synchronize_stream(),wp.ScopedStream - Add support for CUDA events, see
wp.Event,wp.record_event(),wp.wait_event(),wp.wait_stream(),wp.Stream.record_event(),wp.Stream.wait_event(),wp.Stream.wait_stream() - Add support for PyTorch stream interop, see
wp.stream_from_torch(),wp.stream_to_torch() - Add support for allocating host arrays in pinned memory for asynchronous data transfers, use
wp.array(..., pinned=True)(default is non-pinned) - Add support for direct conversions between all scalar types, e.g.:
x = wp.uint8(wp.float64(3.0)) - Add per-module option to enable fast math, use
wp.set_module_options({"fast_math": True}), fast math is now disabled by default - Add support for generating CUBIN kernels instead of PTX on systems with older drivers
- Add user preference options for CUDA kernel output ("ptx" or "cubin", e.g.:
wp.config.cuda_output = "ptx"or per-modulewp.set_module_options({"cuda_output": "ptx"})) - Add kernel node for OmniGraph
- Add
wp.quat_slerp(),wp.quat_to_axis_angle(),wp.rotate_rodriquez()and adjoints for all remaining quaternion operations - Add support for unrolling for-loops when range is a
wp.constant - Add support for arithmetic operators on built-in vector / matrix types outside of
wp.kernel - Add support for multiple solution variables in
wp.optimAdam optimization - Add nested attribute support for
wp.structattributes - Add missing adjoint implementations for spatial math types, and document all functions with missing adjoints
- Add support for retrieving NanoVDB tiles and voxel size, see
wp.Volume.get_tiles(), andwp.Volume.get_voxel_size() - Add support for store operations on integer NanoVDB volumes, see
wp.volume_store_i() - Expose
wp.Meshpoints, indices, as arrays inside kernels, seewp.mesh_get() - Optimizations for
wp.arrayconstruction, 2-3x faster on average - Optimizations for URDF import
- Fix various deployment issues by statically linking with all CUDA libs
- Update warp.so/warp.dll to CUDA Toolkit 11.5
[0.5.1] - 2022-11-01
- Fix for unit tests in Kit
[0.5.0] - 2022-10-31
- Add smoothed particle hydrodynamics (SPH) example, see
example_sph.py - Add support for accessing
array.shapeinside kernels, e.g.:width = arr.shape[0] - Add dependency tracking to hot-reload modules if dependencies were modified
- Add lazy acquisition of CUDA kernel contexts (save ~300Mb of GPU memory in MGPU environments)
- Add BVH object, see
wp.Bvhandbvh_query_ray(),bvh_query_aabb()functions - Add component index operations for
spatial_vector,spatial_matrixtypes - Add
wp.lerp()andwp.smoothstep()builtins - Add
wp.optimmodule with implementation of the Adam optimizer for float and vector types - Add support for transient Python modules (fix for Houdini integration)
- Add
wp.length_sq(),wp.trace()for vector / matrix types respectively - Add missing adjoints for
wp.quat_rpy(),wp.determinant() - Add
wp.atomic_min(),wp.atomic_max()operators - Add vectorized version of
warp.sim.model.add_cloth_mesh() - Add NVDB volume allocation API, see
wp.Volume.allocate(), andwp.Volume.allocate_by_tiles() - Add NVDB volume write methods, see
wp.volume_store_i(),wp.volume_store_f(),wp.volume_store_v() - Add MGPU documentation
- Add example showing how to compute Jacobian of multiple environments in parallel, see
example_jacobian_ik.py - Add
wp.Tape.zero()support forwp.structtypes - Make SampleBrowser an optional dependency for Kit extension
- Make
wp.Meshobject accept both 1d and 2d arrays of face vertex indices - Fix for reloading of class member kernel / function definitions using
importlib.reload() - Fix for hashing of
wp.constants()not invalidating kernels - Fix for reload when multiple
.ptxversions are present - Improved error reporting during code-gen
[0.4.3] - 2022-09-20
- Update all samples to use GPU interop path by default
- Fix for arrays > 2GB in length
- Add support for per-vertex USD mesh colors with warp.render class
[0.4.2] - 2022-09-07
- Register Warp samples to the sample browser in Kit
- Add NDEBUG flag to release mode kernel builds
- Fix for particle solver node when using a large number of particles
- Fix for broken cameras in Warp sample scenes
[0.4.1] - 2022-08-30
- Add geometry sampling methods, see
wp.sample_unit_cube(),wp.sample_unit_disk(), etc - Add
wp.lower_bound()for searching sorted arrays - Add an option for disabling code-gen of backward pass to improve compilation times, see
wp.set_module_options({"enable_backward": False}), True by default - Fix for using Warp from Script Editor or when module does not have a
__file__attribute - Fix for hot reload of modules containing
wp.func()definitions - Fix for debug flags not being set correctly on CUDA when
wp.config.mode == "debug", this enables bounds checking on CUDA kernels in debug mode - Fix for code gen of functions that do not return a value
[0.4.0] - 2022-08-09
- Fix for FP16 conversions on GPUs without hardware support
- Fix for
runtime = Noneerrors when reloading the Warp module - Fix for PTX architecture version when running with older drivers, see
wp.config.ptx_target_arch - Fix for USD imports from
__init__.py, defer them to individual functions that need them - Fix for robustness issues with sign determination for
wp.mesh_query_point() - Fix for
wp.HashGridmemory leak when creating/destroying grids - Add CUDA version checks for toolkit and driver
- Add support for cross-module
@wp.structreferences - Support running even if CUDA initialization failed, use
wp.is_cuda_available()to check availability - Statically linking with the CUDA runtime library to avoid deployment issues
Breaking Changes
- Removed
wp.runtimereference from the top-level module, as it should be considered private
[0.3.2] - 2022-07-19
- Remove Torch import from
__init__.py, defer import towp.from_torch(),wp.to_torch()
[0.3.1] - 2022-07-12
- Fix for marching cubes reallocation after initialization
- Add support for closest point between line segment tests, see
wp.closest_point_edge_edge()builtin - Add support for per-triangle elasticity coefficients in simulation, see
wp.sim.ModelBuilder.add_cloth_mesh() - Add support for specifying default device, see
wp.set_device(),wp.get_device(),wp.ScopedDevice - Add support for multiple GPUs (e.g.,
"cuda:0","cuda:1"), seewp.get_cuda_devices(),wp.get_cuda_device_count(),wp.get_cuda_device() - Add support for explicitly targeting the current CUDA context using device alias
"cuda" - Add support for using arbitrary external CUDA contexts, see
wp.map_cuda_device(),wp.unmap_cuda_device() - Add PyTorch device aliasing functions, see
wp.device_from_torch(),wp.device_to_torch()
Breaking Changes
- A CUDA device is used by default, if available (aligned with
wp.get_preferred_device()) wp.ScopedCudaGuardis deprecated, usewp.ScopedDeviceinsteadwp.synchronize()now synchronizes all devices; for finer-grained control, usewp.synchronize_device()- Device alias
"cuda"now refers to the current CUDA context, rather than a specific device like"cuda:0"or"cuda:1"
[0.3.0] - 2022-07-08
- Add support for FP16 storage type, see
wp.float16 - Add support for per-dimension byte strides, see
wp.array.strides - Add support for passing Python classes as kernel arguments, see
@wp.structdecorator - Add additional bounds checks for builtin matrix types
- Add additional floating point checks, see
wp.config.verify_fp - Add interleaved user source with generated code to aid debugging
- Add generalized GPU marching cubes implementation, see
wp.MarchingCubesclass - Add additional scalar*matrix vector operators
- Add support for retrieving a single row from builtin types, e.g.:
r = m33[i] - Add
wp.log2()andwp.log10()builtins - Add support for quickly instancing
wp.sim.ModelBuilderobjects to improve env. creation performance for RL - Remove custom CUB version and improve compatibility with CUDA 11.7
- Fix to preserve external user-gradients when calling
wp.Tape.zero() - Fix to only allocate gradient of a Torch tensor if
requires_grad=True - Fix for missing
wp.mat22constructor adjoint - Fix for ray-cast precision in edge case on GPU (watertightness issue)
- Fix for kernel hot-reload when definition changes
- Fix for NVCC warnings on Linux
- Fix for generated function names when kernels are defined as class functions
- Fix for reload of generated CPU kernel code on Linux
- Fix for example scripts to output USD at 60 timecodes per-second (better Kit compatibility)
[0.2.3] - 2022-06-13
- Fix for incorrect 4d array bounds checking
- Fix for
wp.constantchanges not updating module hash - Fix for stale CUDA kernel cache when CPU kernels launched first
- Array gradients are now allocated along with the arrays and accessible as
wp.array.grad, users should take care to always callwp.Tape.zero()to clear gradients between different invocations ofwp.Tape.backward() - Added
wp.array.fill_()to set all entries to a scalar value (4-byte values only currently)
Breaking Changes
- Tape
captureoption has been removed, users can now capture tapes inside existing CUDA graphs (e.g.: inside Torch) - Scalar loss arrays should now explicitly set
requires_grad=Trueat creation time
[0.2.2] - 2022-05-30
- Fix for
from import *inside Warp initialization - Fix for body space velocity when using deforming Mesh objects with scale
- Fix for noise gradient discontinuities affecting
wp.curlnoise() - Fix for
wp.from_torch()to correctly preserve shape - Fix for URDF parser incorrectly passing density to scale parameter
- Optimizations for startup time from 3s -> 0.3s
- Add support for custom kernel cache location, Warp will now store generated binaries in the user's application directory
- Add support for cross-module function references, e.g.: call another modules @wp.func functions
- Add support for overloading
@wp.funcfunctions based on argument type - Add support for calling built-in functions directly from Python interpreter outside kernels (experimental)
- Add support for auto-complete and docstring lookup for builtins in IDEs like VSCode, PyCharm, etc
- Add support for doing partial array copies, see
wp.copy()for details - Add support for accessing mesh data directly in kernels, see
wp.mesh_get_point(),wp.mesh_get_index(),wp.mesh_eval_face_normal() - Change to only compile for targets where kernel is launched (e.g.: will not compile CPU unless explicitly requested)
Breaking Changes
- Builtin methods such as
wp.quat_identity()now call the Warp native implementation directly and will return awp.quatobject instead of NumPy array - NumPy implementations of many builtin methods have been moved to
warp.utilsand will be deprecated - Local
@wp.funcfunctions should not be namespaced when called, e.g.: previouslywp.myfunc()would work even ifmyfunc()was not a builtin - Removed
wp.rpy2quat(), please usewp.quat_rpy()instead
[0.2.1] - 2022-05-11
- Fix for unit tests in Kit
[0.2.0] - 2022-05-02
Warp Core
- Fix for unrolling loops with negative bounds
- Fix for unresolved symbol
hash_grid_build_device()not found when lib is compiled without CUDA support - Fix for failure to load nvrtc-builtins64_113.dll when user has a newer CUDA toolkit installed on their machine
- Fix for conversion of Torch tensors to wp.arrays() with a vector dtype (incorrect row count)
- Fix for
warp.dllnot found on some Windows installations - Fix for macOS builds on Clang 13.x
- Fix for step-through debugging of kernels on Linux
- Add argument type checking for user defined
@wp.funcfunctions - Add support for custom iterable types, supports ranges, hash grid, and mesh query objects
- Add support for multi-dimensional arrays, for example use
x = array[i,j,k]syntax to address a 3-dimensional array - Add support for multi-dimensional kernel launches, use
launch(kernel, dim=(i,j,k), ...andi,j,k = wp.tid()to obtain thread indices - Add support for bounds-checking array memory accesses in debug mode, use
wp.config.mode = "debug"to enable - Add support for differentiating through dynamic and nested for-loops
- Add support for evaluating MLP neural network layers inside kernels with custom activation functions, see
wp.mlp() - Add additional NVDB sampling methods and adjoints, see
wp.volume_sample_i(),wp.volume_sample_f(), andwp.volume_sample_vec() - Add support for loading zlib compressed NVDB volumes, see
wp.Volume.load_from_nvdb() - Add support for triangle intersection testing, see
wp.intersect_tri_tri() - Add support for NVTX profile zones in
wp.ScopedTimer() - Add support for additional transform and quaternion math operations, see
wp.inverse(),wp.quat_to_matrix(),wp.quat_from_matrix() - Add fast math (
--fast-math) to kernel compilation by default - Add
warp.torchimport by default (if PyTorch is installed)
Warp Kit
- Add Kit menu for browsing Warp documentation and example scenes under 'Window->Warp'
- Fix for OgnParticleSolver.py example when collider is coming from Read Prim into Bundle node
Warp Sim
- Fix for joint attachment forces
- Fix for URDF importer and floating base support
- Add examples showing how to use differentiable forward kinematics to solve inverse kinematics
- Add examples for URDF cartpole and quadruped simulation
Breaking Changes
wp.volume_sample_world()is now replaced bywp.volume_sample_f/i/vec()which operate in index (local) space. Users should usewp.volume_world_to_index()to transform points from world space to index space before sampling.wp.mlp()expects multi-dimensional arrays instead of one-dimensional arrays for inference, all other semantics remain the same as earlier versions of this API.wp.array.lengthmember has been removed, please usewp.array.shapeto access array dimensions, or usewp.array.sizeto get total element count- Marking
dense_gemm(),dense_chol(), etc methods as experimental until we revisit them
[0.1.25] - 2022-03-20
- Add support for class methods to be Warp kernels
- Add HashGrid reserve() so it can be used with CUDA graphs
- Add support for CUDA graph capture of tape forward/backward passes
- Add support for Python 3.8.x and 3.9.x
- Add hyperbolic trigonometric functions, see wp.tanh(), wp.sinh(), wp.cosh()
- Add support for floored division on integer types
- Move tests into core library so they can be run in Kit environment
[0.1.24] - 2022-03-03
Warp Core
- Add NanoVDB support, see wp.volume_sample*() methods
- Add support for reading compile-time constants in kernels, see wp.constant()
- Add support for cuda_array_interface protocol for zero-copy interop with PyTorch, see wp.torch.to_torch()
- Add support for additional numeric types, i8, u8, i16, u16, etc
- Add better checks for device strings during allocation / launch
- Add support for sampling random numbers with a normal distribution, see wp.randn()
- Upgrade to CUDA 11.3
- Update example scenes to Kit 103.1
- Deduce array dtype from np.array when one is not provided
- Fix for ranged for loops with negative step sizes
- Fix for 3d and 4d spherical gradient distributions
[0.1.23] - 2022-02-17
Warp Core
- Fix for generated code folder being removed during Showroom installation
- Fix for macOS support
- Fix for dynamic for-loop code gen edge case
- Add procedural noise primitives, see noise(), pnoise(), curlnoise()
- Move simulation helpers our of test into warp.sim module
[0.1.22] - 2022-02-14
Warp Core
- Fix for .so reloading on Linux
- Fix for while loop code-gen in some edge cases
- Add rounding functions round(), rint(), trunc(), floor(), ceil()
- Add support for printing strings and formatted strings from kernels
- Add MSVC compiler version detection and require minimum
Warp Sim
- Add support for universal and compound joint types
[0.1.21] - 2022-01-19
Warp Core
- Fix for exception on shutdown in empty wp.array objects
- Fix for hot reload of CPU kernels in Kit
- Add hash grid primitive for point-based spatial queries, see hash_grid_query(), hash_grid_query_next()
- Add new PRNG methods using PCG-based generators, see rand_init(), randf(), randi()
- Add support for AABB mesh queries, see mesh_query_aabb(), mesh_query_aabb_next()
- Add support for all Python range() loop variants
- Add builtin vec2 type and additional math operators, pow(), tan(), atan(), atan2()
- Remove dependency on CUDA driver library at build time
- Remove unused NVRTC binary dependencies (50mb smaller Linux distribution)
Warp Sim
- Bundle import of multiple shapes for simulation nodes
- New OgnParticleVolume node for sampling shapes -> particles
- New OgnParticleSolver node for DEM style granular materials
[0.1.20] - 2021-11-02
- Updates to the ripple solver for GTC (support for multiple colliders, buoyancy, etc)
[0.1.19] - 2021-10-15
- Publish from 2021.3 to avoid omni.graph database incompatibilities
[0.1.18] - 2021-10-08
- Enable Linux support (tested on 20.04)
[0.1.17] - 2021-09-30
- Fix for 3x3 SVD adjoint
- Fix for A6000 GPU (bump compute model to sm_52 minimum)
- Fix for .dll unload on rebuild
- Fix for possible array destruction warnings on shutdown
- Rename spatial_transform -> transform
- Documentation update
[0.1.16] - 2021-09-06
- Fix for case where simple assignments (a = b) incorrectly generated reference rather than value copy
- Handle passing zero-length (empty) arrays to kernels
[0.1.15] - 2021-09-03
- Add additional math library functions (asin, etc)
- Add builtin 3x3 SVD support
- Add support for named constants (True, False, None)
- Add support for if/else statements (differentiable)
- Add custom memset kernel to avoid CPU overhead of cudaMemset()
- Add rigid body joint model to warp.sim (based on Brax)
- Add Linux, MacOS support in core library
- Fix for incorrectly treating pure assignment as reference instead of value copy
- Removes the need to transfer array to CPU before numpy conversion (will be done implicitly)
- Update the example OgnRipple wave equation solver to use bundles
[0.1.14] - 2021-08-09
- Fix for out-of-bounds memory access in CUDA BVH
- Better error checking after kernel launches (use warp.config.verify_cuda=True)
- Fix for vec3 normalize adjoint code
[0.1.13] - 2021-07-29
- Remove OgnShrinkWrap.py test node
[0.1.12] - 2021-07-29
- Switch to Woop et al.'s watertight ray-tri intersection test
- Disable --fast-math in CUDA compilation step for improved precision
[0.1.11] - 2021-07-28
- Fix for mesh_query_ray() returning incorrect t-value
[0.1.10] - 2021-07-28
- Fix for OV extension fwatcher filters to avoid hot-reload loop due to OGN regeneration
[0.1.9] - 2021-07-21
- Fix for loading sibling DLL paths
- Better type checking for built-in function arguments
- Added runtime docs, can now list all builtins using wp.print_builtins()
[0.1.8] - 2021-07-14
- Fix for hot-reload of CUDA kernels
- Add Tape object for replaying differentiable kernels
- Add helpers for Torch interop (convert torch.Tensor to wp.Array)
[0.1.7] - 2021-07-05
- Switch to NVRTC for CUDA runtime
- Allow running without host compiler
- Disable asserts in kernel release mode (small perf. improvement)
[0.1.6] - 2021-06-14
- Look for CUDA toolchain in target-deps
[0.1.5] - 2021-06-14
- Rename OgLang -> Warp
- Improve CUDA environment error checking
- Clean-up some logging, add verbose mode (warp.config.verbose)
[0.1.4] - 2021-06-10
- Add support for mesh raycast
[0.1.3] - 2021-06-09
- Add support for unary negation operator
- Add support for mutating variables during dynamic loops (non-differentiable)
- Add support for in-place operators
- Improve kernel cache start up times (avoids adjointing before cache check)
- Update README.md with requirements / examples
[0.1.2] - 2021-06-03
Add support for querying mesh velocities
Add CUDA graph support, see warp.capture_begin(), warp.capture_end(), warp.capture_launch()
Add explicit initialization phase, warp.init()
Add variational Euler solver (sim)
Add contact caching, switch to nonlinear friction model (sim)
Fix for Linux/macOS support
[0.1.1] - 2021-05-18
- Fix bug with conflicting CUDA contexts
[0.1.0] - 2021-05-17
- Initial publish for alpha testing