Spaces:
Sleeping
Sleeping
| # CHANGELOG | |
| ## [1.0.0-beta.6] - 2024-01-10 | |
| - Do not create CPU copy of grad array when calling `array.numpy()` | |
| - Fix `assert_np_equal()` bug | |
| - Support Linux AArch64 platforms, including Jetson/Tegra devices | |
| - Add parallel testing runner (invoke with `python -m warp.tests`, use `warp/tests/unittest_serial.py` for serial testing) | |
| - Fix support for function calls in `range()` | |
| - `matmul` adjoints now accumulate | |
| - Expand available operators (e.g. vector @ matrix, scalar as dividend) and improve support for calling native built-ins | |
| - Fix multi-gpu synchronization issue in `sparse.py` | |
| - Add depth rendering to `OpenGLRenderer`, document `warp.render` | |
| - Make `atomic_min`, `atomic_max` differentiable | |
| - Fix error reporting using the exact source segment | |
| - Add user-friendly mesh query overloads, returning a struct instead of overwriting parameters | |
| - Address multiple differentiability issues | |
| - Fix backpropagation for returning array element references | |
| - Support passing the return value to adjoints | |
| - Add point basis space and explicit point-based quadrature for `warp.fem` | |
| - Support overriding the LLVM project source directory path using `build_lib.py --build_llvm --llvm_source_path=` | |
| - Fix the error message for accessing non-existing attributes | |
| - Flatten faces array for Mesh constructor in URDF parser | |
| ## [1.0.0-beta.5] - 2023-11-22 | |
| - Fix for kernel caching when function argument types change | |
| - Fix code-gen ordering of dependent structs | |
| - Fix for `wp.Mesh` build on MGPU systems | |
| - Fix for name clash bug with adjoint code: https://github.com/NVIDIA/warp/issues/154 | |
| - Add `wp.frac()` for returning the fractional part of a floating point value | |
| - Add support for custom native CUDA snippets using `@wp.func_native` decorator | |
| - Add support for batched matmul with batch size > 2^16-1 | |
| - Add support for transposed CUTLASS `wp.matmul()` and additional error checking | |
| - Add support for quad and hex meshes in `wp.fem` | |
| - Detect and warn when C++ runtime doesn't match compiler during build, e.g.: ``libstdc++.so.6: version `GLIBCXX_3.4.30' not found`` | |
| - Documentation update for `wp.BVH` | |
| - Documentation and simplified API for runtime kernel specialization `wp.Kernel` | |
| ## [1.0.0-beta.4] - 2023-11-01 | |
| - Add `wp.cbrt()` for cube root calculation | |
| - Add `wp.mesh_furthest_point_no_sign()` to compute furthest point on a surface from a query point | |
| - Add support for GPU BVH builds, 10-100x faster than CPU builds for large meshes | |
| - Add support for chained comparisons, i.e.: `0 < x < 2` | |
| - Add support for running `warp.fem` examples headless | |
| - Fix for unit test determinism | |
| - Fix for possible GC collection of array during graph capture | |
| - Fix for `wp.utils.array_sum()` output initialization when used with vector types | |
| - Coverage and documentation updates | |
| ## [1.0.0-beta.3] - 2023-10-19 | |
| - Add support for code coverage scans (test_coverage.py), coverage at 85% in omni.warp.core | |
| - Add support for named component access for vector types, e.g.: `a = v.x` | |
| - Add support for lvalue expressions, e.g.: `array[i] += b` | |
| - Add casting constructors for matrix and vector types | |
| - Add support for `type()` operator that can be used to return type inside kernels | |
| - Add support for grid-stride kernels to support kernels with > 2^31-1 thread blocks | |
| - Fix for multi-process initialization warnings | |
| - Fix alignment issues with empty `wp.struct` | |
| - Fix for return statement warning with tuple-returning functions | |
| - Fix for `wp.batched_matmul()` registering the wrong function in the Tape | |
| - Fix and document for `wp.sim` forward + inverse kinematics | |
| - Fix for `wp.func` to return a default value if function does not return on all control paths | |
| - Refactor `wp.fem` support for new basis functions, decoupled function spaces | |
| - Optimizations for `wp.noise` functions, up to 10x faster in most cases | |
| - Optimizations for `type_size_in_bytes()` used in array construction' | |
| ### Breaking Changes | |
| - To support grid-stride kernels, `wp.tid()` can no longer be called inside `wp.func` functions. | |
| ## [1.0.0-beta.2] - 2023-09-01 | |
| - Fix for passing bool into `wp.func` functions | |
| - Fix for deprecation warnings appearing on `stderr`, now redirected to `stdout` | |
| - Fix for using `for i in wp.hash_grid_query(..)` syntax | |
| ## [1.0.0-beta.1] - 2023-08-29 | |
| - Fix for `wp.float16` being passed as kernel arguments | |
| - Fix for compile errors with kernels using structs in backward pass | |
| - Fix for `wp.Mesh.refit()` not being CUDA graph capturable due to synchronous temp. allocs | |
| - Fix for dynamic texture example flickering / MGPU crashes demo in Kit by reusing `ui.DynamicImageProvider` instances | |
| - Fix for a regression that disabled bundle change tracking in samples | |
| - Fix for incorrect surface velocities when meshes are deforming in `OgnClothSimulate` | |
| - Fix for incorrect lower-case when setting USD stage "up_axis" in examples | |
| - Fix for incompatible gradient types when wrapping PyTorch tensor as a vector or matrix type | |
| - Fix for adding open edges when building cloth constraints from meshes in `wp.sim.ModelBuilder.add_cloth_mesh()` | |
| - Add support for `wp.fabricarray` to directly access Fabric data from Warp kernels, see https://omniverse.gitlab-master-pages.nvidia.com/usdrt/docs/usdrt_prim_selection.html for examples | |
| - Add support for user defined gradient functions, see `@wp.func_replay`, and `@wp.func_grad` decorators | |
| - Add support for more OG attribute types in `omni.warp.from_omni_graph()` | |
| - Add support for creating NanoVDB `wp.Volume` objects from dense NumPy arrays | |
| - Add support for `wp.volume_sample_grad_f()` which returns the value + gradient efficiently from an NVDB volume | |
| - Add support for LLVM fp16 intrinsics for half-precision arithmetic | |
| - Add implementation of stochastic gradient descent, see `wp.optim.SGD` | |
| - Add `warp.fem` framework for solving weak-form PDE problems (see https://nvidia.github.io/warp/_build/html/modules/fem.html) | |
| - Optimizations for `omni.warp` extension load time (2.2s to 625ms cold start) | |
| - Make all `omni.ui` dependencies optional so that Warp unit tests can run headless | |
| - Deprecation of `wp.tid()` outside of kernel functions, users should pass `tid()` values to `wp.func` functions explicitly | |
| - Deprecation of `wp.sim.Model.flatten()` for returning all contained tensors from the model | |
| - Add support for clamping particle max velocity in `wp.sim.Model.particle_max_velocity` | |
| - Remove dependency on `urdfpy` package, improve MJCF parser handling of default values | |
| ## [0.10.1] - 2023-07-25 | |
| - Fix for large multidimensional kernel launches (> 2^32 threads) | |
| - Fix for module hashing with generics | |
| - Fix for unrolling loops with break or continue statements (will skip unrolling) | |
| - Fix for passing boolean arguments to build_lib.py (previously ignored) | |
| - Fix build warnings on Linux | |
| - Fix for creating array of structs from NumPy structured array | |
| - Fix for regression on kernel load times in Kit when using warp.sim | |
| - Update `warp.array.reshape()` to handle `-1` dimensions | |
| - Update margin used by for mesh queries when using `wp.sim.create_soft_body_contacts()` | |
| - Improvements to gradient handling with `warp.from_torch()`, `warp.to_torch()` plus documentation | |
| ## [0.10.0] - 2023-07-05 | |
| - Add support for macOS universal binaries (x86 + aarch64) for M1+ support | |
| - Add additional methods for SDF generation please see the following new methods: | |
| - `wp.mesh_query_point_nosign()` - closest point query with no sign determination | |
| - `wp.mesh_query_point_sign_normal()` - closest point query with sign from angle-weighted normal | |
| - `wp.mesh_query_point_sign_winding_number()` - closest point query with fast winding number sign determination | |
| - Add CSR/BSR sparse matrix support, see `warp.sparse` module: | |
| - `wp.sparse.BsrMatrix` | |
| - `wp.sparse.bsr_zeros()`, `wp.sparse.bsr_set_from_triplets()` for construction | |
| - `wp.sparse.bsr_mm()`, `wp.sparse_bsr_mv()` for matrix-matrix and matrix-vector products respectively | |
| - Add array-wide utilities: | |
| - `wp.utils.array_scan()` - prefix sum (inclusive or exclusive) | |
| - `wp.utils.array_sum()` - sum across array | |
| - `wp.utils.radix_sort_pairs()` - in-place radix sort (key,value) pairs | |
| - Add support for calling `@wp.func` functions from Python (outside of kernel scope) | |
| - Add support for recording kernel launches using a `wp.Launch` object that can be replayed with low overhead, use `wp.launch(..., record_cmd=True)` to generate a command object | |
| - Optimizations for `wp.struct` kernel arguments, up to 20x faster launches for kernels with large structs or number of params | |
| - Refresh USD samples to use bundle based workflow + change tracking | |
| - Add Python API for manipulating mesh and point bundle data in OmniGraph, see `omni.warp.nodes` module | |
| - See `omni.warp.nodes.mesh_create_bundle()`, `omni.warp.nodes.mesh_get_points()`, etc. | |
| - Improvements to `wp.array`: | |
| - Fix a number of array methods misbehaving with empty arrays | |
| - Fix a number of bugs and memory leaks related to gradient arrays | |
| - Fix array construction when creating arrays in pinned memory from a data source in pageable memory | |
| - `wp.empty()` no longer zeroes-out memory and returns an uninitialized array, as intended | |
| - `array.zero_()` and `array.fill_()` work with non-contiguous arrays | |
| - Support wrapping non-contiguous NumPy arrays without a copy | |
| - Support preserving the outer dimensions of NumPy arrays when wrapping them as Warp arrays of vector or matrix types | |
| - Improve PyTorch and DLPack interop with Warp arrays of arbitrary vectors and matrices | |
| - `array.fill_()` can now take lists or other sequences when filling arrays of vectors or matrices, e.g. `arr.fill_([[1, 2], [3, 4]])` | |
| - `array.fill_()` now works with arrays of structs (pass a struct instance) | |
| - `wp.copy()` gracefully handles copying between non-contiguous arrays on different devices | |
| - Add `wp.full()` and `wp.full_like()`, e.g., `a = wp.full(shape, value)` | |
| - Add optional `device` argument to `wp.empty_like()`, `wp.zeros_like()`, `wp.full_like()`, and `wp.clone()` | |
| - Add `indexedarray` methods `.zero_()`, `.fill_()`, and `.assign()` | |
| - Fix `indexedarray` methods `.numpy()` and `.list()` | |
| - Fix `array.list()` to work with arrays of any Warp data type | |
| - Fix `array.list()` synchronization issue with CUDA arrays | |
| - `array.numpy()` called on an array of structs returns a structured NumPy array with named fields | |
| - Improve the performance of creating arrays | |
| - Fix for `Error: No module named 'omni.warp.core'` when running some Kit configurations (e.g.: stubgen) | |
| - Fix for `wp.struct` instance address being included in module content hash | |
| - Fix codegen with overridden function names | |
| - Fix for kernel hashing so it occurs after code generation and before loading to fix a bug with stale kernel cache | |
| - Fix for `wp.BVH.refit()` when executed on the CPU | |
| - Fix adjoint of `wp.struct` constructor | |
| - Fix element accessors for `wp.float16` vectors and matrices in Python | |
| - Fix `wp.float16` members in structs | |
| - Remove deprecated `wp.ScopedCudaGuard()`, please use `wp.ScopedDevice()` instead | |
| ## [0.9.0] - 2023-06-01 | |
| - Add support for in-place modifications to vector, matrix, and struct types inside kernels (will warn during backward pass with `wp.verbose` if using gradients) | |
| - Add support for step-through VSCode debugging of kernel code with standalone LLVM compiler, see `wp.breakpoint()`, and `walkthrough_debug.py` | |
| - Add support for default values on built-in functions | |
| - Add support for multi-valued `@wp.func` functions | |
| - Add support for `pass`, `continue`, and `break` statements | |
| - Add missing `__sincos_stret` symbol for macOS | |
| - Add support for gradient propagation through `wp.Mesh.points`, and other cases where arrays are passed to native functions | |
| - Add support for Python `@` operator as an alias for `wp.matmul()` | |
| - Add XPBD support for particle-particle collision | |
| - Add support for individual particle radii: `ModelBuilder.add_particle` has a new `radius` argument, `Model.particle_radius` is now a Warp array | |
| - Add per-particle flags as a `Model.particle_flags` Warp array, introduce `PARTICLE_FLAG_ACTIVE` to define whether a particle is being simulated and participates in contact dynamics | |
| - Add support for Python bitwise operators `&`, `|`, `~`, `<<`, `>>` | |
| - Switch to using standalone LLVM compiler by default for `cpu` devices | |
| - Split `omni.warp` into `omni.warp.core` for Omniverse applications that want to use the Warp Python module with minimal additional dependencies | |
| - Disable kernel gradient generation by default inside Omniverse for improved compile times | |
| - Fix for bounds checking on element access of vector/matrix types | |
| - Fix for stream initialization when a custom (non-primary) external CUDA context has been set on the calling thread | |
| - Fix for duplicate `@wp.struct` registration during hot reload | |
| - Fix for array `unot()` operator so kernel writers can use `if not array:` syntax | |
| - Fix for case where dynamic loops are nested within unrolled loops | |
| - Change `wp.hash_grid_point_id()` now returns -1 if the `wp.HashGrid` has not been reserved before | |
| - Deprecate `wp.Model.soft_contact_distance` which is now replaced by `wp.Model.particle_radius` | |
| - Deprecate single scalar particle radius (should be a per-particle array) | |
| ## [0.8.2] - 2023-04-21 | |
| - Add `ModelBuilder.soft_contact_max` to control the maximum number of soft contacts that can be registered. Use `Model.allocate_soft_contacts(new_count)` to change count on existing `Model` objects. | |
| - Add support for `bool` parameters | |
| - Add support for logical boolean operators with `int` types | |
| - Fix for `wp.quat()` default constructor | |
| - Fix conditional reassignments | |
| - Add sign determination using angle weighted normal version of `wp.mesh_query_point()` as `wp.mesh_query_sign_normal()` | |
| - Add sign determination using winding number of `wp.mesh_query_point()` as `wp.mesh_query_sign_winding_number()` | |
| - Add query point without sign determination `wp.mesh_query_no_sign()` | |
| ## [0.8.1] - 2023-04-13 | |
| - Fix for regression when passing flattened numeric lists as matrix arguments to kernels | |
| - Fix for regressions when passing `wp.struct` types with uninitialized (`None`) member attributes | |
| ## [0.8.0] - 2023-04-05 | |
| - Add `Texture Write` node for updating dynamic RTX textures from Warp kernels / nodes | |
| - Add multi-dimensional kernel support to Warp Kernel Node | |
| - Add `wp.load_module()` to pre-load specific modules (pass `recursive=True` to load recursively) | |
| - Add `wp.poisson()` for sampling Poisson distributions | |
| - Add support for UsdPhysics schema see `warp.sim.parse_usd()` | |
| - Add XPBD rigid body implementation plus diff. simulation examples | |
| - Add support for standalone CPU compilation (no host-compiler) with LLVM backed, enable with `--standalone` build option | |
| - Add support for per-timer color in `wp.ScopedTimer()` | |
| - Add support for row-based construction of matrix types outside of kernels | |
| - Add support for setting and getting row vectors for Python matrices, see `matrix.get_row()`, `matrix.set_row()` | |
| - Add support for instantiating `wp.struct` types within kernels | |
| - Add support for indexed arrays, `slice = array[indices]` will now generate a sparse slice of array data | |
| - Add support for generic kernel params, use `def compute(param: Any):` | |
| - Add support for `with wp.ScopedDevice("cuda") as device:` syntax (same for `wp.ScopedStream()`, `wp.Tape()`) | |
| - Add support for creating custom length vector/matrices inside kernels, see `wp.vector()`, and `wp.matrix()` | |
| - Add support for creating identity matrices in kernels with, e.g.: `I = wp.identity(n=3, dtype=float)` | |
| - Add support for unary plus operator (`wp.pos()`) | |
| - Add support for `wp.constant` variables to be used directly in Python without having to use `.val` member | |
| - Add support for nested `wp.struct` types | |
| - Add support for returning `wp.struct` from functions | |
| - Add `--quick` build for faster local dev. iteration (uses a reduced set of SASS arches) | |
| - Add optional `requires_grad` parameter to `wp.from_torch()` to override gradient allocation | |
| - Add type hints for generic vector / matrix types in Python stubs | |
| - Add support for custom user function recording in `wp.Tape()` | |
| - Add support for registering CUTLASS `wp.matmul()` with tape backward pass | |
| - Add support for grids with > 2^31 threads (each dimension may be up to INT_MAX in length) | |
| - Add CPU fallback for `wp.matmul()` | |
| - Optimizations for `wp.launch()`, up to 3x faster launches in common cases | |
| - Fix `wp.randf()` conversion to float to reduce bias for uniform sampling | |
| - Fix capture of `wp.func` and `wp.constant` types from inside Python closures | |
| - Fix for CUDA on WSL | |
| - Fix for matrices in structs | |
| - Fix for transpose indexing for some non-square matrices | |
| - Enable Python faulthandler by default | |
| - Update to VS2019 | |
| ### Breaking Changes | |
| - `wp.constant` variables can now be treated as their true type, accessing the underlying value through `constant.val` is no longer supported | |
| - `wp.sim.model.ground_plane` is now a `wp.array` to support gradient, users should call `builder.set_ground_plane()` to create the ground | |
| - `wp.sim` capsule, cones, and cylinders are now aligned with the default USD up-axis | |
| ## [0.7.2] - 2023-02-15 | |
| - Reduce test time for vec/math types | |
| - Clean-up CUDA disabled build pipeline | |
| - Remove extension.gen.toml to make Kit packages Python version independent | |
| - Handle additional cases for array indexing inside Python | |
| ## [0.7.1] - 2023-02-14 | |
| - Disabling some slow tests for Kit | |
| - Make unit tests run on first GPU only by default | |
| ## [0.7.0] - 2023-02-13 | |
| - Add support for arbitrary length / type vector and matrices e.g.: `wp.vec(length=7, dtype=wp.float16)`, see `wp.vec()`, and `wp.mat()` | |
| - Add support for `array.flatten()`, `array.reshape()`, and `array.view()` with NumPy semantics | |
| - Add support for slicing `wp.array` types in Python | |
| - Add `wp.from_ptr()` helper to construct arrays from an existing allocation | |
| - Add support for `break` statements in ranged-for and while loops (backward pass support currently not implemented) | |
| - Add built-in mathematic constants, see `wp.pi`, `wp.e`, `wp.log2e`, etc. | |
| - Add built-in conversion between degrees and radians, see `wp.degrees()`, `wp.radians()` | |
| - Add security pop-up for Kernel Node | |
| - Improve error handling for kernel return values | |
| ## [0.6.3] - 2023-01-31 | |
| - Add DLPack utilities, see `wp.from_dlpack()`, `wp.to_dlpack()` | |
| - Add Jax utilities, see `wp.from_jax()`, `wp.to_jax()`, `wp.device_from_jax()`, `wp.device_to_jax()` | |
| - Fix for Linux Kit extensions OM-80132, OM-80133 | |
| ## [0.6.2] - 2023-01-19 | |
| - Updated `wp.from_torch()` to support more data types | |
| - Updated `wp.from_torch()` to automatically determine the target Warp data type if not specified | |
| - Updated `wp.from_torch()` to support non-contiguous tensors with arbitrary strides | |
| - Add CUTLASS integration for dense GEMMs, see `wp.matmul()` and `wp.matmul_batched()` | |
| - Add QR and Eigen decompositions for `mat33` types, see `wp.qr3()`, and `wp.eig3()` | |
| - Add default (zero) constructors for matrix types | |
| - Add a flag to suppress all output except errors and warnings (set `wp.config.quiet = True`) | |
| - Skip recompilation when Kernel Node attributes are edited | |
| - Allow optional attributes for Kernel Node | |
| - Allow disabling backward pass code-gen on a per-kernel basis, use `@wp.kernel(enable_backward=False)` | |
| - Replace Python `imp` package with `importlib` | |
| - Fix for quaternion slerp gradients (`wp.quat_slerp()`) | |
| ## [0.6.1] - 2022-12-05 | |
| - Fix for non-CUDA builds | |
| - Fix strides computation in array_t constructor, fixes a bug with accessing mesh indices through mesh.indices[] | |
| - Disable backward pass code generation for kernel node (4-6x faster compilation) | |
| - Switch to linbuild for universal Linux binaries (affects TeamCity builds only) | |
| ## [0.6.0] - 2022-11-28 | |
| - Add support for CUDA streams, see `wp.Stream`, `wp.get_stream()`, `wp.set_stream()`, `wp.synchronize_stream()`, `wp.ScopedStream` | |
| - Add support for CUDA events, see `wp.Event`, `wp.record_event()`, `wp.wait_event()`, `wp.wait_stream()`, `wp.Stream.record_event()`, `wp.Stream.wait_event()`, `wp.Stream.wait_stream()` | |
| - Add support for PyTorch stream interop, see `wp.stream_from_torch()`, `wp.stream_to_torch()` | |
| - Add support for allocating host arrays in pinned memory for asynchronous data transfers, use `wp.array(..., pinned=True)` (default is non-pinned) | |
| - Add support for direct conversions between all scalar types, e.g.: `x = wp.uint8(wp.float64(3.0))` | |
| - Add per-module option to enable fast math, use `wp.set_module_options({"fast_math": True})`, fast math is now *disabled* by default | |
| - Add support for generating CUBIN kernels instead of PTX on systems with older drivers | |
| - Add user preference options for CUDA kernel output ("ptx" or "cubin", e.g.: `wp.config.cuda_output = "ptx"` or per-module `wp.set_module_options({"cuda_output": "ptx"})`) | |
| - Add kernel node for OmniGraph | |
| - Add `wp.quat_slerp()`, `wp.quat_to_axis_angle()`, `wp.rotate_rodriquez()` and adjoints for all remaining quaternion operations | |
| - Add support for unrolling for-loops when range is a `wp.constant` | |
| - Add support for arithmetic operators on built-in vector / matrix types outside of `wp.kernel` | |
| - Add support for multiple solution variables in `wp.optim` Adam optimization | |
| - Add nested attribute support for `wp.struct` attributes | |
| - Add missing adjoint implementations for spatial math types, and document all functions with missing adjoints | |
| - Add support for retrieving NanoVDB tiles and voxel size, see `wp.Volume.get_tiles()`, and `wp.Volume.get_voxel_size()` | |
| - Add support for store operations on integer NanoVDB volumes, see `wp.volume_store_i()` | |
| - Expose `wp.Mesh` points, indices, as arrays inside kernels, see `wp.mesh_get()` | |
| - Optimizations for `wp.array` construction, 2-3x faster on average | |
| - Optimizations for URDF import | |
| - Fix various deployment issues by statically linking with all CUDA libs | |
| - Update warp.so/warp.dll to CUDA Toolkit 11.5 | |
| ## [0.5.1] - 2022-11-01 | |
| - Fix for unit tests in Kit | |
| ## [0.5.0] - 2022-10-31 | |
| - Add smoothed particle hydrodynamics (SPH) example, see `example_sph.py` | |
| - Add support for accessing `array.shape` inside kernels, e.g.: `width = arr.shape[0]` | |
| - Add dependency tracking to hot-reload modules if dependencies were modified | |
| - Add lazy acquisition of CUDA kernel contexts (save ~300Mb of GPU memory in MGPU environments) | |
| - Add BVH object, see `wp.Bvh` and `bvh_query_ray()`, `bvh_query_aabb()` functions | |
| - Add component index operations for `spatial_vector`, `spatial_matrix` types | |
| - Add `wp.lerp()` and `wp.smoothstep()` builtins | |
| - Add `wp.optim` module with implementation of the Adam optimizer for float and vector types | |
| - Add support for transient Python modules (fix for Houdini integration) | |
| - Add `wp.length_sq()`, `wp.trace()` for vector / matrix types respectively | |
| - Add missing adjoints for `wp.quat_rpy()`, `wp.determinant()` | |
| - Add `wp.atomic_min()`, `wp.atomic_max()` operators | |
| - Add vectorized version of `warp.sim.model.add_cloth_mesh()` | |
| - Add NVDB volume allocation API, see `wp.Volume.allocate()`, and `wp.Volume.allocate_by_tiles()` | |
| - Add NVDB volume write methods, see `wp.volume_store_i()`, `wp.volume_store_f()`, `wp.volume_store_v()` | |
| - Add MGPU documentation | |
| - Add example showing how to compute Jacobian of multiple environments in parallel, see `example_jacobian_ik.py` | |
| - Add `wp.Tape.zero()` support for `wp.struct` types | |
| - Make SampleBrowser an optional dependency for Kit extension | |
| - Make `wp.Mesh` object accept both 1d and 2d arrays of face vertex indices | |
| - Fix for reloading of class member kernel / function definitions using `importlib.reload()` | |
| - Fix for hashing of `wp.constants()` not invalidating kernels | |
| - Fix for reload when multiple `.ptx` versions are present | |
| - Improved error reporting during code-gen | |
| ## [0.4.3] - 2022-09-20 | |
| - Update all samples to use GPU interop path by default | |
| - Fix for arrays > 2GB in length | |
| - Add support for per-vertex USD mesh colors with warp.render class | |
| ## [0.4.2] - 2022-09-07 | |
| - Register Warp samples to the sample browser in Kit | |
| - Add NDEBUG flag to release mode kernel builds | |
| - Fix for particle solver node when using a large number of particles | |
| - Fix for broken cameras in Warp sample scenes | |
| ## [0.4.1] - 2022-08-30 | |
| - Add geometry sampling methods, see `wp.sample_unit_cube()`, `wp.sample_unit_disk()`, etc | |
| - Add `wp.lower_bound()` for searching sorted arrays | |
| - Add an option for disabling code-gen of backward pass to improve compilation times, see `wp.set_module_options({"enable_backward": False})`, True by default | |
| - Fix for using Warp from Script Editor or when module does not have a `__file__` attribute | |
| - Fix for hot reload of modules containing `wp.func()` definitions | |
| - Fix for debug flags not being set correctly on CUDA when `wp.config.mode == "debug"`, this enables bounds checking on CUDA kernels in debug mode | |
| - Fix for code gen of functions that do not return a value | |
| ## [0.4.0] - 2022-08-09 | |
| - Fix for FP16 conversions on GPUs without hardware support | |
| - Fix for `runtime = None` errors when reloading the Warp module | |
| - Fix for PTX architecture version when running with older drivers, see `wp.config.ptx_target_arch` | |
| - Fix for USD imports from `__init__.py`, defer them to individual functions that need them | |
| - Fix for robustness issues with sign determination for `wp.mesh_query_point()` | |
| - Fix for `wp.HashGrid` memory leak when creating/destroying grids | |
| - Add CUDA version checks for toolkit and driver | |
| - Add support for cross-module `@wp.struct` references | |
| - Support running even if CUDA initialization failed, use `wp.is_cuda_available()` to check availability | |
| - Statically linking with the CUDA runtime library to avoid deployment issues | |
| ### Breaking Changes | |
| - Removed `wp.runtime` reference from the top-level module, as it should be considered private | |
| ## [0.3.2] - 2022-07-19 | |
| - Remove Torch import from `__init__.py`, defer import to `wp.from_torch()`, `wp.to_torch()` | |
| ## [0.3.1] - 2022-07-12 | |
| - Fix for marching cubes reallocation after initialization | |
| - Add support for closest point between line segment tests, see `wp.closest_point_edge_edge()` builtin | |
| - Add support for per-triangle elasticity coefficients in simulation, see `wp.sim.ModelBuilder.add_cloth_mesh()` | |
| - Add support for specifying default device, see `wp.set_device()`, `wp.get_device()`, `wp.ScopedDevice` | |
| - Add support for multiple GPUs (e.g., `"cuda:0"`, `"cuda:1"`), see `wp.get_cuda_devices()`, `wp.get_cuda_device_count()`, `wp.get_cuda_device()` | |
| - Add support for explicitly targeting the current CUDA context using device alias `"cuda"` | |
| - Add support for using arbitrary external CUDA contexts, see `wp.map_cuda_device()`, `wp.unmap_cuda_device()` | |
| - Add PyTorch device aliasing functions, see `wp.device_from_torch()`, `wp.device_to_torch()` | |
| ### Breaking Changes | |
| - A CUDA device is used by default, if available (aligned with `wp.get_preferred_device()`) | |
| - `wp.ScopedCudaGuard` is deprecated, use `wp.ScopedDevice` instead | |
| - `wp.synchronize()` now synchronizes all devices; for finer-grained control, use `wp.synchronize_device()` | |
| - Device alias `"cuda"` now refers to the current CUDA context, rather than a specific device like `"cuda:0"` or `"cuda:1"` | |
| ## [0.3.0] - 2022-07-08 | |
| - Add support for FP16 storage type, see `wp.float16` | |
| - Add support for per-dimension byte strides, see `wp.array.strides` | |
| - Add support for passing Python classes as kernel arguments, see `@wp.struct` decorator | |
| - Add additional bounds checks for builtin matrix types | |
| - Add additional floating point checks, see `wp.config.verify_fp` | |
| - Add interleaved user source with generated code to aid debugging | |
| - Add generalized GPU marching cubes implementation, see `wp.MarchingCubes` class | |
| - Add additional scalar*matrix vector operators | |
| - Add support for retrieving a single row from builtin types, e.g.: `r = m33[i]` | |
| - Add `wp.log2()` and `wp.log10()` builtins | |
| - Add support for quickly instancing `wp.sim.ModelBuilder` objects to improve env. creation performance for RL | |
| - Remove custom CUB version and improve compatibility with CUDA 11.7 | |
| - Fix to preserve external user-gradients when calling `wp.Tape.zero()` | |
| - Fix to only allocate gradient of a Torch tensor if `requires_grad=True` | |
| - Fix for missing `wp.mat22` constructor adjoint | |
| - Fix for ray-cast precision in edge case on GPU (watertightness issue) | |
| - Fix for kernel hot-reload when definition changes | |
| - Fix for NVCC warnings on Linux | |
| - Fix for generated function names when kernels are defined as class functions | |
| - Fix for reload of generated CPU kernel code on Linux | |
| - Fix for example scripts to output USD at 60 timecodes per-second (better Kit compatibility) | |
| ## [0.2.3] - 2022-06-13 | |
| - Fix for incorrect 4d array bounds checking | |
| - Fix for `wp.constant` changes not updating module hash | |
| - Fix for stale CUDA kernel cache when CPU kernels launched first | |
| - Array gradients are now allocated along with the arrays and accessible as `wp.array.grad`, users should take care to always call `wp.Tape.zero()` to clear gradients between different invocations of `wp.Tape.backward()` | |
| - Added `wp.array.fill_()` to set all entries to a scalar value (4-byte values only currently) | |
| ### Breaking Changes | |
| - Tape `capture` option has been removed, users can now capture tapes inside existing CUDA graphs (e.g.: inside Torch) | |
| - Scalar loss arrays should now explicitly set `requires_grad=True` at creation time | |
| ## [0.2.2] - 2022-05-30 | |
| - Fix for `from import *` inside Warp initialization | |
| - Fix for body space velocity when using deforming Mesh objects with scale | |
| - Fix for noise gradient discontinuities affecting `wp.curlnoise()` | |
| - Fix for `wp.from_torch()` to correctly preserve shape | |
| - Fix for URDF parser incorrectly passing density to scale parameter | |
| - Optimizations for startup time from 3s -> 0.3s | |
| - Add support for custom kernel cache location, Warp will now store generated binaries in the user's application directory | |
| - Add support for cross-module function references, e.g.: call another modules @wp.func functions | |
| - Add support for overloading `@wp.func` functions based on argument type | |
| - Add support for calling built-in functions directly from Python interpreter outside kernels (experimental) | |
| - Add support for auto-complete and docstring lookup for builtins in IDEs like VSCode, PyCharm, etc | |
| - Add support for doing partial array copies, see `wp.copy()` for details | |
| - Add support for accessing mesh data directly in kernels, see `wp.mesh_get_point()`, `wp.mesh_get_index()`, `wp.mesh_eval_face_normal()` | |
| - Change to only compile for targets where kernel is launched (e.g.: will not compile CPU unless explicitly requested) | |
| ### Breaking Changes | |
| - Builtin methods such as `wp.quat_identity()` now call the Warp native implementation directly and will return a `wp.quat` object instead of NumPy array | |
| - NumPy implementations of many builtin methods have been moved to `warp.utils` and will be deprecated | |
| - Local `@wp.func` functions should not be namespaced when called, e.g.: previously `wp.myfunc()` would work even if `myfunc()` was not a builtin | |
| - Removed `wp.rpy2quat()`, please use `wp.quat_rpy()` instead | |
| ## [0.2.1] - 2022-05-11 | |
| - Fix for unit tests in Kit | |
| ## [0.2.0] - 2022-05-02 | |
| ### Warp Core | |
| - Fix for unrolling loops with negative bounds | |
| - Fix for unresolved symbol `hash_grid_build_device()` not found when lib is compiled without CUDA support | |
| - Fix for failure to load nvrtc-builtins64_113.dll when user has a newer CUDA toolkit installed on their machine | |
| - Fix for conversion of Torch tensors to wp.arrays() with a vector dtype (incorrect row count) | |
| - Fix for `warp.dll` not found on some Windows installations | |
| - Fix for macOS builds on Clang 13.x | |
| - Fix for step-through debugging of kernels on Linux | |
| - Add argument type checking for user defined `@wp.func` functions | |
| - Add support for custom iterable types, supports ranges, hash grid, and mesh query objects | |
| - Add support for multi-dimensional arrays, for example use `x = array[i,j,k]` syntax to address a 3-dimensional array | |
| - Add support for multi-dimensional kernel launches, use `launch(kernel, dim=(i,j,k), ...` and `i,j,k = wp.tid()` to obtain thread indices | |
| - Add support for bounds-checking array memory accesses in debug mode, use `wp.config.mode = "debug"` to enable | |
| - Add support for differentiating through dynamic and nested for-loops | |
| - Add support for evaluating MLP neural network layers inside kernels with custom activation functions, see `wp.mlp()` | |
| - Add additional NVDB sampling methods and adjoints, see `wp.volume_sample_i()`, `wp.volume_sample_f()`, and `wp.volume_sample_vec()` | |
| - Add support for loading zlib compressed NVDB volumes, see `wp.Volume.load_from_nvdb()` | |
| - Add support for triangle intersection testing, see `wp.intersect_tri_tri()` | |
| - Add support for NVTX profile zones in `wp.ScopedTimer()` | |
| - Add support for additional transform and quaternion math operations, see `wp.inverse()`, `wp.quat_to_matrix()`, `wp.quat_from_matrix()` | |
| - Add fast math (`--fast-math`) to kernel compilation by default | |
| - Add `warp.torch` import by default (if PyTorch is installed) | |
| ### Warp Kit | |
| - Add Kit menu for browsing Warp documentation and example scenes under 'Window->Warp' | |
| - Fix for OgnParticleSolver.py example when collider is coming from Read Prim into Bundle node | |
| ### Warp Sim | |
| - Fix for joint attachment forces | |
| - Fix for URDF importer and floating base support | |
| - Add examples showing how to use differentiable forward kinematics to solve inverse kinematics | |
| - Add examples for URDF cartpole and quadruped simulation | |
| ### Breaking Changes | |
| - `wp.volume_sample_world()` is now replaced by `wp.volume_sample_f/i/vec()` which operate in index (local) space. Users should use `wp.volume_world_to_index()` to transform points from world space to index space before sampling. | |
| - `wp.mlp()` expects multi-dimensional arrays instead of one-dimensional arrays for inference, all other semantics remain the same as earlier versions of this API. | |
| - `wp.array.length` member has been removed, please use `wp.array.shape` to access array dimensions, or use `wp.array.size` to get total element count | |
| - Marking `dense_gemm()`, `dense_chol()`, etc methods as experimental until we revisit them | |
| ## [0.1.25] - 2022-03-20 | |
| - Add support for class methods to be Warp kernels | |
| - Add HashGrid reserve() so it can be used with CUDA graphs | |
| - Add support for CUDA graph capture of tape forward/backward passes | |
| - Add support for Python 3.8.x and 3.9.x | |
| - Add hyperbolic trigonometric functions, see wp.tanh(), wp.sinh(), wp.cosh() | |
| - Add support for floored division on integer types | |
| - Move tests into core library so they can be run in Kit environment | |
| ## [0.1.24] - 2022-03-03 | |
| ### Warp Core | |
| - Add NanoVDB support, see wp.volume_sample*() methods | |
| - Add support for reading compile-time constants in kernels, see wp.constant() | |
| - Add support for __cuda_array_interface__ protocol for zero-copy interop with PyTorch, see wp.torch.to_torch() | |
| - Add support for additional numeric types, i8, u8, i16, u16, etc | |
| - Add better checks for device strings during allocation / launch | |
| - Add support for sampling random numbers with a normal distribution, see wp.randn() | |
| - Upgrade to CUDA 11.3 | |
| - Update example scenes to Kit 103.1 | |
| - Deduce array dtype from np.array when one is not provided | |
| - Fix for ranged for loops with negative step sizes | |
| - Fix for 3d and 4d spherical gradient distributions | |
| ## [0.1.23] - 2022-02-17 | |
| ### Warp Core | |
| - Fix for generated code folder being removed during Showroom installation | |
| - Fix for macOS support | |
| - Fix for dynamic for-loop code gen edge case | |
| - Add procedural noise primitives, see noise(), pnoise(), curlnoise() | |
| - Move simulation helpers our of test into warp.sim module | |
| ## [0.1.22] - 2022-02-14 | |
| ### Warp Core | |
| - Fix for .so reloading on Linux | |
| - Fix for while loop code-gen in some edge cases | |
| - Add rounding functions round(), rint(), trunc(), floor(), ceil() | |
| - Add support for printing strings and formatted strings from kernels | |
| - Add MSVC compiler version detection and require minimum | |
| ### Warp Sim | |
| - Add support for universal and compound joint types | |
| ## [0.1.21] - 2022-01-19 | |
| ### Warp Core | |
| - Fix for exception on shutdown in empty wp.array objects | |
| - Fix for hot reload of CPU kernels in Kit | |
| - Add hash grid primitive for point-based spatial queries, see hash_grid_query(), hash_grid_query_next() | |
| - Add new PRNG methods using PCG-based generators, see rand_init(), randf(), randi() | |
| - Add support for AABB mesh queries, see mesh_query_aabb(), mesh_query_aabb_next() | |
| - Add support for all Python range() loop variants | |
| - Add builtin vec2 type and additional math operators, pow(), tan(), atan(), atan2() | |
| - Remove dependency on CUDA driver library at build time | |
| - Remove unused NVRTC binary dependencies (50mb smaller Linux distribution) | |
| ### Warp Sim | |
| - Bundle import of multiple shapes for simulation nodes | |
| - New OgnParticleVolume node for sampling shapes -> particles | |
| - New OgnParticleSolver node for DEM style granular materials | |
| ## [0.1.20] - 2021-11-02 | |
| - Updates to the ripple solver for GTC (support for multiple colliders, buoyancy, etc) | |
| ## [0.1.19] - 2021-10-15 | |
| - Publish from 2021.3 to avoid omni.graph database incompatibilities | |
| ## [0.1.18] - 2021-10-08 | |
| - Enable Linux support (tested on 20.04) | |
| ## [0.1.17] - 2021-09-30 | |
| - Fix for 3x3 SVD adjoint | |
| - Fix for A6000 GPU (bump compute model to sm_52 minimum) | |
| - Fix for .dll unload on rebuild | |
| - Fix for possible array destruction warnings on shutdown | |
| - Rename spatial_transform -> transform | |
| - Documentation update | |
| ## [0.1.16] - 2021-09-06 | |
| - Fix for case where simple assignments (a = b) incorrectly generated reference rather than value copy | |
| - Handle passing zero-length (empty) arrays to kernels | |
| ## [0.1.15] - 2021-09-03 | |
| - Add additional math library functions (asin, etc) | |
| - Add builtin 3x3 SVD support | |
| - Add support for named constants (True, False, None) | |
| - Add support for if/else statements (differentiable) | |
| - Add custom memset kernel to avoid CPU overhead of cudaMemset() | |
| - Add rigid body joint model to warp.sim (based on Brax) | |
| - Add Linux, MacOS support in core library | |
| - Fix for incorrectly treating pure assignment as reference instead of value copy | |
| - Removes the need to transfer array to CPU before numpy conversion (will be done implicitly) | |
| - Update the example OgnRipple wave equation solver to use bundles | |
| ## [0.1.14] - 2021-08-09 | |
| - Fix for out-of-bounds memory access in CUDA BVH | |
| - Better error checking after kernel launches (use warp.config.verify_cuda=True) | |
| - Fix for vec3 normalize adjoint code | |
| ## [0.1.13] - 2021-07-29 | |
| - Remove OgnShrinkWrap.py test node | |
| ## [0.1.12] - 2021-07-29 | |
| - Switch to Woop et al.'s watertight ray-tri intersection test | |
| - Disable --fast-math in CUDA compilation step for improved precision | |
| ## [0.1.11] - 2021-07-28 | |
| - Fix for mesh_query_ray() returning incorrect t-value | |
| ## [0.1.10] - 2021-07-28 | |
| - Fix for OV extension fwatcher filters to avoid hot-reload loop due to OGN regeneration | |
| ## [0.1.9] - 2021-07-21 | |
| - Fix for loading sibling DLL paths | |
| - Better type checking for built-in function arguments | |
| - Added runtime docs, can now list all builtins using wp.print_builtins() | |
| ## [0.1.8] - 2021-07-14 | |
| - Fix for hot-reload of CUDA kernels | |
| - Add Tape object for replaying differentiable kernels | |
| - Add helpers for Torch interop (convert torch.Tensor to wp.Array) | |
| ## [0.1.7] - 2021-07-05 | |
| - Switch to NVRTC for CUDA runtime | |
| - Allow running without host compiler | |
| - Disable asserts in kernel release mode (small perf. improvement) | |
| ## [0.1.6] - 2021-06-14 | |
| - Look for CUDA toolchain in target-deps | |
| ## [0.1.5] - 2021-06-14 | |
| - Rename OgLang -> Warp | |
| - Improve CUDA environment error checking | |
| - Clean-up some logging, add verbose mode (warp.config.verbose) | |
| ## [0.1.4] - 2021-06-10 | |
| - Add support for mesh raycast | |
| ## [0.1.3] - 2021-06-09 | |
| - Add support for unary negation operator | |
| - Add support for mutating variables during dynamic loops (non-differentiable) | |
| - Add support for in-place operators | |
| - Improve kernel cache start up times (avoids adjointing before cache check) | |
| - Update README.md with requirements / examples | |
| ## [0.1.2] - 2021-06-03 | |
| - Add support for querying mesh velocities | |
| - Add CUDA graph support, see warp.capture_begin(), warp.capture_end(), warp.capture_launch() | |
| - Add explicit initialization phase, warp.init() | |
| - Add variational Euler solver (sim) | |
| - Add contact caching, switch to nonlinear friction model (sim) | |
| - Fix for Linux/macOS support | |
| ## [0.1.1] - 2021-05-18 | |
| - Fix bug with conflicting CUDA contexts | |
| ## [0.1.0] - 2021-05-17 | |
| - Initial publish for alpha testing | |