Spaces:
Runtime error
Runtime error
BioMistral_gradio
/
llama-cpp-python
/vendor
/llama.cpp
/kompute
/docs
/overview
/async-parallel.rst
| Asynchronous and Parallel Operations | |
| ============= | |
| In GPU computing it is possible to have multiple levels of asynchronous and parallel processing of GPU tasks. | |
| It is important to understand the conceptual distinctions of the diffent terminology when using each of these components. | |
| In this section we will cover the following points: | |
| * Asynchronous operation submission | |
| * Parallel processing of operations | |
| You can also find the published `blog post on the topic using Kompute <https://towardsdatascience.com/parallelizing-heavy-gpu-workloads-via-multi-queue-operations-50a38b15a1dc>`_, which covers the points discussed in this section further. | |
| Below is the architecture we'll be covering further in the parallel operations section through command submission across multiple family queues. | |
| .. image:: ../images/queue-allocation.jpg | |
| :width: 100% | |
| Asynchronous operation submission | |
| --------------------------------- | |
| As the name implies, this refers to the asynchronous submission of operations. This means that operations can be submitted to the GPU, and the C++ / host CPU can continue performing tasks, until when the user desires to run `await` to wait until the operation finishes. | |
| This basically provides further granularity on vk::Fences, which is its means to enable the CPU host to know when GPU commands have finished executing. | |
| It is important that submitting tasks asynchronously, does not mean that these will be executed in parallel. Parallel execution of operations will be covered in the following section. | |
| Asynchronous operation submission can be achieved through the :class:`kp::Manager`, or directly through the :class:`kp::Sequence`. Below is an example using the Kompute manager. | |
| Conceptual Overview | |
| ^^^^^^^^^^^^^^^^^^^^^ | |
| Asynchronous job submission is done using `evalOpAsync` and `evalOpAwait` functions. | |
| For simplicity the `evalOpAsyncDefault` and `evalOpAwaitDefault` functions are provided, which can be used similar to the synchronous counterparts (which basically use the default named sequence). | |
| One important thing to bare in mind when using asynchronous submissions, is that you should make sure that any overlapping asynchronous functions are run in separate sequences. | |
| The reason why this is important is that the Await function not only waits for the fence, but also runs the `postEval` functions across all operations, which is required for several operations. | |
| Async and Parallel Examples | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| We have added a set of examples for asynchronous and parallel processing examples in the `Advanced Examples documentation page <advanced-examples.rst>`_ | |