tmp
/
pip-install-ghxuqwgs
/numpy_78e94bf2b6094bf9a1f3d92042f9bf46
/doc
/source
/reference
/c-api.generalized-ufuncs.rst
| ================================== | |
| Generalized Universal Function API | |
| ================================== | |
| There is a general need for looping over not only functions on scalars | |
| but also over functions on vectors (or arrays). | |
| This concept is realized in Numpy by generalizing the universal functions | |
| (ufuncs). In regular ufuncs, the elementary function is limited to | |
| element-by-element operations, whereas the generalized version (gufuncs) | |
| supports "sub-array" by "sub-array" operations. The Perl vector library PDL | |
| provides a similar functionality and its terms are re-used in the following. | |
| Each generalized ufunc has information associated with it that states | |
| what the "core" dimensionality of the inputs is, as well as the | |
| corresponding dimensionality of the outputs (the element-wise ufuncs | |
| have zero core dimensions). The list of the core dimensions for all | |
| arguments is called the "signature" of a ufunc. For example, the | |
| ufunc numpy.add has signature ``(),()->()`` defining two scalar inputs | |
| and one scalar output. | |
| Another example is the function ``inner1d(a,b)`` with a signature of | |
| ``(i),(i)->()``. This applies the inner product along the last axis of | |
| each input, but keeps the remaining indices intact. | |
| For example, where ``a`` is of shape ``(3,5,N)`` | |
| and ``b`` is of shape ``(5,N)``, this will return an output of shape ``(3,5)``. | |
| The underlying elementary function is called ``3 * 5`` times. In the | |
| signature, we specify one core dimension ``(i)`` for each input and zero core | |
| dimensions ``()`` for the output, since it takes two 1-d arrays and | |
| returns a scalar. By using the same name ``i``, we specify that the two | |
| corresponding dimensions should be of the same size (or one of them is | |
| of size 1 and will be broadcasted). | |
| The dimensions beyond the core dimensions are called "loop" dimensions. In | |
| the above example, this corresponds to ``(3,5)``. | |
| The usual numpy "broadcasting" rules apply, where the signature | |
| determines how the dimensions of each input/output object are split | |
| into core and loop dimensions: | |
| #. While an input array has a smaller dimensionality than the corresponding | |
| number of core dimensions, 1's are pre-pended to its shape. | |
| #. The core dimensions are removed from all inputs and the remaining | |
| dimensions are broadcasted; defining the loop dimensions. | |
| #. The output is given by the loop dimensions plus the output core dimensions. | |
| Definitions | |
| ----------- | |
| Elementary Function | |
| Each ufunc consists of an elementary function that performs the | |
| most basic operation on the smallest portion of array arguments | |
| (e.g. adding two numbers is the most basic operation in adding two | |
| arrays). The ufunc applies the elementary function multiple times | |
| on different parts of the arrays. The input/output of elementary | |
| functions can be vectors; e.g., the elementary function of inner1d | |
| takes two vectors as input. | |
| Signature | |
| A signature is a string describing the input/output dimensions of | |
| the elementary function of a ufunc. See section below for more | |
| details. | |
| Core Dimension | |
| The dimensionality of each input/output of an elementary function | |
| is defined by its core dimensions (zero core dimensions correspond | |
| to a scalar input/output). The core dimensions are mapped to the | |
| last dimensions of the input/output arrays. | |
| Dimension Name | |
| A dimension name represents a core dimension in the signature. | |
| Different dimensions may share a name, indicating that they are of | |
| the same size (or are broadcastable). | |
| Dimension Index | |
| A dimension index is an integer representing a dimension name. It | |
| enumerates the dimension names according to the order of the first | |
| occurrence of each name in the signature. | |
| Details of Signature | |
| -------------------- | |
| The signature defines "core" dimensionality of input and output | |
| variables, and thereby also defines the contraction of the | |
| dimensions. The signature is represented by a string of the | |
| following format: | |
| * Core dimensions of each input or output array are represented by a | |
| list of dimension names in parentheses, ``(i_1,...,i_N)``; a scalar | |
| input/output is denoted by ``()``. Instead of ``i_1``, ``i_2``, | |
| etc, one can use any valid Python variable name. | |
| * Dimension lists for different arguments are separated by ``","``. | |
| Input/output arguments are separated by ``"->"``. | |
| * If one uses the same dimension name in multiple locations, this | |
| enforces the same size (or broadcastable size) of the corresponding | |
| dimensions. | |
| The formal syntax of signatures is as follows:: | |
| <Signature> ::= <Input arguments> "->" <Output arguments> | |
| <Input arguments> ::= <Argument list> | |
| <Output arguments> ::= <Argument list> | |
| <Argument list> ::= nil | <Argument> | <Argument> "," <Argument list> | |
| <Argument> ::= "(" <Core dimension list> ")" | |
| <Core dimension list> ::= nil | <Dimension name> | | |
| <Dimension name> "," <Core dimension list> | |
| <Dimension name> ::= valid Python variable name | |
| Notes: | |
| #. All quotes are for clarity. | |
| #. Core dimensions that share the same name must be broadcastable, as | |
| the two ``i`` in our example above. Each dimension name typically | |
| corresponding to one level of looping in the elementary function's | |
| implementation. | |
| #. White spaces are ignored. | |
| Here are some examples of signatures: | |
| +-------------+------------------------+-----------------------------------+ | |
| | add | ``(),()->()`` | | | |
| +-------------+------------------------+-----------------------------------+ | |
| | inner1d | ``(i),(i)->()`` | | | |
| +-------------+------------------------+-----------------------------------+ | |
| | sum1d | ``(i)->()`` | | | |
| +-------------+------------------------+-----------------------------------+ | |
| | dot2d | ``(m,n),(n,p)->(m,p)`` | matrix multiplication | | |
| +-------------+------------------------+-----------------------------------+ | |
| | outer_inner | ``(i,t),(j,t)->(i,j)`` | inner over the last dimension, | | |
| | | | outer over the second to last, | | |
| | | | and loop/broadcast over the rest. | | |
| +-------------+------------------------+-----------------------------------+ | |
| C-API for implementing Elementary Functions | |
| ------------------------------------------- | |
| The current interface remains unchanged, and ``PyUFunc_FromFuncAndData`` | |
| can still be used to implement (specialized) ufuncs, consisting of | |
| scalar elementary functions. | |
| One can use ``PyUFunc_FromFuncAndDataAndSignature`` to declare a more | |
| general ufunc. The argument list is the same as | |
| ``PyUFunc_FromFuncAndData``, with an additional argument specifying the | |
| signature as C string. | |
| Furthermore, the callback function is of the same type as before, | |
| ``void (*foo)(char **args, intp *dimensions, intp *steps, void *func)``. | |
| When invoked, ``args`` is a list of length ``nargs`` containing | |
| the data of all input/output arguments. For a scalar elementary | |
| function, ``steps`` is also of length ``nargs``, denoting the strides used | |
| for the arguments. ``dimensions`` is a pointer to a single integer | |
| defining the size of the axis to be looped over. | |
| For a non-trivial signature, ``dimensions`` will also contain the sizes | |
| of the core dimensions as well, starting at the second entry. Only | |
| one size is provided for each unique dimension name and the sizes are | |
| given according to the first occurrence of a dimension name in the | |
| signature. | |
| The first ``nargs`` elements of ``steps`` remain the same as for scalar | |
| ufuncs. The following elements contain the strides of all core | |
| dimensions for all arguments in order. | |
| For example, consider a ufunc with signature ``(i,j),(i)->()``. In | |
| this case, ``args`` will contain three pointers to the data of the | |
| input/output arrays ``a``, ``b``, ``c``. Furthermore, ``dimensions`` will be | |
| ``[N, I, J]`` to define the size of ``N`` of the loop and the sizes ``I`` and ``J`` | |
| for the core dimensions ``i`` and ``j``. Finally, ``steps`` will be | |
| ``[a_N, b_N, c_N, a_i, a_j, b_i]``, containing all necessary strides. | |