Kernels:

kernels-community
/

triton-layer-norm

Trusted publisher

Kernel card Files Files and versions

triton-layer-norm / README.md

drbh

Migrated from kernels-community/triton-layer-norm

f17bf36 unverified about 2 months ago

|

history blame contribute delete

3.3 kB

	---
	license: bsd-3-clause
	tags:
	- kernels
	---
	# Triton layer normalization kernels.

	This kernel implements layers normalization using Triton. This kernel is from
	the [flash-attention](https://github.com/Dao-AILab/flash-attention) project.

	## Functions

	### Function `layer_norm`

	`(x: torch.Tensor, weight: torch.Tensor, bias: torch.Tensor, residual: Optional[torch.Tensor] = None, x1: Optional[torch.Tensor] = None, weight1: Optional[torch.Tensor] = None, bias1: Optional[torch.Tensor] = None, eps: float = 1e-06, dropout_p: float = 0.0, rowscale=None, prenorm: bool = False, residual_in_fp32: bool = False, is_rms_norm: bool = False, return_dropout_mask: bool = False, out: Optional[torch.Tensor] = None, residual_out: Optional[torch.Tensor] = None)`

	Apply layer normalization to the input tensor with Triton acceleration.

	### Parameters

	- x (torch.Tensor) --
	Input tensor to normalize.
	- weight (torch.Tensor) --
	Scale parameter for normalization.
	- bias (torch.Tensor) --
	Shift parameter for normalization.
	- residual (torch.Tensor, optional) --
	Optional residual tensor to add to the input before normalization.
	- x1 (torch.Tensor, optional) --
	Optional second input tensor to combine with x. When provided, the function
	first adds x1 to x and then applies normalization.
	- weight1 (torch.Tensor, optional) --
	Scale parameter for the second normalization.
	- bias1 (torch.Tensor, optional) --
	Shift parameter for the second normalization.
	- eps (float, optional, defaults to 1e-6) --
	Small constant added for numerical stability in normalization.
	- dropout_p (float, optional, defaults to 0.0) --
	Dropout probability. If greater than 0, applies dropout to the input before
	normalization and residual addition.
	- rowscale (torch.Tensor, optional) --
	Optional scaling factor applied to each row of the input tensor.
	Not compatible with the use of x1.
	- prenorm (bool, optional, defaults to False) --
	If True, returns both the normalized output and the unnormalized input+residual.
	- residual_in_fp32 (bool, optional, defaults to False) --
	If True, performs the residual connection in FP32 precision.
	- is_rms_norm (bool, optional, defaults to False) --
	If True, uses RMS normalization instead of layer normalization.
	- return_dropout_mask (bool, optional, defaults to False) --
	If True, returns the dropout mask used for the computation.
	- out (torch.Tensor, optional) --
	Output tensor for the normalized result. If None, a new tensor is allocated.
	- residual_out (torch.Tensor, optional) --
	Output tensor for the residual result when using prenorm. If None, a new tensor
	is allocated when needed.

	### Returns

	Type: torch.Tensor or tuple of torch.Tensor

	- The normalized input.
	- The second normalization of the input if weight1 is provided.
	- The residual tensor if prenorm is set.
	- The dropout mask if return_dropout_mask is set.
	- The dropout mask for x1 if x1 is provided and return_dropout_mask is set.

	## Layers

	### Class `LlamaRMSNorm`

	No documentation available.

	#### Methods

	##### Method `forward`

	`(self, hidden_states: torch.Tensor) -> torch.Tensor`

	No documentation available.