File size: 2,830 Bytes
e36aeda
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
// Copyright 2025 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

//go:build goexperiment.simd

// Package archsimd provides access to architecture-specific SIMD operations.
//
// This is a low-level package that exposes hardware-specific functionality.
// It currently supports AMD64.
//
// This package is experimental, and not subject to the Go 1 compatibility promise.
// It only exists when building with the GOEXPERIMENT=simd environment variable set.
//
// # Vector types and operations
//
// Vector types are defined as structs, such as Int8x16 and Float64x8, corresponding
// to the hardware's vector registers. On AMD64, 128-, 256-, and 512-bit vectors are
// supported.
//
// Mask types are defined similarly, such as Mask8x16, and are represented as
// opaque types, handling the differences in the underlying representations.
// A mask can be converted to/from the corresponding integer vector type, or
// to/from a bitmask.
//
// Operations are mostly defined as methods on the vector types. Most of them
// are compiler intrinsics and correspond directly to hardware instructions.
//
// Common operations include:
//   - Load/Store: Load a vector from memory or store a vector to memory.
//   - Arithmetic: Add, Sub, Mul, etc.
//   - Bitwise: And, Or, Xor, etc.
//   - Comparison: Equal, Greater, etc., which produce a mask.
//   - Conversion: Convert between different vector types.
//   - Field selection and rearrangement: GetElem, Permute, etc.
//   - Masking: Masked, Merge.
//
// The compiler recognizes certain patterns of operations and may optimize
// them to more performant instructions. For example, on AVX512, an Add operation
// followed by Masked may be optimized to a masked add instruction.
// For this reason, not all hardware instructions are available as APIs.
//
// # CPU feature checks
//
// The package provides global variables to check for CPU features available
// at runtime. For example, on AMD64, the [X86] variable provides methods to
// check for AVX2, AVX512, etc.
// It is recommended to check for CPU features before using the corresponding
// vector operations.
//
// # Notes
//
//   - This package is not portable, as the available types and operations depend
//     on the target architecture. It is not recommended to expose the SIMD types
//     defined in this package in public APIs.
//   - For performance reasons, it is recommended to use the vector types directly
//     as values. It is not recommended to take the address of a vector type,
//     allocate it in the heap, or put it in an aggregate type.
package archsimd

// BUG(cherry): Using a vector type as a type parameter may not work.

// BUG(cherry): Using reflect Call to call a vector function/method may not work.