Enumerations | |
enum class | contraction_type { local , no_local } |
enum class | data_source { global_mem , local_mem , private_mem } |
enum class | reduction_dim { inner_most , outer_most } |
enum class | scan_step { first , second } |
Functions | |
template<bool is_internal> | |
bool | check_boundary (bool) |
check_boundary: is used to check the edge condition for non-internal blocks. More... | |
template<> | |
bool | check_boundary< false > (bool cond) |
check_boundary: specialization of the check_boundary for non-internal blocks. More... | |
template<bool PacketLoad, bool , bool IsRhs, typename PacketType , typename TensorMapper , typename StorageIndex > | |
static std::enable_if_t<!PacketLoad, PacketType > | read (const TensorMapper &tensorMapper, const StorageIndex &NCIndex, const StorageIndex &CIndex, const StorageIndex &) |
read, special overload of read function, when the read access is not vectorized More... | |
template<bool PacketLoad, bool is_coalesced_layout, bool , typename PacketType , typename TensorMapper , typename StorageIndex > | |
static std::enable_if_t< PacketLoad, PacketType > | read (const TensorMapper &tensorMapper, const StorageIndex &NCIndex, const StorageIndex &CIndex, const StorageIndex &ld) |
read, a template function used for loading the data from global memory. This function is used to guarantee coalesced and vectorized load whenever possible More... | |
template<data_source dt, typename PacketType , typename DataScalar > | |
static std::enable_if_t< Eigen::internal::unpacket_traits< PacketType >::size !=1 &&dt==data_source::global_mem, void > | write (PacketType &packet_data, DataScalar *ptr) |
Overloading the write function for storing the data to global memory, when vectorization enabled This function is used to guarantee coalesced and vectorized store whenever possible. More... | |
template<data_source dt, typename PacketType , typename DataScalar > | |
static std::enable_if_t< Eigen::internal::unpacket_traits< PacketType >::size==1 &&dt==data_source::global_mem, void > | write (PacketType &packet_data, DataScalar *ptr) |
Overloading the write function for storing the data to global memory, when vectorization is disabled. More... | |
template<typename StorageIndex , StorageIndex ld, data_source dt, typename PacketType , typename DataScalar > | |
static std::enable_if_t< dt !=data_source::global_mem, void > | write (PacketType &packet_data, DataScalar ptr) |
write, a template function used for storing the data to local memory. This function is used to guarantee coalesced and vectorized store whenever possible. More... | |
|
strong |
|
strong |
Enumerator | |
---|---|
global_mem | |
local_mem | |
private_mem |
Definition at line 132 of file TensorContractionSycl.h.
|
strong |
|
strong |
Enumerator | |
---|---|
first | |
second |
Definition at line 82 of file TensorScanSycl.h.
|
inline |
check_boundary: is used to check the edge condition for non-internal blocks.
is_internal | determines if the block is internal |
Definition at line 279 of file TensorContractionSycl.h.
|
inline |
check_boundary: specialization of the check_boundary for non-internal blocks.
cond | true when the data is in range. Otherwise false |
Definition at line 289 of file TensorContractionSycl.h.
|
inlinestatic |
read, special overload of read function, when the read access is not vectorized
PacketLoad | determines if the each element of this tensor block should be loaded in a packet mode |
is_coalesced_layout | determines whether or not the Tensor data in a memory can be access coalesced and vectorized when possible. Coalesced memory access is a key factor in Kernel performance. When a tensor is 2d and the contracting dimension is 1, it is always possible to accessed tensor data coalesced and vectorized. This is the case when RHS(right hand side) Tensor is transposed or when LHS(left hand side) Tensor is not transposed. |
PacketType | determines the type of packet |
TensorMapper | determines the input tensor mapper type |
StorageIndex | determines the Index type |
tensorMapper | is the input tensor |
NCIndex | is the non-contracting dim index |
CIndex | is the contracting dim index |
Definition at line 191 of file TensorContractionSycl.h.
|
inlinestatic |
read, a template function used for loading the data from global memory. This function is used to guarantee coalesced and vectorized load whenever possible
PacketLoad | determines if the each element of this tensor block should be loaded in a packet mode |
is_coalesced_layout | determines whether or not the Tensor data in a memory can be access coalesced and vectorized when possible. Coalesced memory access is a key factor in Kernel performance. When a tensor is 2d and the contracting dimension is 1, it is always possible to accessed tensor data coalesced and vectorized. This is the case when RHS(right hand side) Tensor is transposed or when LHS(left hand side) Tensor is not transposed. |
PacketType | determines the type of packet |
TensorMapper | determines the input tensor mapper type |
StorageIndex | determines the Index type |
tensorMapper | is the input tensor |
NCIndex | is the non-contracting dim index |
CIndex | is the contracting dim index |
ld | is the leading dimension of the flattened tensor |
Definition at line 161 of file TensorContractionSycl.h.
|
inlinestatic |
Overloading the write function for storing the data to global memory, when vectorization enabled This function is used to guarantee coalesced and vectorized store whenever possible.
data_source | an enum value representing if the location of the data in a memory hierarchy. |
PacketType | determines the type of packet |
DataScalar | determines the output data type |
packet_data | the data to be written in the local memory |
ptr | a pointer to the local memory |
Definition at line 249 of file TensorContractionSycl.h.
|
inlinestatic |
Overloading the write function for storing the data to global memory, when vectorization is disabled.
data_source | an enum value representing if the location of the data in a memory hierarchy. |
PacketType | determines the type of packet |
DataScalar | determines the output data type |
packet_data | the data to be written in the local memory |
ptr | a pointer to the local memory |
Definition at line 269 of file TensorContractionSycl.h.
|
inlinestatic |
write, a template function used for storing the data to local memory. This function is used to guarantee coalesced and vectorized store whenever possible.
StorageIndex | determines the Index type |
ld | is the leading dimension of the local memory. ld is a compile time value for the local memory |
data_source | an enum value representing if the location of the data in a memory hierarchy. |
PacketType | determines the type of packet |
DataScalar | determines the output data type |
packet_data | the data to be written in the local memory |
ptr | a pointer to the local memory |
CIndex | is the contracting dim index |
Definition at line 222 of file TensorContractionSycl.h.