|
template<bool is_internal> |
bool | check_boundary (bool) |
| check_boundary: is used to check the edge condition for non-internal blocks. More...
|
|
template<> |
bool | check_boundary< false > (bool cond) |
| check_boundary: specialization of the check_boundary for non-internal blocks. More...
|
|
template<bool PacketLoad, bool , bool IsRhs, typename PacketType , typename TensorMapper , typename StorageIndex > |
static std::enable_if_t<!PacketLoad, PacketType > | read (const TensorMapper &tensorMapper, const StorageIndex &NCIndex, const StorageIndex &CIndex, const StorageIndex &) |
| read, special overload of read function, when the read access is not vectorized More...
|
|
template<bool PacketLoad, bool is_coalesced_layout, bool , typename PacketType , typename TensorMapper , typename StorageIndex > |
static std::enable_if_t< PacketLoad, PacketType > | read (const TensorMapper &tensorMapper, const StorageIndex &NCIndex, const StorageIndex &CIndex, const StorageIndex &ld) |
| read, a template function used for loading the data from global memory. This function is used to guarantee coalesced and vectorized load whenever possible More...
|
|
template<data_source dt, typename PacketType , typename DataScalar > |
static std::enable_if_t< Eigen::internal::unpacket_traits< PacketType >::size !=1 &&dt==data_source::global_mem, void > | write (PacketType &packet_data, DataScalar *ptr) |
| Overloading the write function for storing the data to global memory, when vectorization enabled This function is used to guarantee coalesced and vectorized store whenever possible. More...
|
|
template<data_source dt, typename PacketType , typename DataScalar > |
static std::enable_if_t< Eigen::internal::unpacket_traits< PacketType >::size==1 &&dt==data_source::global_mem, void > | write (PacketType &packet_data, DataScalar *ptr) |
| Overloading the write function for storing the data to global memory, when vectorization is disabled. More...
|
|
template<typename StorageIndex , StorageIndex ld, data_source dt, typename PacketType , typename DataScalar > |
static std::enable_if_t< dt !=data_source::global_mem, void > | write (PacketType &packet_data, DataScalar ptr) |
| write, a template function used for storing the data to local memory. This function is used to guarantee coalesced and vectorized store whenever possible. More...
|
|
template<bool PacketLoad, bool , bool IsRhs, typename PacketType , typename TensorMapper , typename StorageIndex >
static std::enable_if_t<!PacketLoad, PacketType> Eigen::TensorSycl::internal::read |
( |
const TensorMapper & |
tensorMapper, |
|
|
const StorageIndex & |
NCIndex, |
|
|
const StorageIndex & |
CIndex, |
|
|
const StorageIndex & |
|
|
) |
| |
|
inlinestatic |
read, special overload of read function, when the read access is not vectorized
- Template Parameters
-
PacketLoad | determines if the each element of this tensor block should be loaded in a packet mode |
- Parameters
-
is_coalesced_layout | determines whether or not the Tensor data in a memory can be access coalesced and vectorized when possible. Coalesced memory access is a key factor in Kernel performance. When a tensor is 2d and the contracting dimension is 1, it is always possible to accessed tensor data coalesced and vectorized. This is the case when RHS(right hand side) Tensor is transposed or when LHS(left hand side) Tensor is not transposed. |
- Template Parameters
-
PacketType | determines the type of packet |
TensorMapper | determines the input tensor mapper type |
StorageIndex | determines the Index type |
- Parameters
-
tensorMapper | is the input tensor |
NCIndex | is the non-contracting dim index |
CIndex | is the contracting dim index |
Definition at line 191 of file TensorContractionSycl.h.
193 const StorageIndex
row = (IsRhs) ? CIndex : NCIndex;
194 const StorageIndex
col = (IsRhs) ? NCIndex : CIndex;
195 return tensorMapper(row, col);
RowXpr row(Index i) const
ColXpr col(Index i) const
template<bool PacketLoad, bool is_coalesced_layout, bool , typename PacketType , typename TensorMapper , typename StorageIndex >
static std::enable_if_t<PacketLoad, PacketType> Eigen::TensorSycl::internal::read |
( |
const TensorMapper & |
tensorMapper, |
|
|
const StorageIndex & |
NCIndex, |
|
|
const StorageIndex & |
CIndex, |
|
|
const StorageIndex & |
ld |
|
) |
| |
|
inlinestatic |
read, a template function used for loading the data from global memory. This function is used to guarantee coalesced and vectorized load whenever possible
- Template Parameters
-
PacketLoad | determines if the each element of this tensor block should be loaded in a packet mode |
- Parameters
-
is_coalesced_layout | determines whether or not the Tensor data in a memory can be access coalesced and vectorized when possible. Coalesced memory access is a key factor in Kernel performance. When a tensor is 2d and the contracting dimension is 1, it is always possible to accessed tensor data coalesced and vectorized. This is the case when RHS(right hand side) Tensor is transposed or when LHS(left hand side) Tensor is not transposed. |
- Template Parameters
-
PacketType | determines the type of packet |
TensorMapper | determines the input tensor mapper type |
StorageIndex | determines the Index type |
- Parameters
-
tensorMapper | is the input tensor |
NCIndex | is the non-contracting dim index |
CIndex | is the contracting dim index |
ld | is the leading dimension of the flattened tensor |
Definition at line 161 of file TensorContractionSycl.h.
163 const StorageIndex
row = (is_coalesced_layout) ? NCIndex : CIndex;
164 const StorageIndex
col = (is_coalesced_layout) ? CIndex : NCIndex;
165 return tensorMapper.get_tensor().template packet<Unaligned>(row + (col * ld));
template<data_source dt, typename PacketType , typename DataScalar >
Overloading the write function for storing the data to global memory, when vectorization enabled This function is used to guarantee coalesced and vectorized store whenever possible.
- Template Parameters
-
data_source | an enum value representing if the location of the data in a memory hierarchy. |
PacketType | determines the type of packet |
DataScalar | determines the output data type |
- Parameters
-
packet_data | the data to be written in the local memory |
ptr | a pointer to the local memory |
Definition at line 249 of file TensorContractionSycl.h.
250 ::Eigen::internal::pstoreu<DataScalar, PacketType>(ptr, packet_data);
template<typename StorageIndex , StorageIndex ld, data_source dt, typename PacketType , typename DataScalar >
write, a template function used for storing the data to local memory. This function is used to guarantee coalesced and vectorized store whenever possible.
- Template Parameters
-
StorageIndex | determines the Index type |
- Parameters
-
ld | is the leading dimension of the local memory. ld is a compile time value for the local memory |
- Template Parameters
-
data_source | an enum value representing if the location of the data in a memory hierarchy. |
PacketType | determines the type of packet |
DataScalar | determines the output data type |
- Parameters
-
packet_data | the data to be written in the local memory |
ptr | a pointer to the local memory |
CIndex | is the contracting dim index |
Definition at line 222 of file TensorContractionSycl.h.
223 EIGEN_CONSTEXPR int PacketSize = Eigen::internal::unpacket_traits<PacketType>::size;
225 for (
int i = 0;
i < PacketSize;
i++) {
226 *ptr = PacketWrapper<PacketType, PacketSize>::scalarize(i, packet_data);
#define EIGEN_UNROLL_LOOP