Awesome
cudnnMultiHeadAttention
This is a draft implementation of the formula softmax(QK^T/sqrt(d_k))V.
The reference paper is "Attention is All You Need" (https://arxiv.org/abs/1706.03762).
This is a draft implementation of the formula softmax(QK^T/sqrt(d_k))V.
The reference paper is "Attention is All You Need" (https://arxiv.org/abs/1706.03762).