Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
The study proposes Kernel-Eigen Pair Sparse Variational Gaussian Processes (KEP-SVGP) for building uncertainty-aware self-attention in transformers. The asymmetry of attention kernels is addressed using Kernel SVD (KSVD), yielding reduced complexity….
Continue reading