The research paper ‘Elephant Neural Networks: Born to be a Continual Learner’ by Qingfeng Lan and A. Rupam Mahmood, focuses on the issue of catastrophic forgetting in continual learning. The study reveals that the gradient sparsity of activation functions plays a critical role in reducing forgetting. The researchers propose a new class of activation functions, called elephant activation functions, which can generate sparse representations and gradients, hence improving the resilience of neural networks to catastrophic forgetting. This method has broad applicability in regression, class incremental learning, and reinforcement learning tasks.

 

Publication date: 2 Oct 2023
Project Page: https://arxiv.org/abs/2310.01365
Paper: https://arxiv.org/pdf/2310.01365