论文标题

Tanhsoft-将Tanh和Softplus组合的激活功能系列

TanhSoft -- a family of activation functions combining Tanh and Softplus

论文作者

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

论文摘要

深度学习在其核心中包含具有线性转化的组成功能,该函数具有非线性函数,称为激活函数。在过去的几年中,人们对新型激活功能的构建产生了越来越多的兴趣,从而获得了更好的学习。在这项工作中,我们提出了一个新型活化功能的家族,即Tanhsoft,其中四个未定的超参数形式的Tanh(αX+βe^{γx})ln(Δ+E^X),并调整这些超参数以获得这些激活功能,以使这些激活函数显示出多种知名的激活功能。 For instance, replacing ReLU with xtanh(0.6e^x)improves top-1 classification accuracy on CIFAR-10 by 0.46% for DenseNet-169 and 0.7% for Inception-v3 while with tanh(0.87x)ln(1 +e^x) top-1 classification accuracy on CIFAR-100 improves by 1.24% for DenseNet-169 and 2.57% for SimpleNet model.

Deep learning at its core, contains functions that are composition of a linear transformation with a non-linear function known as activation function. In past few years, there is an increasing interest in construction of novel activation functions resulting in better learning. In this work, we propose a family of novel activation functions, namely TanhSoft, with four undetermined hyper-parameters of the form tanh(αx+βe^{γx})ln(δ+e^x) and tune these hyper-parameters to obtain activation functions which are shown to outperform several well known activation functions. For instance, replacing ReLU with xtanh(0.6e^x)improves top-1 classification accuracy on CIFAR-10 by 0.46% for DenseNet-169 and 0.7% for Inception-v3 while with tanh(0.87x)ln(1 +e^x) top-1 classification accuracy on CIFAR-100 improves by 1.24% for DenseNet-169 and 2.57% for SimpleNet model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源