Tanhsoft-将Tanh和Softplus组合的激活功能系列

论文标题

Tanhsoft-将Tanh和Softplus组合的激活功能系列

TanhSoft -- a family of activation functions combining Tanh and Softplus

论文作者

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

论文摘要

深度学习在其核心中包含具有线性转化的组成功能，该函数具有非线性函数，称为激活函数。在过去的几年中，人们对新型激活功能的构建产生了越来越多的兴趣，从而获得了更好的学习。在这项工作中，我们提出了一个新型活化功能的家族，即Tanhsoft，其中四个未定的超参数形式的Tanh（αX+βe^{γx}）ln（Δ+E^X），并调整这些超参数以获得这些激活功能，以使这些激活函数显示出多种知名的激活功能。 For instance, replacing ReLU with xtanh(0.6e^x)improves top-1 classification accuracy on CIFAR-10 by 0.46% for DenseNet-169 and 0.7% for Inception-v3 while with tanh(0.87x)ln(1 +e^x) top-1 classification accuracy on CIFAR-100 improves by 1.24% for DenseNet-169 and 2.57% for SimpleNet model.

Deep learning at its core, contains functions that are composition of a linear transformation with a non-linear function known as activation function. In past few years, there is an increasing interest in construction of novel activation functions resulting in better learning. In this work, we propose a family of novel activation functions, namely TanhSoft, with four undetermined hyper-parameters of the form tanh(αx+βe^{γx})ln(δ+e^x) and tune these hyper-parameters to obtain activation functions which are shown to outperform several well known activation functions. For instance, replacing ReLU with xtanh(0.6e^x)improves top-1 classification accuracy on CIFAR-10 by 0.46% for DenseNet-169 and 0.7% for Inception-v3 while with tanh(0.87x)ln(1 +e^x) top-1 classification accuracy on CIFAR-100 improves by 1.24% for DenseNet-169 and 2.57% for SimpleNet model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题