论文标题

使用KNN,ENN和SVM分类器对T-SNE和MDS维度降低技术的性能评估

Performance Evaluation of t-SNE and MDS Dimensionality Reduction Techniques with KNN, ENN and SVM Classifiers

论文作者

Sakib, Shadman, Siddique, Md. Abu Bakr, Rahman, Md. Abdur

论文摘要

本文的中心目标是建立两个常见的维度降低(DR)方法,即MATLAB中的T-分布的随机邻居嵌入(T-SNE)和多维缩放(MDS),并观察其在几个数据集中的应用。这些DR技术应用于九个不同的数据集,即CNAE9,分割,种子,PIMA印第安人糖尿病,帕金森氏症,运动天秤座,乳房X线学质量,知识,知识和从UCI机器学习库中获取。通过应用T-SNE和MDS算法,每个数据集都通过消除数据集中的不必要功能将每个数据集转换为其原始维度的一半。随后,这些具有缩小尺寸的数据集被馈入三种监督分类算法进行分类。这些分类算法是K最近的邻居(KNN),延伸的最近的邻居(ENN)和支持向量机(SVM)。同样,所有这些算法均在MATLAB中实现。培训和测试数据比率保持在百分之九十:每个数据集为10%。通过准确观察,分析了使用分类算法的每个维度技术的效率,并评估每个分类器的性能。

The central goal of this paper is to establish two commonly available dimensionality reduction (DR) methods i.e. t-distributed Stochastic Neighbor Embedding (t-SNE) and Multidimensional Scaling (MDS) in Matlab and to observe their application in several datasets. These DR techniques are applied to nine different datasets namely CNAE9, Segmentation, Seeds, Pima Indians diabetes, Parkinsons, Movement Libras, Mammographic Masses, Knowledge, and Ionosphere acquired from UCI machine learning repository. By applying t-SNE and MDS algorithms, each dataset is transformed to the half of its original dimension by eliminating unnecessary features from the datasets. Subsequently, these datasets with reduced dimensions are fed into three supervised classification algorithms for classification. These classification algorithms are K Nearest Neighbors (KNN), Extended Nearest Neighbors (ENN), and Support Vector Machine (SVM). Again, all these algorithms are implemented in Matlab. The training and test data ratios are maintained as ninety percent: ten percent for each dataset. Upon accuracy observation, the efficiency for every dimensionality technique with availed classification algorithms is analyzed and the performance of each classifier is evaluated.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源