论文标题

恶意程序中的潜在语义结构

Latent Semantic Structure in Malicious Programs

论文作者

Musgrave, John, Messay-Kebede, Temesguen, Kapp, David, Ralescu, Anca

论文摘要

潜在语义分析是一种用于在自然语言文档中发现主题和主题权重的矩阵分解方法。这项研究使用潜在的语义分析来分析恶意程序的二进制组成。术语频率矢量表示的语义表示产生一组主题,每个主题都是术语的组成。使用空间表示对矢量和主题进行定量评估。这种语义分析提供了从其术语频率分析中得出的程序的更抽象表示。我们使用度量空间代表一个程序作为向量集合,以及一个距离度量标准来评估其在主题中的相似性。该数据集中的向量的分割提供了对程序结构的分辨率。

Latent Semantic Analysis is a method of matrix decomposition used for discovering topics and topic weights in natural language documents. This study uses Latent Semantic Analysis to analyze the composition of binaries of malicious programs. The semantic representation of the term frequency vector representation yields a set of topics, each topic being a composition of terms. The vectors and topics were evaluated quantitatively using a spatial representation. This semantic analysis provides a more abstract representation of the program derived from its term frequency analysis. We use a metric space to represent a program as a collection of vectors, and a distance metric to evaluate their similarity within a topic. The segmentation of the vectors in this dataset provides increased resolution into the program structure.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源