学术文献库

论文标题

产品市场需求分析使用NLP在Banglish文本中使用情感分析和命名实体识别

Product Market Demand Analysis Using NLP in Banglish Text with Sentiment Analysis and Named Entity Recognition

论文作者

Hossain, Md Sabbir, Nayla, Nishat, Rasel, Annajiat Alim

论文摘要

产品市场需求分析对于发起业务策略的发挥作用，因为其对竞争性业务领域的显着影响。此外，大约有2.28亿本地孟加拉语者，大多数人使用Banglish文本在社交媒体上相互互动。由于社交媒体作为企业家的在线市场，消费者正在社交媒体上购买和评估商品。人们使用社交媒体通过与他们分享积极和糟糕的体验来找到偏爱的智能手机品牌和模型。因此，我们的目标是收集笨拙的文本数据并使用情感分析并指定实体标识，以评估孟加拉国对智能手机的市场需求，以确定性别最受欢迎的智能手机。我们用即时数据刮刀和Wikipedia和其他站点的爬行数据从社交媒体上刮掉了相关的数据，以获取Python Web刮刀的产品信息。使用Python的Pandas和Seaborn库，使用NLP方法对原始数据进行过滤。为了训练我们的数据集以获得命名实体识别，我们利用了Spacey的自定义NER模型Amazon Collass casture ner。通过参数调整进行情感分析，部署了TensorFlow顺序模型。同时，我们使用Google Cloud Translation API使用孟加拉国库来估计审稿人的性别。在本文中，我们使用自然语言处理（NLP）方法和几种机器学习模型来确定孟加拉国市场中最重要的项目和服务。我们的模型的精度为87.99％的Spacy自定义命名实体识别，在亚马逊中有95.51％的精度为95.51％，在顺序模型中，用于需求分析的自定义NER和87.02％。经过Spacy的研究，我们能够使用Levenshtein距离和比率算法来管理与拼写错误的单词相关的错误。

Product market demand analysis plays a significant role for originating business strategies due to its noticeable impact on the competitive business field. Furthermore, there are roughly 228 million native Bengali speakers, the majority of whom use Banglish text to interact with one another on social media. Consumers are buying and evaluating items on social media with Banglish text as social media emerges as an online marketplace for entrepreneurs. People use social media to find preferred smartphone brands and models by sharing their positive and bad experiences with them. For this reason, our goal is to gather Banglish text data and use sentiment analysis and named entity identification to assess Bangladeshi market demand for smartphones in order to determine the most popular smartphones by gender. We scraped product related data from social media with instant data scrapers and crawled data from Wikipedia and other sites for product information with python web scrapers. Using Python's Pandas and Seaborn libraries, the raw data is filtered using NLP methods. To train our datasets for named entity recognition, we utilized Spacey's custom NER model, Amazon Comprehend Custom NER. A tensorflow sequential model was deployed with parameter tweaking for sentiment analysis. Meanwhile, we used the Google Cloud Translation API to estimate the gender of the reviewers using the BanglaLinga library. In this article, we use natural language processing (NLP) approaches and several machine learning models to identify the most in-demand items and services in the Bangladeshi market. Our model has an accuracy of 87.99% in Spacy Custom Named Entity recognition, 95.51% in Amazon Comprehend Custom NER, and 87.02% in the Sequential model for demand analysis. After Spacy's study, we were able to manage 80% of mistakes related to misspelled words using a mix of Levenshtein distance and ratio algorithms.