基于信息化技术实现2次地震事件评论数据的挖掘分析

Mining and Analysis of Two Earthquake Event Review Data based on Information Technology

摘要: 通过网络爬虫爬取央视新闻微博发布的北流M_S5.2地震和靖西M_S5.2地震2条热点事件微博下用户评论数据，并经过清洗、分词、去掉停用词和词云统计等数据处理技术，产出每单个地震事件的可视化词云图和前10位热频词统计表。经挖掘分析词云图和热频词统计表的舆情信息，发现在2次热点地震事件中，微博用户评论的主要内容是表达平安愿望，前者用户的参与度更活跃，前者影响的用户范围比后者更广。

Abstract: Using web crawler, we crawled users’ comment data on two hot events, M_S5.2 earthquake in Beiliu and M_S5.2 earthquake in Jingxi released by CCTV News microblog, the visualized word cloud picture of each individual earthquake event and the statistical table of the top ten hot frequency words were produced using data processing techniques such as cleaning, word segmentation, removing stop-words and word cloud statistics. Through mining and analyzing the public opinion information of the word cloud chart and the hot frequency word statistical table, it is found that in two hot issues about earthquake, the main content of microblog users' comments was to express their wishes for peace; participation of users in the former are more engaged; compared with the latter, the former has a wider range of users.