文章摘要
冯林,姚远,陈沣,金博.一种基于MapReduce的动态数据流分类算法[J].,2014,54(4):461-468
一种基于MapReduce的动态数据流分类算法
A dynamic data stream classification algorithm based on MapReduce
  
DOI:10.7511/dllgxb201404014
中文关键词: 数据流分类  增量式学习  极端支持向量机(ESVM)  MapReduce  遗忘因子  鲁棒性
英文关键词: data stream classification  incremental learning  extreme support vector machine (ESVM)  MapReduce  forgetting factor  robustness
基金项目:国家自然科学基金资助项目(6117316351105052);教育部新世纪优秀人才支持计划资助项目(NCET-09-0251);辽宁省教育厅资助项目(201102037).
作者单位
冯林,姚远,陈沣,金博  
摘要点击次数: 1702
全文下载次数: 1154
中文摘要:
      当前动态数据流下的实时分类问题存在3个难点 针对海量数据的实时处理;概念漂移的跟踪和模型的更新;模型的稳定和鲁棒性.针对上述问题,将极端支持向量机(extreme support vector machine,ESVM)与MapReduce框架结合,提出了带遗忘因子的鲁棒ESVM算法.该方法通过构造残差权重矩阵,对残差进行修正,同时加入遗忘因子,提高新样本的作用,从而实现对海量数据处理问题的求解.实验结果显示,所提出方法能够快速有效地对动态数据流进行分类,且结果不易受到噪声干扰,稳定性强.
英文摘要:
      There are three difficulties in real-time dynamic data stream classification: real-time processing of massive data, tracking of concept drift and model updates, model′s stability and robustness. To solve these problems, extreme support vector machine (ESVM) is combined with MapReduce framework, and a forgetting factor robust ESVM algorithm (FFR-ESVM) is proposed. The proposed algorithm amends the residuals by constructing a residual matrix, while improves the effect of new samples by forgetting factor. Experimental results show that the proposed algorithm can rapidly and effectively classify dynamic data stream, and the results are stable and less affected by noise interference.
查看全文   查看/发表评论  下载PDF阅读器
关闭